WO2022253185A1 - Cas12 protein, gene editing system containing cas12 protein, and application - Google Patents

Cas12 protein, gene editing system containing cas12 protein, and application Download PDF

Info

Publication number
WO2022253185A1
WO2022253185A1 PCT/CN2022/096002 CN2022096002W WO2022253185A1 WO 2022253185 A1 WO2022253185 A1 WO 2022253185A1 CN 2022096002 W CN2022096002 W CN 2022096002W WO 2022253185 A1 WO2022253185 A1 WO 2022253185A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
acid sequence
nucleic acid
sequence
Prior art date
Application number
PCT/CN2022/096002
Other languages
French (fr)
Chinese (zh)
Inventor
王永明
王帅
高思琪
王瑶
Original Assignee
复旦大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 复旦大学 filed Critical 复旦大学
Publication of WO2022253185A1 publication Critical patent/WO2022253185A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0684Cells of the urinary tract or kidneys
    • C12N5/0686Kidney cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/22Vectors comprising a coding region that has been codon optimised for expression in a respective host

Definitions

  • This application belongs to the technical field of gene editing, and specifically relates to Cas12 protein, a gene editing system containing the Cas12 protein and related applications.
  • the CRISPR/Cas system is an adaptive immune system evolved by bacteria and archaea to resist the invasion of foreign viruses or plasmids.
  • crRNA CRISPR-derived RNA
  • Cas12 protein form a complex to recognize the PAM (Protospacer Adjacent Motif) sequence of the target site. After recognition, crRNA will form a complementary structure with the targeted DNA sequence, and the Cas protein will perform the function of cutting DNA, resulting in DNA breakage and damage.
  • the CRISPR/Cas12b system also contains tracrRNA (trans-activating RNA), which forms a complex with crRNA and Cas12b to function.
  • tracrRNA and crRNA can be fused into a single-stranded guide RNA (single guide RNA, sgRNA) through a linking sequence.
  • sgRNA single guide RNA
  • NHEJ non-homologous end-joining
  • HR homologous recombination
  • the CRISPR/Cas12 gene editing system In addition to basic scientific research, the CRISPR/Cas12 gene editing system also has broad clinical application prospects. When using the CRISPR/Cas12 gene editing system for gene therapy, it is necessary to introduce Cas and single-stranded guide RNA into the body. At present, the most effective expression vector for gene therapy is adeno-associated virus (AAV). However, the DNA packaged by AAV virus generally does not exceed 4.5 kb. SpCas9 has been widely used because of its simple PAM sequence (NGG recognition) and high activity. However, the SpCas9 protein has 1368 amino acids, plus sgRNA and promoter, which cannot be effectively packaged into AAV virus, which limits its clinical application.
  • AAV adeno-associated virus
  • Cas9s with small molecular weight including SaCas9 (PAM sequence is NNGRRT); St1Cas9 (PAM sequence is NNAGAW); NmCas9 (PAM sequence is NNNNGATT); Nme2Cas9 (PAM sequence is NNNNCC); The PAM sequence is NNNNRYAC).
  • these Cas9s are either easy to off-target (that is, cut at non-target sites), or have complex PAM sequences, or have low editing activity, making it difficult to be widely used.
  • the present invention provides a conjugate comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity: and
  • the present invention provides a fusion protein comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
  • the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having any one of SEQ ID NO: 15 to SEQ ID NO: 18.
  • Nucleic acid sequence, or a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence shown in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity, or having a nucleic acid sequence based on SEQ ID NO: 15 A nucleic acid sequence obtained by modifying the nucleic acid sequence described in any one of SEQ ID NO: 18 and retaining its biological activity.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the following:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
  • the fusion protein of the second aspect of the present invention is fusion protein of the second aspect of the present invention.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
  • a) protein component comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
  • nucleic acid component comprising:
  • the single-stranded guide RNA of the third aspect of the invention is the single-stranded guide RNA of the third aspect of the invention.
  • the present invention provides a cell comprising: the isolated nucleic acid molecule of the sixth aspect of the present invention, or the vector of the seventh aspect of the present invention.
  • the present invention provides a method for gene editing a target sequence in a cell or in vitro, the method comprising: making the Cas12 protein, the conjugate of the first aspect of the present invention or the second aspect of the present invention
  • the Cas12 protein is respectively Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6 , or having at least 80 amino acid sequences as shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 4, S
  • the present invention provides a kit comprising: Cas12 protein, the conjugate of the first aspect of the present invention or the fusion protein of the second aspect of the present invention and the single protein of the third aspect of the present invention Strand guide RNA, the isolated nucleic acid molecule of the fourth aspect and the fifth aspect of the present invention, the carrier of the sixth aspect and the seventh aspect of the present invention, or the CRISPR/Cas12 gene editing system of the eighth aspect of the present invention; or target sequences in an in vitro environment for gene editing; wherein the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a respectively having amino acid sequences shown in SEQ ID NO: 1 to SEQ ID NO: 6 Protein, BgCas12a albumen or ChCas12b albumen, or for having any one in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ
  • the Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors.
  • the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
  • the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing.
  • Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment.
  • Cas12j-8 has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of Cas12j-2, which is more suitable for the development and application research of gene editing.
  • the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and have a relatively simple PAM sequence.
  • PAM is YYN, which expands the field of Cas12a protein and Cas12b protein, and increases the application range of Cas12a protein and Cas12b protein.
  • Figure 1 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Cas12J-8 gene editing system
  • Figure 2 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/ChCas12b gene editing system
  • Figure 3 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Mb4Cas12a gene editing system
  • Figure 4 shows a schematic diagram of the editing efficiency results after the CRISPR/MoCas12a gene editing system performs gene editing on two target sites;
  • Figure 5 shows a schematic diagram of the editing efficiency results after the CRISPR/BgCas12a gene editing system performs gene editing on two target sites;
  • Figure 6 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/MICas12a gene editing system
  • Figure 7 and Figure 8 are schematic diagrams showing the specific detection results of the CRISPR/Cas12J-8 gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 9 shows a schematic diagram of the specific detection results of the CRISPR/ChCas12b gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 10 shows a schematic diagram of the specific detection results of the CRISPR/Mb4Cas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 11 shows a schematic diagram of the specific detection results of the CRISPR/MoCas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 12 shows a schematic diagram of the specific detection results of the CRISPR/BgCas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 13 shows a schematic diagram of the specific detection results of the CRISPR/MICas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 14 shows the results of editing the target sites of each endogenous site by the Cas12J-8ABE base editor.
  • Fig. 15 shows a schematic diagram of using the GFP reporter cell line library to detect the editing of the target gene by the CRISPR/Cas system.
  • Figure 16 shows cell photographs of GFP reporter cell lines processed using several CRISPR/Cas12J gene editing systems, wherein the upper image is a fluorescent image, and the lower image is an ordinary microscopic image.
  • Figure 17 shows a schematic diagram of the editing efficiency of ChCas12b and its point mutants in GFP cell lines.
  • Figure 18 shows a schematic diagram of the editing efficiency of Cas12a and its point mutants in GFP cell lines.
  • Cas12 protein is a protein component of the CRISPR/Cas12 genome editing system, which can target and cut DNA target sequences under the guidance of single-stranded guide RNA (gRNA), forming DNA double-strand breaks (DSB). DNA double-strand breaks can activate the inherent repair mechanisms in cells, non-homologous end joining (NHEJ) and homologous recombination (homologous recombination, HR), thereby repairing DNA damage in cells. During repair, site-specific editing is performed on that specific DNA sequence.
  • gRNA single-stranded guide RNA
  • DSB DNA double-strand breaks
  • NHEJ non-homologous end joining
  • HR homologous recombination
  • point mutant refers to a mutein with one amino acid or multiple amino acid mutations relative to the wild-type protein, within the scope of the term “homologue” of the present invention.
  • point mutants include mutants having single point mutations or multiple point mutations.
  • the notation "AXXXB” is used to denote a point mutant, wherein amino acid A denoting XXX is mutated to B.
  • the point mutant T207A of Mb4Cas12a protein represents the mutant that T (threonine (Thr)) at position 207 of Mb4Cas12a protein is mutated to A (alanine (Ala)).
  • the point mutant T207A-N616S of the Mb4Cas12a protein represents that the T (threonine (Thr)) at the 207th position of the Mb4Cas12a protein is mutated into A (alanine (Ala)) and the N (asparagus at the 616th position) Amide (Asn)) is mutated to S (serine (Ser)) mutants.
  • sgRNA single-stranded guide RNA
  • sgRNA single guided RNA
  • a single-stranded guide RNA or sgRNA may comprise a CRISPR repeat sequence (repeat sequence) and a guide sequence (guide sequence), and the guide sequence is also referred to herein as a guide RNA (guide RNA or gRNA).
  • guide RNA or gRNA guide RNA
  • the guide sequence is also called a spacer.
  • a guide sequence is any polynucleotide sequence that has sufficient similarity to a target sequence to hybridize to and direct specific binding of the CRISPR/Cas12 complex to the target sequence.
  • the degree of complementarity between the guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, Or at least 99%. Determining the optimal alignment is within the ability of one of ordinary skill in the art. For example, there are public and commercially available alignment algorithms and programs such as, but not limited to, ClustalW, Smith-Waterman in matlab, Bowtie, Geneious, Biopython, and SeqMan.
  • CRISPR/Cas12 complex refers to the complex formed by the combination of single-stranded guide RNA (single guide RNA) or mature crRNA and the Cas12 protein, which includes hybridization with the target sequence and thus makes the Cas12 protein and Cas12 protein A guide sequence to which the target sequence binds.
  • the complex recognizes and cleaves polynucleotides that hybridize to the single-stranded guide RNA or mature crRNA.
  • target sequence refers to a polynucleotide targeted by a guide sequence designed to be targeted, e.g., a sequence complementary to the guide sequence, wherein the target The hybridization between the sequence and the guide sequence will promote Cas12 to exert its activity, such as the activity of cutting the target sequence. Full complementarity is not required, as long as there is sufficient complementarity to cause hybridization and facilitate Cas12 to exert its activity.
  • a target sequence can include any polynucleotide, such as DNA or RNA.
  • the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located in an organelle of a eukaryotic cell such as the mitochondria or chloroplast.
  • target sequence or "target polynucleotide” as used herein may be any polynucleotide endogenous or exogenous to a cell (eg, a eukaryotic cell).
  • the target polynucleotide can be a polynucleotide present in the nucleus of a eukaryotic cell.
  • the target polynucleotide can be a sequence encoding a gene product (eg, protein) or a non-coding sequence (eg, regulatory polynucleotide or dummy DNA).
  • the target sequence should be related to a protospacer adjacent motif (PAM).
  • PAM protospacer adjacent motif
  • the exact sequence and length requirements for the PAM vary depending on the Cas protein used, but the PAM is typically 2-5 bases sequence adjacent to the protospacer (target sequence). Those skilled in the art will be able to identify the PAM sequence to use with a given Cas protein.
  • nucleic acid sequence As used herein, the terms “polynucleotide”, “nucleic acid sequence”, “nucleotide sequence” or “nucleic acid fragment” are used interchangeably and are single- or double-stranded RNA or DNA polymers, optionally containing synthetic unnatural, or altered nucleotide bases.
  • Nucleotides are referred to by their single letter designations as follows: “A” for adenosine or deoxyadenosine (for RNA or DNA, respectively), “C” for cytidine or deoxycytidine, “G” for guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T,” H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
  • polypeptide used herein are used interchangeably in this application to refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are an artificial chemical analog of the corresponding naturally occurring amino acid, and to naturally occurring amino acid polymers.
  • polypeptide may also include modified forms including, but not limited to, glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation ylation and ADP-ribosylation.
  • sequence identity or “homology” as used herein has an art-recognized meaning, and the percent sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide, or along a region of the molecule. (See, e.g., Computational Molecular Biology, Lesk,
  • identity is well known to the skilled artisan to be suitable for conservative amino acid substitutions in peptides or proteins and can generally be made unchanged The biological activity of the resulting molecule.
  • those skilled in the art recognize that single amino acid substitutions in non-essential regions of a polypeptide do not substantially alter biological activity (see, e.g., Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub.co ., p. 224).
  • vector refers to a nucleic acid delivery vehicle into which a polynucleotide can be inserted.
  • the vector When the vector enables the expression of a protein encoded by the inserted polynucleotide, or when the vector enables the transcription of the inserted polynucleotide (eg, into mRNA or functional RNA), the vector is called an expression vector.
  • a vector can be introduced into a host cell by transformation, transduction or transfection, so that the genetic material elements it carries can be expressed in the host cell.
  • Vectors are well known to those skilled in the art, including but not limited to: plasmid vectors, viral vectors and the like.
  • the vector may also contain various regulatory sequences to regulate expression.
  • regulatory sequence and “regulatory element” are used interchangeably herein to refer to a sequence located upstream (5' non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence and which affects the transcription of the associated coding sequence, RNA processing or stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, reporter genes, and the like. The regulatory sequences may be of different origin, or of the same origin but arranged in a manner different from that normally found in nature. In addition, the vector may also contain an origin of replication.
  • promoter refers to a nucleic acid segment capable of controlling the transcription of another nucleic acid segment.
  • the promoter is a promoter capable of controlling the transcription of a gene in a cell, whether or not it is derived from said cell.
  • the promoter may be a constitutive promoter or a tissue specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably and refer to expression that is primarily, but not necessarily exclusively, in one tissue or organ, and may also be expressed in a particular cell or cell type promoter.
  • a “developmentally regulated promoter” refers to a promoter whose activity is determined by developmental events.
  • An “inducible promoter” selectively expresses an operably linked DNA sequence in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • Introducing" a nucleic acid molecule eg, plasmid, linear nucleic acid fragment, RNA, etc.
  • a nucleic acid molecule or protein into an organism means transforming cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • Transformation as used in the present invention includes stable transformation and transient transformation.
  • stable transformation refers to the introduction of a foreign nucleotide sequence into the genome, resulting in stable inheritance of the foreign gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • transient transformation refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without the stable inheritance of a foreign gene. In transient transformation, the exogenous nucleic acid sequence does not integrate into the genome.
  • complementarity refers to the ability of a nucleic acid sequence to form one or more hydrogen bonds with another nucleic acid sequence by means of traditional Watson-Crick or other non-traditional types. Percent complementarity indicates the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with another nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 complementary, then the complementary percentages are 50%, 60%, 70%, 80%, 90% and 100%). "Perfectly complementary” means that all contiguous residues of one nucleic acid sequence hydrogen bond with the same number of contiguous residues in the other nucleic acid sequence.
  • Substantially complementary refers to a group having At least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98 over a region of 30, 35, 40, 45, 50 or more nucleotides %, 99% or 100% degree of complementarity, or refers to two nucleic acids that hybridize under stringent conditions.
  • stringent conditions used herein in relation to hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence primarily hybridizes to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are generally sequence-dependent and depend on many factors. In general, the longer the sequence, the higher the temperature at which the sequence will specifically hybridize to its target sequence. Non-limiting examples of stringent conditions are described in “Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization With Nucleic Acid Probes" by Tijssen (1993). ), Part I, Chapter 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assay", Elsevier, New York).
  • hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized by hydrogen bonding of the bases between the nucleotide residues . Hydrogen bonding can occur by means of Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner.
  • the complex may comprise two strands forming a duplex, three or more strands forming a multi-strand complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a broader process such as the initiation of PCR, or cleavage of a polynucleotide by an enzyme. A sequence that is capable of hybridizing to a given sequence is called the "complement" of that given sequence.
  • the Cas12 protein can be derivatized, for example linked to another molecule (eg another protein or polypeptide).
  • another molecule eg another protein or polypeptide
  • derivatization e.g., labeling
  • the Cas12 protein can be functionally connected (by chemical coupling, gene fusion, non-covalent connection or other means) to one or more other molecular parts, such as another protein or polypeptide, detectable label, Pharmaceutical reagents, etc.
  • the Cas12 protein can be linked to other functional units.
  • it can be linked to a nuclear localization signal (NLS) sequence to enhance the ability of the protein of the invention to enter the nucleus.
  • NLS nuclear localization signal
  • it can be linked with a targeting moiety so that the Cas12 protein is targeted.
  • it can be linked with a detectable label to facilitate the detection of Cas12 protein.
  • it can be linked with an epitope tag to facilitate the expression, detection, tracking and/or purification of the Cas12 protein.
  • the present invention provides a conjugate comprising:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the "biological activity" of the so-called Cas12 protein refers to the activity of the protein in conjunction with single-stranded guide RNA, endonuclease activity (including single-strand cleavage activity and double-strand cleavage activity), and/or in the guide RNA (gRNA)-guided binding and cleavage activity at a specific site in a target sequence, but not limited thereto.
  • endonuclease activity including single-strand cleavage activity and double-strand cleavage activity
  • gRNA guide RNA
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the Cas12 protein in addition to the Cas12 protein itself, the Cas12 protein can also be combined with other substances such as other proteins or labelable tags to impart other functions.
  • the modified moiety may be an additional protein or polypeptide, a detectable label, or a combination thereof.
  • said additional protein or polypeptide is selected from an epitope tag, a reporter protein or a nuclear localization signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), Cytosine methylase DNMT3A and MQ1, cytosine demethylase Tet1, transcription activators VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endogenous One or more of Dicer FokI.
  • NLS nuclear localization signal
  • Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • the Cas12 protein of the present invention can be coupled, conjugated or fused to the modified part through a linker, or directly connected to the modified part without a linker.
  • Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, ⁇ -Ala, GABA or Ava), or PEG, etc.
  • the present invention provides a fusion protein comprising:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO :5 and SEQ ID NO: any point mutant of the Cas12 protein shown in 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the The point mutants of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the BgCas12a Point mutants of the protein such as Q144R, D148G
  • the additional protein or polypeptide may be selected from epitope tags, reporter proteins or nuclear localization signal (NLS) sequences, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase DNMT3A One of MQ1, cytosine demethylase Tet1, transcription activator VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI or multiple.
  • NLS nuclear localization signal
  • Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • the Cas12 protein of the present invention can be coupled, conjugated or fused with the other protein or polypeptide through a linker, or directly connected with the other protein or polypeptide without a linker.
  • Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, ⁇ -Ala, GABA or Ava), or PEG, etc.
  • the fusion protein comprises: Cas12J-8 protein, adenine deaminase (ABE) with the amino acid sequence shown in SEQ ID NO: 1, and optionally connecting the Cas12J-8 protein and the linker of the adenine deaminase (ABE).
  • ABE adenine deaminase
  • the fusion protein is sequentially composed of the adenine deaminase (ABE), the linker, and the Cas12J-8 protein from its N-terminus to its C-terminus.
  • ABE adenine deaminase
  • the linker the linker
  • the Cas12J-8 protein from its N-terminus to its C-terminus.
  • amino acid sequence of the fusion protein is shown in SEQ ID NO: 7.
  • the Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors.
  • the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
  • the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing.
  • Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment.
  • the Cas12j-8 protein has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of the Cas12j-2 protein, which is more suitable for the development and application research of gene editing.
  • the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and a relatively simple PAM sequence, and the PAM of the Cas12a protein and Cas12b protein is YYN , expanding the field of Cas12a protein and Cas12b protein, and increasing the application range of Cas12a protein and Cas12b protein.
  • the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having:
  • BgCas12a albumen its homologue, the nucleic acid sequence shown in the SEQ ID NO of conjugate or fusion protein: 17, or
  • ChCas12b protein its homologue, conjugate or fusion protein SEQ ID NO: the nucleotide sequence shown in 18;
  • nucleic acid sequence modified based on the nucleic acid sequence described in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity.
  • the modification may be one or more of base phosphorylation, base sulfuration, base methylation, base hydroxylation, sequence shortening and sequence lengthening.
  • shortening of said sequence and lengthening of said sequence comprises the presence of one, two, three, four, five, six, seven, eight, nine or Deletions or additions of ten bases.
  • the single-stranded guide RNA may further include a CRISPR spacer sequence at the 3' end of the CRISPR repeat sequence, and the CRISPR spacer sequence is 20, 21, 22, 23, 24, 25, A sequence of 26, 27, 28, 29, 30 nucleotides (preferably 24 nucleotides) capable of complementary pairing with the target sequence.
  • the CRISPR spacer sequence is a sequence of 24 nucleotides in length and capable of complementary pairing with the target sequence.
  • said single-stranded guide RNA further comprises a terminator at the 3' end of said spacer sequence.
  • the terminator may be a plurality of terminators composed of at least six (eg, seven or eight) Us.
  • the single-stranded guide RNA can combine with the above-mentioned Cas12 protein, conjugate or fusion protein to form a complex, which can recognize the corresponding PAM and thus bind to the target sequence, thereby realizing the shearing of the target sequence or Say gene editing.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, A point mutant of any one of the Cas12 protein shown in SEQ ID NO: 5 and SEQ ID NO: 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S- I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, Point mutants of the BgCas12a protein such as Q144R, D148G,
  • the isolated nucleic acid molecule comprises any of SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13 A nucleic acid sequence as indicated or a degenerate sequence thereof.
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
  • the isolated nucleic acid molecule comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
  • the isolated nucleic acid molecule also encodes the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
  • the isolated nucleic acid molecule comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, its homologue, a conjugate or a fusion protein (such as a fusion shown in SEQ ID NO: 7) Protein) nucleic acid sequence, such as SEQ ID NO: 8, or the nucleic acid sequence shown in SEQ ID NO: 14, and comprises a sequence encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID
  • the nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity such as the nucleic acid sequence shown in SEQ ID NO: 19.
  • the isolated nucleic acid molecule comprises a Cas12a protein encoding an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue, a conjugate or a fusion protein thereof Nucleic acid sequence, such as the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID coding for the Cas12a protein, its homologue, conjugate or fusion protein
  • the CRISPR repeat sequence shown in NO: 16 comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity
  • the nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity such as the nucleic acid sequence shown in SEQ ID NO:20.
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 12 Nucleic acid sequence, and comprising encoding for the BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and a modified sequence that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21 .
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 13 Nucleic acid sequence, and comprising encoding for the ChCas12b protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 18 shown in the CRISPR repeat sequence, comprising at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence that includes a single-stranded guide RNA that is modified based on SEQ ID NO: 18 and that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22 .
  • the invention provides an isolated nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the invention.
  • the nucleic acid molecule of described separation comprises SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21 and the nucleotide sequence shown in any one of SEQ ID NO:22 or its degenerate sequence .
  • said isolated nucleic acid molecule further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
  • the isolated nucleic acid molecule of the present invention can express the Cas12 protein, Cas12 protein, Its conjugate or fusion protein, and/or the above-mentioned single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
  • the isolated nucleic acid molecule of the present invention can express Cas12 protein, its conjugate or fusion protein, and single-stranded guide RNA separately/respectively, or express the expression product integrally, which expression method is selected according to It depends.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the following:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the vector comprises any one of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, and SEQ ID NO: 13 Nucleic acid sequence or its degenerate sequence.
  • the vector comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
  • the vector comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
  • the vector may be an expression vector, such as a plasmid vector such as a pUC19 vector, an episomal vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
  • a plasmid vector such as a pUC19 vector, an episomal vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
  • the vector further comprises a nucleic acid sequence encoding the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
  • the vector comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7).
  • Nucleic acid sequence for example the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14, and comprises coding for this Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 15 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity.
  • the nucleic acid sequence of the single-stranded guide RNA such as the nucleic acid sequence shown in SEQ ID NO: 19.
  • the vector comprises a nucleic acid sequence encoding a Cas12a protein, a homologue thereof, a conjugate or a fusion protein thereof having an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4,
  • the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID NO: 16 coding for the Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the nucleic acid sequence of the single-stranded guide RNA of sequence for example the nucleic acid sequence shown in SEQ ID NO:20.
  • the vector comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homologue thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 12, And comprise coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological
  • the vector comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 13, And comprise coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising and SEQ ID NO: 18 has at least 90% sequence identity and retains its biological
  • the present invention provides a vector comprising a nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the vector comprises the nucleic acid sequence shown in any one of SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22 or a degenerate sequence thereof.
  • the vector further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
  • the nucleic acid sequence cloned in the vector can be expressed as Cas12 protein, its conjugate or fusion protein, and/or the above-mentioned Single-stranded guide RNA, where it can perform corresponding functions, such as gene editing.
  • multiple vectors such as two vectors, can be transfected into cells, one of which expresses the Cas12 protein, its conjugate or fusion protein, and the other expresses a single-stranded guide RNA. Subsequently, the expressed Cas12 protein, its conjugate or fusion protein complexes with the expressed single-stranded guide RNA to form a complex, where it performs corresponding functions, such as gene editing.
  • nucleic acid sequence encoding the Cas12 protein, its conjugate or fusion protein, and the nucleic acid sequence encoding the single-stranded guide RNA can also be cloned into a vector, so that the vector is transfected into cells to express the Cas12 protein, its conjugate or fusion protein, and the single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
  • the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
  • a) protein component comprising:
  • Cas12 protein the Cas12 protein is:
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • 1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
  • nucleic acid component comprising: a single-stranded guide RNA corresponding to the protein component in a) according to the third aspect of the present invention
  • the protein component and the nucleic acid component combine with each other to form a complex.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the protein component comprises a Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single-stranded guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 15, comprising a single strand of a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity Guide RNA, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity.
  • the protein component comprises Cas12a protein, its homologue, conjugate or fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, said
  • the nucleic acid component comprises a single-stranded guide RNA that is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 16, comprising at least 90% sequence identity with SEQ ID NO: 16 and retaining its A single-stranded guide RNA with a biologically active homologous sequence, or a single-stranded guide RNA containing a modified sequence based on SEQ ID NO: 16 and retaining its biological activity.
  • the protein component comprises BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity.
  • the protein component comprises a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
  • the expression "at least 90% sequence identity” mentioned for single-stranded guide RNAs may be, for example, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% , at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity.
  • the CRISPR/Cas12 gene editing system of the present invention can be directly constituted by the Cas12 protein described herein, its homologue, or their conjugates or fusion proteins and the single-stranded guide RNA described herein, or can be directly composed of the single-stranded guide RNA described herein.
  • the expression product obtained by the vector expression constitutes.
  • the CRISPR/Cas12 gene editing system of the present invention realizes the recognition, positioning, cutting and gene editing of the target sequence through the joint action of the Cas12 protein contained therein and the single-stranded guide RNA.
  • the CRISPR/Cas12 gene editing system of the present invention can precisely locate the target sequence.
  • precise positioning has two meanings: the first meaning means that the CRISPR/Cas12 gene editing system of the present invention can recognize and bind the target sequence itself, and the second meaning means that the CRISPR/Cas12 gene editing system of the present invention can Other proteins fused with the Cas12 protein or proteins that specifically recognize the sgRNA are brought to the position of the target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention has a low tolerance to non-target sequences.
  • the so-called "with low tolerance” means that the CRISPR/Cas12 gene editing system of the present invention is basically unable or completely unable to recognize and bind non-target sequences, or basically unable or completely unable to fuse with the Cas12 protein Other proteins or proteins that specifically recognize the sgRNA are brought to the position of the non-target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention can target more DNA sequences in the genome because the PAM sequence on the target sequence recognized by the Cas12 protein contained therein is simpler.
  • the present invention provides a cell comprising: the isolated nucleic acid molecule of the fourth and fifth aspects of the present invention, or the vector of the sixth and seventh aspects of the present invention.
  • the cells may be prokaryotic or eukaryotic.
  • the eukaryotic cell it may be, for example, a plant cell or an animal cell.
  • the animal cell as an example, it may be a mammalian cell such as a human cell.
  • the present invention provides a method for gene editing a target sequence in a cell or in vitro environment, the method comprising combining any one of the following (1) to (4) with the intracellular or in vitro environment
  • the Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the Cas12 protein, its homologue, conjugate or fusion protein recognizes the respective protospacer adjacent sequence (PAM), and the PAM is located at the 5' end of the target sequence, and, for The Cas12J-8 protein, the Mb4Cas12a protein, the MlCas12a protein, the MoCas12a protein, the BgCas12a protein, and the ChCas12b protein, or their respective homologues, conjugates or fusion proteins, the The PAMs are 5'-TTN, 5'-YYN, 5'-YYN, 5'-YYN, 5'-YYN and 5'-TTN, respectively.
  • PAM protospacer adjacent sequence
  • a Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue thereof, a conjugate or a fusion protein thereof, and a Cas12a protein comprising the amino acid sequence of SEQ ID NO: 16 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity single-stranded guide RNA;
  • the nucleic acid sequence of the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologues, their conjugates or fusion proteins, and the CRISPR repeat sequence shown in SEQ ID NO: 17, including and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological activity homologous sequence, or comprises a single-stranded guide RNA based on SEQ ID NO: 17 modified sequence obtained and retains its biological activity;
  • a ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, and a CRISPR repeat sequence comprising SEQ ID NO: 18, comprising the same sequence as SEQ ID NO: 18
  • a homologous sequence having at least 90% sequence identity and retaining its biological activity, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the nucleic acid sequence for example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 19 of the single-stranded guide RNA;
  • a nucleic acid sequence such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11
  • the vector comprising SEQ ID NO: 16 encoding for the Mb4Cas12a protein, its homologue, conjugate or fusion protein
  • the shown CRISPR repeat sequence comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 20 of the single-stranded guide RNA of sequence;
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological
  • the homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
  • the cell is a prokaryotic cell or a eukaryotic cell, such as a plant cell or an animal cell, such as a mammalian cell such as a human cell.
  • the gene editing includes gene knockout of target sequence, site-directed base change, site-directed insertion, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification One or more of , single base conversion, and chromatin imaging tracking.
  • the single base conversion comprises the conversion of the base adenine to guanine, the conversion of cytosine to thymine or the conversion of cytosine to uracil.
  • the CRISPR spacer sequence of the single-stranded guide RNA forms a complete base pairing structure with the target sequence, and forms an incomplete base pairing structure with the non-target sequence.
  • the incomplete base complementary pairing structure refers to a structure including a part of base complementary pairing and a part of non-base complementary pairing including, for example, base mismatch (mismatch) and/or Or base bulge (bulge), etc.
  • the incomplete complementary base pairing structure includes one or more, eg, two or more, base mismatches.
  • the Cas12 protein of the present invention can cut the target site on the target sequence, and under the cleavage action of the Cas12 protein, a double-strand break occurs in the target sequence. Further, when the method is carried out in cells, the cleaved target sequence can be repaired by intracellular non-homologous end joining repair or homologous recombination repair pathway, thereby realizing gene editing of the target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention and the gene editing method using the gene editing system are found through experiments to have 40%-70% (for the Cas12J-8 protein), 12%-56% (for the ChCas12b protein) and 10% %-20% (for each other Cas12a protein) editing efficiency.
  • the mismatch of the first 14bp guide RNA has an error tolerance rate close to 0%. Therefore, the gene editing system can edit target genes with high specificity, has the characteristics of high editing efficiency and low off-target rate, and can be widely used in gene editing in cells or in vitro environments.
  • the present invention provides a kit for gene editing a target sequence in a cell or in an in vitro environment, comprising:
  • the Cas12 protein is:
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • 1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
  • a Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1, a homologue, a conjugate or a fusion protein thereof, and a Cas12J-8 protein comprising the amino acid sequence shown in SEQ ID NO: 15 A single-stranded guide RNA of a CRISPR repeat sequence, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 A single-stranded guide RNA with a modified sequence that retains its biological activity;
  • the Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, which has the same protein as SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 Homologs of amino acid sequences having at least 80% sequence identity, their conjugates or fusion proteins, and a single-stranded guide RNA comprising the CRISPR repeat sequence shown in SEQ ID NO: 16, comprising the same sequence as SEQ ID NO: 16 A single-stranded guide RNA with at least 90% sequence identity and a homologous sequence that retains its biological activity, or a single-stranded guide RNA that includes a modified sequence based on SEQ ID NO: 16 and retains its biological activity;
  • the BgCas12a protein having the amino acid sequence shown in SEQ ID NO: 5, its homologue with the amino acid sequence having at least 80% sequence identity to SEQ ID NO: 5, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 17 and retaining its biological activity;
  • the ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, its homologue having an amino acid sequence with at least 80% sequence identity to SEQ ID NO: 6, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 18 and retaining its biological activity.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7) nucleic acid sequence (such as SEQ ID NO: 8 or the nucleic acid sequence shown in SEQ ID NO: 14) the isolated nucleic acid molecule of nucleic acid molecule, and comprise coding for this Cas12J-8 albumen, its homologue, conjugate or fusion protein comprising SEQ ID NO:
  • the isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA of the modified sequence;
  • nucleic acid sequence (SEQ ID NO : 9, SEQ ID NO: 10 or the nucleic acid sequence shown in SEQ ID NO: 11) isolated nucleic acid molecules, and comprising SEQ ID NO coding for the Cas12a protein, its homologue, conjugate or fusion protein : CRISPR repeat sequence shown in 16, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity
  • isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of the modified sequence;
  • the isolated nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 12) comprising the BgCas12a protein encoding the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein Nucleic acid molecule, and comprising coding for this BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and that retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 21 ) isolated nucleic acid molecule;
  • an isolated nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) comprising a ChCas12b protein encoding an amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein thereof Nucleic acid molecule, and comprising coding for this ChCas12b albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 18 and retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 22 ) isolated nucleic acid molecule.
  • a nucleic acid sequence shown in SEQ ID NO: 13 comprising a ChCas12b protein encoding an amino acid sequence shown in SEQ ID NO:
  • the nucleic acid sequence for example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 19 of the single-stranded guide RNA;
  • a nucleic acid sequence (such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11), and the vector comprising SEQ ID NO: 16 encoding for the Cas12a protein, its homologue, conjugate or fusion protein
  • the shown CRISPR repeat sequence comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of sequence;
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological
  • the homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
  • kit of the present invention may also contain other reagents that are helpful for gene editing.
  • SEQ ID NO: 1 Cas12J-8 protein sequence
  • SEQ ID NO: 2 Mb4Cas12a protein sequence
  • SEQ ID NO: 3 MlCas12a protein sequence
  • SEQ ID NO: 4 MoCas12a protein sequence
  • SEQ ID NO: 5 BgCas12a protein sequence
  • SEQ ID NO: 6 ChCas12b protein sequence
  • SEQ ID NO: 7 fusion protein comprising Cas12J-8 protein
  • SEQ ID NO: 8 Coding sequence of Cas12J-8 protein
  • SEQ ID NO: 9 Coding sequence of Mb4Cas12a protein
  • SEQ ID NO: 10 coding sequence of MlCas12a protein
  • SEQ ID NO: 11 Coding sequence of MoCas12a protein
  • SEQ ID NO: 12 coding sequence of BgCas12a protein
  • SEQ ID NO: 13 Coding sequence of ChCas12b protein
  • SEQ ID NO: 14 fusion protein coding sequence comprising Cas12J-8 protein
  • SEQ ID NO: 15 CRISPR repeat sequence used in conjunction with Cas12J-8 protein
  • SEQ ID NO: 16 CRISPR repeat sequence used in conjunction with Mb4Cas12a, MlCas12a and MoCas12a proteins
  • SEQ ID NO: 17 CRISPR repeat sequence used in conjunction with BgCas12a protein
  • SEQ ID NO: 18 CRISPR repeat sequence used in conjunction with ChCas12b protein
  • SEQ ID NO: 19 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Cas12J-8 protein
  • SEQ ID NO: 20 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Mb4Cas12a, MlCas12a, and MoCas12a proteins
  • SEQ ID NO: 21 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with BgCas12a protein
  • SEQ ID NO: 22 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with ChCas12b protein
  • SEQ ID NO: 23 Cas12J-4 protein sequence
  • SEQ ID NO: 24 Cas12J-5 protein sequence
  • SEQ ID NO: 25 Cas12J-7 protein sequence
  • SEQ ID NO: 26 Cas12J-9 protein sequence
  • SEQ ID NO: 27 Coding sequence of Cas12J-4 protein
  • SEQ ID NO: 28 Coding sequence of Cas12J-5 protein
  • SEQ ID NO: 29 Coding sequence of Cas12J-7 protein
  • SEQ ID NO: 30 Coding sequence of Cas12J-9 protein
  • SEQ ID NO: 31 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-4 protein
  • SEQ ID NO: 32 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-5 protein
  • SEQ ID NO: 33 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-7 protein
  • SEQ ID NO: 34 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-9 protein
  • each Cas12 protein listed in Table 1 download its amino acid sequence, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 To SEQ ID NO: shown in 6.
  • Cas12 protein name NCBI Protein Search ID amino acid sequence Cas12J-8 none SEQ ID NO: 1 Mb4Cas12a WP_078273923.1 SEQ ID NO: 2 M1Cas12a WP_065256572.1 SEQ ID NO: 3 MoCas12a WP_112744621.1 SEQ ID NO: 4 BgCas12a OLA11341.1 SEQ ID NO: 5 ChCas12b OQB30769 SEQ ID NO: 6
  • the coding nucleic acid sequences of the above Cas12 proteins were codon-optimized to obtain the gene sequences of the highly expressed Cas12 proteins in human cells.
  • Cas12J-8 protein The coding nucleic acid sequences of the above Cas12 proteins were codon-optimized to obtain the gene sequences of the highly expressed Cas12 proteins in human cells.
  • Mb4Cas12a protein MlCas12a protein
  • MoCas12a protein MlCas12a protein
  • BgCas12a protein BgCas12a protein
  • ChCas12b protein The optimized gene sequences of Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively shown in SEQ ID NO: 8 to SEQ ID NO: 13.
  • the highly expressed gene sequences of each Cas12 protein obtained above from SEQ ID NO: 8 to SEQ ID NO: 13 were gene synthesized and constructed on the slugCas9 backbone plasmid (Addgene platform, catalog #163793) to obtain the plasmid pAAV2_Cas12_ITR.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases.
  • the digestion system was: 1 ⁇ g plasmid psk-BbsI - Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI-Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
  • the pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid hU6-sa-tracr-BsaI, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased at NEB Company), 1 ⁇ L BsaI and 1 ⁇ L NotI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 3 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested with BbsI restriction endonuclease.
  • the digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA fragment is the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR comprising the coding gene of each Cas12 protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp ( MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • Each gRNA was designed, and its sequence is shown in Table 2. Add cohesive end sequences corresponding to both sides of the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR to the sense strand and antisense strand of each designed gRNA sequence pair, and synthesize two oligonucleotide single-stranded DNAs. The specific sequence of the oligonucleotide single-stranded DNA is also shown in the table below.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • HEK293T cells containing the target sequence were plated in a 6-well plate, and the cell density was about 30%.
  • Opti-MEM medium purchased from Gibco
  • liposome transfection reagent 2000 (purchased from Invitrogen) or polyethyleneimine (hereinafter referred to as PEI) (purchased from polysciences) flicked and mixed, pipette 5 ⁇ L 2000 or PEI was added to 100 ⁇ L of Opti-MEM medium (purchased from Gibco), mixed gently, and allowed to stand at room temperature for 5 minutes.
  • PEI polyethyleneimine
  • HEK293T cells were collected three days after editing, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
  • a DNA kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304
  • PCR primers are as follows:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • PCR primers are as follows:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
  • each Cas12 protein listed in Table 1 above download its amino acid sequence information, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 to SEQ ID NO: shown in 6.
  • Codon optimization was carried out on the coding nucleic acid sequence of the Cas12 protein obtained above to obtain the gene sequence of the Cas protein highly expressed in human cells.
  • the gene sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein protein and ChCas12b are shown in SEQ ID NO: 8 to SEQ ID NO: 13 respectively.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases.
  • the digestion system was: 1 ⁇ g plasmid psk-BbsI -Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI-Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
  • the pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid hU6-sa-tracr-BsaI, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased at NEB Company), 1 ⁇ L BsaI and 1 ⁇ L NotI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 3 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease.
  • the enzyme digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10xCutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding gene of each Cas protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp (MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the obtained products were respectively ligated to the obtained linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein, wherein the PAM sequence is shown in Figure 7 to Figure 13, the sequence of the target site (target site) For GGATATGTTGAAGAACACCATGAC.
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO 2 incubator to continue culturing.
  • the flow cytometric analysis technology analyzes the editing efficiency and off-target rate of the CRISPR gene editing system of the present invention on the target sequence.
  • the HEK293T cell line cultured in a CO 2 incubator for 3 days was collected, its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the Y-axis in the lower histograms in Figure 7 to Figure 13 represents the percentage (%) of GFP-positive cells
  • the X-axis represents the oligonucleotide single-stranded DNA sequence corresponding to On-target gRNA and mismatch gRNA.
  • SlugABEmax plasmid (Addgene platform, catalog#163798) as a template for PCR reaction, and the primer sequence is:
  • Primer 1 TCTGGTGGTTTCTCCCAAGAAGA
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel with a voltage of 120V for 30min, and a DNA fragment of 4152bp was purified with a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Or store at -20°C for long-term storage.
  • homologous recombination was performed on the linearized SlugABEmax backbone fragment and the humanized Cas12J-8 fragment (SEQ ID NO: 8) synthesized by the company according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed at 120V for 30 min on a 1% agarose gel, and a DNA fragment of 6305 bp was purified using a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and T4 PNK treatment and T4 DNA ligase treatment were carried out respectively, and the reaction system was as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the pAAV2_envTadA-dCas12J-8_ITR plasmid was digested with Kpn1 and Not1 restriction enzymes (NEB), and the reaction system was: 2 ⁇ g plasmid pAAV2_envTadA-dCas12J-8_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L Kpn1 restriction endonuclease (purchased from NEB Company), 1 ⁇ L Not1 restriction endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment pAAV2_envTadA-dCas12J-8_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • the primer sequence is:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1.5% agarose gel with a voltage of 120V for 30min, and the gel recovery kit was used to purify the Cas12J-8 crRNA DNA fragment of 394bp according to the steps provided by the manufacturer, which was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific) DNA concentration, spare or store at -20°C for long-term storage.
  • a NanoDrop TM Lite spectrophotometer Thermo Scientific
  • the linearized pAAV2_envTadA-dCas12J-8_ITR fragment and the Cas12J-8crRNA fragment were subjected to homologous recombination according to the ratio required by the instructions, and the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the pAAV2_envTadA-dCas12J-8-crRNA_ITR plasmid was digested with BbsI restriction endonuclease, and the digestion system was: 2 ⁇ g plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • the endogenous site target sequence that meets the PAM requirements of the Cas12J-8 protein is randomly selected in the human genome, and the corresponding oligonucleotide single-stranded DNA is shown in the table below.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_envTadA-dCas12J-8-crRNA_ITR vector by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR plasmids were transfected into wild-type HEK293T cell lines by liposomes.
  • transfection process comprises the following steps:
  • HEK293T cell lines were plated in 6-well plates according to transfection requirements, and the cell density was controlled at 30%.
  • plasmid Take 2 ⁇ g of the plasmid to be transfected, pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR, and add it to 100 ⁇ L of Opti-MEM medium (purchased from Gibco), and gently mix by pipetting.
  • Opti-MEM medium purchased from Gibco
  • HEK293T cells seven days after editing were collected, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
  • a DNA kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the second round of PCR for PCR library construction was carried out, and 2 ⁇ Q5 Mastermix was used for PCR reaction, and the PCR primers were the same as the F2 primers and R2 primers given in Example 1 above.
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the second-round PCR product was purified with a gel recovery kit according to the steps provided by the manufacturer to purify the DNA fragment, and thus the next-generation sequencing library was prepared.
  • next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
  • next-generation sequencing results were calculated to obtain the editing ratio of adenine A that meets the editing requirements in each endogenous site target site, and the results are shown in FIG. 14 . It can be seen from the figure that the Cas12J-8ABE base editor has successfully edited these several endogenous site target sites, and only 938 cells containing the Cas12J-8ABE base editor protein A single amino acid can be easily packaged by AAV virus, thus making it possible to apply the CRISPR single base editor system in gene therapy of organisms.
  • the coding nucleic acid sequence of each Cas12 protein is codon-optimized, and the gene sequence of the Cas12 protein highly expressed in human cells is obtained.
  • the gene sequences of Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8 and Cas12J-9 proteins are shown in SEQ ID NO: 27-29, 8 and 30 respectively.
  • the highly expressed gene sequences of Cas12 proteins obtained above as shown in SEQ ID NO: 27-29, 8 and 30 were gene-synthesized, and respectively constructed on the slugCas9 backbone plasmid (Addgene platform, catalog#163793) to obtain each plasmid pAAV2_Cas12_ITR.
  • the pBluescriptSKII+ U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI- Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-PSK-u6-crRNA plasmid expressing sgRNA corresponding to each protein in (2) were linearized by PCR method.
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-PSK-u6-crRNA fragment were homologously recombined according to the ratio required by the instructions, and the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the Escherichia coli DH5 ⁇ clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain each plasmid pAAV2_Cas12-hU6-sgRNA_ITR for future use.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease.
  • the enzyme digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10xCutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • the DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding genes of the above Cas proteins.
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
  • Oligo-F GGATATGTTGAAGAACACCATGAC
  • Oligo-R GTCATGGTGTTCTTCAACATATCC
  • the cohesive ends of Oligo-F for Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8, and Cas12J-9 are CGAC, GGAC, AGAC, AGAC, and AGAC, respectively, and Oligo-R for all Cas12 proteins
  • the cohesive ends of both are AAAA.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the GFP reporter system HEK293T cell line library containing the target sequence is obtained in the following manner: a 5bp random sequence (as a PAM sequence) and a 24bp protospacer (as a target sequence) are inserted between the initiation codon ATG and the GFP coding sequence ), resulting in GFP frameshift mutation and no expression.
  • the GFP gene containing the insert was driven by the CMV promoter and constructed on a lentiviral expression vector. This sequence is randomly inserted into the genome of HEK293T cells by lentivirus, making it a stable GFP reporter cell line library.
  • the cells When the gene editing system is used to cut the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence, and the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry .
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line library containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line library containing the target sequence comprises the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is a 5bp random sequence, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC (FIG. 15).
  • Opti-MEM medium purchased from Gibco
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line library containing the target sequence, and Place it in a 37°C, 5% CO 2 incubator to continue culturing.
  • the pAAV2_Cas12_ITR (SEQ ID NO: 13) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer.
  • the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific)
  • the reaction system is as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the hU6-OQB30769_tracr-Bsa1 plasmid was digested with Bsa1 restriction endonuclease (NEB), and the reaction system was: 2 ⁇ g plasmid hU6-OQB30769_tracr-Bsa1, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L Bsa1 restriction Endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment hU6-OQB30769_tracr-Bsa1 was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the obtained product was ligated to the obtained linearized hU6-OQB30769_tracr-Bsa1 plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting plasmids Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D and Q23T-P182R-E507D-E1090K expressing the ChCas12b point mutant protein were mixed with hU6-OQB30769_tracr-Bsa1-on
  • the sgRNA was co-transfected into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposome.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is CGTTG, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
  • PEI purchased from polysciences company
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
  • the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the editing efficiency results of the ChCas12b point mutant in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 17 .
  • the ChCas12b point mutant cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to produce green fluorescence.
  • the Y-axis in Figure 17 represents the percentage (%) of GFP-positive cells, the X-axis represents ChCas12b and its different point mutants, and NC represents the negative control group (no transfection plasmid).
  • the pAAV2_Cas12_ITR (SEQ ID NO: 9 to SEQ ID NO: 12) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer.
  • the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific)
  • the reaction system is as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the Escherichia coli DH5 ⁇ clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain the point mutant plasmid, which was stored for later use or stored at -20°C.
  • the psk-BbsI-Cas12a-crRNA1 plasmid was digested with BbsI restriction endonuclease (NEB), and the reaction system was: 2 ⁇ g plasmid psk-BbsI-Cas12a-crRNA1, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L of BbsI restriction endonuclease (purchased from NEB Company), made up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment psk-BbsI-Cas12a-crRNA1 was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the resulting linearized psk-BbsI-Cas12a-crRNA1 plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting plasmids expressing the Cas12a point mutant protein were co-transfected with psk-BbsI-Cas12a-crRNA1-on target sgRNA into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposomes.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is GTTTT, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
  • PEI purchased from polysciences company
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
  • the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the editing efficiency results of Cas12a point mutants in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 18 .
  • the Cas12a point mutant cuts the target sequence, the cells will restore the GFP reading frame to some cells through the self-repair system, resulting in green fluorescence.
  • the Y-axis in Figure 18 represents the percentage (%) of GFP-positive cells, the X-axis represents Cas12a and its different point mutants, and NC represents the negative control group (no transfection plasmid).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Urology & Nephrology (AREA)
  • Virology (AREA)
  • Cell Biology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates to the technical field of gene editing, and in particular to a CRISPR/Cas12 gene editing system and an application thereof. The gene editing system is a complex formed by a specific Cas12 protein and sgRNA, and can accurately locate a target DNA sequence and cause cutting, so that double-strand break damage occurs to the target sequence; gene editing refers to intracellular or in vitro gene editing. Specifically, specific Cas12J-8, a Cas12a protein, and a Cas12b protein are involved, the specific Cas12J-8 protein has a relatively small number of amino acids, the specific Cas12J-8 protein, the Cas12a protein, and the Cas12b protein all have high editing efficiency, and PAM sequences identified by the three types of proteins are simple, so that the present invention has a wide prospect for application in the field of gene editing.

Description

Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用Cas12 protein, gene editing system and application containing Cas12 protein 技术领域technical field
本申请属于基因编辑技术领域,具体涉及Cas12蛋白、含有该Cas12蛋白的基因编辑系统及其相关应用。This application belongs to the technical field of gene editing, and specifically relates to Cas12 protein, a gene editing system containing the Cas12 protein and related applications.
背景技术Background technique
CRISPR/Cas系统是细菌和古细菌为抵御外源病毒或质粒入侵而进化的一种获得性免疫系统。在CRISPR/Cas12a和CRISPR/Cas12j系统中,crRNA(CRISPR-derived RNA)和Cas12蛋白形成复合体后,识别靶位点的PAM(Protospacer Adjacent Motif)序列。在识别后,crRNA会与靶向DNA序列形成互补结构,Cas蛋白行使切割DNA的功能,使DNA发生断裂损伤。CRISPR/Cas12b系统还含有tracrRNA(trans-activating RNA),它和crRNA以及Cas12b共同形成复合物发挥功能。tracrRNA和crRNA通过连接序列可以融合成为单链向导RNA(single guide RNA,sgRNA)。当DNA发生断裂损伤后,细胞内的两种主要DNA损伤修复机制负责修复:非同源末端连接(Non-homologous end-joining,NHEJ)和同源重组(homologous recombination,HR)。NHEJ修复的结果会引起碱基的缺失或插入,可以进行基因敲除;在提供同源模板的情况下,利用HR修复可以进行基因的定点插入和碱基的精确替换。The CRISPR/Cas system is an adaptive immune system evolved by bacteria and archaea to resist the invasion of foreign viruses or plasmids. In the CRISPR/Cas12a and CRISPR/Cas12j systems, crRNA (CRISPR-derived RNA) and Cas12 protein form a complex to recognize the PAM (Protospacer Adjacent Motif) sequence of the target site. After recognition, crRNA will form a complementary structure with the targeted DNA sequence, and the Cas protein will perform the function of cutting DNA, resulting in DNA breakage and damage. The CRISPR/Cas12b system also contains tracrRNA (trans-activating RNA), which forms a complex with crRNA and Cas12b to function. tracrRNA and crRNA can be fused into a single-stranded guide RNA (single guide RNA, sgRNA) through a linking sequence. When DNA is broken and damaged, two main DNA damage repair mechanisms in cells are responsible for repair: non-homologous end-joining (Non-homologous end-joining, NHEJ) and homologous recombination (homologous recombination, HR). The result of NHEJ repair will cause base deletion or insertion, and gene knockout can be performed; in the case of providing a homologous template, HR repair can be used for site-specific insertion of genes and precise base replacement.
除了基础科研外,CRISPR/Cas12基因编辑系统还具有广泛的临床应用前景。利用CRISPR/Cas12基因编辑系统做基因治疗时,需要把Cas和单链向导RNA导入到体内。目前做基因治疗最有效的表达载体是腺相关病毒(AAV)。但是AAV病毒包装的DNA一般不超过4.5kb。SpCas9因为PAM序列简单(识别NGG)和活性高而得到广泛应用。但是SpCas9蛋白有1368个氨基酸,加上sgRNA和启动子,无法有效地包装到AAV病毒中,限制了其在临床中的应用。为了克服这个问题,几个分子量小的Cas9被发明出来,包括SaCas9(PAM序列为NNGRRT);St1Cas9(PAM序列为NNAGAW);NmCas9(PAM序列为NNNNGATT);Nme2Cas9(PAM序列为NNNNCC);CjCas9(PAM 序列为NNNNRYAC)。但是这些Cas9或者容易脱靶(即非靶向位点切割),或者PAM序列复杂,或者编辑活性低,难以广泛应用。In addition to basic scientific research, the CRISPR/Cas12 gene editing system also has broad clinical application prospects. When using the CRISPR/Cas12 gene editing system for gene therapy, it is necessary to introduce Cas and single-stranded guide RNA into the body. At present, the most effective expression vector for gene therapy is adeno-associated virus (AAV). However, the DNA packaged by AAV virus generally does not exceed 4.5 kb. SpCas9 has been widely used because of its simple PAM sequence (NGG recognition) and high activity. However, the SpCas9 protein has 1368 amino acids, plus sgRNA and promoter, which cannot be effectively packaged into AAV virus, which limits its clinical application. In order to overcome this problem, several Cas9s with small molecular weight were invented, including SaCas9 (PAM sequence is NNGRRT); St1Cas9 (PAM sequence is NNAGAW); NmCas9 (PAM sequence is NNNNGATT); Nme2Cas9 (PAM sequence is NNNNCC); The PAM sequence is NNNNRYAC). However, these Cas9s are either easy to off-target (that is, cut at non-target sites), or have complex PAM sequences, or have low editing activity, making it difficult to be widely used.
因此,寻找编辑活性高、特异性高、PAM序列简单的小型CRISPR/Cas系统是解决上述问题的希望所在。Therefore, finding a small CRISPR/Cas system with high editing activity, high specificity, and simple PAM sequence is the hope to solve the above problems.
发明内容Contents of the invention
针对上述问题,本发明人进行了反复研究,发现一系列Cas12蛋白以及与之相对应的单链向导RNA,两者能构成有效地进行基因编辑的CRISPA/Cas12基因编辑系统,由此完成了本发明。In response to the above problems, the inventors have conducted repeated studies and found that a series of Cas12 proteins and corresponding single-stranded guide RNAs can constitute a CRISPA/Cas12 gene editing system for effective gene editing, thus completing this paper. invention.
因此,在第一方面,本发明提供了一种缀合物,所述缀合物包含:Therefore, in a first aspect, the present invention provides a conjugate comprising:
a)Cas12蛋白,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MICas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物:以及a) Cas12 protein, the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity: and
b)修饰部分。b) Modification part.
在第二方面,本发明提供了一种融合蛋白,所述融合蛋白包含:In a second aspect, the present invention provides a fusion protein comprising:
a)Cas12蛋白,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物;a) Cas12 protein, the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
b)另外的蛋白或多肽;以及b) additional proteins or polypeptides; and
c)任选的用于连接所述Cas12蛋白或其同源物与所述另外的蛋白或多肽的接头。c) an optional linker for connecting the Cas12 protein or its homologue with the other protein or polypeptide.
在第三方面,本发明提供了一种单链向导RNA,所述单链向导RNA包含CRISPR重复序列,所述CRISPR重复序列具有SEQ ID NO: 15至SEQ ID NO:18中任一个所示的核酸序列,或者具有与SEQ ID NO:15至SEQ ID NO:18中任一项所示的核酸序列至少90%序列同一性且保留其生物学活性的核酸序列,或者具有基于SEQ ID NO:15至SEQ ID NO:18中任一项所述的核酸序列改造得到的保留其生物学活性的核酸序列。In a third aspect, the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having any one of SEQ ID NO: 15 to SEQ ID NO: 18. Nucleic acid sequence, or a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence shown in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity, or having a nucleic acid sequence based on SEQ ID NO: 15 A nucleic acid sequence obtained by modifying the nucleic acid sequence described in any one of SEQ ID NO: 18 and retaining its biological activity.
在第四方面,本发明提供了一种分离的核酸分子,所述分离的核酸分子包含编码以下的核酸序列:In a fourth aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
a)Cas12蛋白,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物:a) Cas12 protein, the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
b)本发明第一方面的缀合物;或者b) a conjugate of the first aspect of the invention; or
c)本发明第三方面的融合蛋白。c) The fusion protein of the third aspect of the present invention.
在第五方面,本发明提供了一种分离的核酸分子,所述分离的核酸分子包含编码本发明第三方面的单链向导RNA的核酸序列。In a fifth aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
在第六方面,本发明提供了一种载体,所述载体包含编码以下的核酸序列:In a sixth aspect, the present invention provides a vector comprising a nucleic acid sequence encoding the following:
a)Cas12蛋白,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物:a) Cas12 protein, the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
b)本发明第一方面的缀合物;或者b) a conjugate of the first aspect of the invention; or
本发明第二方面的融合蛋白。The fusion protein of the second aspect of the present invention.
在第七方面,本发明提供了一种载体,所述载体包含编码本发明第三方面的单链向导RNA的核酸序列。In the seventh aspect, the present invention provides a vector comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
在第八方面,本发明提供了一种CRISPR/Cas12基因编辑系统,其包含:In the eighth aspect, the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
a)蛋白组分,其包含:a) protein component comprising:
1)Cas12蛋白,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物;1) Cas12 protein, the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
2)本发明第一方面的缀合物;或者2) the conjugate of the first aspect of the invention; or
3)本发明第二方面的融合蛋白;3) the fusion protein of the second aspect of the present invention;
b)核酸组分,其包含:b) a nucleic acid component comprising:
本发明第三方面的单链向导RNA。The single-stranded guide RNA of the third aspect of the invention.
在第九方面,本发明提供了一种细胞,所述细胞包含:本发明第六方面的分离的核酸分子、或者本发明第七方面的载体。In a ninth aspect, the present invention provides a cell comprising: the isolated nucleic acid molecule of the sixth aspect of the present invention, or the vector of the seventh aspect of the present invention.
在第十方面,本发明提供了一种对细胞内或体外环境中的靶序列进行基因编辑的方法,所述方法包括:使Cas12蛋白、本发明第一方面的缀合物或本发明第二方面的融合蛋白与本发明第三方面的单链向导RNA、使本发明第六方面和第七方面的载体、或使本发明第八方面的CRISPR/Cas12基因编辑系统与细胞内或体外环境中的靶序列相接触,其中,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物,所述靶序列位于原间隔邻近序列(PAM)的5’端,并且,对于所述Cas12J-8蛋白、所述Mb4Cas12a蛋白、所述MlCas12a蛋白、所述MoCas12a蛋白、所述BgCas12a蛋白、和所述ChCas12b蛋白、或者它们的同源物、缀合物或融合蛋白,所述PAM分别具有序列 5’TTN、5’-YYN、5’YYN、5’YYN、5’-YYN和5’-TTN。In a tenth aspect, the present invention provides a method for gene editing a target sequence in a cell or in vitro, the method comprising: making the Cas12 protein, the conjugate of the first aspect of the present invention or the second aspect of the present invention The fusion protein of the present invention and the single-stranded guide RNA of the third aspect of the present invention, the carrier of the sixth aspect and the seventh aspect of the present invention, or the CRISPR/Cas12 gene editing system of the eighth aspect of the present invention and the intracellular or in vitro environment wherein the Cas12 protein is respectively Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6 , or having at least 80 amino acid sequences as shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 % sequence identity and retain the homologue of its biologically active amino acid sequence, the target sequence is located at the 5' end of the protospacer adjacent sequence (PAM), and, for the Cas12J-8 protein, the Mb4Cas12a protein, The MlCas12a protein, the MoCas12a protein, the BgCas12a protein, and the ChCas12b protein, or their homologues, conjugates or fusion proteins, the PAMs have the sequences 5'TTN, 5'-YYN, 5'YYN, 5'YYN, 5'-YYN, and 5'-TTN.
在第十一方面,本发明提供了一种试剂盒,所述试剂盒包括:Cas12蛋白、本发明第一方面的缀合物或本发明第二方面的融合蛋白与本发明第三方面的单链向导RNA,本发明第四方面和第五方面的分离的核酸分子,本发明第六方面和第七方面的载体,或者本发明第八方面的CRISPR/Cas12基因编辑系统;以及如何对细胞内或体外环境中的靶序列进行基因编辑的说明书;其中,所述Cas12蛋白为分别具有SEQ ID NO:1至SEQ ID NO:6所示氨基酸序列的Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白或ChCas12b蛋白,或者为具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%序列同一性并且保留其生物学活性的氨基酸序列的同源物。In the eleventh aspect, the present invention provides a kit comprising: Cas12 protein, the conjugate of the first aspect of the present invention or the fusion protein of the second aspect of the present invention and the single protein of the third aspect of the present invention Strand guide RNA, the isolated nucleic acid molecule of the fourth aspect and the fifth aspect of the present invention, the carrier of the sixth aspect and the seventh aspect of the present invention, or the CRISPR/Cas12 gene editing system of the eighth aspect of the present invention; or target sequences in an in vitro environment for gene editing; wherein the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a respectively having amino acid sequences shown in SEQ ID NO: 1 to SEQ ID NO: 6 Protein, BgCas12a albumen or ChCas12b albumen, or for having any one in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences with at least 80% sequence identity to the indicated amino acid sequences and which retain their biological activity.
本课题组开发了可在真核细胞环境高效进行基因编辑的Cas12j-8编辑工具。该Cas12j-8蛋白具有较少数量的氨基酸,特别是具有目前可用于真核基因编辑器中最少数量的氨基酸,因此可有效地包装到表达载体例如腺相关病毒载体中。并且,该蛋白具有特异性高、PAM简单的特性,而且蛋白分子量小可轻易被腺相关病毒等载体工具包装,非常适合后期作为基因治疗工具的开发。Our research group has developed a Cas12j-8 editing tool that can efficiently edit genes in a eukaryotic cell environment. The Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors. Moreover, the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
此外,Cas12j-8蛋白的PAM为TTN,PAM简单,编辑范围广泛。而且,经过我们的实验证明,Cas12j-8蛋白在随机位点的编辑效率较FnCas12a蛋白具有显著性优势,在真核环境下的基因编辑能力强。相较于同系列的Cas12j-2蛋白,Cas12j-8具有极为显著的编辑优势,随机位点上的编辑能力显著高于Cas12j-2,更适合进行基因编辑的开发和应用研究。In addition, the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing. Moreover, our experiments have proved that Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment. Compared with the Cas12j-2 protein of the same series, Cas12j-8 has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of Cas12j-2, which is more suitable for the development and application research of gene editing.
本发明的Cas12a蛋白及Cas12b蛋白相比于现有的其他Cas12a蛋白及Cas12b蛋白具有较高的编辑活性,特异性较高,且具有较为简单的PAM序列,同时本发明的Cas12a蛋白及Cas12b蛋白的PAM为YYN,拓展了Cas12a蛋白及Cas12b蛋白的领域,及增加了Cas12a蛋白和Cas12b蛋白的应用范围。Compared with other existing Cas12a proteins and Cas12b proteins, the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and have a relatively simple PAM sequence. PAM is YYN, which expands the field of Cas12a protein and Cas12b protein, and increases the application range of Cas12a protein and Cas12b protein.
附图说明Description of drawings
图1示出CRISPR/Cas12J-8基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 1 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Cas12J-8 gene editing system;
图2示出CRISPR/ChCas12b基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 2 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/ChCas12b gene editing system;
图3示出CRISPR/Mb4Cas12a基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 3 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Mb4Cas12a gene editing system;
图4示出CRISPR/MoCas12a基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 4 shows a schematic diagram of the editing efficiency results after the CRISPR/MoCas12a gene editing system performs gene editing on two target sites;
图5示出CRISPR/BgCas12a基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 5 shows a schematic diagram of the editing efficiency results after the CRISPR/BgCas12a gene editing system performs gene editing on two target sites;
图6示出CRISPR/MICas12a基因编辑系统对两个靶位点进行基因编辑后的编辑效率结果的示意图;Figure 6 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/MICas12a gene editing system;
图7和图8示出CRISPR/Cas12J-8基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 7 and Figure 8 are schematic diagrams showing the specific detection results of the CRISPR/Cas12J-8 gene editing system in the GFP reporter system HEK293T cell line;
图9示出CRISPR/ChCas12b基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 9 shows a schematic diagram of the specific detection results of the CRISPR/ChCas12b gene editing system in the GFP reporter system HEK293T cell line;
图10示出CRISPR/Mb4Cas12a基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 10 shows a schematic diagram of the specific detection results of the CRISPR/Mb4Cas12a gene editing system in the GFP reporter system HEK293T cell line;
图11示出CRISPR/MoCas12a基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 11 shows a schematic diagram of the specific detection results of the CRISPR/MoCas12a gene editing system in the GFP reporter system HEK293T cell line;
图12示出CRISPR/BgCas12a基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 12 shows a schematic diagram of the specific detection results of the CRISPR/BgCas12a gene editing system in the GFP reporter system HEK293T cell line;
图13示出CRISPR/MICas12a基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果的示意图;Figure 13 shows a schematic diagram of the specific detection results of the CRISPR/MICas12a gene editing system in the GFP reporter system HEK293T cell line;
图14示出Cas12J-8ABE碱基编辑器对各内源位点靶位点进行编辑的结果图。Figure 14 shows the results of editing the target sites of each endogenous site by the Cas12J-8ABE base editor.
图15示出利用GFP报告细胞系文库检测CRISPR/Cas系统对靶基因进行编辑的示意图。Fig. 15 shows a schematic diagram of using the GFP reporter cell line library to detect the editing of the target gene by the CRISPR/Cas system.
图16示出使用几个CRISPR/Cas12J基因编辑系统对GFP报告 细胞系进行处理后的细胞照片,其中上图为荧光图像,下图为普通显微图像。Figure 16 shows cell photographs of GFP reporter cell lines processed using several CRISPR/Cas12J gene editing systems, wherein the upper image is a fluorescent image, and the lower image is an ordinary microscopic image.
图17示出ChCas12b及其点突变体在GFP细胞系内的编辑效率的示意图。Figure 17 shows a schematic diagram of the editing efficiency of ChCas12b and its point mutants in GFP cell lines.
图18示出Cas12a及其点突变体在GFP细胞系内的编辑效率的示意图。Figure 18 shows a schematic diagram of the editing efficiency of Cas12a and its point mutants in GFP cell lines.
具体实施方式Detailed ways
下面将进一步对本发明进行详细的描述。应理解,上文的发明内容部分以及下文的详细描述仅为具体阐释本发明之目的,无意于以任何方式对本发明进行限制。本发明的保护范围由随附的权利要求书确定。在不背离本发明的精神和主旨的情况下,本领域技术人与可以对各具体实施方式进行更改。The present invention will be further described in detail below. It should be understood that the summary of the invention above and the detailed description below are only for the purpose of specifically illustrating the present invention, and are not intended to limit the present invention in any way. The protection scope of the present invention is determined by the appended claims. Without departing from the spirit and purpose of the present invention, those skilled in the art can make changes to each specific implementation.
定义definition
除非另有说明,否则本申请中使用的科学和技术名词具有本领域技术人员所通常理解的含义。为了更好地理解本发明,下面提供相关术语的定义和解释。Unless otherwise stated, scientific and technical terms used in this application have the meanings commonly understood by those skilled in the art. In order to better understand the present invention, definitions and explanations of relevant terms are provided below.
本文中使用的术语“Cas12蛋白”、“Cas12”和“Cas”在本申请中可互换使用,指包括Cas12蛋白或其功能活性片段在内的RNA指导的核酸酶。Cas12蛋白是CRISPR/Cas12基因组编辑系统的蛋白组分,能在单链向导RNA(gRNA)的指导下靶向并切割DNA靶序列,形成DNA双链断裂(DSB)。DNA双链断裂能够激活细胞内固有的修复机制非同源末端连接(non-homologousendjoining,NHEJ)和同源重组(homologous recombination,HR),由此对细胞中的DNA损伤进行修复。在修复过程中,对该特定的DNA序列进行定点编辑。The terms "Cas12 protein", "Cas12" and "Cas" used herein are used interchangeably in this application to refer to RNA-guided nucleases including Cas12 protein or functionally active fragments thereof. Cas12 protein is a protein component of the CRISPR/Cas12 genome editing system, which can target and cut DNA target sequences under the guidance of single-stranded guide RNA (gRNA), forming DNA double-strand breaks (DSB). DNA double-strand breaks can activate the inherent repair mechanisms in cells, non-homologous end joining (NHEJ) and homologous recombination (homologous recombination, HR), thereby repairing DNA damage in cells. During repair, site-specific editing is performed on that specific DNA sequence.
本文中使用的术语“点突变体”是指相对于野生型蛋白具有一个氨基酸或者多个氨基酸突变的突变蛋白,在本发明术语“同源物”的范畴内。在本发明中,点突变体包括具有单点突变或多点突变的突变体。在本文中,使用符号“AXXXB”来表示点突变体,其中表示第XXX为的氨基酸A被突变为B。例如,Mb4Cas12a蛋白的点突 变体T207A表示的是Mb4Cas12a蛋白第207位的T(苏氨酸(Thr))突变为A(丙氨酸(Ala))的突变体。类似地,Mb4Cas12a蛋白的点突变体T207A-N616S表示的是Mb4Cas12a蛋白第207位的T(苏氨酸(Thr))突变为A(丙氨酸(Ala))并且第616位的N(天冬酰胺(Asn))突变为S(丝氨酸(Ser))的突变体。The term "point mutant" used herein refers to a mutein with one amino acid or multiple amino acid mutations relative to the wild-type protein, within the scope of the term "homologue" of the present invention. In the present invention, point mutants include mutants having single point mutations or multiple point mutations. Herein, the notation "AXXXB" is used to denote a point mutant, wherein amino acid A denoting XXX is mutated to B. For example, the point mutant T207A of Mb4Cas12a protein represents the mutant that T (threonine (Thr)) at position 207 of Mb4Cas12a protein is mutated to A (alanine (Ala)). Similarly, the point mutant T207A-N616S of the Mb4Cas12a protein represents that the T (threonine (Thr)) at the 207th position of the Mb4Cas12a protein is mutated into A (alanine (Ala)) and the N (asparagus at the 616th position) Amide (Asn)) is mutated to S (serine (Ser)) mutants.
本文中使用的术语“单链向导RNA”、“sgRNA(single guided RNA)”在本申请中可互换使用并且具有本领域技术人员通常理解的含义。一般而言,单链向导RNA或者sgRNA可以包含CRISPR重复序列(repeat sequence)和向导序列(guide sequence),向导序列在本文中也称为向导RNA(guide RNA或gRNA)。在内源性CRISPR系统背景下,向导序列也称为间隔序列(spacer)。在某些情况下,向导序列是与靶序列具有足够相似性从而与所述靶序列杂交并引导CRISPR/Casl2复合物与所述靶序列的特异性结合的任何多核苷酸序列。在某些实施方案中,当最佳比对时,向导序列与其相应靶序列之间的互补程度为至少50%、至少60%、至少70%、至少80%、至少90%、至少95%、或至少99%。确定最佳比对在本领域的普通技术人员的能力范围内。例如,存在公开和可商购的比对算法和程序,诸如但不限于ClustalW、matlab中的史密斯-沃特曼算法(Smith-Waterman)、Bowtie、Geneious、Biopython以及SeqMan。The terms "single-stranded guide RNA" and "sgRNA (single guided RNA)" used herein are used interchangeably in this application and have meanings commonly understood by those skilled in the art. Generally speaking, a single-stranded guide RNA or sgRNA may comprise a CRISPR repeat sequence (repeat sequence) and a guide sequence (guide sequence), and the guide sequence is also referred to herein as a guide RNA (guide RNA or gRNA). In the context of the endogenous CRISPR system, the guide sequence is also called a spacer. In certain instances, a guide sequence is any polynucleotide sequence that has sufficient similarity to a target sequence to hybridize to and direct specific binding of the CRISPR/Cas12 complex to the target sequence. In certain embodiments, when optimally aligned, the degree of complementarity between the guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, Or at least 99%. Determining the optimal alignment is within the ability of one of ordinary skill in the art. For example, there are public and commercially available alignment algorithms and programs such as, but not limited to, ClustalW, Smith-Waterman in matlab, Bowtie, Geneious, Biopython, and SeqMan.
本文中所使用的术语“CRISPR/Cas12复合物”是指单链向导RNA(single guide RNA)或成熟crRNA与Cas12蛋白结合所形成的复合体,其包含与靶序列杂交并由此使Cas12蛋白与所述靶序列结合的向导序列。该复合体能够识别并切割能与该单链向导RNA或成熟crRNA杂交的多核苷酸。The term "CRISPR/Cas12 complex" as used herein refers to the complex formed by the combination of single-stranded guide RNA (single guide RNA) or mature crRNA and the Cas12 protein, which includes hybridization with the target sequence and thus makes the Cas12 protein and Cas12 protein A guide sequence to which the target sequence binds. The complex recognizes and cleaves polynucleotides that hybridize to the single-stranded guide RNA or mature crRNA.
因此,在形成CRISPR/Cas12复合物的情况下,“靶序列”是指被设计为具有靶向性的向导序列所靶向的多核苷酸,例如与该向导序列具有互补性的序列,其中靶序列与向导序列之间的杂交将促进Cas12发挥其活性,例如切割靶序列的活性。完全互补性不是必需的,只要存在足够互补性以引起杂交并且促进Cas12发挥其活性即可。靶序列可以包括任何多核苷酸,如DNA或RNA。在某些情况下,所述靶序列位于细胞的细胞核或细胞质中。在某些情况下,该靶序列可位于真核细胞的一个细 胞器例如线粒体或叶绿体内。Thus, in the context of the formation of a CRISPR/Cas12 complex, "target sequence" refers to a polynucleotide targeted by a guide sequence designed to be targeted, e.g., a sequence complementary to the guide sequence, wherein the target The hybridization between the sequence and the guide sequence will promote Cas12 to exert its activity, such as the activity of cutting the target sequence. Full complementarity is not required, as long as there is sufficient complementarity to cause hybridization and facilitate Cas12 to exert its activity. A target sequence can include any polynucleotide, such as DNA or RNA. In certain instances, the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located in an organelle of a eukaryotic cell such as the mitochondria or chloroplast.
在本文中使用的术语“靶序列”或“靶多核苷酸”可以是对细胞(例如,真核细胞)而言任何内源或外源的多核苷酸。例如,该靶多核苷酸可以是一种存在于真核细胞的细胞核中的多核苷酸。该靶多核苷酸可以是一个编码基因产物(例如,蛋白质)的序列或一个非编码序列(例如,调节多核苷酸或无用DNA)。在某些情况下,该靶序列应该与原间隔序列临近基序(PAM)相关。对PAM的精确序列和长度要求根据使用的Cas蛋白而不同,但是PAM典型地是临近原间隔序列(靶序列)的2-5个碱基序列。本领域技术人员能够鉴定与给定的Cas蛋白一起使用的PAM序列。The term "target sequence" or "target polynucleotide" as used herein may be any polynucleotide endogenous or exogenous to a cell (eg, a eukaryotic cell). For example, the target polynucleotide can be a polynucleotide present in the nucleus of a eukaryotic cell. The target polynucleotide can be a sequence encoding a gene product (eg, protein) or a non-coding sequence (eg, regulatory polynucleotide or dummy DNA). In some cases, the target sequence should be related to a protospacer adjacent motif (PAM). The exact sequence and length requirements for the PAM vary depending on the Cas protein used, but the PAM is typically 2-5 bases sequence adjacent to the protospacer (target sequence). Those skilled in the art will be able to identify the PAM sequence to use with a given Cas protein.
本文中使用的术语“多核苷酸”、“核酸序列”、“核苷酸序列”或“核酸片段”可互换使用并且是单链或双链RNA或DNA聚合物,任选地可含有合成的、非天然的或改变的核苷酸碱基。核苷酸通过如下它们的单个字母名称来指代:“A”为腺苷或脱氧腺苷(分别对应RNA或DNA),“C”表示胞苷或脱氧胞苷,“G”表示鸟苷或脱氧鸟苷,“U”表示尿苷,“T”表示脱氧胸苷,“R”表示嘌呤(A或G),“Y”表示嘧啶(C或T),“K”表示G或T,“H”表示A或C或T,“I”表示肌苷,并且“N”表示任何核苷酸。As used herein, the terms "polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid fragment" are used interchangeably and are single- or double-stranded RNA or DNA polymers, optionally containing synthetic unnatural, or altered nucleotide bases. Nucleotides are referred to by their single letter designations as follows: "A" for adenosine or deoxyadenosine (for RNA or DNA, respectively), "C" for cytidine or deoxycytidine, "G" for guanosine or Deoxyguanosine, "U" means uridine, "T" means deoxythymidine, "R" means purine (A or G), "Y" means pyrimidine (C or T), "K" means G or T," H" means A or C or T, "I" means inosine, and "N" means any nucleotide.
本文中使用的术语“多肽”、“肽”、和“蛋白(质)”在本申请中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,并且适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。The terms "polypeptide", "peptide", and "protein" used herein are used interchangeably in this application to refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are an artificial chemical analog of the corresponding naturally occurring amino acid, and to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms including, but not limited to, glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation ylation and ADP-ribosylation.
本文中使用的术语序列“同一性”或者“同源性”具有本领域公认的含义,并且可以利用公开的技术计算两个核酸或多肽分子或区域之间序列同一性的百分比。可以沿着多核苷酸或多肽的全长或者沿着该分子的区域测量序列同一性。(参见,例如Computational Molecular Biology,Lesk,The term sequence "identity" or "homology" as used herein has an art-recognized meaning, and the percent sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide, or along a region of the molecule. (See, e.g., Computational Molecular Biology, Lesk,
A.M.,ed.,Oxford University Press,New York,1988;Biocomputing:Informatics and Genome Projects,Smith,D.W.,ed.,Academic Press,New York,1993;Computer Analysis of Sequence Data,Part I,Griffin,A.M.,and Griffin,H.G.,eds.,Humana Press,New Jersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G.,Academic Press,1987;and  Sequence Analysis Primer,Gribskov,M.and Devereux,J.,eds.,M Stockton Press,New York,1991)。虽然存在许多测量两个多核苷酸或多肽之间的同一性的方法,但是术语“同一性”是技术人员公知的在肽或蛋白中适合于保守型氨基酸置换的,并且一般可以进行而不改变所得分子的生物活性。通常,本领域技术人员认识到多肽的非必需区中的单个氨基酸置换基本上不改变生物活性(参见例如Watson et al.,Molecular Biology of the Gene,4th Edition,1987,The Benjamin/Cummings Pub.co.,p.224)。A.M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991). While there are many methods of measuring identity between two polynucleotides or polypeptides, the term "identity" is well known to the skilled artisan to be suitable for conservative amino acid substitutions in peptides or proteins and can generally be made unchanged The biological activity of the resulting molecule. Generally, those skilled in the art recognize that single amino acid substitutions in non-essential regions of a polypeptide do not substantially alter biological activity (see, e.g., Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub.co ., p. 224).
本文中所使用的术语“载体”是指可将多聚核苷酸插入其中的一种核酸运载工具。当载体能使插入的多核苷酸编码的蛋白获得表达时,或者当载体能使得插入的多核苷酸发生转录(例如转录生成mRNA或功能RNA)时,载体称为表达载体。载体可以通过转化、转导或者转染而被导入宿主细胞,使其携带的遗传物质元件在宿主细胞中获得表达。载体是本领域技术人员公知的,包括但不限于:质粒载体、病毒载体等。载体还可以含有多种调控表达的调控序列。“调控序列”和“调控元件”在本文中可互换使用,指位于编码序列的上游(5′非编码序列)、中间或下游(3′非编码序列)、并且影响相关编码序列的转录、RNA加工或稳定性或者翻译的核苷酸序列。调控序列可以包括但不限于启动子序列、转录起始序列、增强子序列、选择元件及报告基因等。所述调控序列可以是不同来源的,也可以是相同来源但以不同于通常天然存在的方式排列的。另外,载体还可含有复制起始位点。The term "vector" as used herein refers to a nucleic acid delivery vehicle into which a polynucleotide can be inserted. When the vector enables the expression of a protein encoded by the inserted polynucleotide, or when the vector enables the transcription of the inserted polynucleotide (eg, into mRNA or functional RNA), the vector is called an expression vector. A vector can be introduced into a host cell by transformation, transduction or transfection, so that the genetic material elements it carries can be expressed in the host cell. Vectors are well known to those skilled in the art, including but not limited to: plasmid vectors, viral vectors and the like. The vector may also contain various regulatory sequences to regulate expression. "Regulatory sequence" and "regulatory element" are used interchangeably herein to refer to a sequence located upstream (5' non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence and which affects the transcription of the associated coding sequence, RNA processing or stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, reporter genes, and the like. The regulatory sequences may be of different origin, or of the same origin but arranged in a manner different from that normally found in nature. In addition, the vector may also contain an origin of replication.
本文中使用的术语“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。The term "promoter" as used herein refers to a nucleic acid segment capable of controlling the transcription of another nucleic acid segment. In some embodiments of the invention, the promoter is a promoter capable of controlling the transcription of a gene in a cell, whether or not it is derived from said cell. The promoter may be a constitutive promoter or a tissue specific promoter or a developmentally regulated promoter or an inducible promoter.
本文中使用的术语“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的DNA序列。The term "constitutive promoter" as used herein refers to a promoter that will generally cause a gene to be expressed in most cell types under most circumstances. "Tissue-specific promoter" and "tissue-preferred promoter" are used interchangeably and refer to expression that is primarily, but not necessarily exclusively, in one tissue or organ, and may also be expressed in a particular cell or cell type promoter. A "developmentally regulated promoter" refers to a promoter whose activity is determined by developmental events. An "inducible promoter" selectively expresses an operably linked DNA sequence in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
将核酸分子(例如质粒、线性核酸片段、RNA等)或蛋白质“导入”生物体是指用所述核酸或蛋白质转化生物体细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发明所用的“转化”包括稳定转化和瞬时转化。"Introducing" a nucleic acid molecule (eg, plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism means transforming cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell. "Transformation" as used in the present invention includes stable transformation and transient transformation.
本文中使用的术语“稳定转化”指将外源核苷酸序列导入基因组中,导致外源基因稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。The term "stable transformation" as used herein refers to the introduction of a foreign nucleotide sequence into the genome, resulting in stable inheritance of the foreign gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
本文中使用的术语“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源基因稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。The term "transient transformation" as used herein refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without the stable inheritance of a foreign gene. In transient transformation, the exogenous nucleic acid sequence does not integrate into the genome.
本文中使用的术语“互补性”是指一个核酸序列与另一个核酸序列借助于传统的沃森-克里克或其他非传统类型形成一个或多个氢键的能力。互补百分比表示一个核酸分子中可与另一个核酸序列形成氢键(例如,沃森-克里克碱基配对)的残基的百分比(例如,10个之中有5、6、7、8、9、10个互补,则互补百分比为50%、60%、70%、80%、90%和100%)。“完全互补”表示一个核酸序列的所有连续残基与另一个核酸序列中的相同数目的连续残基均形成氢键。如本文使用的“基本上互补”是指在一个具有8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、30、35、40、45、50个或更多个核苷酸的区域上至少为60%、65%、70%、75%、80%、85%、90%、95%、97%、98%、99%或100%的互补程度,或者是指在严格条件下杂交的两个核酸。As used herein, the term "complementarity" refers to the ability of a nucleic acid sequence to form one or more hydrogen bonds with another nucleic acid sequence by means of traditional Watson-Crick or other non-traditional types. Percent complementarity indicates the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with another nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 complementary, then the complementary percentages are 50%, 60%, 70%, 80%, 90% and 100%). "Perfectly complementary" means that all contiguous residues of one nucleic acid sequence hydrogen bond with the same number of contiguous residues in the other nucleic acid sequence. "Substantially complementary" as used herein refers to a group having At least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98 over a region of 30, 35, 40, 45, 50 or more nucleotides %, 99% or 100% degree of complementarity, or refers to two nucleic acids that hybridize under stringent conditions.
本文中使用的与杂交相关的术语“严格条件”是指与靶序列具有互补性的一个核酸主要地与该靶序列杂交并且基本上不杂交到非靶序列上的条件。严格条件通常是序列依赖性的,并且取决于许多因素。一般而言,该序列越长,则该序列特异性地杂交到其靶序列上的温度就越高。严格条件的非限制性实例描述于蒂森(Tijssen)(1993)的《生物化学和分子生物学中的实验室技术-核酸探针杂交》(Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization With Nucleic Acid Probes),第I部分,第二章,“杂交原理概述和核酸探针分析策略”(“Overview of principles of hybridization andthe strategy of nucleic acid  probe assay”),爱思唯尔(Elsevier),纽约)。The term "stringent conditions" used herein in relation to hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence primarily hybridizes to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are generally sequence-dependent and depend on many factors. In general, the longer the sequence, the higher the temperature at which the sequence will specifically hybridize to its target sequence. Non-limiting examples of stringent conditions are described in "Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization With Nucleic Acid Probes" by Tijssen (1993). ), Part I, Chapter 2, "Overview of principles of hybridization and the strategy of nucleic acid probe assay", Elsevier, New York).
本文中使用的术语“杂交”是指其中一个或多个多核苷酸反应形成一种复合物的反应,该复合物经由这些核苷酸残基之间的碱基的氢键键合而稳定化。氢键键合可以借助于沃森-克里克碱基配对、Hoogstein结合或以任何其他序列特异性方式而发生。该复合物可包含形成一个双链体的两条链、形成多链复合物的三条或多条链、单个自我杂交链、或这些的任何组合。杂交反应可以构成一个更广泛的过程(如PCR的开始、或经由一种酶的多核苷酸的切割)中的一个步骤。能够与一个给定序列杂交的序列被称为该给定序列的“互补物”。As used herein, the term "hybridization" refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized by hydrogen bonding of the bases between the nucleotide residues . Hydrogen bonding can occur by means of Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex, three or more strands forming a multi-strand complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a broader process such as the initiation of PCR, or cleavage of a polynucleotide by an enzyme. A sequence that is capable of hybridizing to a given sequence is called the "complement" of that given sequence.
衍生化蛋白derivatized protein
可以对Cas12蛋白进行衍生化,例如将其连接至另外的分子(例如另外的蛋白或多肽)。通常,蛋白的衍生化(例如标记)不会不利影响该蛋白的期望活性(例如,起与单链向导RNA结合的活性、核酸内切酶活性、在向导RNA引导下与靶序列特定位点结合并切割的活性)。因此,在本发明中,可以将Cas12蛋白功能性连接(通过化学偶合、基因融合、非共价连接或其它方式)于一个或多个其它分子部分,例如另外的蛋白或多肽、可检测标记、药用试剂等。The Cas12 protein can be derivatized, for example linked to another molecule (eg another protein or polypeptide). Typically, derivatization (e.g., labeling) of a protein does not adversely affect the desired activity of the protein (e.g., activity to bind to a single-stranded guide RNA, endonuclease activity, binding to a specific site in a target sequence guided by a guide RNA) and cleavage activity). Therefore, in the present invention, the Cas12 protein can be functionally connected (by chemical coupling, gene fusion, non-covalent connection or other means) to one or more other molecular parts, such as another protein or polypeptide, detectable label, Pharmaceutical reagents, etc.
特别地,可以将Cas12蛋白连接其他功能性单元。例如,可以将其与核定位信号(NLS)序列连接,以提高本发明的蛋白进入细胞核的能力。例如,可以将其与靶向部分连接,以使得Cas12蛋白具有靶向性。例如,可以将其与可检测标记连接,以便于对Cas12蛋白进行检测。例如,可以将其与表位标签连接,以便于对Cas12蛋白的表达、检测、示踪和/或纯化。In particular, the Cas12 protein can be linked to other functional units. For example, it can be linked to a nuclear localization signal (NLS) sequence to enhance the ability of the protein of the invention to enter the nucleus. For example, it can be linked with a targeting moiety so that the Cas12 protein is targeted. For example, it can be linked with a detectable label to facilitate the detection of Cas12 protein. For example, it can be linked with an epitope tag to facilitate the expression, detection, tracking and/or purification of the Cas12 protein.
因此,在第一方面,本发明提供了一种缀合物,所述缀合物包含:Therefore, in a first aspect, the present invention provides a conjugate comprising:
a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
2)具有SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) having at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6, At least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93 %, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, Amino acids that have at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of 80%-100% sequence identity and retain their biological activity sequence homologues;
b)修饰部分;以及b) the modification part; and
c)任选的用于连接所述Cas12蛋白与所述修饰部分的接头。c) an optional linker for connecting the Cas12 protein with the modified part.
在本发明中,所谓Cas12蛋白的“生物学活性”是指该蛋白与单链向导RNA结合的活性、核酸内切酶活性(包括单链切割活性和双链切割活性)、和/或在向导RNA(gRNA)引导下与靶序列特定位点结合并切割的活性,但不限于此。In the present invention, the "biological activity" of the so-called Cas12 protein refers to the activity of the protein in conjunction with single-stranded guide RNA, endonuclease activity (including single-strand cleavage activity and double-strand cleavage activity), and/or in the guide RNA (gRNA)-guided binding and cleavage activity at a specific site in a target sequence, but not limited thereto.
在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。In a preferred embodiment, said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182 E507D-E1090K.
可以理解,除了Cas12蛋白自身外,还可以使Cas12蛋白与其他物质例如其他蛋白或者可标记标签等结合从而赋予其他的功能性。It can be understood that, in addition to the Cas12 protein itself, the Cas12 protein can also be combined with other substances such as other proteins or labelable tags to impart other functions.
因此,在一个实施方案中,所述修饰部分可以为另外的蛋白或多肽、可检测标记或其组合。Thus, in one embodiment, the modified moiety may be an additional protein or polypeptide, a detectable label, or a combination thereof.
在一个进一步的实施方案中,所述另外的蛋白或多肽选自表位标签、报告蛋白或核定位信号(NLS)序列、胞嘧啶脱氨酶(CBE)、腺嘌呤脱氨酶(ABE)、胞嘧啶甲基化酶DNMT3A和MQ1、胞嘧啶去甲基化酶Tet1、转录激活蛋白VP64、p65和RTA、转录抑制蛋白KRAB、组蛋白乙酰化酶p300、组蛋白去乙酰化酶LSD1、和内切酶FokI中的一种或者多种。In a further embodiment, said additional protein or polypeptide is selected from an epitope tag, a reporter protein or a nuclear localization signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), Cytosine methylase DNMT3A and MQ1, cytosine demethylase Tet1, transcription activators VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endogenous One or more of Dicer FokI.
表位标签是本领域技术人员熟知的,其实例包括但不限于His、V5、FLAG、HA、Myc、VSV-G、Trx等,并且本领域技术人员已知如何根据期望目的(例如,纯化、检测或示踪)选择合适的表位标签。Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag.
报告蛋白是本领域技术人员熟知的,其实例包括但不限于GST、HRP、CAT、GFP、HcRed、DsRed、CFP、YFP、BFP等。Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
可检测标记是本领域技术人员熟知的,其实例包括荧光染料,例如异硫氰酸荧光素(FITC)或DAPI。Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
本发明的Cas12蛋白可以通过接头与所述修饰部分偶联、缀合或融合,也可以不通过接头而直接地与所述修饰部分连接。接头是本领域熟知的,其实例可以包括但不限于包含1-50个氨基酸(如Glu或Ser)或氨基酸衍生物(如Ahx、β-Ala、GABA或Ava)的接头、或PEG等。The Cas12 protein of the present invention can be coupled, conjugated or fused to the modified part through a linker, or directly connected to the modified part without a linker. Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, β-Ala, GABA or Ava), or PEG, etc.
在第二方面,本发明提供了一种融合蛋白,所述融合蛋白包含:In a second aspect, the present invention provides a fusion protein comprising:
a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5,
or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
b)另外的蛋白或多肽、以及b) additional proteins or polypeptides, and
c)任选的用于连接所述Cas12蛋白与所述另外的蛋白或多肽的接头。c) an optional linker for connecting the Cas12 protein with the other protein or polypeptide.
同本发明第一方面一样,在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。Same as the first aspect of the present invention, in a preferred embodiment, the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO :5 and SEQ ID NO: any point mutant of the Cas12 protein shown in 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the The point mutants of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the BgCas12a Point mutants of the protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T- P182R-E507D or Q23T-P182R-E507D-E1090K.
所述另外的蛋白或多肽可以选自表位标签、报告蛋白或核定位信号(NLS)序列、胞嘧啶脱氨酶(CBE)、腺嘌呤脱氨酶(ABE)、胞嘧啶甲基化酶DNMT3A和MQ1、胞嘧啶去甲基化酶Tet1、转录 激活蛋白VP64、p65和RTA、转录抑制蛋白KRAB、组蛋白乙酰化酶p300、组蛋白去乙酰化酶LSD1、和内切酶FokI中的一种或者多种。The additional protein or polypeptide may be selected from epitope tags, reporter proteins or nuclear localization signal (NLS) sequences, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase DNMT3A One of MQ1, cytosine demethylase Tet1, transcription activator VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI or multiple.
表位标签是本领域技术人员熟知的,其实例包括但不限于His、V5、FLAG、HA、Myc、VSV-G、Trx等,并且本领域技术人员已知如何根据期望目的(例如,纯化、检测或示踪)选择合适的表位标签。报告蛋白是本领域技术人员熟知的,其实例包括但不限于GST、HRP、CAT、GFP、HcRed、DsRed、CFP、YFP、BFP等。Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag. Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
报告蛋白是本领域技术人员熟知的,其实例包括但不限于GST、HRP、CAT、GFP、HcRed、DsRed、CFP、YFP、BFP等。Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
可检测标记是本领域技术人员熟知的,其实例包括荧光染料,例如异硫氰酸荧光素(FITC)或DAPI。Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
本发明的Cas12蛋白可以通过接头与所述另外的蛋白或多肽偶联、缀合或融合,也可以不通过接头而直接地与所述另外的蛋白或多肽连接。接头是本领域熟知的,其实例包括但不限于包含1-50个氨基酸(如Glu或Ser)或氨基酸衍生物(如Ahx、β-Ala、GABA或Ava)的接头、或PEG等。The Cas12 protein of the present invention can be coupled, conjugated or fused with the other protein or polypeptide through a linker, or directly connected with the other protein or polypeptide without a linker. Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, β-Ala, GABA or Ava), or PEG, etc.
在一个优选的实施方案中,所述融合蛋白包含:具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、腺嘌呤脱氨酶(ABE)、以及任选的连接所述Cas12J-8蛋白和所述腺嘌呤脱氨酶(ABE)的接头。In a preferred embodiment, the fusion protein comprises: Cas12J-8 protein, adenine deaminase (ABE) with the amino acid sequence shown in SEQ ID NO: 1, and optionally connecting the Cas12J-8 protein and the linker of the adenine deaminase (ABE).
在一个优选的实施方案中,所述融合蛋白从其N端到C端依次为所述腺嘌呤脱氨酶(ABE)、所述接头、以及所述Cas12J-8蛋白。In a preferred embodiment, the fusion protein is sequentially composed of the adenine deaminase (ABE), the linker, and the Cas12J-8 protein from its N-terminus to its C-terminus.
在一个更优选的实施方案中,所述融合蛋白的氨基酸序列为SEQ ID NO:7所示。In a more preferred embodiment, the amino acid sequence of the fusion protein is shown in SEQ ID NO: 7.
本课题组开发了可在真核细胞环境高效进行基因编辑的Cas12j-8编辑工具。该Cas12j-8蛋白具有较少数量的氨基酸,特别是具有目前可用于真核基因编辑器中最少数量的氨基酸,因此可有效地包装到表达载体例如腺相关病毒载体中。并且,该蛋白具有特异性高、PAM简单的特性,而且蛋白分子量小可轻易被腺相关病毒等载体工具包装,非常适合后期作为基因治疗工具的开发。Our research group has developed a Cas12j-8 editing tool that can efficiently edit genes in a eukaryotic cell environment. The Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors. Moreover, the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
此外,Cas12j-8蛋白的PAM为TTN,PAM简单,编辑范围广泛。而且,经过我们的实验证明,Cas12j-8蛋白在随机位点的编辑效率较FnCas12a蛋白具有显著性优势,在真核环境下的基因编辑能力强。相较于同系列的Cas12j-2蛋白,Cas12j-8蛋白具有极为显著的编辑优势,随机位点上的编辑能力显著高于Cas12j-2蛋白,更适合进行基因编辑的开发和应用研究。In addition, the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing. Moreover, our experiments have proved that Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment. Compared with the Cas12j-2 protein of the same series, the Cas12j-8 protein has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of the Cas12j-2 protein, which is more suitable for the development and application research of gene editing.
本发明的Cas12a蛋白及Cas12b蛋白相较于现有的其他Cas12a蛋白及Cas12b蛋白具有较高的编辑活性,特异性较高,且具有较为简单的PAM序列,同时Cas12a蛋白及Cas12b蛋白的PAM为YYN,拓展了Cas12a蛋白及Cas12b蛋白的领域,及增加了Cas12a蛋白和Cas12b蛋白的应用范围。Compared with other existing Cas12a proteins and Cas12b proteins, the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and a relatively simple PAM sequence, and the PAM of the Cas12a protein and Cas12b protein is YYN , expanding the field of Cas12a protein and Cas12b protein, and increasing the application range of Cas12a protein and Cas12b protein.
单链向导RNAsingle-stranded guide RNA
在第三方面,本发明提供了一种单链向导RNA,所述单链向导RNA包括CRISPR重复序列,所述CRISPR重复序列具有:In a third aspect, the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having:
a)针对Cas12J-8蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:15所示的核酸序列,A) for Cas12J-8 albumen, its homologue, conjugate or the nucleic acid sequence shown in the SEQ ID NO of fusion protein: 15,
针对Mb4Cas12a蛋白、MlCas12a蛋白和MoCas12a蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:16所示的核酸序列,For the nucleic acid sequence shown in SEQ ID NO of Mb4Cas12a protein, MlCas12a protein and MoCas12a protein, its homologue, conjugate or fusion protein,
针对BgCas12a蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:17所示的核酸序列,或For BgCas12a albumen, its homologue, the nucleic acid sequence shown in the SEQ ID NO of conjugate or fusion protein: 17, or
针对ChCas12b蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:18所示的核酸序列;For ChCas12b protein, its homologue, conjugate or fusion protein SEQ ID NO: the nucleotide sequence shown in 18;
或者or
b)与SEQ ID NO:15至SEQ ID NO:18中任一个所示的核酸序列至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.9%或者至少100%的序列同一性且保留其生物学活性的核酸序列;或者b) at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, A nucleic acid sequence that has at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity and retains its biological activity; or
c)基于SEQ ID NO:15至SEQ ID NO:18中任一个所述的核酸序列改造得到的且保留其生物学活性的核酸序列。c) a nucleic acid sequence modified based on the nucleic acid sequence described in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity.
在一个实施方案中,所述改造可以为碱基磷酸化、碱基硫化、 碱基甲基化、碱基羟基化、序列的缩短和序列的加长中的一种或者多种。In one embodiment, the modification may be one or more of base phosphorylation, base sulfuration, base methylation, base hydroxylation, sequence shortening and sequence lengthening.
在一个进一步的实施方案中,所述序列的缩短和所述序列加长包括相对于基础序列存在一个、两个、三个、四个、五个、六个、七个、八个、九个或者十个碱基的缺失或者添加。In a further embodiment, shortening of said sequence and lengthening of said sequence comprises the presence of one, two, three, four, five, six, seven, eight, nine or Deletions or additions of ten bases.
在又一个实施方案中,所述单链向导RNA可以在所述CRISPR重复序列的3’端进一步包括CRISPR间隔序列,所述CRISPR间隔序列为长度为20、21、22、23、24、25、26、27、28、29、30个核苷酸(优选24个核苷酸)且能够与靶序列互补配对的序列。In yet another embodiment, the single-stranded guide RNA may further include a CRISPR spacer sequence at the 3' end of the CRISPR repeat sequence, and the CRISPR spacer sequence is 20, 21, 22, 23, 24, 25, A sequence of 26, 27, 28, 29, 30 nucleotides (preferably 24 nucleotides) capable of complementary pairing with the target sequence.
在一个优选的实施方案中,所述CRISPR间隔序列为长度为24个核苷酸且能够与靶序列互补配对的序列。In a preferred embodiment, the CRISPR spacer sequence is a sequence of 24 nucleotides in length and capable of complementary pairing with the target sequence.
在一个进一步的实施方案中,所述单链向导RNA在所述间隔序列的3’端进一步包括终止子。作为示例,所述终止子可以为多个如至少六个(例如七个或者八个)U构成的终止子。In a further embodiment, said single-stranded guide RNA further comprises a terminator at the 3' end of said spacer sequence. As an example, the terminator may be a plurality of terminators composed of at least six (eg, seven or eight) Us.
所述单链向导RNA能够与上述的Cas12蛋白、缀合物或者融合蛋白结合而形成复合物,该复合物可以识别相应的PAM并由此与靶序列结合,进而实现对靶序列的剪切或者说基因编辑。The single-stranded guide RNA can combine with the above-mentioned Cas12 protein, conjugate or fusion protein to form a complex, which can recognize the corresponding PAM and thus bind to the target sequence, thereby realizing the shearing of the target sequence or Say gene editing.
编码核酸以及载体Encoding nucleic acid and vector
在第四方面,本发明提供了一种分离的核酸分子,所述分离的核酸分子包含编码以下的核酸序列:In a fourth aspect, the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5,
or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、 SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
b)本发明第一方面的缀合物;或者b) a conjugate of the first aspect of the invention; or
c)本发明第二方面的融合蛋白。c) The fusion protein of the second aspect of the invention.
同本发明第一和第二方面一样,在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。Like the first and second aspects of the present invention, in a preferred embodiment, the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, A point mutant of any one of the Cas12 protein shown in SEQ ID NO: 5 and SEQ ID NO: 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S- I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, Point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D , Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K.
在一个实施方案中,所述分离的核酸分子包含SEQ ID NO:8、SEQ ID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13中任一个所示的核酸序列或其简并序列。In one embodiment, the isolated nucleic acid molecule comprises any of SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13 A nucleic acid sequence as indicated or a degenerate sequence thereof.
在一个实施方案中,所述分离的核酸分子包含编码SEQ ID NO:7所示融合蛋白的核酸序列。In one embodiment, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
在一个优选的实施方案中,所述分离的核酸分子包含SEQ ID NO:14所示的核酸序列或其简并序列。In a preferred embodiment, the isolated nucleic acid molecule comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
在一个进一步的实施方案中,所述分离的核酸分子还编码本发 明第三方面的与所述Cas12蛋白对应的单链向导RNA。In a further embodiment, the isolated nucleic acid molecule also encodes the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
作为一个示例,所述分离的核酸分子包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列,例如SEQ ID NO:8、或SEQ ID NO:14所示的核酸序列,并且包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:19所示的核酸序列。As an example, the isolated nucleic acid molecule comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, its homologue, a conjugate or a fusion protein (such as a fusion shown in SEQ ID NO: 7) Protein) nucleic acid sequence, such as SEQ ID NO: 8, or the nucleic acid sequence shown in SEQ ID NO: 14, and comprises a sequence encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID The CRISPR repeat sequence shown in NO: 15, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity, such as the nucleic acid sequence shown in SEQ ID NO: 19.
作为一个示例,所述分离的核酸分子包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列,并且包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:20所示的核酸序列。As an example, the isolated nucleic acid molecule comprises a Cas12a protein encoding an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue, a conjugate or a fusion protein thereof Nucleic acid sequence, such as the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID coding for the Cas12a protein, its homologue, conjugate or fusion protein The CRISPR repeat sequence shown in NO: 16, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity, such as the nucleic acid sequence shown in SEQ ID NO:20.
作为一个示例,所述分离的核酸分子包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:12所示的核酸序列,并且包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:21所示的核酸序列。As an example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 12 Nucleic acid sequence, and comprising encoding for the BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and a modified sequence that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21 .
作为一个示例,所述分离的核酸分子包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:13所示的核酸序列,并且包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO: 18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:22所示的核酸序列。As an example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 13 Nucleic acid sequence, and comprising encoding for the ChCas12b protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 18 shown in the CRISPR repeat sequence, comprising at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence that includes a single-stranded guide RNA that is modified based on SEQ ID NO: 18 and that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22 .
在第五方面,本发明提供了一种分离的核酸分子,所述分离的核酸分子编码本发明第三方面的单链向导RNA。In a fifth aspect, the invention provides an isolated nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the invention.
在一个实施方案中,所述分离的核酸分子包含SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、和SEQ ID NO:22中任一个所示的核酸序列或其简并序列。In one embodiment, the nucleic acid molecule of described separation comprises SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21 and the nucleotide sequence shown in any one of SEQ ID NO:22 or its degenerate sequence .
在一个优选的实施方案中,所述分离的核酸分子还包含编码CRISPR间隔序列的核酸序列。In a preferred embodiment, said isolated nucleic acid molecule further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
在利用本领域已知的某些工具例如表达载体将本发明的分离的核酸分子转染到相应的细胞中后,本发明的分离的核酸分子可以表达出本发明上文所述的Cas12蛋白、其缀合物或融合蛋白、和/或上文所述的单链向导RNA,并在此行使相应的功能,例如进行基因编辑。After using certain tools known in the art such as expression vectors to transfect the isolated nucleic acid molecule of the present invention into corresponding cells, the isolated nucleic acid molecule of the present invention can express the Cas12 protein, Cas12 protein, Its conjugate or fusion protein, and/or the above-mentioned single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
另外,本发明的分离的核酸分子可以单独地/分别地表达Cas12蛋白、其缀合物或融合蛋白、以及单链向导RNA,也可以一体地表达所述的表达产物,选择何种表达方式根据具体情况而定。In addition, the isolated nucleic acid molecule of the present invention can express Cas12 protein, its conjugate or fusion protein, and single-stranded guide RNA separately/respectively, or express the expression product integrally, which expression method is selected according to It depends.
再者,所述表达产物具有上文记载的相应作用和/或功能,为简洁起见在此不再赘述。Furthermore, the expression product has the corresponding effects and/or functions described above, which will not be repeated here for the sake of brevity.
在第六方面,本发明提供了一种载体,其包含编码以下的核酸序列:In a sixth aspect, the present invention provides a vector comprising a nucleic acid sequence encoding the following:
a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5,
or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
b)本发明第一方面的缀合物;或者b) a conjugate of the first aspect of the invention; or
c)本发明第二方面的融合蛋白。c) The fusion protein of the second aspect of the invention.
在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。In a preferred embodiment, said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182 E507D-E1090K.
在一个实施方案中,所述载体包含SEQ ID NO:8、SEQ ID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13中任一个所示的核酸序列或其简并序列。In one embodiment, the vector comprises any one of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, and SEQ ID NO: 13 Nucleic acid sequence or its degenerate sequence.
在一个实施方案中,所述载体包含编码SEQ ID NO:7所示融合蛋白的核酸序列。In one embodiment, the vector comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
在一个优选的实施方案中,所述载体包含SEQ ID NO:14所示的核酸序列或其简并序列。In a preferred embodiment, the vector comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
所述载体可以为表达载体,例如质粒载体例如pUC19载体、附着体载体、pAAV2_ITR载体、逆转录病毒载体、慢病毒载体、腺病毒载体或腺相关病毒载体。The vector may be an expression vector, such as a plasmid vector such as a pUC19 vector, an episomal vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
在又一个实施方案中,所述载体进一步包含编码本发明第三方面的与所述Cas12蛋白对应的单链向导RNA的核酸序列。In yet another embodiment, the vector further comprises a nucleic acid sequence encoding the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
作为一个示例,所述载体包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列,例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列,并且包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:19所示的核酸序列。As an example, the vector comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7). Nucleic acid sequence, for example the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14, and comprises coding for this Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 15 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA, such as the nucleic acid sequence shown in SEQ ID NO: 19.
作为一个示例,所述载体包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列,并且包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:20所示的核酸序列。As an example, the vector comprises a nucleic acid sequence encoding a Cas12a protein, a homologue thereof, a conjugate or a fusion protein thereof having an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, For example, the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID NO: 16 coding for the Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA of sequence, for example the nucleic acid sequence shown in SEQ ID NO:20.
作为一个示例,所述载体包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:12所示的核酸序列,并且包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示 CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:21所示的核酸序列。As an example, the vector comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homologue thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 12, And comprise coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological The homologous sequence of biological activity, or the nucleic acid sequence of a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21.
作为一个示例,所述载体包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:13所示的核酸序列,并且包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:22所示的核酸序列。As an example, the vector comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 13, And comprise coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising and SEQ ID NO: 18 has at least 90% sequence identity and retains its biological The homologous sequence of biological activity, or the nucleic acid sequence of a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22.
在第七方面,本发明提供了一种载体,所述载体包含编码本发明第三方面的单链向导RNA的核酸分子。In the seventh aspect, the present invention provides a vector comprising a nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the present invention.
在一个实施方案中,所述载体包含SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21和SEQ ID NO:22中任一个所示的核酸序列或其简并序列。In one embodiment, the vector comprises the nucleic acid sequence shown in any one of SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22 or a degenerate sequence thereof.
在一个优选的实施方案中,所述载体还包含编码CRISPR间隔序列的核酸序列。In a preferred embodiment, the vector further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
根据上文的记载可知,在将本发明的载体转染到细胞中后,在载体中克隆的核酸序列可以被表达为Cas12蛋白、其缀合物或融合蛋白、和/或上文所述的单链向导RNA,并在此行使相应的功能,例如进行基因编辑。According to the above description, after the vector of the present invention is transfected into cells, the nucleic acid sequence cloned in the vector can be expressed as Cas12 protein, its conjugate or fusion protein, and/or the above-mentioned Single-stranded guide RNA, where it can perform corresponding functions, such as gene editing.
另外,可以将多种载体例如两种载体转染到细胞中,其中一种载体表达所述Cas12蛋白、其缀合物或融合蛋白,而另一种载体表达单链向导RNA。随后,表达出来的Cas12蛋白、其缀合物或融合蛋白与表达出来的单链向导RNA复合形成复合物,并在此行使相应的功能,例如进行基因编辑。In addition, multiple vectors, such as two vectors, can be transfected into cells, one of which expresses the Cas12 protein, its conjugate or fusion protein, and the other expresses a single-stranded guide RNA. Subsequently, the expressed Cas12 protein, its conjugate or fusion protein complexes with the expressed single-stranded guide RNA to form a complex, where it performs corresponding functions, such as gene editing.
当然,也可以将编码所述Cas12蛋白、其缀合物或融合蛋白的 核酸序列以及编码所述单链向导RNA的核酸序列克隆到一个载体中,使得该载体转染到细胞内后表达所述Cas12蛋白、其缀合物或融合蛋白以及所述单链向导RNA两者,并在此行使相应的功能,例如进行基因编辑。Of course, the nucleic acid sequence encoding the Cas12 protein, its conjugate or fusion protein, and the nucleic acid sequence encoding the single-stranded guide RNA can also be cloned into a vector, so that the vector is transfected into cells to express the Cas12 protein, its conjugate or fusion protein, and the single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
CRISPR/Cas12基因编辑系统CRISPR/Cas12 Gene Editing System
在第八方面,本发明提供了一种CRISPR/Cas12基因编辑系统,其包含:In the eighth aspect, the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
a)蛋白组分,其包含:a) protein component comprising:
1)Cas12蛋白,所述Cas12蛋白为:1) Cas12 protein, the Cas12 protein is:
1.1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1.1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
1.2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
2)本发明第一方面的缀合物;或者2) the conjugate of the first aspect of the invention; or
3)本发明第二方面的融合蛋白;以及3) the fusion protein of the second aspect of the present invention; and
b)核酸组分,其包含:本发明第三方面的与a)中的蛋白组分对应的单链向导RNA;b) a nucleic acid component comprising: a single-stranded guide RNA corresponding to the protein component in a) according to the third aspect of the present invention;
并且,所述蛋白组分和所述核酸组分相互结合形成复合物。And, the protein component and the nucleic acid component combine with each other to form a complex.
在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。In a preferred embodiment, said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182 E507D-E1090K.
作为一个示例,所述蛋白组分包含具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:15所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, the protein component comprises a Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein, and the nucleic acid component comprises a single-stranded guide RNA, so The single-stranded guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 15, comprising a single strand of a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity Guide RNA, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity.
作为一个示例,所述蛋白组分包含具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:16所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, the protein component comprises Cas12a protein, its homologue, conjugate or fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, said The nucleic acid component comprises a single-stranded guide RNA that is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 16, comprising at least 90% sequence identity with SEQ ID NO: 16 and retaining its A single-stranded guide RNA with a biologically active homologous sequence, or a single-stranded guide RNA containing a modified sequence based on SEQ ID NO: 16 and retaining its biological activity.
作为一个示例,所述蛋白组分包含具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:17所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:17具 有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, the protein component comprises BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, the nucleic acid component comprises a single-stranded guide RNA, the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity.
作为一个示例,所述蛋白组分包含具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:18所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, the protein component comprises a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, the nucleic acid component comprises a single-stranded guide RNA, and the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
上文中,针对单链向导RNA提及的表述“至少90%序列同一性”可以为例如至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.9%或者至少100%的序列同一性。Above, the expression "at least 90% sequence identity" mentioned for single-stranded guide RNAs may be, for example, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% , at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity.
本发明的CRISPR/Cas12基因编辑系统可以由本文所述的Cas12蛋白、其同源物、或者它们的缀合物或融合蛋白与本文所述的单链向导RNA直接地构成,也可以本文所述的载体表达得到的表达产物构成。The CRISPR/Cas12 gene editing system of the present invention can be directly constituted by the Cas12 protein described herein, its homologue, or their conjugates or fusion proteins and the single-stranded guide RNA described herein, or can be directly composed of the single-stranded guide RNA described herein. The expression product obtained by the vector expression constitutes.
本发明的CRISPR/Cas12基因编辑系统通过其中包含的Cas12蛋白和单链向导RNA共同作用而实现对靶序列的识别、定位、切割和基因编辑。The CRISPR/Cas12 gene editing system of the present invention realizes the recognition, positioning, cutting and gene editing of the target sequence through the joint action of the Cas12 protein contained therein and the single-stranded guide RNA.
本发明CRISPR/Cas12基因编辑系统能够精确定位靶序列。所谓“精确定位”有两层含义:第一层含义是指本发明的CRISPR/Cas12基因编辑系统自身能够识别并结合靶序列,第二层含义是指本发明的CRISPR/Cas12基因编辑系统能够将与所述Cas12蛋白融合的其他蛋白或特异性识别所述sgRNA的蛋白带至靶序列的位置。The CRISPR/Cas12 gene editing system of the present invention can precisely locate the target sequence. The so-called "precise positioning" has two meanings: the first meaning means that the CRISPR/Cas12 gene editing system of the present invention can recognize and bind the target sequence itself, and the second meaning means that the CRISPR/Cas12 gene editing system of the present invention can Other proteins fused with the Cas12 protein or proteins that specifically recognize the sgRNA are brought to the position of the target sequence.
本发明的CRISPR/Cas12基因编辑系统对非靶序列具有低容忍度。在本文中,所谓“具有低容忍度”是指本发明的CRISPR/Cas12基因编辑系统基本上不能或者完全不能识别并结合非靶序列,或基本上不能或者完全不能将与所述Cas12蛋白融合的其他蛋白或特异性识别所述sgRNA的蛋白带至非靶序列的位置。The CRISPR/Cas12 gene editing system of the present invention has a low tolerance to non-target sequences. Herein, the so-called "with low tolerance" means that the CRISPR/Cas12 gene editing system of the present invention is basically unable or completely unable to recognize and bind non-target sequences, or basically unable or completely unable to fuse with the Cas12 protein Other proteins or proteins that specifically recognize the sgRNA are brought to the position of the non-target sequence.
本发明的CRISPR/Cas12因编辑系统,由于其中含有的Cas12蛋白所识别的靶序列上的PAM序列更简单,由此可以靶向基因组中更多的DNA序列。The CRISPR/Cas12 gene editing system of the present invention can target more DNA sequences in the genome because the PAM sequence on the target sequence recognized by the Cas12 protein contained therein is simpler.
细胞cell
在第九方面,本发明提供了一种细胞,所述细胞包含:本发明第四发明和第五方面的分离的核酸分子、或者本发明第六方面和第七方面的载体。In a ninth aspect, the present invention provides a cell comprising: the isolated nucleic acid molecule of the fourth and fifth aspects of the present invention, or the vector of the sixth and seventh aspects of the present invention.
作为一个示例,所述细胞可以为原核细胞或者真核细胞。对于所述真核细胞,作为示例,其可以为植物细胞或者动物细胞。对于所述动物细胞,作为示例,其可以为哺乳动物细胞例如人类细胞。As an example, the cells may be prokaryotic or eukaryotic. As for the eukaryotic cell, it may be, for example, a plant cell or an animal cell. As for the animal cell, as an example, it may be a mammalian cell such as a human cell.
方法method
在第十方面,本发明提供了一种对细胞内或体外环境中的靶序列进行基因编辑的方法,所述方法包括使以下(1)至(4)中任一项与细胞内或体外环境中的靶序列相接触:In a tenth aspect, the present invention provides a method for gene editing a target sequence in a cell or in vitro environment, the method comprising combining any one of the following (1) to (4) with the intracellular or in vitro environment The target sequence in the contact:
(1)Cas12蛋白、本发明第一方面的缀合物或者本发明第二方面的融合蛋白,和本发明第三方面的与所述Cas12蛋白对应的单链向导RNA,(1) Cas12 protein, the conjugate of the first aspect of the present invention or the fusion protein of the second aspect of the present invention, and the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention,
其中,所述Cas12蛋白为:Wherein, the Cas12 protein is:
1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、 至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
(2)本发明第六方面和第七方面的载体;(2) the carrier of the sixth aspect and the seventh aspect of the present invention;
(3)本发明第六方面的载体;以及(3) the carrier of the sixth aspect of the present invention; and
(4)本发明第八方面的CRISPR/Cas12基因编辑系统;(4) The CRISPR/Cas12 gene editing system of the eighth aspect of the present invention;
其中,在与靶序列接触后,所述Cas12蛋白、其同源物、缀合物或融合蛋白识别各自的原间隔邻近序列(PAM),所述PAM位于靶序列的5’端,并且,对于所述Cas12J-8蛋白、所述Mb4Cas12a蛋白、所述MlCas12a蛋白、所述MoCas12a蛋白、所述BgCas12a蛋白、和所述ChCas12b蛋白、或它们各自的同源物、缀合物或融合蛋白,所述PAM分别为5’-TTN、5’-YYN、5’-YYN、5’-YYN、5’-YYN和5’-TTN。Wherein, after contacting with the target sequence, the Cas12 protein, its homologue, conjugate or fusion protein recognizes the respective protospacer adjacent sequence (PAM), and the PAM is located at the 5' end of the target sequence, and, for The Cas12J-8 protein, the Mb4Cas12a protein, the MlCas12a protein, the MoCas12a protein, the BgCas12a protein, and the ChCas12b protein, or their respective homologues, conjugates or fusion proteins, the The PAMs are 5'-TTN, 5'-YYN, 5'-YYN, 5'-YYN, 5'-YYN and 5'-TTN, respectively.
对于上文所述的第(1)项:作为一个示例,具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA;For item (1) mentioned above: as an example, there is the Cas12J-8 albumen of the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein, and comprise SEQ ID NO: The CRISPR repeat sequence shown in 15, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15, or comprising a single-stranded guide based on the modified sequence of SEQ ID NO: 15 and retaining its biological activity RNA;
作为一个示例,具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA;As an example, a Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue thereof, a conjugate or a fusion protein thereof, and a Cas12a protein comprising the amino acid sequence of SEQ ID NO: 16 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity single-stranded guide RNA;
作为一个示例,具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、它们的缀合物或融合蛋白的核酸序列,以及包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于 SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA;As an example, the nucleic acid sequence of the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologues, their conjugates or fusion proteins, and the CRISPR repeat sequence shown in SEQ ID NO: 17, including and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological activity homologous sequence, or comprises a single-stranded guide RNA based on SEQ ID NO: 17 modified sequence obtained and retains its biological activity;
作为一个示例,具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, a ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, and a CRISPR repeat sequence comprising SEQ ID NO: 18, comprising the same sequence as SEQ ID NO: 18 A homologous sequence having at least 90% sequence identity and retaining its biological activity, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K。In a preferred embodiment, said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182 E507D-E1090K.
对于上文中的第(2)项:For item (2) above:
作为一个示例,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的载体,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的载体;As an example, the nucleic acid sequence ( For example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA;
作为一个示例,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或 融合蛋白的核酸序列(例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的载体,以及包含编码针对该Mb4Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:20所示的核酸序列)的载体;As an example, a nucleic acid sequence (such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11), and the vector comprising SEQ ID NO: 16 encoding for the Mb4Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of sequence;
作为一个示例,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的载体,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的载体;As an example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological The homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 21) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 17 and retaining its biological activity ;
作为一个示例,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的载体,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的载体。As an example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological The homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
在一个实施方案中,所述细胞为原核细胞或者真核细胞,所述真核细胞为例如植物细胞或动物细胞,所述动物细胞为例如哺乳动物细胞如人类细胞。In one embodiment, the cell is a prokaryotic cell or a eukaryotic cell, such as a plant cell or an animal cell, such as a mammalian cell such as a human cell.
在一个实施方案中,所述基因编辑包括对靶序列的基因敲除、定点碱基的改变、定点插入、基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换以及染色质成像追踪中的一种或者多种。In one embodiment, the gene editing includes gene knockout of target sequence, site-directed base change, site-directed insertion, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification One or more of , single base conversion, and chromatin imaging tracking.
进一步地,在一个实施方案中,所述单碱基转换包括碱基腺嘌 呤到鸟嘌呤的转换、胞嘧啶到胸腺嘧啶的转换或胞嘧啶到尿嘧啶的转换。Further, in one embodiment, the single base conversion comprises the conversion of the base adenine to guanine, the conversion of cytosine to thymine or the conversion of cytosine to uracil.
在一个实施方案中,在所述方法中,所述单链向导RNA的CRISPR间隔序列与所述靶序列形成完全碱基互补配对结构,而与非靶序列形成不完全碱基互补配对的结构。In one embodiment, in the method, the CRISPR spacer sequence of the single-stranded guide RNA forms a complete base pairing structure with the target sequence, and forms an incomplete base pairing structure with the non-target sequence.
在本文中,所述不完全碱基互补配对结构是指其中包括一部分碱基互补配对和一部分非碱基互补配对的结构,所述非碱基互补配对包括例如碱基错配(mismatch)和/或碱基凸出(bulge)等。Herein, the incomplete base complementary pairing structure refers to a structure including a part of base complementary pairing and a part of non-base complementary pairing including, for example, base mismatch (mismatch) and/or Or base bulge (bulge), etc.
在一个实施方案中,所述不完全碱基互补配对结构包括一个或者多个例如两个或者更多个碱基错配。In one embodiment, the incomplete complementary base pairing structure includes one or more, eg, two or more, base mismatches.
由此,本发明的Cas12蛋白可以对所述靶序列上的靶位点进行切割,并且在Cas12蛋白的切割作用下,靶序列发生双链断裂。进一步地,当所述方法在细胞内进行时,切割后的靶序列可以通过细胞内的非同源末端连接修复或同源重组修复途径进行修复,从而实现对靶序列的基因编辑。Thus, the Cas12 protein of the present invention can cut the target site on the target sequence, and under the cleavage action of the Cas12 protein, a double-strand break occurs in the target sequence. Further, when the method is carried out in cells, the cleaved target sequence can be repaired by intracellular non-homologous end joining repair or homologous recombination repair pathway, thereby realizing gene editing of the target sequence.
本发明的CRISPR/Cas12基因编辑系统以及采用该基因编辑系统的基因编辑方法,经实验发现其具有40%-70%(对于Cas12J-8蛋白)、12%-56%(对于ChCas12b蛋白)以及10%-20%(对于其他各Cas12a蛋白)的编辑效率。另外,对于CRISPR/Cas12J-8基因编辑系统,前14bp的向导RNA的mismatch具有接近0%的容错率。因此,该基因编辑系统可以高特异性地编辑靶基因,具有编辑效率高、脱靶率低的特点,可广泛应用于细胞中或者体外环境中的基因编辑。The CRISPR/Cas12 gene editing system of the present invention and the gene editing method using the gene editing system are found through experiments to have 40%-70% (for the Cas12J-8 protein), 12%-56% (for the ChCas12b protein) and 10% %-20% (for each other Cas12a protein) editing efficiency. In addition, for the CRISPR/Cas12J-8 gene editing system, the mismatch of the first 14bp guide RNA has an error tolerance rate close to 0%. Therefore, the gene editing system can edit target genes with high specificity, has the characteristics of high editing efficiency and low off-target rate, and can be widely used in gene editing in cells or in vitro environments.
试剂盒Reagent test kit
在第十一方面,本发明提供了一种试剂盒,所述试剂盒用于对细胞内或者体外环境中的靶序列进行基因编辑,包括:In the eleventh aspect, the present invention provides a kit for gene editing a target sequence in a cell or in an in vitro environment, comprising:
a)选自以T i)至6)中的任一项:a) Any one selected from Ti) to 6):
1)Cas12蛋白或其同源物、本发明第一方面的缀合物、或者本发明第二方面的融合蛋白,和本发明第三方面的与所述Cas12蛋白对应的单链向导RNA,1) Cas12 protein or its homologue, the conjugate of the first aspect of the present invention, or the fusion protein of the second aspect of the present invention, and the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention,
其中,所述Cas12蛋白为:Wherein, the Cas12 protein is:
1.1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1.1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5,
or
具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
或者为or for
1.2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
2)本发明第四和第五方面的分离的核酸分子;2) the isolated nucleic acid molecules of the fourth and fifth aspects of the invention;
3)本发明第五方面的分离的核酸分子;3) The isolated nucleic acid molecule of the fifth aspect of the invention;
4)本发明第六和第七方面的载体;4) The carrier of the sixth and seventh aspects of the present invention;
5)本发明第六方面的载体;或者5) the carrier of the sixth aspect of the present invention; or
6)本发明第八方面的CRISPR/Cas12基因编辑系统;6) The CRISPR/Cas12 gene editing system of the eighth aspect of the present invention;
以及as well as
b)如何对细胞内或体外环境中的靶序列进行基因编辑的说明书。b) Instructions on how to perform gene editing on target sequences in cells or in an in vitro environment.
对于上文中的第1)项:作为一个示例,具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:15所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:15改造得到 的且保留其生物学活性的改造序列的单链向导RNA;For item 1) above: As an example, a Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1, a homologue, a conjugate or a fusion protein thereof, and a Cas12J-8 protein comprising the amino acid sequence shown in SEQ ID NO: 15 A single-stranded guide RNA of a CRISPR repeat sequence, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 A single-stranded guide RNA with a modified sequence that retains its biological activity;
作为一个示例,具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其具有与SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:16所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA;As an example, the Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, which has the same protein as SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 Homologs of amino acid sequences having at least 80% sequence identity, their conjugates or fusion proteins, and a single-stranded guide RNA comprising the CRISPR repeat sequence shown in SEQ ID NO: 16, comprising the same sequence as SEQ ID NO: 16 A single-stranded guide RNA with at least 90% sequence identity and a homologous sequence that retains its biological activity, or a single-stranded guide RNA that includes a modified sequence based on SEQ ID NO: 16 and retains its biological activity;
作为一个示例,具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其具有与SEQ ID NO:5具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:17所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA;As an example, the BgCas12a protein having the amino acid sequence shown in SEQ ID NO: 5, its homologue with the amino acid sequence having at least 80% sequence identity to SEQ ID NO: 5, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 17 and retaining its biological activity;
作为一个示例,具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其具有与SEQ ID NO:6具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:18所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA。As an example, the ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, its homologue having an amino acid sequence with at least 80% sequence identity to SEQ ID NO: 6, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 18 and retaining its biological activity.
在一个优选的实施方案中,所述同源物可以为SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示Cas12蛋白的点突变体,包括但不限于所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突 变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E 1090K。In a preferred embodiment, said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182 E507D-E 1090K.
对于上文中的第2)项:For item 2) above:
作为一个示例,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的分离的核酸分子,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的分离的核酸分子;As an example, a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7) nucleic acid sequence (such as SEQ ID NO: 8 or the nucleic acid sequence shown in SEQ ID NO: 14) the isolated nucleic acid molecule of nucleic acid molecule, and comprise coding for this Cas12J-8 albumen, its homologue, conjugate or fusion protein comprising SEQ ID NO: The CRISPR repeat sequence shown in 15, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA of the modified sequence;
作为一个示例,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的分离的核酸分子,以及包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:20所示的核酸序列)的分离的核酸分子;As an example, the nucleic acid sequence (SEQ ID NO : 9, SEQ ID NO: 10 or the nucleic acid sequence shown in SEQ ID NO: 11) isolated nucleic acid molecules, and comprising SEQ ID NO coding for the Cas12a protein, its homologue, conjugate or fusion protein : CRISPR repeat sequence shown in 16, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of the modified sequence;
作为一个示例,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的分离的核酸分子,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的分离的核酸分子;As an example, the isolated nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 12) comprising the BgCas12a protein encoding the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein Nucleic acid molecule, and comprising coding for this BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and that retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 21 ) isolated nucleic acid molecule;
作为一个示例,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的分离的核酸分子,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的分离的核酸分子。As an example, an isolated nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) comprising a ChCas12b protein encoding an amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein thereof Nucleic acid molecule, and comprising coding for this ChCas12b albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 18 and retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 22 ) isolated nucleic acid molecule.
对于上文中的第4)项:For item 4) above:
作为一个示例,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的载体,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的载体;As an example, the nucleic acid sequence ( For example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA;
作为一个示例,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的载体,以及包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:20所示的核酸序列)的载体;As an example, a nucleic acid sequence (such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11), and the vector comprising SEQ ID NO: 16 encoding for the Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of sequence;
作为一个示例,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的载体,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示 CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的载体;As an example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological The homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 21) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 17 and retaining its biological activity ;
作为一个示例,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的载体,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的载体。As an example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological The homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
当然,本领域技术人员可以理解,本发明试剂盒中还可以包含其他有助于进行基因编辑的试剂。Of course, those skilled in the art can understand that the kit of the present invention may also contain other reagents that are helpful for gene editing.
对本发明涉及序列的简单描述A brief description of the sequences involved in the present invention
SEQ ID NO:1:Cas12J-8蛋白序列SEQ ID NO: 1: Cas12J-8 protein sequence
SEQ ID NO:2:Mb4Cas12a蛋白序列SEQ ID NO: 2: Mb4Cas12a protein sequence
SEQ ID NO:3:MlCas12a蛋白序列SEQ ID NO: 3: MlCas12a protein sequence
SEQ ID NO:4:MoCas12a蛋白序列SEQ ID NO: 4: MoCas12a protein sequence
SEQ ID NO:5:BgCas12a蛋白序列SEQ ID NO: 5: BgCas12a protein sequence
SEQ ID NO:6:ChCas12b蛋白序列SEQ ID NO: 6: ChCas12b protein sequence
SEQ ID NO:7:包含Cas12J-8蛋白的融合蛋白SEQ ID NO: 7: fusion protein comprising Cas12J-8 protein
SEQ ID NO:8:Cas12J-8蛋白的编码序列SEQ ID NO: 8: Coding sequence of Cas12J-8 protein
SEQ ID NO:9:Mb4Cas12a蛋白的编码序列SEQ ID NO: 9: Coding sequence of Mb4Cas12a protein
SEQ ID NO:10:MlCas12a蛋白的编码序列SEQ ID NO: 10: coding sequence of MlCas12a protein
SEQ ID NO:11:MoCas12a蛋白的编码序列SEQ ID NO: 11: Coding sequence of MoCas12a protein
SEQ ID NO:12:BgCas12a蛋白的编码序列SEQ ID NO: 12: coding sequence of BgCas12a protein
SEQ ID NO:13:ChCas12b蛋白的编码序列SEQ ID NO: 13: Coding sequence of ChCas12b protein
SEQ ID NO:14:包含Cas12J-8蛋白的融合蛋白编码序列SEQ ID NO: 14: fusion protein coding sequence comprising Cas12J-8 protein
SEQ ID NO:15:与Cas12J-8蛋白联用的CRISPR重复序列SEQ ID NO: 15: CRISPR repeat sequence used in conjunction with Cas12J-8 protein
SEQ ID NO:16:与Mb4Cas12a、MlCas12a和MoCas12a蛋白联用的CRISPR重复序列SEQ ID NO: 16: CRISPR repeat sequence used in conjunction with Mb4Cas12a, MlCas12a and MoCas12a proteins
SEQ ID NO:17:与BgCas12a蛋白联用的CRISPR重复序列SEQ ID NO: 17: CRISPR repeat sequence used in conjunction with BgCas12a protein
SEQ ID NO:18:与ChCas12b蛋白联用的CRISPR重复序列SEQ ID NO: 18: CRISPR repeat sequence used in conjunction with ChCas12b protein
SEQ ID NO:19:与Cas12J-8蛋白相关的单链向导RNA的CRISPR重复序列的DNA序列SEQ ID NO: 19: DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Cas12J-8 protein
SEQ ID NO:20:与Mb4Cas12a、MlCas12a、和MoCas12a蛋白相关的单链向导RNA的CRISPR重复序列的DNA序列SEQ ID NO: 20: DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Mb4Cas12a, MlCas12a, and MoCas12a proteins
SEQ ID NO:21:与BgCas12a蛋白相关的单链向导RNA的CRISPR重复序列的DNA序列SEQ ID NO: 21: DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with BgCas12a protein
SEQ ID NO:22:与ChCas12b蛋白相关的单链向导RNA的CRISPR重复序列的DNA序列SEQ ID NO: 22: DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with ChCas12b protein
SEQ ID NO:23:Cas12J-4蛋白序列SEQ ID NO: 23: Cas12J-4 protein sequence
SEQ ID NO:24:Cas12J-5蛋白序列SEQ ID NO: 24: Cas12J-5 protein sequence
SEQ ID NO:25:Cas12J-7蛋白序列SEQ ID NO: 25: Cas12J-7 protein sequence
SEQ ID NO:26:Cas12J-9蛋白序列SEQ ID NO: 26: Cas12J-9 protein sequence
SEQ ID NO:27:Cas12J-4蛋白的编码序列SEQ ID NO: 27: Coding sequence of Cas12J-4 protein
SEQ ID NO:28:Cas12J-5蛋白的编码序列SEQ ID NO: 28: Coding sequence of Cas12J-5 protein
SEQ ID NO:29:Cas12J-7蛋白的编码序列SEQ ID NO: 29: Coding sequence of Cas12J-7 protein
SEQ ID NO:30:Cas12J-9蛋白的编码序列SEQ ID NO: 30: Coding sequence of Cas12J-9 protein
SEQ ID NO:31:与Cas12J-4蛋白联用的CRISPR重复序列的DNA序列SEQ ID NO: 31: DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-4 protein
SEQ ID NO:32:与Cas12J-5蛋白联用的CRISPR重复序列的DNA序列SEQ ID NO: 32: DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-5 protein
SEQ ID NO:33:与Cas12J-7蛋白联用的CRISPR重复序列的DNA序列SEQ ID NO: 33: DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-7 protein
SEQ ID NO:34:与Cas12J-9蛋白联用的CRISPR重复序列的DNA序列SEQ ID NO: 34: DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-9 protein
实施例Example
现参照下列意在举例说明而非限定本发明的实施例来描述本发明。本领域技术人员知晓,在此提供实施例仅出于详细描述本发明之目的,无意于限制本发明所要求保护的范围。The invention will now be described with reference to the following examples which are intended to illustrate, but not limit, the invention. Those skilled in the art know that the examples provided here are only for the purpose of describing the present invention in detail, and are not intended to limit the scope of protection claimed by the present invention.
除非特别指明,否则基本按照本领域内熟知的以及在各参考文献中描述的常规方法进行实施例中描述的实验和方法。另外,对于实施例中未注明具体条件者,均按照常规条件或制造商建议的条件进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。Unless otherwise indicated, the experiments and methods described in the examples were performed substantially according to conventional methods well known in the art and described in the respective references. In addition, for those not specified specific conditions in the examples, they were carried out according to conventional conditions or conditions suggested by the manufacturer. The reagents or instruments used were not indicated by the manufacturer, and they were all commercially available conventional products.
实施例1Example 1
(1)构建质粒pAAV2_Cas12_ITR(1) Construction of plasmid pAAV2_Cas12_ITR
根据表1中列出的各Cas12蛋白的基因检索号,下载其氨基酸序列,其中Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白和ChCas12b蛋白的氨基酸序列分别如SEQ ID NO:1至SEQ ID NO:6所示。According to the gene retrieval number of each Cas12 protein listed in Table 1, download its amino acid sequence, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 To SEQ ID NO: shown in 6.
表1.Cas12蛋白及其NCBI蛋白搜索ID和序列编号Table 1. Cas12 protein and its NCBI protein search ID and sequence number
Cas12蛋白名称Cas12 protein name NCBI蛋白搜索IDNCBI Protein Search ID 氨基酸序列amino acid sequence
Cas12J-8Cas12J-8 none SEQ ID NO:1SEQ ID NO: 1
Mb4Cas12aMb4Cas12a WP_078273923.1WP_078273923.1 SEQ ID NO:2SEQ ID NO: 2
M1Cas12aM1Cas12a WP_065256572.1WP_065256572.1 SEQ ID NO:3SEQ ID NO: 3
MoCas12aMoCas12a WP_112744621.1WP_112744621.1 SEQ ID NO:4SEQ ID NO: 4
BgCas12aBgCas12a OLA11341.1OLA11341.1 SEQ ID NO:5SEQ ID NO: 5
ChCas12bChCas12b OQB30769OQB30769 SEQ ID NO:6SEQ ID NO: 6
将上述各Cas12蛋白的编码核酸序列进行密码子优化,获得所述Cas12蛋白在人细胞中高表达的基因序列。Cas12J-8蛋白、The coding nucleic acid sequences of the above Cas12 proteins were codon-optimized to obtain the gene sequences of the highly expressed Cas12 proteins in human cells. Cas12J-8 protein,
Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白和ChCas12b蛋白的经优化基因序列分别如SEQ ID NO:8至SEQ ID NO:13所示。The optimized gene sequences of Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively shown in SEQ ID NO: 8 to SEQ ID NO: 13.
将上述获得的SEQ ID NO:8至SEQ ID NO:13所示的各Cas12蛋白高表达的基因序列进行基因合成,并构建至slugCas9骨架质粒(Addgene平台,catalog#163793)上,得到质粒pAAV2_Cas12_ITR。The highly expressed gene sequences of each Cas12 protein obtained above from SEQ ID NO: 8 to SEQ ID NO: 13 were gene synthesized and constructed on the slugCas9 backbone plasmid (Addgene platform, catalog #163793) to obtain the plasmid pAAV2_Cas12_ITR.
(2-1)构建质粒Cas12J-8-PSK-u6-crRNA(2-1) Construction of plasmid Cas12J-8-PSK-u6-crRNA
用BbsI和XhoI限制性内切酶将pBluescriptSKII+U6-sgRNA(F+E)empty质粒(Addgene平台,可以商购,catalog为#74707)进行酶切反应,酶切体系为:1μg质粒psk-BbsI-Sasg、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI和1μL XhoI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。The pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases. The digestion system was: 1 μg plasmid psk-BbsI - Sasg, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI and 1 μL XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下3296bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据Cas12j-8蛋白基因组上的repeat序列(其DNA序列为SEQ ID NO:19),将该repeat序列进行基因合成,并构建于线性化的pBluescriptSKII+U6-sgRNA(F+E)empty骨架上,得到质粒Cas12J-8-PSK-u6-crRNA。According to the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
(2-2)构建质粒psk-BbsI-Cas12a-crRNA1(2-2) Construction of plasmid psk-BbsI-Cas12a-crRNA1
用BbsI和XhoI限制性内切酶将pBluescriptSKII+U6-sgRNA(F+E)empty质粒进行酶切反应,酶切体系为:1μg质粒psk-BbsI-Sasg、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI和1μL XhoI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。The pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes. The enzyme digestion system was: 1 μg plasmid psk-BbsI-Sasg, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI and 1 μL XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下3296bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据Cas12a蛋白基因组上的repeat,将截断后的repeat序列(其DNA序列分别为SEQ ID NO:20和SEQ ID NO:21)进行基因合成,并构建于线性化的pBluescriptSKII+U6-sgRNA(F+E)empty骨架上,得到质粒psk-BbsI-Cas12a-crRNA1。According to the repeat on the Cas12a protein genome, the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
(2-3)构建质粒hU6-OQB30769_tracr-Bsa1(2-3) Construction of plasmid hU6-OQB30769_tracr-Bsa1
用BsaI和NotI限制性内切酶将pX330_sgACTA2质粒(Addgene平台,catalog为#63712)进行酶切反应,酶切体系为:1μg质粒hU6-sa-tracr-BsaI、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BsaI和1μL NotI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应3小时。The pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases. The enzyme digestion system was: 1 μg plasmid hU6-sa-tracr-BsaI, 5 μL 10×CutSmart buffer (purchased at NEB Company), 1 μL BsaI and 1 μL NotI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was allowed to react at 37°C for 3 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下2998bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据ChCas12b的基因组找出基因组上的repeat及tracr,根据二级结构推断出其RNA Scaffold序列(其DNA序列为SEQ ID NO:22),将该序列进行基因合成,并构建于线性化的hU6-sa-tracr-BsaI骨架上,得到质粒hU6-OQB30769_tracr-Bsa1。Find out the repeat and tracr on the genome according to the genome of ChCas12b, deduce its RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
(3)质粒pAAV2_Cas12-hU6-sgRNA_ITR载体的构建(3) Construction of plasmid pAAV2_Cas12-hU6-sgRNA_ITR vector
利用PCR方法线性化(1)中表达Cas12蛋白的pAAV2_Cas12_ITR质粒和(2)中表达各蛋白对应sgRNA的Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1质粒。The pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
对于pAAV2_Cas12_ITR质粒,引物序列为:For the pAAV2_Cas12_ITR plasmid, the primer sequences are:
ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC;和ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC; and
CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;
对于Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1质粒,引物序列为:For the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids, the primer sequences are:
GAGGGCCTATTTCCCATGAT;和GAGGGCCTATTTCCCATGAT; and
GTACCGGGCCCCCCCTCG。GTACCGGGCCCCCCCTCG.
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000001
Figure PCTCN2022096002-appb-000001
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000002
Figure PCTCN2022096002-appb-000002
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到目的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
将线性化pAAV2_Cas12_ITR片段与线性化Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1片段对应按照说明书要求比例进行同源重组,所使用的同源重组酶为
Figure PCTCN2022096002-appb-000003
高保真DNA组装预混液(NEB),反应体系如下:
The linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions. The homologous recombination enzyme used was
Figure PCTCN2022096002-appb-000003
High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
Figure PCTCN2022096002-appb-000004
Figure PCTCN2022096002-appb-000004
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000005
Figure PCTCN2022096002-appb-000005
Figure PCTCN2022096002-appb-000006
Figure PCTCN2022096002-appb-000006
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到质粒pAAV2_Cas12-hU6-sgRNA_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid pAAV2_Cas12-hU6-sgRNA_ITR for future use.
(4)线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR的制备(4) Preparation of linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR
用BbsI限制性内切酶将(3)中制备的各质粒pAAV2_Cas12-hU6-sgRNA_ITR进行酶切反应,酶切体系为:1μg质粒pAAV2_Cas12-hU6-sgRNA_ITR、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested with BbsI restriction endonuclease. The digestion system was: 1 μg plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 μL 10×CutSmart buffer (purchased from NEB Company ), 1 μL BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 μL. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。所述DNA片段即为包含以上各Cas12蛋白的编码基因的线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR,其大小分别为7135bp(Cas12J-8蛋白)、7866bp(Mb4Cas12a蛋白)、7875bp(MlCas12a蛋白)、7998bp(MoCas12a蛋白)、7875bp(BgCas12a)和8606bp(ChCas12b)。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water. The DNA fragment is the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR comprising the coding gene of each Cas12 protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp ( MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
将回收的线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
(5)质粒pAAV2_Cas12-hU6-sgRNA_ITR的制备(5) Preparation of plasmid pAAV2_Cas12-hU6-sgRNA_ITR
设计各gRNA,其序列示于如表2。在设计的各gRNA序列对用的正义链和反义链上分别加上线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR两侧对应的粘性末端序列,并合成两条寡核苷酸单链DNA,这两条寡核苷酸单链DNA的具体序列也示于下表。Each gRNA was designed, and its sequence is shown in Table 2. Add cohesive end sequences corresponding to both sides of the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR to the sense strand and antisense strand of each designed gRNA sequence pair, and synthesize two oligonucleotide single-stranded DNAs. The specific sequence of the oligonucleotide single-stranded DNA is also shown in the table below.
Figure PCTCN2022096002-appb-000007
Figure PCTCN2022096002-appb-000007
将寡核苷酸单链DNA进行退火得到双链DNA。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序,退火程序为:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物通过DNA连接酶(购于NEB公司)连接至步骤(2)所得的线性化pAAV2_Cas12-hU6-sgRNA_ITR质粒。The oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program, the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, and incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37 Cultivate at ℃ for 1 hour to activate and recover Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到含有表达目标sgRNA序列的质粒pAAV2_Cas12-hU6-sgRNA_ITR,备用。Shake the Escherichia coli DH5α clone with correct connection after sequencing verification, extract the plasmid, and obtain the plasmid pAAV2_Cas12-hU6-sgRNA_ITR containing the expression target sgRNA sequence, for future use.
(6)表达Cas蛋白和sgRNA的质粒pAAV2_Cas12-hU6-sgRNA_ITR对HEK293T细胞系的转染(6) Transfection of HEK293T cell line with plasmid pAAV2_Cas12-hU6-sgRNA_ITR expressing Cas protein and sgRNA
在第0天,根据转染所需,将含有靶序列的HEK293T细胞在6孔板进行铺板,细胞密度约30%左右。On day 0, according to the needs of transfection, HEK293T cells containing the target sequence were plated in a 6-well plate, and the cell density was about 30%.
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
取2μg待转染质粒pAAV2_Cas12-hU6-sgRNA_ITR加入至100μL Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Take 2 μg of the plasmid to be transfected, pAAV2_Cas12-hU6-sgRNA_ITR, and add it to 100 μL of Opti-MEM medium (purchased from Gibco), and gently blow and mix.
将转染试剂脂质体
Figure PCTCN2022096002-appb-000008
2000(购于Invitrogen公司)或聚乙烯亚胺(以下简称PEI)(购于polysciences公司)轻弹混匀,吸取5μL
Figure PCTCN2022096002-appb-000009
2000或PEI加入至100μL Opti-MEM培养 基(购于Gibco公司)中,轻轻混匀,室温静置5min。
liposome transfection reagent
Figure PCTCN2022096002-appb-000008
2000 (purchased from Invitrogen) or polyethyleneimine (hereinafter referred to as PEI) (purchased from polysciences) flicked and mixed, pipette 5 μL
Figure PCTCN2022096002-appb-000009
2000 or PEI was added to 100 μL of Opti-MEM medium (purchased from Gibco), mixed gently, and allowed to stand at room temperature for 5 minutes.
将稀释的转染试剂和稀释的质粒进行混合,轻轻吹打混匀,室温静置20min,然后加入到包含待转染细胞HEK293T细胞的培养基中,然后将细胞置于37℃、5%CO 2培养箱中继续培养3天。 Mix the diluted transfection reagent and the diluted plasmid, gently pipette and mix, let stand at room temperature for 20min, then add to the culture medium containing HEK293T cells to be transfected, and then place the cells at 37°C, 5% CO 2 Continue culturing for 3 days in the incubator.
(7)二代测序文库的制备(7) Preparation of next-generation sequencing library
收集编辑三天后的HEK293T细胞,用DNA试剂盒(天根生化科技(北京)有限公司,DP304)并依据该DNA试剂盒提供的说明书提取基因组DNA。HEK293T cells were collected three days after editing, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
进行PCR建库第一轮PCR,用2×Q5Mastermix进行PCR反应,PCR引物如下所示:Perform the first round of PCR for PCR library construction, and use 2×Q5Mastermix for PCR reaction. The PCR primers are as follows:
表3.二代测序一轮PCR引物列表Table 3. A list of PCR primers for next-generation sequencing
Figure PCTCN2022096002-appb-000010
Figure PCTCN2022096002-appb-000010
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000011
Figure PCTCN2022096002-appb-000011
Figure PCTCN2022096002-appb-000012
Figure PCTCN2022096002-appb-000012
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000013
Figure PCTCN2022096002-appb-000013
进行测序建库第二轮PCR,用2xQ5Mastermix进行PCR反应,PCR引物如下所示:Carry out the second round of PCR for sequencing library construction, and use 2xQ5Mastermix for PCR reaction. The PCR primers are as follows:
F2引物:F2 primers:
Figure PCTCN2022096002-appb-000014
Figure PCTCN2022096002-appb-000014
R2引物:R2 primer:
Figure PCTCN2022096002-appb-000015
Figure PCTCN2022096002-appb-000015
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000016
Figure PCTCN2022096002-appb-000016
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000017
Figure PCTCN2022096002-appb-000017
将第二轮的PCR产物用胶回收试剂盒依据厂家提供的步骤,纯化330bp、327bp、279bp、239bp、311bp和298bp的DNA片段,其中,330bp、327bp分别是A1和A7的大小,279bp和239bp分别是E2和E3位点的大小,311bp和298bp分别是A3和A4位点的大小。由此,二代测序文库制备完毕。Use the gel recovery kit for the second round of PCR products to purify DNA fragments of 330bp, 327bp, 279bp, 239bp, 311bp and 298bp according to the steps provided by the manufacturer, wherein 330bp and 327bp are the sizes of A1 and A7 respectively, 279bp and 239bp are the sizes of E2 and E3 sites, respectively, and 311bp and 298bp are the sizes of A3 and A4 sites, respectively. Thus, the preparation of the next-generation sequencing library is completed.
(8)二代测序结果的分析(8) Analysis of next-generation sequencing results
将制备好的二代测序文库在高通量测序仪HiseqXTen(illumina)上进行双端测序。The prepared next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
二代测序计算得到对各自的两个靶位点的编辑效率如图1至图6所示,其中X轴表示靶位点,Y轴表示编辑效率(Indels%)。从图中可以看出,含有Cas12J-8、Mb4Cas12a、MoCas12a、BgCas12a、MlCas12a及ChCas12b蛋白的基因编辑系统均可以用于细胞基因编辑,且含有Cas12J-8蛋白的基因编辑系统的编辑活性相较于现有的Cas12J-2蛋白的基因编辑系统更高。The editing efficiencies of the two target sites calculated by the next-generation sequencing are shown in Figures 1 to 6, wherein the X-axis represents the target site, and the Y-axis represents the editing efficiency (Indels%). It can be seen from the figure that the gene editing system containing Cas12J-8, Mb4Cas12a, MoCas12a, BgCas12a, MlCas12a and ChCas12b proteins can be used for cell gene editing, and the editing activity of the gene editing system containing Cas12J-8 protein is compared with that of The existing gene editing system of Cas12J-2 protein is higher.
实施例2Example 2
(1)构建质粒pAAV2_Cas12_ITR(1) Construction of plasmid pAAV2_Cas12_ITR
根据上文表1中列出的各Cas12蛋白的基因检索号,下载其氨基酸序列信息,其中Cas12J-8蛋白、Mb4Cas12a蛋白、MICas12a蛋白、MoCas12a蛋白、BgCas12a蛋白和ChCas12b蛋白的氨基酸序列分别如SEQ ID NO:1至SEQ ID NO:6所示。According to the gene retrieval number of each Cas12 protein listed in Table 1 above, download its amino acid sequence information, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 to SEQ ID NO: shown in 6.
将上述所得的Cas12蛋白的编码核酸序列进行密码子优化,获得所述Cas蛋白在人细胞中高表达的基因序列。Cas12J-8蛋白、Mb4Cas12a蛋白、MlCas12a蛋白、MoCas12a蛋白、BgCas12a蛋白蛋白和ChCas12b的基因序列分别如SEQ ID NO:8至SEQ ID NO:13所示。Codon optimization was carried out on the coding nucleic acid sequence of the Cas12 protein obtained above to obtain the gene sequence of the Cas protein highly expressed in human cells. The gene sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein protein and ChCas12b are shown in SEQ ID NO: 8 to SEQ ID NO: 13 respectively.
将上述获得的SEQ ID NO:8至SEQ ID NO:13所示的各Cas蛋白高表达的基因序列进行基因合成,并构建至slugCas9骨架质粒The highly expressed gene sequences of each Cas protein obtained above from SEQ ID NO: 8 to SEQ ID NO: 13 were gene synthesized and constructed into the slugCas9 backbone plasmid
(Addgene平台,catalog#163793)上,得到质粒pAAV2_Cas12_ITR。(Addgene platform, catalog #163793), the plasmid pAAV2_Cas12_ITR was obtained.
(2-1)构建质粒Cas12J-8-PSK-u6-crRNA(2-1) Construction of plasmid Cas12J-8-PSK-u6-crRNA
用BbsI和XhoI限制性内切酶将pBluescriptSKII+U6-sgRNA(F+E)empty质粒(Addgene平台,可以商购,catalog为#74707)进行酶切反应,酶切体系为:1μg质粒psk-BbsI-Sasg、5μL10×CutSmart缓冲液(购于NEB公司)、1μL BbsI和1μL XhoI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。The pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases. The digestion system was: 1 μg plasmid psk-BbsI -Sasg, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI and 1 μL XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下3296bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据Cas12j-8蛋白基因组上的repeat序列(其DNA序列为SEQ ID NO:19),将该repeat序列进行基因合成,并构建于线性化的pBluescriptSKII+U6-sgRNA(F+E)empty骨架上,得到质粒Cas12J-8-PSK-u6-crRNA。According to the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
(2-2)构建质粒psk-BbsI-Cas12a-crRNA1(2-2) Construction of plasmid psk-BbsI-Cas12a-crRNA1
用BbsI和XhoI限制性内切酶将pBluescriptSKII+U6-sgRNA(F+E)empty质粒进行酶切反应,酶切体系为:1μg质粒psk-BbsI-Sasg、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI和1μL XhoI限制性内切酶(购于NEB公司),水补足至50μL。使 该酶切体系在37℃反应1小时。The pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes. The enzyme digestion system was: 1 μg plasmid psk-BbsI-Sasg, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI and 1 μL XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was allowed to react at 37°C for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下3296bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据Cas12a蛋白基因组上的repeat,将截断后的repeat序列(其DNA序列分别为SEQ ID NO:20和SEQ ID NO:21)进行基因合成,并构建于线性化的pBluescriptSKII+U6-sgRNA(F+E)empty骨架上,得到质粒psk-BbsI-Cas12a-crRNA1。According to the repeat on the Cas12a protein genome, the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
(2-3)构建质粒hU6-OQB30769_tracr-Bsa1(2-3) Construction of plasmid hU6-OQB30769_tracr-Bsa1
用BsaI和NotI限制性内切酶将pX330_sgACTA2质粒(Addgene平台,catalog为#63712)进行酶切反应,酶切体系为:1μg质粒hU6-sa-tracr-BsaI、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BsaI和1μL NotI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应3小时。The pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases. The enzyme digestion system was: 1 μg plasmid hU6-sa-tracr-BsaI, 5 μL 10×CutSmart buffer (purchased at NEB Company), 1 μL BsaI and 1 μL NotI restriction endonuclease (purchased from NEB Company), and make up to 50 μL with water. The enzyme cleavage system was allowed to react at 37°C for 3 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下2998bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据ChCas12b的基因组找出基因组上的repeat及tracr,根据二级结构推断出其RNA Scaffold序列(其DNA序列为SEQ ID NO:22),将该序列进行基因合成,并构建于线性化的hU6-sa-tracr-BsaI骨架上,得到质粒hU6-OQB30769_tracr-Bsa1。Find out the repeat and tracr on the genome according to the genome of ChCas12b, deduce its RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
(3)质粒pAAV2_Cas12-hU6-sgRNA_ITR载体的构建(3) Construction of plasmid pAAV2_Cas12-hU6-sgRNA_ITR vector
利用PCR方法线性化(1)中表达Cas12蛋白的pAAV2_Cas12_ITR质粒和(2)中表达各蛋白对应sgRNA的Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1质粒。The pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
对于pAAV2_Cas12_ITR质粒,引物序列为:For the pAAV2_Cas12_ITR plasmid, the primer sequences are:
ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC;和ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC; and
CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;
对于Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1质粒,引物序列为:For the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids, the primer sequences are:
GAGGGCCTATTTCCCATGAT;和GAGGGCCTATTTCCCATGAT; and
GTACCGGGCCCCCCCTCG。GTACCGGGCCCCCCCTCG.
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000018
Figure PCTCN2022096002-appb-000018
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000019
Figure PCTCN2022096002-appb-000019
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到目的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
将线性化pAAV2_Cas12_ITR片段与线性化Cas12J-8-PSK-u6-crRNA、psk-BbsI-Cas12a-crRNA1和hU6-OQB30769_tracr-Bsa1片段对应按照说明书要求比例进行同源 重组,所使用的同源重组酶为
Figure PCTCN2022096002-appb-000020
高保真DNA组装预混液(NEB),反应体系如下:
The linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions. The homologous recombination enzyme used was
Figure PCTCN2022096002-appb-000020
High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
Figure PCTCN2022096002-appb-000021
Figure PCTCN2022096002-appb-000021
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000022
Figure PCTCN2022096002-appb-000022
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到质粒pAAV2_Cas12-hU6-sgRNA_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid pAAV2_Cas12-hU6-sgRNA_ITR for future use.
(4)线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR的制备(4) Preparation of linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR
用BbsI限制性内切酶将(3)中制备的各质粒pAAV2_Cas12-hU6-sgRNA_ITR进行酶切线性化反应,酶切体系为:1μg质粒pAAV2_Cas12-hU6-sgRNA_ITR、5μL 10xCutSmart缓冲液(购于NEB公司)、1μL BbsI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease. The enzyme digestion system was: 1 μg plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 μL 10xCutSmart buffer (purchased from NEB Company ), 1 μL BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 μL. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)并依据该厂家提供的说明进行回收,最终用超纯水进行洗脱。所述DNA片段即为包含以上各Cas蛋白的编码基因的线性化质粒pAAV2_Cas12_ITR,其大小分别为7135bp(Cas12J-8蛋白)、7866bp(Mb4Cas12a蛋白)、7875bp(MlCas12a蛋白)、7998bp(MoCas12a蛋白)、7875bp(BgCas12a)和8606bp(ChCas12b)。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water. The DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding gene of each Cas protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp (MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
将回收的线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR用NanoDrop TM Lite分光光度计NanoDrop(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
(5)质粒pAAV2_Cas12-U6-on target sgRNA或pAAV2_Cas12-U6-mismatch sgRNA的制备(5) Preparation of plasmid pAAV2_Cas12-U6-on target sgRNA or pAAV2_Cas12-U6-mismatch sgRNA
设计各on target gRNA和mismatch gRNA的序列,并且其对应的寡核苷酸单链DNA如下表4所示,其中mismatch碱基在序列表中显示为带下划线的粗体碱基。Design the sequences of each on target gRNA and mismatch gRNA, and their corresponding oligonucleotide single-stranded DNAs are shown in Table 4 below, where mismatch bases are shown as underlined bold bases in the sequence listing.
将所得的on target gRNA对应的寡核苷酸单链DNA和不同mismatch gRNA对应的寡核苷酸单链DNA分别退火。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序;退火程序如下:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物分别通过DNA连接酶(购于NEB公司)连接至所得的线性化pAAV2_Cas12-hU6-sgRNA_ITR质粒。Anneal the oligonucleotide single-stranded DNA corresponding to the obtained on target gRNA and the oligonucleotide single-stranded DNA corresponding to different mismatch gRNAs respectively. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program; the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the obtained products were respectively ligated to the obtained linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,37℃培养1h进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to E. coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C After culturing for 1 h, the Escherichia coli DH5α competent cells were activated and recovered.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的 LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即分别得到表达上述on target gRNA序列的质粒pAAV2_Cas12-hU6-on target gRNA和表达上述不同mismatch gRNA序列的质粒pAAV2_Cas12-hU6-mismatch gRNA,备用。Shake the Escherichia coli DH5α clone that is connected correctly by sequencing, extract the plasmid, and obtain the plasmid pAAV2_Cas12-hU6-on target gRNA expressing the above-mentioned on target gRNA sequence and the plasmid pAAV2_Cas12-hU6-mismatch gRNA expressing the above-mentioned different mismatch gRNA sequences, respectively, spare.
(7)将所得的表达on target gRNA序列的质粒pAAV2_Cas12-hU6-on target gRNA和表达mismatch gRNA序列的质粒pAAV2_Cas12-U6-mismatch gRNA采用脂质体方式分别转染至含有靶序列(GGATATGTTGAAGAACACCATGAC)的GFP报告系统HEK293T细胞系。(7) The obtained plasmid pAAV2_Cas12-hU6-on target gRNA expressing the on target gRNA sequence and the plasmid pAAV2_Cas12-U6-mismatch gRNA expressing the mismatch gRNA sequence were respectively transfected into GFP containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposome Reporter system HEK293T cell line.
Figure PCTCN2022096002-appb-000023
Figure PCTCN2022096002-appb-000023
Figure PCTCN2022096002-appb-000024
Figure PCTCN2022096002-appb-000024
Figure PCTCN2022096002-appb-000025
Figure PCTCN2022096002-appb-000025
所述含有靶序列的GFP报告系统HEK293T细胞系是通过下述方式获得的:在起始密码子ATG和GFP编码序列之间插入PAM序列和特定的靶序列,造成GFP移码突变,然后通过慢病毒感染整合到HEK293T细胞中,得到含有靶序列的GFP报告系统HEK293T细胞系。当基因编辑系统对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光,通过流式分析统计GFP阳性细胞比率可以评估基因编辑系统的编辑能力及特异性。The HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly The virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained. When the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence. The editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
上述转染过程包括如下步骤:Above-mentioned transfection process comprises the following steps:
第0天,根据转染所需,将含有靶序列的GFP报告系统HEK293T细胞系在6孔板进行铺板,细胞密度控制在30%。On day 0, according to the requirements of transfection, the GFP reporter system HEK293T cell line containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
该含有靶序列的GFP报告系统HEK293T细胞系中包含CMV-ATG-PAM-target site-GFP的核苷酸序列,其中,其中PAM序列参见图7至图13,靶位点(target site)的序列为GGATATGTTGAAGAACACCATGAC。The GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein, wherein the PAM sequence is shown in Figure 7 to Figure 13, the sequence of the target site (target site) For GGATATGTTGAAGAACACCATGAC.
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
分别取2μg待转染质粒pAAV2_Cas12-U6-on target gRNA或2μg待转染质粒pAAV2_Cas12-U6-mismatch gRNA加入至100μL Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Add 2 μg of the plasmid pAAV2_Cas12-U6-on target gRNA to be transfected or 2 μg of the plasmid pAAV2_Cas12-U6-mismatch gRNA to be transfected into 100 μL Opti-MEM medium (purchased from Gibco), and gently blow and mix.
Figure PCTCN2022096002-appb-000026
2000(购于Invitrogen公司)或PEI(购于polysciences公司)轻弹混匀,吸取5μL
Figure PCTCN2022096002-appb-000027
2000或PEI加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min。
Will
Figure PCTCN2022096002-appb-000026
2000 (purchased from Invitrogen) or PEI (purchased from polysciences) flick to mix, pipette 5 μL
Figure PCTCN2022096002-appb-000027
2000 or PEI was added to 100 μL Opti-MEM medium, mixed gently, and allowed to stand at room temperature for 5 minutes.
将经稀释的质粒和经稀释的转染试剂进行混合,轻轻吹打混匀,得到的混合液室温静置20min,然后加入到含有靶序列的GFP报告系统HEK293T细胞系的培养基中,并将其置于37℃、5%CO 2培养箱中继续培养。 The diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO 2 incubator to continue culturing.
流式细胞分析技术分析本发明的CRISPR基因编辑系统对靶序列的编辑效率及脱靶率。The flow cytometric analysis technology analyzes the editing efficiency and off-target rate of the CRISPR gene editing system of the present invention on the target sequence.
具体地,收集在CO 2培养箱中培养3天后的HEK293T细胞系, 采用流式细胞仪(BD Biosciences FACSCalibur)对其特异性进行检测,并用FlowJo分析软件分析GFP阳性比率并作图。 Specifically, the HEK293T cell line cultured in a CO 2 incubator for 3 days was collected, its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
本发明的CRISPR/Cas12基因编辑系统在含有靶序列的GFP报告系统HEK293T细胞系中的特异性检测结果示于图7至图13,其中上方横条显示GFP报告系统示意图,在起始密码子ATG和GFP编码序列之间插入有特定的PAM序列及靶序列,造成GFP移码突变。因此当基因编辑系统对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光。图7至图13中下方的柱状图中的Y轴代表GFP阳性细胞百比率(%),X轴代表On-target gRNA和mismatch gRNA对应的寡核苷酸单链DNA序列。从图7至图13中可以看出,本发明的CRISPR基因编辑系统在GFP报告系统HEK293T细胞系中的靶位点均发生了编辑,且由mismatch gRNA介导的基因编辑比例显著性低于on-target gRNA介导的基因编辑比例,由此表明本发明的CRISPR基因编辑系统的编辑活性高,脱靶率低,特异性高。且在对于CRISPR/Cas12J-8基因编辑系统的研究结果中,在前14bp的单碱基mismatch中并未发现明显错配现象,说明CRISPR/Cas12J-8基因编辑系统对gRNA与靶序列间的完全配对要求极高,具有较低的容错率和实际应用的较高安全性。The specific detection results of the CRISPR/Cas12 gene editing system of the present invention in the GFP reporter system HEK293T cell line containing the target sequence are shown in Figures 7 to 13, wherein the upper bar shows the schematic diagram of the GFP reporter system, and the start codon ATG A specific PAM sequence and target sequence are inserted between the GFP coding sequence, resulting in GFP frameshift mutation. Therefore, when the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system of the cells, resulting in green fluorescence. The Y-axis in the lower histograms in Figure 7 to Figure 13 represents the percentage (%) of GFP-positive cells, and the X-axis represents the oligonucleotide single-stranded DNA sequence corresponding to On-target gRNA and mismatch gRNA. It can be seen from Figures 7 to 13 that the CRISPR gene editing system of the present invention edited all target sites in the GFP reporter system HEK293T cell line, and the proportion of gene editing mediated by mismatch gRNA was significantly lower than that of on -target gRNA-mediated gene editing ratio, thus indicating that the CRISPR gene editing system of the present invention has high editing activity, low off-target rate, and high specificity. And in the research results of the CRISPR/Cas12J-8 gene editing system, no obvious mismatches were found in the first 14bp single base mismatch, indicating that the CRISPR/Cas12J-8 gene editing system has complete compatibility between the gRNA and the target sequence. The pairing requirements are extremely high, with a low fault tolerance rate and high security for practical applications.
实施例3Example 3
(1)线性化质粒SlugABEmax的制备(1) Preparation of linearized plasmid SlugABEmax
以SlugABEmax质粒(Addgene平台,catalog#163798)为模板进行PCR反应,引物序列为:Use the SlugABEmax plasmid (Addgene platform, catalog#163798) as a template for PCR reaction, and the primer sequence is:
引物1:TCTGGTGGTTCTCCCAAGAAGAPrimer 1: TCTGGTGGTTTCTCCCAAGAAGA
引物2:TGACCCCCCGCTGCTGCCCCPrimer 2: TGACCCCCCGCTGCTGCCCC
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000028
Figure PCTCN2022096002-appb-000028
Figure PCTCN2022096002-appb-000029
Figure PCTCN2022096002-appb-000029
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000030
Figure PCTCN2022096002-appb-000030
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到4152bp的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The PCR product was electrophoresed on a 1% agarose gel with a voltage of 120V for 30min, and a DNA fragment of 4152bp was purified with a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop Lite spectrophotometer (Thermo Scientific). Or store at -20°C for long-term storage.
(2)质粒pAAV2_envTadA-Cas12J-8_ITR的制备(2) Preparation of plasmid pAAV2_envTadA-Cas12J-8_ITR
将线性化SlugABEmax骨架片段与和公司合成的人源化Cas12J-8片段(SEQ ID NO:8)按说明书要求比例进行同源重组,所使用的同源重组酶为
Figure PCTCN2022096002-appb-000031
高保真DNA组装预混液(NEB),反应体系如下:
Homologous recombination was performed on the linearized SlugABEmax backbone fragment and the humanized Cas12J-8 fragment (SEQ ID NO: 8) synthesized by the company according to the ratio required by the instructions. The homologous recombination enzyme used was
Figure PCTCN2022096002-appb-000031
High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
Figure PCTCN2022096002-appb-000032
Figure PCTCN2022096002-appb-000032
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000033
Figure PCTCN2022096002-appb-000033
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到质粒pAAV2_envTadA-Cas12J-8_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid pAAV2_envTadA-Cas12J-8_ITR for future use.
(3)质粒pAAV2_envTadA-dCas12J-8_ITR的制备(3) Preparation of plasmid pAAV2_envTadA-dCas12J-8_ITR
以pAAV2_envTadA-Cas12J-8_ITR为模板进行环形PCR反应,引物序列为:Carry out circular PCR reaction with pAAV2_envTadA-Cas12J-8_ITR as template, the primer sequence is:
引物3:CAACCTGGTGAAAAAGAACAACTTCPrimer 3: CAACCTGGTGAAAAAGAACAACTTC
引物4:GCGATGCCGATCACATCGCACAPrimer 4: GCGATGCCGATCACATCGCACA
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000034
Figure PCTCN2022096002-appb-000034
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000035
Figure PCTCN2022096002-appb-000035
Figure PCTCN2022096002-appb-000036
Figure PCTCN2022096002-appb-000036
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到6305bp的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,并分别进行T4 PNK处理和T4 DNA连接酶处理,反应体系如下: The PCR product was electrophoresed at 120V for 30 min on a 1% agarose gel, and a DNA fragment of 6305 bp was purified using a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop Lite spectrophotometer (Thermo Scientific), and T4 PNK treatment and T4 DNA ligase treatment were carried out respectively, and the reaction system was as follows:
Figure PCTCN2022096002-appb-000037
Figure PCTCN2022096002-appb-000037
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000038
Figure PCTCN2022096002-appb-000038
在反应体系中加入T4 DNA连接酶(NEB)1μL,震荡混匀后室温孵育2h。Add 1 μL of T4 DNA ligase (NEB) to the reaction system, shake and mix well, and incubate at room temperature for 2 h.
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到质粒pAAV2_envTadA-dCas12J-8_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid pAAV2_envTadA-dCas12J-8_ITR for future use.
(5)pAAV2_envTadA-dCas12J-8_ITR的线性化制备(5) Linearization preparation of pAAV2_envTadA-dCas12J-8_ITR
利用Kpn1和Not1限制性内切酶(NEB)对pAAV2_envTadA-dCas12J-8_ITR质粒进行酶切反应,反应体系为: 2μg质粒pAAV2_envTadA-dCas12J-8_ITR、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL Kpn1限制性内切酶(购于NEB公司),1μL Not1限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应2小时。The pAAV2_envTadA-dCas12J-8_ITR plasmid was digested with Kpn1 and Not1 restriction enzymes (NEB), and the reaction system was: 2 μg plasmid pAAV2_envTadA-dCas12J-8_ITR, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL Kpn1 restriction endonuclease (purchased from NEB Company), 1 μL Not1 restriction endonuclease (purchased from NEB Company), water to make up to 50 μL. The enzyme cleavage system was allowed to react at 37°C for 2 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
将回收的线性化片段pAAV2_envTadA-dCas12J-8_ITR用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized fragment pAAV2_envTadA-dCas12J-8_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
(6)pAAV2_envTadA-dCas12J-8-crRNA_ITR质粒的制备(6) Preparation of pAAV2_envTadA-dCas12J-8-crRNA_ITR plasmid
以Cas12J-8-PSK-u6-crRNA为模板进行PCR反应,引物序列为:Using Cas12J-8-PSK-u6-crRNA as a template for PCR reaction, the primer sequence is:
引物5:GGAGGTACCGATCCGACGCGCCATCTCTAGPrimer 5: GGAGGTACCGATCCGACGCGCCATCTCTAG
引物6:CCTGCGGCCGCGGGCCCCCCCTCGAAAAAAAAACPrimer 6: CCTGCGGCCGCGGGCCCCCCCTCGAAAAAAAAAAC
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000039
Figure PCTCN2022096002-appb-000039
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000040
Figure PCTCN2022096002-appb-000040
Figure PCTCN2022096002-appb-000041
Figure PCTCN2022096002-appb-000041
PCR产物在1.5%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到394bp的Cas12J-8 crRNA DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The PCR product was electrophoresed on 1.5% agarose gel with a voltage of 120V for 30min, and the gel recovery kit was used to purify the Cas12J-8 crRNA DNA fragment of 394bp according to the steps provided by the manufacturer, which was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific) DNA concentration, spare or store at -20°C for long-term storage.
将线性化pAAV2_envTadA-dCas12J-8_ITR片段与Cas12J-8crRNA片段按说明书要求比例进行同源重组,所使用的同源重组酶为
Figure PCTCN2022096002-appb-000042
高保真DNA组装预混液(NEB),反应体系如下:
The linearized pAAV2_envTadA-dCas12J-8_ITR fragment and the Cas12J-8crRNA fragment were subjected to homologous recombination according to the ratio required by the instructions, and the homologous recombination enzyme used was
Figure PCTCN2022096002-appb-000042
High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
Figure PCTCN2022096002-appb-000043
Figure PCTCN2022096002-appb-000043
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000044
Figure PCTCN2022096002-appb-000044
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到质粒pAAV2_envTadA-dCas12J-8-crRNA_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR for future use.
(7)质粒pAAV2_envTadA-dCas12J-8-sgRNA_ITR的制备(7) Preparation of plasmid pAAV2_envTadA-dCas12J-8-sgRNA_ITR
用BbsI限制性内切酶对pAAV2_envTadA-dCas12J-8-crRNA_ITR质粒进行酶切反应,酶切体系为:2μg质粒pAAV2_envTadA-dCas12J-8-crRNA_ITR、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应2小时。The pAAV2_envTadA-dCas12J-8-crRNA_ITR plasmid was digested with BbsI restriction endonuclease, and the digestion system was: 2 μg plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI restriction endonuclease (purchased from NEB Company), water to make up to 50 μL. The enzyme cleavage system was allowed to react at 37°C for 2 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
将回收的线性化质粒pAAV2_envTadA-dCas12J-8-crRNA_ITR用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
在人类基因组中随机选择满足Cas12J-8蛋白PAM需求的内源位点靶序列,其对应的寡核苷酸单链DNA如下表所示。The endogenous site target sequence that meets the PAM requirements of the Cas12J-8 protein is randomly selected in the human genome, and the corresponding oligonucleotide single-stranded DNA is shown in the table below.
Figure PCTCN2022096002-appb-000045
Figure PCTCN2022096002-appb-000045
将寡核苷酸单链DNA进行退火得到双链DNA。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序,退火程序为:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物通过DNA连接酶(购于NEB公司)连接至线性化pAAV2_envTadA-dCas12J-8-crRNA_ITR载体。The oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program, the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the resulting product was ligated to the linearized pAAV2_envTadA-dCas12J-8-crRNA_ITR vector by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, and incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37 Cultivate at ℃ for 1 hour to activate and recover Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到含有表达目标sgRNA序列的质粒pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR,备用。Shake the E. coli DH5α clone that has been correctly connected by sequencing, extract the plasmid, and obtain the plasmid pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR that expresses the target sgRNA sequence for future use.
(8)pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR质粒对野生型HEK293T细胞系的转染(8) Transfection of wild-type HEK293T cell line with pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR plasmid
将所得的pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR质粒采用脂质体方式分别转染至野生型HEK293T细胞系。The resulting pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR plasmids were transfected into wild-type HEK293T cell lines by liposomes.
上述转染过程包括如下步骤:Above-mentioned transfection process comprises the following steps:
第0天,根据转染所需,将HEK293T细胞系在6孔板进行铺板,细胞密度控制在30%。On day 0, HEK293T cell lines were plated in 6-well plates according to transfection requirements, and the cell density was controlled at 30%.
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
取2μg待转染质粒pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR加入至100μL  Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Take 2 μg of the plasmid to be transfected, pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR, and add it to 100 μL of Opti-MEM medium (purchased from Gibco), and gently mix by pipetting.
Figure PCTCN2022096002-appb-000046
2000(购于Invitrogen公司)或PEI(购于polysciences公司)轻弹混匀,吸取5μL
Figure PCTCN2022096002-appb-000047
2000或PEI加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min。
Will
Figure PCTCN2022096002-appb-000046
2000 (purchased from Invitrogen) or PEI (purchased from polysciences) flick to mix, pipette 5 μL
Figure PCTCN2022096002-appb-000047
2000 or PEI was added to 100 μL Opti-MEM medium, mixed gently, and allowed to stand at room temperature for 5 minutes.
将经稀释的质粒和经稀释的转染试剂进行混合,轻轻吹打混匀,得到的混合液室温静置20min,然后加入到备转HEK293T细胞的培养基中,并将其置于37℃、5%CO 2培养箱中继续培养7天。 Mix the diluted plasmid and the diluted transfection reagent, gently blow and mix, and let the resulting mixture stand at room temperature for 20 minutes, then add it to the culture medium for transfecting HEK293T cells, and place it at 37°C, Continue culturing for 7 days in a 5% CO 2 incubator.
(9)二代测序文库的制备(9) Preparation of next-generation sequencing library
收集编辑七天后的HEK293T细胞,用DNA试剂盒(天根生化科技(北京)有限公司,DP304)并依据该DNA试剂盒提供的说明书提取基因组DNA。HEK293T cells seven days after editing were collected, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
进行PCR建库第一轮PCR,用2×Q5Mastermix进行PCR反应,PCR引物如下表所示:Carry out the first round of PCR for PCR library construction, and use 2×Q5Mastermix for PCR reaction. The PCR primers are shown in the table below:
表6:针对各内源位点的PCR引物列表Table 6: List of PCR primers for each endogenous site
Figure PCTCN2022096002-appb-000048
Figure PCTCN2022096002-appb-000048
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000049
Figure PCTCN2022096002-appb-000049
Figure PCTCN2022096002-appb-000050
Figure PCTCN2022096002-appb-000050
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000051
Figure PCTCN2022096002-appb-000051
进行PCR建库第二轮PCR,用2×Q5 Mastermix进行PCR反应,PCR引物同上文实施例1中给出的F2引物和R2引物。The second round of PCR for PCR library construction was carried out, and 2×Q5 Mastermix was used for PCR reaction, and the PCR primers were the same as the F2 primers and R2 primers given in Example 1 above.
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000052
Figure PCTCN2022096002-appb-000052
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000053
Figure PCTCN2022096002-appb-000053
将第二轮的PCR产物用胶回收试剂盒依据厂家提供的步骤,纯化DNA片段,由此二代测序文库制备完毕。The second-round PCR product was purified with a gel recovery kit according to the steps provided by the manufacturer to purify the DNA fragment, and thus the next-generation sequencing library was prepared.
(10)二代测序结果的分析(10) Analysis of next-generation sequencing results
将制备好的二代测序文库在高通量测序仪HiseqXTen(illumina)上进行双端测序。The prepared next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
二代测序结果经运算后获得各内源位点靶位点中符合编辑要求的腺嘌呤A的编辑比例,结果示于图14。从该图中可以看出,Cas12J-8ABE碱基编辑器成功地对这几个内源性位点靶位点进行了细胞单碱基基因编辑,且含有Cas12J-8ABE碱基编辑器蛋白只有938个氨基酸,可以轻易被AAV病毒包装,由此使CRISPR单碱基编辑器系统在生物体基因治疗上的应用成为了可能。The next-generation sequencing results were calculated to obtain the editing ratio of adenine A that meets the editing requirements in each endogenous site target site, and the results are shown in FIG. 14 . It can be seen from the figure that the Cas12J-8ABE base editor has successfully edited these several endogenous site target sites, and only 938 cells containing the Cas12J-8ABE base editor protein A single amino acid can be easily packaged by AAV virus, thus making it possible to apply the CRISPR single base editor system in gene therapy of organisms.
实施例4Example 4
(1)构建质粒pAAV2_Cas12_ITR(1) Construction of plasmid pAAV2_Cas12_ITR
文末序列表中示出了Cas12J-4、Cas12J-5、Cas12J-7、Cas12J-8和Cas12J-9蛋白的氨基酸序列(分别如SEQ ID NO:23-25、1和26所示)。The amino acid sequences of Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8 and Cas12J-9 proteins are shown in the sequence table at the end of the text (respectively as shown in SEQ ID NO: 23-25, 1 and 26).
将各Cas12蛋白的编码核酸序列进行密码子优化,获得所述Cas12蛋白在人细胞中高表达的基因序列。Cas12J-4、Cas12J-5、Cas12J-7、Cas12J-8和Cas12J-9蛋白的基因序列分别由SEQ ID NO:27-29、8和30所示。The coding nucleic acid sequence of each Cas12 protein is codon-optimized, and the gene sequence of the Cas12 protein highly expressed in human cells is obtained. The gene sequences of Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8 and Cas12J-9 proteins are shown in SEQ ID NO: 27-29, 8 and 30 respectively.
将上述获得的SEQ ID NO:27-29、8和30所示的各Cas12蛋白高表达的基因序列进行基因合成,并分别构建至slugCas9骨架质粒(Addgene平台,catalog#163793)上,得到各质粒pAAV2_Cas12_ITR。The highly expressed gene sequences of Cas12 proteins obtained above as shown in SEQ ID NO: 27-29, 8 and 30 were gene-synthesized, and respectively constructed on the slugCas9 backbone plasmid (Addgene platform, catalog#163793) to obtain each plasmid pAAV2_Cas12_ITR.
(2)构建质粒Cas12J-PSK-u6-crRNA(2) Construction of plasmid Cas12J-PSK-u6-crRNA
用BbsI和XhoI限制性内切酶将pBluescriptSKII+ U6-sgRNA(F+E)empty质粒(Addgene平台,可以商购,catalog为 #74707)进行酶切反应,酶切体系为:1μg质粒psk-BbsI-Sasg、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL BbsI和1μL XhoI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。The pBluescriptSKII+ U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases. The enzyme digestion system was: 1 μg plasmid psk-BbsI- Sasg, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL BbsI and 1 μL XhoI restriction endonuclease (purchased from NEB Company), make up to 50 μL with water. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下3296bp DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。The 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
根据Cas12J-4、Cas12J-5、Cas12J-7、Cas12J-8和Cas12J-9蛋白基因组上的repeat序列(其DNA序列分别由SEQ ID NO:31至33、19和34所示),将该repeat序列进行基因合成,并分别构建于线性化的pBluescriptSKII+ U6-sgRNA(F+E)empty骨架上,得到各质粒Cas12J-PSK-u6-crRNA。According to the repeat sequence on Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8 and Cas12J-9 protein genome (its DNA sequence is shown by SEQ ID NO:31 to 33, 19 and 34 respectively), this repeat The sequence was gene synthesized, and constructed on the linearized pBluescriptSKII+ U6-sgRNA(F+E)empty backbone to obtain each plasmid Cas12J-PSK-u6-crRNA.
(3)质粒pAAV2_Cas12-hU6-sgRNA_ITR载体的构建(3) Construction of plasmid pAAV2_Cas12-hU6-sgRNA_ITR vector
利用PCR方法线性化(1)中表达Cas12蛋白的pAAV2_Cas12_ITR质粒和(2)中表达各蛋白对应sgRNA的Cas12J-PSK-u6-crRNA质粒。The pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-PSK-u6-crRNA plasmid expressing sgRNA corresponding to each protein in (2) were linearized by PCR method.
对于pAAV2_Cas12_ITR质粒,引物序列为:For the pAAV2_Cas12_ITR plasmid, the primer sequences are:
ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC;和ATCATGGGAAATAGGCCCTCAGGTACCTCCCCAGCATGC; and
CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;CGAGGGGGGGCCCGGTACATCATGGGAAATAGGCCCTC;
对于Cas12J-PSK-u6-crRNA质粒,引物序列为:For the Cas12J-PSK-u6-crRNA plasmid, the primer sequences are:
GAGGGCCTATTTCCCATGAT:和GAGGGCCTATTTCCCATGAT: and
GTACCGGGCCCCCCCTCG。GTACCGGGCCCCCCCTCG.
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000054
Figure PCTCN2022096002-appb-000054
Figure PCTCN2022096002-appb-000055
Figure PCTCN2022096002-appb-000055
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000056
Figure PCTCN2022096002-appb-000056
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到目的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
将线性化pAAV2_Cas12_ITR片段与线性化Cas12J-PSK-u6-crRNA片段对应按照说明书要求比例进行同源重组,所使用的同源重组酶为
Figure PCTCN2022096002-appb-000057
高保真DNA组装预混液(NEB),反应体系如下:
The linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-PSK-u6-crRNA fragment were homologously recombined according to the ratio required by the instructions, and the homologous recombination enzyme used was
Figure PCTCN2022096002-appb-000057
High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
Figure PCTCN2022096002-appb-000058
Figure PCTCN2022096002-appb-000058
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000059
Figure PCTCN2022096002-appb-000059
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到各质粒pAAV2_Cas12-hU6-sgRNA_ITR,备用。The Escherichia coli DH5α clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain each plasmid pAAV2_Cas12-hU6-sgRNA_ITR for future use.
(4)线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR的制备(4) Preparation of linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR
用BbsI限制性内切酶将(3)中制备的各质粒pAAV2_Cas12-hU6-sgRNA_ITR进行酶切线性化反应,酶切体系为:1μg质粒pAAV2_Cas12-hU6-sgRNA_ITR、5μL 10xCutSmart缓冲液(购于NEB公司)、1μL BbsI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应1小时。Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease. The enzyme digestion system was: 1 μg plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 μL 10xCutSmart buffer (purchased from NEB Company ), 1 μL BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 μL. The enzyme cleavage system was reacted at 37° C. for 1 hour.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)并依据该厂家提供的说明进行回收,最终用超纯水进行洗脱。所述DNA片段即为包含以上各Cas蛋白的编码基因的线性化质粒pAAV2_Cas12_ITR。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water. The DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding genes of the above Cas proteins.
将回收的线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR用NanoDrop TM Lite分光光度计NanoDrop(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
(5)质粒pAAV2_Cas12-hU6-sgRNA_ITR的制备(5) Preparation of plasmid pAAV2_Cas12-hU6-sgRNA_ITR
设计gRNA(GGAUAUGUUGAAGAACACCAUGAC),并在设计的gRNA序列用的正义链和反义链上分别加上线性化质粒pAAV2_Cas12-hU6-sgRNA_ITR两侧对应的粘性末端序列,并合成两条寡核苷酸单链DNA,这两条寡核苷酸单链DNA的具体序列如下:Design gRNA (GGAUAUGUUGAAGAACACCAUGAC), and add cohesive end sequences corresponding to both sides of the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR on the sense strand and antisense strand of the designed gRNA sequence, and synthesize two oligonucleotide single strands DNA, the specific sequence of these two oligonucleotide single-stranded DNA is as follows:
Oligo-F:GGATATGTTGAAGAACACCATGACOligo-F: GGATATGTTGAAGAACACCATGAC
Oligo-R:GTCATGGTGTTCTTCAACATATCCOligo-R: GTCATGGTGTTCTTCAACATATCC
其中,针对Cas12J-4、Cas12J-5、Cas12J-7、Cas12J-8、和Cas12J-9的Oligo-F的粘性末端分别为CGAC、GGAC、AGAC、AGAC和AGAC,针对所有Cas12蛋白的Oligo-R的粘性末端均为AAAA。Among them, the cohesive ends of Oligo-F for Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8, and Cas12J-9 are CGAC, GGAC, AGAC, AGAC, and AGAC, respectively, and Oligo-R for all Cas12 proteins The cohesive ends of both are AAAA.
将寡核苷酸单链DNA进行退火得到双链DNA。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序,退火程序为:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物通过DNA连接酶(购于NEB公司)连接至步骤(2)所得的线性化pAAV2_Cas12-hU6-sgRNA_ITR质粒。The oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program, the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, and incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37 Cultivate at ℃ for 1 hour to activate and recover Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到含有表达目标sgRNA序列的质粒pAAV2_Cas12-hU6-sgRNA_ITR,备用。Shake the Escherichia coli DH5α clone with correct connection after sequencing verification, extract the plasmid, and obtain the plasmid pAAV2_Cas12-hU6-sgRNA_ITR containing the expression target sgRNA sequence, for future use.
(6)将所得的表达gRNA序列的质粒pAAV2_Cas12-hU6-sgRNA_ITR采用脂质体方式分别转染至含有靶序列(GGATATGTTGAAGAACACCATGAC)的GFP报告系统HEK293T细胞系文库中。(6) The obtained plasmid pAAV2_Cas12-hU6-sgRNA_ITR expressing the gRNA sequence was transfected into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposome.
所述含有靶序列的GFP报告系统HEK293T细胞系文库是通过下述方式获得的:在起始密码子ATG和GFP编码序列之间插入5bp随机序列(作为PAM序列)和24bp的protospacer(作为靶序列),造成GFP移码突变而不表达。用CMV启动子启动这个含有插入片 段的GFP基因,并构建在慢病毒表达载体上。这段序列由慢病毒介导随机插入到HEK293T细胞的基因组中,使之成为稳定的GFP报告细胞系文库。当使用基因编辑系统对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光,通过流式分析统计GFP阳性细胞比率可以评估基因编辑系统的编辑能力及特异性。The GFP reporter system HEK293T cell line library containing the target sequence is obtained in the following manner: a 5bp random sequence (as a PAM sequence) and a 24bp protospacer (as a target sequence) are inserted between the initiation codon ATG and the GFP coding sequence ), resulting in GFP frameshift mutation and no expression. The GFP gene containing the insert was driven by the CMV promoter and constructed on a lentiviral expression vector. This sequence is randomly inserted into the genome of HEK293T cells by lentivirus, making it a stable GFP reporter cell line library. When the gene editing system is used to cut the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence, and the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry .
上述转染过程包括如下步骤:Above-mentioned transfection process comprises the following steps:
第0天,根据转染所需,将含有靶序列的GFP报告系统HEK293T细胞系文库在6孔板进行铺板,细胞密度控制在30%。On day 0, according to the requirements of transfection, the GFP reporter system HEK293T cell line library containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
该含有靶序列的GFP报告系统HEK293T细胞系文库中包含CMV-ATG-PAM-target site-GFP的核苷酸序列,其中,其中PAM序列为5bp随机序列,靶位点(target site)的序列为GGATATGTTGAAGAACACCATGAC(图15)。The GFP reporter system HEK293T cell line library containing the target sequence comprises the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is a 5bp random sequence, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC (FIG. 15).
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
分别取2μg待转染质粒pAAV2_Cas12-hU6-sgRNA_ITR加入至100μL Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Take 2 μg of the plasmid to be transfected, pAAV2_Cas12-hU6-sgRNA_ITR, and add it to 100 μL of Opti-MEM medium (purchased from Gibco), and gently blow and mix.
Figure PCTCN2022096002-appb-000060
2000(购于Invitrogen公司)或PEI(购于polysciences公司)轻弹混匀,吸取5μL
Figure PCTCN2022096002-appb-000061
2000或PEI加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min。
Will
Figure PCTCN2022096002-appb-000060
2000 (purchased from Invitrogen) or PEI (purchased from polysciences) flick to mix, pipette 5 μL
Figure PCTCN2022096002-appb-000061
2000 or PEI was added to 100 μL Opti-MEM medium, mixed gently, and allowed to stand at room temperature for 5 minutes.
将经稀释的质粒和经稀释的转染试剂进行混合,轻轻吹打混匀,得到的混合液室温静置20min,然后加入到含有靶序列的GFP报告系统HEK293T细胞系文库的培养基中,并将其置于37℃、5%CO 2培养箱中继续培养。 The diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line library containing the target sequence, and Place it in a 37°C, 5% CO 2 incubator to continue culturing.
然后,在荧光显微镜下观察各CRISPR/Cas12系统对HEK293T细胞系文库中的靶基因进行编辑的情况,结果示于图16。从该图中可以看出,只有CRISPR/Cas12J-8系统组别文库细胞出绿色荧光,这表明该系统成功地对细胞中的靶基因进行了编辑。但是,其他的任何CRISPR/Cas12J基因编辑系统组别文库细胞均没有发出任何荧 光,表明这些系统不能够对靶基因进行有效编辑。Then, the editing of the target gene in the HEK293T cell line library by each CRISPR/Cas12 system was observed under a fluorescence microscope, and the results are shown in FIG. 16 . It can be seen from the figure that only the CRISPR/Cas12J-8 system group library cells emit green fluorescence, which indicates that the system has successfully edited the target gene in the cell. However, none of the other CRISPR/Cas12J gene editing system group library cells emitted any fluorescence, indicating that these systems are not capable of efficiently editing the target gene.
实施例5Example 5
(1)ChCas12b点突变体质粒的构建(1) Construction of ChCas12b point mutant plasmid
以pAAV2_Cas12_ITR(SEQ ID NO:13)质粒为模板进行环形PCR反应,引物序列见下表:The pAAV2_Cas12_ITR (SEQ ID NO: 13) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
表7:构建ChCas12b点突变体的PCR引物Table 7: PCR primers for constructing ChCas12b point mutants
Figure PCTCN2022096002-appb-000062
Figure PCTCN2022096002-appb-000062
Figure PCTCN2022096002-appb-000063
Figure PCTCN2022096002-appb-000063
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000064
Figure PCTCN2022096002-appb-000064
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000065
Figure PCTCN2022096002-appb-000065
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到目的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,并分别进行T4 PNK处理和T4 DNA连接酶处理,反应体系如下: The PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer. The DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific) For T4 PNK treatment and T4 DNA ligase treatment, the reaction system is as follows:
Figure PCTCN2022096002-appb-000066
Figure PCTCN2022096002-appb-000066
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000067
Figure PCTCN2022096002-appb-000067
在反应体系中加入T4 DNA连接酶(NEB)1μL,震荡混匀后室温孵育2h。Add 1 μL of T4 DNA ligase (NEB) to the reaction system, shake and mix well, and incubate at room temperature for 2 h.
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到点突变体Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D和Q23T-P182R-E507D-E1090K质粒,备用或置于-20℃进行长期保存。Shake the Escherichia coli DH5α clone that has been correctly connected by sequencing and extract the plasmids to obtain point mutants Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D and Q23T-P182R-E507D-E1090K Plasmids, spare or store at -20°C for long-term storage.
(2)质粒hU6-OQB30769_tracr-Bsa1的线性化制备(2) Linearization preparation of plasmid hU6-OQB30769_tracr-Bsa1
利用Bsa1限制性内切酶(NEB)对hU6-OQB30769_tracr-Bsa1质粒进行酶切反应,反应体系为:2μg质粒hU6-OQB30769_tracr-Bsa1、5μL 10×CutSmart缓冲液(购于NEB公司)、1μL Bsa1限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应2小时。The hU6-OQB30769_tracr-Bsa1 plasmid was digested with Bsa1 restriction endonuclease (NEB), and the reaction system was: 2 μg plasmid hU6-OQB30769_tracr-Bsa1, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL Bsa1 restriction Endonuclease (purchased from NEB Company), water to make up to 50 μL. The enzyme cleavage system was allowed to react at 37°C for 2 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
将回收的线性化片段hU6-OQB30769_tracr-Bsa1用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized fragment hU6-OQB30769_tracr-Bsa1 was measured with a NanoDrop Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
(3)质粒hU6-OQB30769_tracr-Bsa1-on target sgRNA的构建(3) Construction of plasmid hU6-OQB30769_tracr-Bsa1-on target sgRNA
设计on target gRNA的序列,其对应的寡核苷酸单链DNA如下表8所示:Design the sequence of the on target gRNA, and its corresponding oligonucleotide single-stranded DNA is shown in Table 8 below:
表8:on target gRNA的寡核苷酸单链DNATable 8: Oligonucleotide single-stranded DNA on target gRNA
Figure PCTCN2022096002-appb-000068
Figure PCTCN2022096002-appb-000068
将所得的on target gRNA对应的寡核苷酸单链DNA进行退火。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序;退火程序如下:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物通过DNA连接酶(购于NEB公司)连接至所得的线性化hU6-OQB30769_tracr-Bsa1质粒。Anneal the oligonucleotide single-stranded DNA corresponding to the obtained on target gRNA. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program; the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the obtained product was ligated to the obtained linearized hU6-OQB30769_tracr-Bsa1 plasmid by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,37℃培养1h进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to E. coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C After culturing for 1 h, the Escherichia coli DH5α competent cells were activated and recovered.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即可得到表达上述on target gRNA序列的质粒hU6-OQB30769_tracr-Bsa1-on target sgRNA,备用。Shake the Escherichia coli DH5α clone that has been correctly connected by sequencing, and extract the plasmid to obtain the plasmid hU6-OQB30769_tracr-Bsa1-on target sgRNA expressing the above on target gRNA sequence, which is ready for use.
将所得的表达ChCas12b点突变体蛋白的质粒Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D,Q23T-P182R-E507D和Q23T-P182R-E507D-E1090K分别与hU6-OQB30769_tracr-Bsa1-on target sgRNA采用脂质体方式共转染至含有靶序列(GGATATGTTGAAGAACACCATGAC)的GFP报告系统HEK293T细胞系文库中。The resulting plasmids Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D and Q23T-P182R-E507D-E1090K expressing the ChCas12b point mutant protein were mixed with hU6-OQB30769_tracr-Bsa1-on The sgRNA was co-transfected into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposome.
所述含有靶序列的GFP报告系统HEK293T细胞系是通过下述方式获得的:在起始密码子ATG和GFP编码序列之间插入PAM序列和特定的靶序列,造成GFP移码突变,然后通过慢病毒感染整合到HEK293T细胞中,得到含有靶序列的GFP报告系统HEK293T细胞系。当基因编辑系统对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光,通过流式分析统计GFP阳性细胞比率可以评估基因编辑系统的编辑能力及特异性。The HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly The virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained. When the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence. The editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
上述转染过程包括如下步骤:Above-mentioned transfection process comprises the following steps:
第0天,根据转染所需,将含有靶序列的GFP报告系统HEK293T细胞系在48孔板进行铺板,细胞密度控制在30%。On day 0, according to transfection requirements, the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
该含有靶序列的GFP报告系统HEK293T细胞系中包含CMV-ATG-PAM-target site-GFP的核苷酸序列,其中PAM序列为CGTTG,靶位点(target site)的序列为GGATATGTTGAAGAACACCATGAC。The GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is CGTTG, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
分别取0.5μg待转染质粒表达ChCas12b点突变体蛋白的质粒Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D和Q23T-P182R-E507D-E1090K,与0.3μg hU6-OQB30769-tracr-Bsa1-on target sgRNA共同加入至17.5μL Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Take 0.5 μg of plasmids Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D and Q23T-P182R-E507D and Q23T-P182R-E507D-E1090K expressing ChCas12b point mutant protein in the plasmid to be transfected, and 0.3 μg of hU6 -OQB30769-tracr-Bsa1-on target sgRNA were added to 17.5 μL Opti-MEM medium (purchased from Gibco), and mixed gently by pipetting.
将PEI(购于polysciences公司)轻弹混匀,吸取0.8μL PEI加入至17.5μL Opti-MEM培养基中,轻轻混匀,室温静置5min。Mix PEI (purchased from polysciences company) lightly, pipette 0.8 μL PEI and add it to 17.5 μL Opti-MEM medium, mix gently, and let stand at room temperature for 5 minutes.
将经稀释的质粒和经稀释的转染试剂进行混合,轻轻吹打混匀,得到的混合液室温静置15min,然后加入到含有靶序列的GFP报告系统HEK293T细胞系的培养基中,并将其置于37℃、5%CO2培养箱中继续培养。The diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
流式细胞分析技术分析ChCas12b点突变体对靶序列的编辑效率。Flow cytometry analysis was used to analyze the editing efficiency of ChCas12b point mutants on target sequences.
具体地,收集在CO2培养箱中培养5天后的HEK293T细胞系,采 用流式细胞仪(BD Biosciences FACSCalibur)对其特异性进行检测,并用FlowJo分析软件分析GFP阳性比率并作图。Specifically, the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
ChCas12b点突变体在含有靶序列的GFP报告系统HEK293T细胞系中的编辑效率结果示于图17。当ChCas12b点突变体对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光。图17中的Y轴代表GFP阳性细胞百比率(%),X轴代表ChCas12b及其不同点突变体,NC表示阴性对照组(没有转染质粒)。从图17中可以看出,ChCas12b点突变体在GFP报告系统HEK293T细胞系中的靶位点均发生了编辑,且ChCas12b点突变体与野生型ChCas12b的编辑效率相当。The editing efficiency results of the ChCas12b point mutant in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 17 . When the ChCas12b point mutant cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to produce green fluorescence. The Y-axis in Figure 17 represents the percentage (%) of GFP-positive cells, the X-axis represents ChCas12b and its different point mutants, and NC represents the negative control group (no transfection plasmid). It can be seen from Figure 17 that the target sites of the ChCas12b point mutant were edited in the GFP reporter system HEK293T cell line, and the editing efficiency of the ChCas12b point mutant was comparable to that of the wild-type ChCas12b.
实施例6Example 6
(1)Cas12a点突变体质粒的构建(1) Construction of Cas12a point mutant plasmid
以pAAV2_Cas12_ITR(SEQ ID NO:9至SEQ ID NO:12)质粒为模板进行环形PCR反应,引物序列见下表:The pAAV2_Cas12_ITR (SEQ ID NO: 9 to SEQ ID NO: 12) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
表9:构建Cas12a点突变体的PCR引物Table 9: PCR primers for constructing Cas12a point mutants
Figure PCTCN2022096002-appb-000069
Figure PCTCN2022096002-appb-000069
Figure PCTCN2022096002-appb-000070
Figure PCTCN2022096002-appb-000070
Figure PCTCN2022096002-appb-000071
Figure PCTCN2022096002-appb-000071
反应体系如下:The reaction system is as follows:
Figure PCTCN2022096002-appb-000072
Figure PCTCN2022096002-appb-000072
PCR运行程序如下:The PCR running procedure is as follows:
Figure PCTCN2022096002-appb-000073
Figure PCTCN2022096002-appb-000073
PCR产物在1%琼脂糖凝胶上以120V电压电泳30min,用胶回收试剂盒依据厂家提供的步骤,纯化得到目的DNA片段,用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,并分别进行T4 PNK处理和T4 DNA连接酶处理,反应体系如下: The PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer. The DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific) For T4 PNK treatment and T4 DNA ligase treatment, the reaction system is as follows:
Figure PCTCN2022096002-appb-000074
Figure PCTCN2022096002-appb-000074
反应条件如下:The reaction conditions are as follows:
Figure PCTCN2022096002-appb-000075
Figure PCTCN2022096002-appb-000075
在反应体系中加入T4 DNA连接酶(NEB)1μL,震荡混匀后室温孵育2h。Add 1 μL of T4 DNA ligase (NEB) to the reaction system, shake and mix well, and incubate at room temperature for 2 h.
将连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,于37℃培养1小时,以进行大肠杆菌DH5α感受态细胞的活化复苏。Add the ligation product to Escherichia coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5α competent cells.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有氨苄青霉素抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The recovered Escherichia coli DH5α competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5α monoclonals were verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即得到点突变体质粒,备用或置于-20℃进行长期保存。The Escherichia coli DH5α clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain the point mutant plasmid, which was stored for later use or stored at -20°C.
(2)质粒psk-BbsI-Cas12a-crRNA1的线性化制备(2) Linearization preparation of plasmid psk-BbsI-Cas12a-crRNA1
利用BbsI限制性内切酶(NEB)对psk-BbsI-Cas12a-crRNA1质粒进行酶切反应,反应体系为:2μg质粒psk-BbsI-Cas12a-crRNA1、5μL10×CutSmart缓冲液(购于NEB公司)、1μL BbsI限制性内切酶(购于NEB公司),水补足至50μL。使该酶切体系在37℃反应2小时。The psk-BbsI-Cas12a-crRNA1 plasmid was digested with BbsI restriction endonuclease (NEB), and the reaction system was: 2 μg plasmid psk-BbsI-Cas12a-crRNA1, 5 μL 10×CutSmart buffer (purchased from NEB Company), 1 μL of BbsI restriction endonuclease (purchased from NEB Company), made up to 50 μL with water. The enzyme cleavage system was allowed to react at 37°C for 2 hours.
然后,将酶切产物在1%琼脂糖凝胶上以120V电压电泳30min。Then, the digested products were electrophoresed on 1% agarose gel at 120V for 30min.
从琼脂糖凝胶上切下DNA片段,用胶回收试剂盒(天根生化科技(北京)有限公司,DP209)依据厂家提供的说明进行回收,最终用超纯水进行洗脱。DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
将回收的线性化片段psk-BbsI-Cas12a-crRNA1用NanoDrop TM Lite分光光度计(Thermo Scientific)测定DNA浓度,备用或置于-20℃进行长期保存。 The DNA concentration of the recovered linearized fragment psk-BbsI-Cas12a-crRNA1 was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
(3)质粒psk-BbsI-Cas12a-crRNA1-on target sgRNA的构建(3) Construction of plasmid psk-BbsI-Cas12a-crRNA1-on target sgRNA
设计on target gRNA的序列,其对应的寡核苷酸单链DNA如下表10所示:Design the sequence of the on target gRNA, and its corresponding oligonucleotide single-stranded DNA is shown in Table 10 below:
表10 on target gRNA的寡核苷酸单链DNATable 10 Oligonucleotide single-stranded DNA of on target gRNA
Figure PCTCN2022096002-appb-000076
Figure PCTCN2022096002-appb-000076
将所得的on target gRNA对应的寡核苷酸单链DNA进行退火。退火反应体系为:1μL 100μM oligo-F、1μL 100μM oligo-R、28μL水。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序;退火程序如下:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。退火后,将所得的产物通过DNA连接酶(购于NEB公司)连接至所得的线性化psk-BbsI-Cas12a-crRNA1质粒。Anneal the oligonucleotide single-stranded DNA corresponding to the obtained on target gRNA. The annealing reaction system is: 1 μL 100 μM oligo-F, 1 μL 100 μM oligo-R, 28 μL water. After shaking and mixing the annealing system, place it in a PCR instrument to run the annealing program; the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s. After annealing, the resulting product was ligated to the resulting linearized psk-BbsI-Cas12a-crRNA1 plasmid by DNA ligase (purchased from NEB Company).
取1μL所得连接产物加到大肠杆菌DH5α感受态细胞(购于上海唯地生物技术有限公司)中,冰上孵育30min,42℃热激1min,冰上孵育2min,加入900μL LB培养基,37℃培养1h进行大肠杆菌DH5α感受态细胞的活化复苏。Take 1 μL of the obtained ligation product and add it to E. coli DH5α competent cells (purchased from Shanghai Weidi Biotechnology Co., Ltd.), incubate on ice for 30 min, heat shock at 42°C for 1 min, incubate on ice for 2 min, add 900 μL LB medium, and incubate at 37°C After culturing for 1 h, the Escherichia coli DH5α competent cells were activated and recovered.
将复苏后的大肠杆菌DH5α感受态细胞涂布在含有对应抗性的LB固体平板在37℃培养箱倒置培养,得到的大肠杆菌DH5α单克隆进行Sanger测序验证。The revived E. coli DH5α competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5α monoclonal was verified by Sanger sequencing.
将测序验证连接正确的大肠杆菌DH5α克隆摇菌,提取质粒,即可得到表达上述on target gRNA序列的质粒psk-BbsI-Cas12a-crRNA1-on target sgRNA,备用。Shake the Escherichia coli DH5α clone that has been correctly connected by sequencing and extract the plasmid to obtain the plasmid psk-BbsI-Cas12a-crRNA1-on target sgRNA expressing the above on target gRNA sequence, which is ready for use.
将所得的表达Cas12a点突变体蛋白的质粒分别与 psk-BbsI-Cas12a-crRNA1-on target sgRNA采用脂质体方式共转染至含有靶序列(GGATATGTTGAAGAACACCATGAC)的GFP报告系统HEK293T细胞系文库中。The resulting plasmids expressing the Cas12a point mutant protein were co-transfected with psk-BbsI-Cas12a-crRNA1-on target sgRNA into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposomes.
所述含有靶序列的GFP报告系统HEK293T细胞系是通过下述方式获得的:在起始密码子ATG和GFP编码序列之间插入PAM序列和特定的靶序列,造成GFP移码突变,然后通过慢病毒感染整合到HEK293T细胞中,得到含有靶序列的GFP报告系统HEK293T细胞系。当基因编辑系统对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光,通过流式分析统计GFP阳性细胞比率可以评估基因编辑系统的编辑能力及特异性。The HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly The virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained. When the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence. The editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
上述转染过程包括如下步骤:Above-mentioned transfection process comprises the following steps:
第0天,根据转染所需,将含有靶序列的GFP报告系统HEK293T细胞系在48孔板进行铺板,细胞密度控制在30%。On day 0, according to transfection requirements, the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
该含有靶序列的GFP报告系统HEK293T细胞系中包含CMV-ATG-PAM-target site-GFP的核苷酸序列,其中PAM序列为GTTTT,靶位点(target site)的序列为GGATATGTTGAAGAACACCATGAC。The GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is GTTTT, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
第1天,进行转染,转染过程如下:On day 1, transfection was performed, and the transfection process was as follows:
分别取0.5μg待转染质粒表达Cas12a点突变体蛋白的质粒与0.3μg psk-BbsI-Cas12a-crRNA1-on target sgRNA共同加入至17.5μL Opti-MEM培养基(购于Gibco公司)中,轻轻吹打混匀。Take 0.5 μg of the plasmid to be transfected to express the Cas12a point mutant protein and 0.3 μg of psk-BbsI-Cas12a-crRNA1-on target sgRNA and add it to 17.5 μL of Opti-MEM medium (purchased from Gibco), and gently Mix by pipetting.
将PEI(购于polysciences公司)轻弹混匀,吸取0.8μL PEI加入至17.5μL Opti-MEM培养基中,轻轻混匀,室温静置5min。Mix PEI (purchased from polysciences company) lightly, pipette 0.8 μL PEI and add it to 17.5 μL Opti-MEM medium, mix gently, and let stand at room temperature for 5 minutes.
将经稀释的质粒和经稀释的转染试剂进行混合,轻轻吹打混匀,得到的混合液室温静置15min,然后加入到含有靶序列的GFP报告系统HEK293T细胞系的培养基中,并将其置于37℃、5%CO2培养箱中继续培养。The diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
流式细胞分析技术分析Cas12a点突变体对靶序列的编辑效率。Flow cytometry analysis was used to analyze the editing efficiency of Cas12a point mutants on target sequences.
具体地,收集在CO2培养箱中培养5天后的HEK293T细胞系,采用流式细胞仪(BD Biosciences FACSCalibur)对其特异性进行检测,并用FlowJo分析软件分析GFP阳性比率并作图。Specifically, the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
Cas12a点突变体在含有靶序列的GFP报告系统HEK293T细胞系中的编辑效率结果示于图18。当Cas12a点突变体对靶序列进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光。图18中的Y轴代表GFP阳性细胞百比率(%),X轴代表Cas12a及其不同点突变体,NC表示阴性对照组(没有转染质粒)。从图18中可以看出,Cas12a点突变体在GFP报告系统HEK293T细胞系中的靶位点均发生了编辑,且Cas12a点突变体与野生型Cas12a的编辑效率相当。The editing efficiency results of Cas12a point mutants in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 18 . When the Cas12a point mutant cuts the target sequence, the cells will restore the GFP reading frame to some cells through the self-repair system, resulting in green fluorescence. The Y-axis in Figure 18 represents the percentage (%) of GFP-positive cells, the X-axis represents Cas12a and its different point mutants, and NC represents the negative control group (no transfection plasmid). It can be seen from Figure 18 that the Cas12a point mutants were edited at the target sites in the GFP reporter system HEK293T cell line, and the editing efficiency of the Cas12a point mutants was comparable to that of the wild-type Cas12a.
Figure PCTCN2022096002-appb-000077
Figure PCTCN2022096002-appb-000077
Figure PCTCN2022096002-appb-000078
Figure PCTCN2022096002-appb-000078
Figure PCTCN2022096002-appb-000079
Figure PCTCN2022096002-appb-000079
Figure PCTCN2022096002-appb-000080
Figure PCTCN2022096002-appb-000080
Figure PCTCN2022096002-appb-000081
Figure PCTCN2022096002-appb-000081
Figure PCTCN2022096002-appb-000082
Figure PCTCN2022096002-appb-000082
Figure PCTCN2022096002-appb-000083
Figure PCTCN2022096002-appb-000083
Figure PCTCN2022096002-appb-000084
Figure PCTCN2022096002-appb-000084
Figure PCTCN2022096002-appb-000085
Figure PCTCN2022096002-appb-000085
Figure PCTCN2022096002-appb-000086
Figure PCTCN2022096002-appb-000086
Figure PCTCN2022096002-appb-000087
Figure PCTCN2022096002-appb-000087
Figure PCTCN2022096002-appb-000088
Figure PCTCN2022096002-appb-000088
Figure PCTCN2022096002-appb-000089
Figure PCTCN2022096002-appb-000089
Figure PCTCN2022096002-appb-000090
Figure PCTCN2022096002-appb-000090
Figure PCTCN2022096002-appb-000091
Figure PCTCN2022096002-appb-000091
Figure PCTCN2022096002-appb-000092
Figure PCTCN2022096002-appb-000092

Claims (15)

  1. 一种缀合物,所述缀合物包含:A conjugate comprising:
    a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
    1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    2)具有SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) having at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6, At least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93 %, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, Amino acids that have at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of 80%-100% sequence identity and retain their biological activity sequence homologues;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    b)修饰部分;b) modification part;
    例如,所述修饰部分选自另外的蛋白或多肽、可检测标记或其组合;For example, the modified moiety is selected from another protein or polypeptide, a detectable label or a combination thereof;
    例如,所述另外的蛋白或多肽选自表位标签、报告蛋白或核定位信号(NLS)序列、胞嘧啶脱氨酶(CBE)、腺嘌呤脱氨酶(ABE)、胞嘧啶甲基化酶DNMT3A和MQ1、胞嘧啶去甲基化酶Tet1、转录激活蛋白VP64、p65和RTA、转录抑制蛋白KRAB、组蛋白乙酰化酶p300、组蛋白去乙酰化酶LSD1、和内切酶FokI中的一种或者多种;For example, said additional protein or polypeptide is selected from an epitope tag, a reporter protein or a nuclear localization signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase One of DNMT3A and MQ1, cytosine demethylase Tet1, transcription activator VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI one or more kinds;
    以及as well as
    c)任选的用于连接所述Cas12蛋白与所述修饰部分的接头。c) an optional linker for connecting the Cas12 protein with the modified part.
  2. 一种融合蛋白,所述融合蛋白包含:A fusion protein comprising:
    a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
    1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列 同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    b)另外的蛋白或多肽;b) additional proteins or polypeptides;
    例如,所述另外的蛋白或多肽选自表位标签、报告蛋白或核定位信号(NLS)序列、胞嘧啶脱氨酶(CBE)、腺嘌呤脱氨酶(ABE)、胞嘧啶甲基化酶DNMT3A和MQ1、胞嘧啶去甲基化酶Tet1、转录激活蛋白VP64、p65和RTA、转录抑制蛋白KRAB、组蛋白乙酰化酶p300、组蛋白去乙酰化酶LSD1、和内切酶FokI中的一种或者多种;For example, said additional protein or polypeptide is selected from an epitope tag, a reporter protein or a nuclear localization signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase One of DNMT3A and MQ1, cytosine demethylase Tet1, transcription activator VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI one or more kinds;
    以及as well as
    c)任选的用于连接所述Cas12蛋白与所述另外的蛋白或多肽的接头;C) optionally be used to connect described Cas12 albumen and the linker of described other albumen or polypeptide;
    例如,所述接头为长度为1-50个氨基酸的接头;For example, the linker is a linker with a length of 1-50 amino acids;
    优选地,所述融合蛋白包含:具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、腺嘌呤脱氨酶(ABE)、以及任选的连接所述Cas12J-8蛋白和所述腺嘌呤脱氨酶(ABE)的接头;Preferably, the fusion protein comprises: Cas12J-8 protein, adenine deaminase (ABE) having the amino acid sequence shown in SEQ ID NO: 1, and optionally connecting the Cas12J-8 protein and the adenine A linker for deaminase (ABE);
    优选地,所述融合蛋白从其N端到C端依次为所述腺嘌呤脱氨酶(ABE)、所述接头、以及所述Cas12J-8蛋白;Preferably, the fusion protein is sequentially the adenine deaminase (ABE), the linker, and the Cas12J-8 protein from its N-terminus to the C-terminus;
    更优选地,所述融合蛋白的氨基酸序列为SEQ ID NO:7所示。More preferably, the amino acid sequence of the fusion protein is shown in SEQ ID NO: 7.
  3. 一种单链向导RNA,其包含CRISPR重复序列,所述CRISPR重复序列具有:A single-stranded guide RNA comprising a CRISPR repeat sequence having:
    a)针对Cas12J-8蛋白、其同源物、缀合物或融合蛋白的SEQ ID  NO:15所示的核酸序列,a) for Cas12J-8 protein, its homologue, conjugate or the nucleic acid sequence shown in SEQ ID NO: 15 of fusion protein,
    针对Mb4Cas12a蛋白、M1Cas12a蛋白和MoCas12a蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:16所示的核酸序列,For the nucleic acid sequence shown in SEQ ID NO of Mb4Cas12a protein, M1Cas12a protein and MoCas12a protein, its homologue, conjugate or fusion protein,
    针对BgCas12a蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:17所示的核酸序列,或For BgCas12a albumen, its homologue, the nucleic acid sequence shown in the SEQ ID NO of conjugate or fusion protein: 17, or
    针对ChCas12b蛋白、其同源物、缀合物或融合蛋白的SEQ ID NO:18所示的核酸序列;For ChCas12b protein, its homologue, conjugate or fusion protein SEQ ID NO: the nucleotide sequence shown in 18;
    或者or
    b)与SEQ ID NO:15至SEQ ID NO:18中任一个所示的核酸序列至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.9%或者至少100%的序列同一性且保留其生物学活性的核酸序列;或者b) at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, A nucleic acid sequence that has at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity and retains its biological activity; or
    c)基于SEQ ID NO:15至SEQ ID NO:18中任一个所述的核酸序列改造得到的且保留其生物学活性的核酸序列,c) a nucleic acid sequence modified based on any one of the nucleic acid sequences described in SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity,
    例如,所述改造为碱基磷酸化、碱基硫化、碱基甲基化、碱基羟基化、序列的缩短和序列的加长中的一种或者多种,For example, the modification is one or more of base phosphorylation, base sulfuration, base methylation, base hydroxylation, sequence shortening and sequence lengthening,
    例如,所述序列的缩短和所述序列加长包括相对于基础序列存在一个、两个、三个、四个、五个、六个、七个、八个、九个或者十个碱基的缺失或者添加。For example, the shortening of the sequence and the lengthening of the sequence include deletions of one, two, three, four, five, six, seven, eight, nine or ten bases relative to the base sequence or add.
  4. 根据权利要求3所述的单链向导RNA,其中,所述单链向导RNA在所述CRISPR重复序列的3’端进一步包括CRISPR间隔序列,所述CRISPR间隔序列为长度为20、21、22、23、24、25、26、27、28、29、30个核苷酸(优选24个核苷酸)且能够与靶序列互补配对的序列。The single-stranded guide RNA according to claim 3, wherein the single-stranded guide RNA further comprises a CRISPR spacer sequence at the 3' end of the CRISPR repeat sequence, and the CRISPR spacer sequence has a length of 20, 21, 22, A sequence of 23, 24, 25, 26, 27, 28, 29, 30 nucleotides (preferably 24 nucleotides) capable of complementary pairing with the target sequence.
  5. 一种分离的核酸分子,所述分离的核酸分子包含编码以下的核酸序列:An isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
    a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
    1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    b)权利要求1所述的缀合物;或者b) the conjugate of claim 1; or
    c)权利要求2所述的融合蛋白;c) fusion protein according to claim 2;
    例如,所述分离的核酸分子包含SEQ ID NO:8、SEQ ID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13中任一个所示的核酸序列或其简并序列;For example, the isolated nucleic acid molecule comprises any one of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, and SEQ ID NO: 13. Nucleic acid sequence or its degenerate sequence;
    例如,所述分离的核酸分子包含编码SEQ ID NO:7所示融合蛋白的核酸序列;For example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO: 7;
    优选地,所述分离的核酸分子包含SEQ ID NO:14所示的核酸序列或其简并序列。Preferably, the isolated nucleic acid molecule comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
  6. 根据权利要求5所述的分离的核酸分子,其中所述分离的核酸分子还包含编码权利要求3至4中任一项所述的与所述Cas12蛋白对应的单链向导RNA的核酸序列;The nucleic acid molecule of separation according to claim 5, wherein the nucleic acid molecule of wherein said separation also comprises the nucleic acid sequence of the single-stranded guide RNA corresponding to the Cas12 protein described in any one of claims 3 to 4;
    例如,所述分离的核酸分子包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列,例如SEQ ID NO:8、或SEQ ID NO:14所示的核酸序列,并且包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:19所示的核酸序列;For example, the isolated nucleic acid molecule comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, its homologue, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7) Nucleic acid sequence, such as SEQ ID NO: 8, or the nucleic acid sequence shown in SEQ ID NO: 14, and comprising SEQ ID NO coding for the Cas12J-8 protein, its homologue, conjugate or fusion protein: The CRISPR repeat sequence shown in 15, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA of the modified sequence, such as the nucleic acid sequence shown in SEQ ID NO: 19;
    例如,所述分离的核酸分子包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列,并且包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:20所示的核酸序列;For example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a Cas12a protein, its homologue, a conjugate or a fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 , such as SEQ ID NO: 9, SEQ ID NO: 10 or the nucleotide sequence shown in SEQ ID NO: 11, and comprising SEQ ID NO coding for the Cas12a protein, its homologue, conjugate or fusion protein: The CRISPR repeat sequence shown in 16, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA of the modified sequence, for example, the nucleic acid sequence shown in SEQ ID NO: 20;
    例如,所述分离的核酸分子包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:12所示的核酸序列,并且包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性 且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:21所示的核酸序列;For example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, its homologue, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 12 , and comprising coding for the BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and retaining its A biologically active homologous sequence, or a nucleic acid sequence comprising a single-stranded guide RNA modified based on SEQ ID NO: 17 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21;
    例如,所述分离的核酸分子包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:13所示的核酸序列,并且包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:22所示的核酸序列。For example, the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 13 , and comprising coding for the ChCas12b protein, its homologues, conjugates or fusion proteins comprising SEQ ID NO: 18 shown in the CRISPR repeat sequence, comprising at least 90% sequence identity with SEQ ID NO: 18 and retaining its A homologous sequence of biological activity, or a nucleic acid sequence of a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22.
  7. 一种分离的核酸分子,所述分离的核酸分子包含编码权利要求3至4中任一项所述的单链向导RNA的核酸序列;An isolated nucleic acid molecule, said isolated nucleic acid molecule comprising a nucleic acid sequence encoding the single-stranded guide RNA described in any one of claims 3 to 4;
    例如,所述分离的核酸分子包含SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、和SEQ ID NO:22中任一个所示的核酸序列或其简并序列,并且优选地还包含编码CRISPR间隔序列的核酸序列。For example, the isolated nucleic acid molecule comprises a nucleic acid sequence or a degenerate sequence thereof shown in any one of SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, and SEQ ID NO: 22, and preferably A nucleic acid sequence encoding a CRISPR spacer sequence is also included.
  8. 一种载体,所述载体包含编码以下的核酸序列:A vector comprising a nucleic acid sequence encoding the following:
    a)Cas12蛋白,所述Cas12蛋白为:a) Cas12 protein, the Cas12 protein is:
    1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、 至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    b)权利要求1所述的缀合物;或者b) the conjugate of claim 1; or
    c)权利要求2所述的融合蛋白;c) fusion protein according to claim 2;
    例如,所述载体包含SEQ ID NO:8、SEQ ID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13中任一个所示的核酸序列或其简并序列;For example, the vector comprises any one of the nucleic acid sequences shown in SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13 or its degenerate sequence;
    例如,所述载体包含编码SEQ ID NO:7所示融合蛋白的核酸序列;For example, the vector comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO: 7;
    优选地,所述载体包含SEQ ID NO:14所示的核酸序列或其简并序列;Preferably, the vector comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence;
    例如,所述载体为质粒载体例如pUC19载体、附着体载体、pAAV2_ITR载体、逆转录病毒载体、慢病毒载体、腺病毒载体或腺相关病毒载体。For example, the vector is a plasmid vector such as pUC19 vector, episomal vector, pAAV2_ITR vector, retroviral vector, lentiviral vector, adenoviral vector or adeno-associated viral vector.
  9. 根据权利要求8所述的载体,其中,所述载体进一步包含编 码权利要求3至4中任一项所述的与所述Cas12蛋白对应的单链向导RNA的核酸序列;The carrier according to claim 8, wherein, the carrier further comprises the nucleic acid sequence of the single-stranded guide RNA corresponding to the Cas12 protein described in any one of the coding claims 3 to 4;
    例如,所述载体包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列,例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列,并且包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:19所示的核酸序列;For example, the vector comprises a nucleic acid sequence encoding a Cas12J-8 protein having an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7) , such as SEQ ID NO: 8 or the nucleotide sequence shown in SEQ ID NO: 14, and comprising coding for the Cas12J-8 protein, its homologue, conjugate or fusion protein comprising the CRISPR shown in SEQ ID NO: 15 A repeat sequence, a homologous sequence comprising at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or a single sequence comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The nucleic acid sequence of the strand guide RNA, such as the nucleic acid sequence shown in SEQ ID NO: 19;
    例如,所述载体包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列,并且包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:20所示的核酸序列;For example, the vector comprises a nucleic acid sequence encoding a Cas12a protein, its homologue, a conjugate or a fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, such as SEQ ID NO: ID NO: 9, SEQ ID NO: 10 or the nucleic acid sequence shown in SEQ ID NO: 11, and comprises coding for the Cas12a protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: shown in 16 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The nucleic acid sequence of the single-stranded guide RNA, such as the nucleic acid sequence shown in SEQ ID NO: 20;
    例如,所述载体包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:12所示的核酸序列,并且包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:21所示的核酸序列;For example, the vector comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homolog thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 12, and includes Encoding for the BgCas12a protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological activity Homologous sequence, or the nucleic acid sequence of a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21;
    例如,所述载体包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列,例如SEQ ID NO:13所示的核酸序列,并且包含编码针对该ChCas12b蛋白、 其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列,例如SEQ ID NO:22所示的核酸序列。For example, the vector comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homologue, a conjugate or a fusion protein thereof, such as a nucleic acid sequence shown in SEQ ID NO: 13, and includes Encoding for the ChCas12b protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological activity Homologous sequence, or the nucleic acid sequence of a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22.
  10. 一种载体,所述载体包含编码权利要求3至4中任一项所述的单链向导RNA的核酸序列;A carrier, said carrier comprising the nucleic acid sequence encoding the single-stranded guide RNA described in any one of claims 3 to 4;
    例如,所述载体包含SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21和SEQ ID NO:22中任一个所示的核酸序列或其简并序列,并且优选地还包含编码CRISPR间隔序列的核酸序列。For example, the vector comprises a nucleic acid sequence or a degenerate sequence thereof shown in any one of SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22, and preferably also contains an encoding CRISPR The nucleic acid sequence of the spacer sequence.
  11. 一种CRISPR/Cas12基因编辑系统,其包含:A CRISPR/Cas12 gene editing system comprising:
    a)蛋白组分,其包含:a) protein component comprising:
    1)Cas12蛋白,所述Cas12蛋白为:1) Cas12 protein, the Cas12 protein is:
    1.1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1.1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    1.2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生 物学活性的氨基酸序列的同源物;1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    2)权利要求1所述的缀合物,或者2) The conjugate of claim 1, or
    3)权利要求2所述的融合蛋白;以及3) the fusion protein of claim 2; and
    b)核酸组分,其包含:权利要求3至4中任一项所述的与a)中的蛋白组分对应的单链向导RNA;B) nucleic acid component, it comprises: the single-stranded guide RNA corresponding to the protein component in a) described in any one in claim 3 to 4;
    并且,所述蛋白组分和所述核酸组分相互结合形成复合物;And, the protein component and the nucleic acid component combine with each other to form a complex;
    例如,所述蛋白组分包含具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:15所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the protein component comprises a Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein, the nucleic acid component comprises a single-stranded guide RNA, and the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 15, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 15 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity;
    例如,所述蛋白组分包含具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:16所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the protein component comprises Cas12a protein, its homologue, conjugate or fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, and the nucleic acid set Part comprises a single-stranded guide RNA, the single-stranded guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 16, comprising at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological A single-stranded guide RNA of an active homologous sequence, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity;
    例如,所述蛋白组分包含具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含 单链向导RNA,所述单链向导RNA为包含SEQ ID NO:17所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the protein component comprises BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, and the nucleic acid component comprises a single-stranded guide RNA, and the single-stranded guide The RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence set forth in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity, or A single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity;
    例如,所述蛋白组分包含具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白,所述核酸组分包含单链向导RNA,所述单链向导RNA为包含SEQ ID NO:18所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA。For example, the protein component comprises ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, its homologue, conjugate or fusion protein, the nucleic acid component comprises a single-stranded guide RNA, and the single-stranded guide The RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence set forth in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity, or A single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
  12. 一种细胞,所述细胞包含:权利要求5至7中任一项所述的分离的核酸分子、或者权利要求8至10中任一项所述的载体;A cell comprising: the isolated nucleic acid molecule of any one of claims 5 to 7, or the vector of any one of claims 8 to 10;
    例如,所述细胞为原核细胞或者真核细胞,所述真核细胞为例如植物细胞或动物细胞,所述动物细胞为例如哺乳动物细胞如人类细胞。For example, the cells are prokaryotic cells or eukaryotic cells, the eukaryotic cells are, for example, plant cells or animal cells, and the animal cells are, for example, mammalian cells such as human cells.
  13. 一种对细胞内或体外环境中的靶序列进行基因编辑的方法,所述方法包括:使以下(1)至(4)中任一项与细胞内或体外环境中的靶序列相接触:A method for gene editing a target sequence in a cell or an in vitro environment, the method comprising: contacting any one of the following (1) to (4) with the target sequence in a cell or an in vitro environment:
    (1)Cas12蛋白、根据权利要求1所述的缀合物或者根据权利要求2所述的融合蛋白,和根据权利要求3至4中任一项所述的与所述Cas12蛋白对应的单链向导RNA,(1) Cas12 protein, the conjugate according to claim 1 or the fusion protein according to claim 2, and the single chain corresponding to the Cas12 protein according to any one of claims 3 to 4 guide RNA,
    其中,所述Cas12蛋白为:Wherein, the Cas12 protein is:
    1)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,1) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白, 或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    2)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    例如,具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, a Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1, its homologue, a conjugate or a fusion protein, and a CRISPR repeat sequence comprising SEQ ID NO: 15, comprising the same sequence as SEQ ID NO: 15 A homologous sequence with at least 90% sequence identity, or a single-stranded guide RNA comprising a modified sequence modified based on SEQ ID NO: 15 and retaining its biological activity;
    例如,具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具 有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, a Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue, a conjugate or a fusion protein thereof, and a CRISPR protein comprising SEQ ID NO: 16 A repeat sequence, a homologous sequence comprising at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or a single sequence comprising an engineered sequence modified based on SEQ ID NO: 16 and retaining its biological activity Strand guide RNA;
    例如,具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、它们的缀合物或融合蛋白的核酸序列,以及包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the nucleic acid sequence of the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologues, their conjugates or fusion proteins, and the CRISPR repeat sequence shown in SEQ ID NO: 17, including and SEQ ID NO: 17 has a homologous sequence with at least 90% sequence identity and retains its biological activity, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retains its biological activity;
    例如,具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, a ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein, and a CRISPR repeat sequence comprising SEQ ID NO: 18, comprising at least A homologous sequence with 90% sequence identity and retaining its biological activity, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity;
    (2)根据权利要求8所述的载体和根据权利要求10所述的载体;(2) The carrier according to claim 8 and the carrier according to claim 10;
    例如,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的载体,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的载体;For example, comprising a nucleic acid sequence (such as SEQ ID NO: 7 shown in the fusion protein) encoding the Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein (such as SEQ ID NO: 7) ID NO: 8 or the vector of the nucleic acid sequence shown in SEQ ID NO: 14), and comprising the CRISPR shown in SEQ ID NO: 15 comprising coding for the Cas12J-8 protein, its homologue, conjugate or fusion protein A repeat sequence, a homologous sequence comprising at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or a single sequence comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the strand guide RNA;
    例如,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的载体,以及包含编码针对该Mb4Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID  NO:20所示的核酸序列)的载体;For example, comprising a nucleic acid sequence encoding a Cas12a protein, a homolog thereof, a conjugate or a fusion protein having an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 (such as SEQ ID NO: 9. The carrier of the nucleic acid sequence shown in SEQ ID NO: 10 or SEQ ID NO: 11), and the vector containing the nucleic acid sequence shown in SEQ ID NO: 16 for the Mb4Cas12a protein, its homologue, conjugate or fusion protein. CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of single-stranded guide RNA;
    例如,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的载体,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的载体;For example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homologue thereof, a conjugate or a fusion protein thereof, and comprising Encoding for the BgCas12a protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological activity Homologous sequence, or the nucleic acid sequence (for example, the nucleic acid sequence shown in SEQ ID NO: 21) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 17 and retaining its biological activity;
    例如,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的载体,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的载体;For example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein thereof, and comprising Encoding for the ChCas12b protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological activity The homologous sequence of or comprising the nucleotide sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA based on SEQ ID NO: 18 transformation obtained and retaining its biological activity;
    (3)根据权利要求9所述的载体;以及(3) The carrier according to claim 9; and
    (4)根据权利要求11所述的CRISPR/Cas12基因编辑系统;(4) CRISPR/Cas12 gene editing system according to claim 11;
    其中,在与靶序列接触后,所述Cas12蛋白、其同源物、缀合物或融合蛋白识别各自的原间隔邻近序列(PAM),所述PAM位于靶序列的5’端,并且,对于所述Cas12J-8蛋白、所述Mb4Cas12a蛋白、所述MlCas12a蛋白、所述MoCas12a蛋白、所述BgCas12a蛋白、和所述ChCas12b蛋白、或它们各自的同源物、缀合物或融合蛋白,所述PAM分别为5’-TTN、5’-YYN、5’-YYN、5’-YYN、5’-YYN和5’-TTN;Wherein, after contacting with the target sequence, the Cas12 protein, its homologue, conjugate or fusion protein recognizes the respective protospacer adjacent sequence (PAM), and the PAM is located at the 5' end of the target sequence, and, for The Cas12J-8 protein, the Mb4Cas12a protein, the MlCas12a protein, the MoCas12a protein, the BgCas12a protein, and the ChCas12b protein, or their respective homologues, conjugates or fusion proteins, the PAMs are 5'-TTN, 5'-YYN, 5'-YYN, 5'-YYN, 5'-YYN and 5'-TTN;
    例如,所述细胞为原核细胞或者真核细胞,所述真核细胞为例如植物细胞或动物细胞,所述动物细胞为例如哺乳动物细胞如人类细胞;For example, the cells are prokaryotic cells or eukaryotic cells, the eukaryotic cells are, for example, plant cells or animal cells, and the animal cells are, for example, mammalian cells such as human cells;
    例如,所述基因编辑包括对靶序列的基因敲除、定点碱基的改变、定点插入、基因转录水平的调控、DNA甲基化调控、DNA乙酰 化修饰、组蛋白乙酰化修饰、单碱基转换以及染色质成像追踪中的一种或者多种,例如,所述单碱基转换包括碱基腺嘌呤到鸟嘌呤的转换、胞嘧啶到胸腺嘧啶的转换或胞嘧啶到尿嘧啶的转换。For example, the gene editing includes gene knockout of target sequence, site-directed base change, site-directed insertion, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification, single base One or more of conversion and chromatin imaging tracking, for example, the single base conversion includes conversion of the base adenine to guanine, cytosine to thymine or cytosine to uracil.
  14. 根据权利要求13所述的方法,其中,所述单链向导RNA的CRISPR间隔序列与所述靶序列形成完全碱基互补配对结构,而与非靶序列形成不完全碱基互补配对的结构;The method according to claim 13, wherein the CRISPR spacer sequence of the single-stranded guide RNA forms a complete base pairing structure with the target sequence, and forms an incomplete base pairing structure with the non-target sequence;
    例如,所述不完全碱基互补配对结构包括一个或者多个例如两个或者更多个碱基错配的结构。For example, the incomplete base pairing structure includes one or more such as two or more base mismatching structures.
  15. 一种试剂盒,所述试剂盒用于对细胞内或者体外环境中的靶序列进行基因编辑,包括:A kit for gene editing a target sequence in a cell or in an in vitro environment, comprising:
    a.选自以下1)至6)中的任一项:a. Any one selected from the following 1) to 6):
    1)Cas12蛋白、根据权利要求1所述的缀合物、或者根据权利要求2所述的融合蛋白,和根据权利要求3至4中任一项所述的与所述Cas12蛋白对应的单链向导RNA,1) Cas12 protein, the conjugate according to claim 1 or the fusion protein according to claim 2, and the single chain corresponding to the Cas12 protein according to any one of claims 3 to 4 guide RNA,
    其中,所述Cas12蛋白为:Wherein, the Cas12 protein is:
    a)具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白,a) Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
    具有SEQ ID NO:2所示氨基酸序列的Mb4Cas12a蛋白,There is the Mb4Cas12a protein of the amino acid sequence shown in SEQ ID NO: 2,
    具有SEQ ID NO:3所示氨基酸序列的MlCas12a蛋白,There is the MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
    具有SEQ ID NO:4所示氨基酸序列的MoCas12a蛋白,MoCas12a protein with the amino acid sequence shown in SEQ ID NO: 4,
    具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白,或There is the BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
    具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白,或者为ChCas12b protein with the amino acid sequence shown in SEQ ID NO: 6, or
    b)具有与SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5和SEQ ID NO:6中任一个所示的氨基酸序列至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%、至少99.1%、至少99.2%、至 少99.3%、至少99.4%、至少99.5%、至少99.6%、至少99.7%、至少99.8%、至少99.9%、至少99.95%、至少99.99%、至少99.999%、至少100%、或者80%-100%中任一百分比的序列同一性并且保留其生物学活性的氨基酸序列的同源物;b) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
    优选地,所述同源物为所述Mb4Cas12a蛋白的点突变体例如T207A、R432G、I902T、T207A-N616S或T207A-N616S-I902T,所述MlCas12a蛋白的点突变体例如V68T、R347Q、V68T-R347Q或V68T-R347Q-V1109K,所述MoCas12a蛋白的点突变体例如Q784E、T902I、I1105K、Q784E-I1105K或Q784E-T902I-I1105K,所述BgCas12a蛋白的点突变体例如Q144R、D148G、V279I、Q144R-D148G或Q144R-D148G-V279I,或者所述ChCas12b蛋白的点突变体例如Q23T、P182R、E507D、E1090K、Q23T-P182R、Q23T-E507D、Q23T-P182R-E507D或Q23T-P182R-E507D-E1090K;Preferably, the homologue is a point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, and a point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q Or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutant of the BgCas12a protein such as Q144R, D148G, V279I, Q144R-D148G Or Q144R-D148G-V279I, or point mutants of the ChCas12b protein such as Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D or Q23T-P182R-E507D-E1090K;
    例如,具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白,以及包含SEQ ID NO:15所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein, and the single-stranded guide RNA comprising the CRISPR repeat sequence shown in SEQ ID NO: 15, comprising and SEQ ID NO: 15 has at least 90% sequence identity and retains the single-stranded guide RNA of the homologous sequence of its biological activity, or comprises the single strand of the modified sequence obtained based on SEQ ID NO: 15 and retains its biological activity Strand guide RNA;
    例如,具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其具有与SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:16所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, the Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 has at least Homologues of amino acid sequences with 80% sequence identity, their conjugates or fusion proteins, and a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 16, comprising at least 90 The single-stranded guide RNA of the homologous sequence of % sequence identity and retains its biological activity, or comprises the single-stranded guide RNA of the transformation sequence obtained based on SEQ ID NO: 16 and retains its biological activity;
    例如,具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其具有与SEQ ID NO:5具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:17所示 CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, a BgCas12a protein having the amino acid sequence shown in SEQ ID NO: 5, its homologue with an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 5, their conjugates or fusion proteins, and comprising A single-stranded guide RNA of the CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological activity, or comprising a single-stranded guide RNA based on SEQ ID NO: 17 ID NO: 17 modified single-stranded guide RNA with a modified sequence that retains its biological activity;
    例如,具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其具有与SEQ ID NO:6具有至少80%序列同一性的氨基酸序列的同源物、它们的缀合物或融合蛋白,以及包含SEQ ID NO:18所示CRISPR重复序列的单链向导RNA、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列的单链向导RNA、或者包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA;For example, a ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, its homologue with an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 6, their conjugates or fusion proteins, and comprising A single-stranded guide RNA of the CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological activity, or comprising a single-stranded guide RNA based on SEQ ID NO: 18 ID NO: 18 modified single-stranded guide RNA with a modified sequence that retains its biological activity;
    2)根据权利要求5所述的分离的核酸分子和根据权利要求7所述的分离的核酸分子;2) The nucleic acid molecule of separation according to claim 5 and the nucleic acid molecule of separation according to claim 7;
    例如,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的分离的核酸分子,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的分离的核酸分子;For example, the nucleic acid sequence (such as SEQ ID NO: 8 or the nucleic acid sequence shown in SEQ ID NO: 14) isolated nucleic acid molecule, and the nucleic acid molecule comprising SEQ ID NO: 15 encoding for the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA;
    例如,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的分离的核酸分子,以及包含编码针对该Cas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:20所示的核酸序列)的分离的核酸分子;For example, a nucleic acid sequence (SEQ ID NO: 9) comprising a Cas12a protein encoding an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, its homologue, a conjugate or a fusion protein , SEQ ID NO: 10 or the nucleic acid sequence shown in SEQ ID NO: 11) isolated nucleic acid molecules, and comprising SEQ ID NO: 16 encoding for the Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity The isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO:20) of the single-stranded guide RNA of sequence;
    例如,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的分离的核酸分子,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的分离的核酸分子;For example, an isolated nucleic acid molecule comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, its homologue, a conjugate or a fusion protein , and comprising a CRISPR repeat sequence coding for the BgCas12a protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and retaining its Biologically active homologous sequence, or the nucleotide sequence (for example, the nucleic acid sequence shown in SEQ ID NO: 21) comprising a single-stranded guide RNA that is modified based on SEQ ID NO: 17 and retains its biological activity. isolated nucleic acid molecules;
    例如,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的分离的核酸分子,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的分离的核酸分子;For example, an isolated nucleic acid molecule comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein , and comprising a CRISPR repeat sequence coding for the ChCas12b protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 18, comprising at least 90% sequence identity with SEQ ID NO: 18 and retaining its Biologically active homologous sequence, or the nucleotide sequence (for example, the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity isolated nucleic acid molecules;
    3)根据权利要求6所述的分离的核酸分子;3) The isolated nucleic acid molecule according to claim 6;
    4)根据权利要求8所述的载体和根据权利要求10所述的载体;4) The carrier according to claim 8 and the carrier according to claim 10;
    例如,包含编码具有SEQ ID NO:1所示氨基酸序列的Cas12J-8蛋白、其同源物、缀合物或融合蛋白(例如SEQ ID NO:7所示的融合蛋白)的核酸序列(例如SEQ ID NO:8或SEQ ID NO:14所示的核酸序列)的载体,以及包含编码针对该Cas12J-8蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:15所示CRISPR重复序列、包含与SEQ ID NO:15具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:15改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:19所示的核酸序列)的载体;For example, comprising a nucleic acid sequence (such as SEQ ID NO: 7 shown in the fusion protein) encoding the Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein (such as SEQ ID NO: 7) ID NO: 8 or the vector of the nucleic acid sequence shown in SEQ ID NO: 14), and comprising the CRISPR shown in SEQ ID NO: 15 comprising coding for the Cas12J-8 protein, its homologue, conjugate or fusion protein A repeat sequence, a homologous sequence comprising at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or a single sequence comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the strand guide RNA;
    例如,包含编码具有SEQ ID NO:2、SEQ ID NO:3或SEQ ID NO:4所示氨基酸序列的Cas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:9、SEQ ID NO:10或SEQ ID NO:11所示的核酸序列)的载体,以及包含编码针对该Cas12a蛋白、其同源 物、缀合物或融合蛋白的包含SEQ ID NO:16所示CRISPR重复序列、包含与SEQ ID NO:16具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:16改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:20所示的核酸序列)的载体;For example, comprising a nucleic acid sequence encoding a Cas12a protein, a homolog thereof, a conjugate or a fusion protein having an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 (such as SEQ ID NO: 9. The carrier of the nucleic acid sequence shown in SEQ ID NO: 10 or SEQ ID NO: 11), and the vector containing the coding sequence shown in SEQ ID NO: 16 for the Cas12a protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity The carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of single-stranded guide RNA;
    例如,包含编码具有SEQ ID NO:5所示氨基酸序列的BgCas12a蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:12所示的核酸序列)的载体,以及包含编码针对该BgCas12a蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:17所示CRISPR重复序列、包含与SEQ ID NO:17具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:17改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:21所示的核酸序列)的载体;For example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homologue thereof, a conjugate or a fusion protein thereof, and comprising Encoding for the BgCas12a protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological activity Homologous sequence, or the nucleic acid sequence (for example, the nucleic acid sequence shown in SEQ ID NO: 21) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 17 and retaining its biological activity;
    例如,包含编码具有SEQ ID NO:6所示氨基酸序列的ChCas12b蛋白、其同源物、缀合物或融合蛋白的核酸序列(例如SEQ ID NO:13所示的核酸序列)的载体,以及包含编码针对该ChCas12b蛋白、其同源物、缀合物或融合蛋白的包含SEQ ID NO:18所示CRISPR重复序列、包含与SEQ ID NO:18具有至少90%序列同一性且保留其生物学活性的同源序列、或包含基于SEQ ID NO:18改造得到的且保留其生物学活性的改造序列的单链向导RNA的核酸序列(例如SEQ ID NO:22所示的核酸序列)的载体;For example, a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein thereof, and comprising Encoding for the ChCas12b protein, its homologues, conjugates or fusion proteins comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological activity The homologous sequence of or comprising the nucleotide sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA based on SEQ ID NO: 18 transformation obtained and retaining its biological activity;
    5)根据权利要求9所述的载体;或者5) The vector according to claim 9; or
    6)根据权利要求11所述的CRISPR/Cas12基因编辑系统;6) CRISPR/Cas12 gene editing system according to claim 11;
    以及as well as
    b.如何对细胞内或体外环境中的靶序列进行基因编辑的说明书。b. Instructions on how to perform gene editing on a target sequence in a cellular or in vitro environment.
PCT/CN2022/096002 2021-05-31 2022-05-30 Cas12 protein, gene editing system containing cas12 protein, and application WO2022253185A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110606220.9 2021-05-31
CN202110606220.9A CN113373130B (en) 2021-05-31 2021-05-31 Cas12 protein, gene editing system containing Cas12 protein and application

Publications (1)

Publication Number Publication Date
WO2022253185A1 true WO2022253185A1 (en) 2022-12-08

Family

ID=77575235

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096002 WO2022253185A1 (en) 2021-05-31 2022-05-30 Cas12 protein, gene editing system containing cas12 protein, and application

Country Status (2)

Country Link
CN (1) CN113373130B (en)
WO (1) WO2022253185A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116179512A (en) * 2023-03-16 2023-05-30 华中农业大学 Endonuclease with wide target recognition range and application thereof

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113373130B (en) * 2021-05-31 2023-12-22 复旦大学 Cas12 protein, gene editing system containing Cas12 protein and application
CN114438055B (en) * 2021-10-26 2022-08-26 山东舜丰生物科技有限公司 Novel CRISPR enzymes and systems and uses
CN114441772B (en) * 2022-01-29 2023-03-21 北京大学 Methods and reagents for detecting target molecules capable of binding to RNA in cells
CN114438056B (en) * 2022-03-03 2023-11-21 吉林省农业科学院 CasF2 protein, CRISPR/Cas gene editing system and application thereof in plant gene editing
WO2023216037A1 (en) * 2022-05-07 2023-11-16 上海鲸奇生物科技有限公司 Development of dna-targeting gene editing tool
WO2023232109A1 (en) * 2022-06-01 2023-12-07 中国科学院遗传与发育生物学研究所 Novel crispr gene editing system
CN116286742B (en) * 2022-09-29 2023-11-17 隆平生物技术(海南)有限公司 CasD protein, CRISPR/CasD gene editing system and application thereof in plant gene editing
WO2024089629A1 (en) * 2022-10-27 2024-05-02 Geneditbio Limited Cas12 protein, crispr-cas system and uses thereof
CN116144631B (en) * 2023-01-17 2023-09-15 华中农业大学 Heat-resistant endonuclease and mediated gene editing system thereof
CN116410955B (en) * 2023-03-10 2023-12-19 华中农业大学 Two novel endonucleases and application thereof in nucleic acid detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018148511A1 (en) * 2017-02-10 2018-08-16 Zymergen Inc. A modular universal plasmid design strategy for the assembly and editing of multiple dna constructs for multiple hosts
CN109312316A (en) * 2016-02-15 2019-02-05 本森希尔生物系统股份有限公司 The composition and method of modifier group
WO2020086144A2 (en) * 2018-08-15 2020-04-30 Zymergen Inc. APPLICATIONS OF CRISPRi IN HIGH THROUGHPUT METABOLIC ENGINEERING
CN112301016A (en) * 2020-07-23 2021-02-02 广州美格生物科技有限公司 Application of novel mlCas12a protein in nucleic acid detection
CN113373130A (en) * 2021-05-31 2021-09-10 复旦大学 Cas12 protein, gene editing system containing Cas12 protein and application

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019201331A1 (en) * 2018-04-20 2019-10-24 中国农业大学 Crispr/cas effector protein and system
US11124783B2 (en) * 2018-09-13 2021-09-21 The Board Of Regents Of The University Of Oklahoma Variant CAS9 proteins with improved DNA cleavage selectivity
EP4023766B1 (en) * 2018-09-20 2024-04-03 Institute Of Zoology, Chinese Academy Of Sciences Method for detecting nucleic acid
US20200216825A1 (en) * 2019-01-08 2020-07-09 Integrated Dna Technologies, Inc. CAS12a MUTANT GENES AND POLYPEPTIDES ENCODED BY SAME
AU2020231380A1 (en) * 2019-03-07 2021-09-23 The Regents Of The University Of California CRISPR-Cas effector polypeptides and methods of use thereof
CN110747187B (en) * 2019-11-13 2022-10-21 电子科技大学 Cas12a protein for identifying TTTV and TTV double-PAM sites, plant genome directed editing vector and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109312316A (en) * 2016-02-15 2019-02-05 本森希尔生物系统股份有限公司 The composition and method of modifier group
WO2018148511A1 (en) * 2017-02-10 2018-08-16 Zymergen Inc. A modular universal plasmid design strategy for the assembly and editing of multiple dna constructs for multiple hosts
WO2020086144A2 (en) * 2018-08-15 2020-04-30 Zymergen Inc. APPLICATIONS OF CRISPRi IN HIGH THROUGHPUT METABOLIC ENGINEERING
CN112301016A (en) * 2020-07-23 2021-02-02 广州美格生物科技有限公司 Application of novel mlCas12a protein in nucleic acid detection
CN113373130A (en) * 2021-05-31 2021-09-10 复旦大学 Cas12 protein, gene editing system containing Cas12 protein and application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN 13 October 2019 (2019-10-13), ANONYMOUS : "type V CRISPR-associated protein Cas12a/Cpf1 [Moraxella bovis]", XP093010372, retrieved from NCBI Database accession no. WP_078273923.1 *
DATABASE PROTEIN 13 October 2019 (2019-10-13), ANONYMOUS : "type V CRISPR-associated protein Cas12a/Cpf1 [Moraxella ovis] ", XP093010373, retrieved from NCBI Database accession no. WP_112744621.1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116179512A (en) * 2023-03-16 2023-05-30 华中农业大学 Endonuclease with wide target recognition range and application thereof
CN116179512B (en) * 2023-03-16 2023-09-15 华中农业大学 Endonuclease with wide target recognition range and application thereof

Also Published As

Publication number Publication date
CN113373130A (en) 2021-09-10
CN113373130B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
WO2022253185A1 (en) Cas12 protein, gene editing system containing cas12 protein, and application
US10781432B1 (en) Engineered cascade components and cascade complexes
Xu et al. Engineered miniature CRISPR-Cas system for mammalian genome regulation and editing
US11840694B2 (en) Truncated CRISPR-Cas proteins for DNA targeting
DK3350327T3 (en) CONSTRUCTED CRISPR CLASS-2-NUCLEIC ACID TARGETING-NUCLEIC ACID
CN113286880A (en) Methods and compositions for regulating a genome
WO2017161068A1 (en) Mutant cas proteins
CN113881652B (en) Novel Cas enzymes and systems and applications
CN112105728A (en) CRISPR/Cas effector proteins and systems
WO2019120193A1 (en) Split single-base gene editing systems and application thereof
CN113015798B (en) CRISPR-Cas12a enzymes and systems
US20230340481A1 (en) Systems and methods for transposing cargo nucleotide sequences
WO2020087631A1 (en) System and method for genome editing based on c2c1 nucleases
CN113583999A (en) Cas9 protein, gene editing system containing Cas9 protein and application
KR102151064B1 (en) Gene editing composition comprising sgRNAs with matched 5' nucleotide and gene editing method using the same
WO2022147157A1 (en) Novel nucleic acid-guided nucleases
WO2021081384A1 (en) Synthetic nucleases
CN113652411A (en) Cas9 protein, gene editing system containing Cas9 protein and application
CN116751762A (en) Cas12b proteins, single stranded guide RNAs, gene editing systems comprising same and related applications
WO2023165613A1 (en) Use of 5'→3' exonuclease in gene editing system, and gene editing system and gene editing method
WO2022188816A1 (en) Improved cg base editing system
CN117025570A (en) Cas12a mutant protein, gene editing system containing Cas12a mutant protein and application
CN116144629A (en) Cas9 protein, gene editing system containing Cas9 protein and application
US20230242922A1 (en) Gene editing tools
CN116804190A (en) SlugCas9 mutant protein and related application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22815230

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22815230

Country of ref document: EP

Kind code of ref document: A1