WO2022253185A1 - Protéine cas12, système d'édition de gènes contenant une protéine de cas12, et application - Google Patents

Protéine cas12, système d'édition de gènes contenant une protéine de cas12, et application Download PDF

Info

Publication number
WO2022253185A1
WO2022253185A1 PCT/CN2022/096002 CN2022096002W WO2022253185A1 WO 2022253185 A1 WO2022253185 A1 WO 2022253185A1 CN 2022096002 W CN2022096002 W CN 2022096002W WO 2022253185 A1 WO2022253185 A1 WO 2022253185A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
acid sequence
nucleic acid
sequence
Prior art date
Application number
PCT/CN2022/096002
Other languages
English (en)
Chinese (zh)
Inventor
王永明
王帅
高思琪
王瑶
Original Assignee
复旦大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 复旦大学 filed Critical 复旦大学
Publication of WO2022253185A1 publication Critical patent/WO2022253185A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0684Cells of the urinary tract or kidneys
    • C12N5/0686Kidney cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/22Vectors comprising a coding region that has been codon optimised for expression in a respective host

Definitions

  • This application belongs to the technical field of gene editing, and specifically relates to Cas12 protein, a gene editing system containing the Cas12 protein and related applications.
  • the CRISPR/Cas system is an adaptive immune system evolved by bacteria and archaea to resist the invasion of foreign viruses or plasmids.
  • crRNA CRISPR-derived RNA
  • Cas12 protein form a complex to recognize the PAM (Protospacer Adjacent Motif) sequence of the target site. After recognition, crRNA will form a complementary structure with the targeted DNA sequence, and the Cas protein will perform the function of cutting DNA, resulting in DNA breakage and damage.
  • the CRISPR/Cas12b system also contains tracrRNA (trans-activating RNA), which forms a complex with crRNA and Cas12b to function.
  • tracrRNA and crRNA can be fused into a single-stranded guide RNA (single guide RNA, sgRNA) through a linking sequence.
  • sgRNA single guide RNA
  • NHEJ non-homologous end-joining
  • HR homologous recombination
  • the CRISPR/Cas12 gene editing system In addition to basic scientific research, the CRISPR/Cas12 gene editing system also has broad clinical application prospects. When using the CRISPR/Cas12 gene editing system for gene therapy, it is necessary to introduce Cas and single-stranded guide RNA into the body. At present, the most effective expression vector for gene therapy is adeno-associated virus (AAV). However, the DNA packaged by AAV virus generally does not exceed 4.5 kb. SpCas9 has been widely used because of its simple PAM sequence (NGG recognition) and high activity. However, the SpCas9 protein has 1368 amino acids, plus sgRNA and promoter, which cannot be effectively packaged into AAV virus, which limits its clinical application.
  • AAV adeno-associated virus
  • Cas9s with small molecular weight including SaCas9 (PAM sequence is NNGRRT); St1Cas9 (PAM sequence is NNAGAW); NmCas9 (PAM sequence is NNNNGATT); Nme2Cas9 (PAM sequence is NNNNCC); The PAM sequence is NNNNRYAC).
  • these Cas9s are either easy to off-target (that is, cut at non-target sites), or have complex PAM sequences, or have low editing activity, making it difficult to be widely used.
  • the present invention provides a conjugate comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity: and
  • the present invention provides a fusion protein comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
  • the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having any one of SEQ ID NO: 15 to SEQ ID NO: 18.
  • Nucleic acid sequence, or a nucleic acid sequence having at least 90% sequence identity with the nucleic acid sequence shown in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity, or having a nucleic acid sequence based on SEQ ID NO: 15 A nucleic acid sequence obtained by modifying the nucleic acid sequence described in any one of SEQ ID NO: 18 and retaining its biological activity.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the following:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Homologues of amino acid sequences that retain their biological activity:
  • the fusion protein of the second aspect of the present invention is fusion protein of the second aspect of the present invention.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
  • a) protein component comprising:
  • Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein respectively having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6, or is Having at least 80% sequence identity with any one of the amino acid sequences shown in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 homologues of amino acid sequences that retain their biological activity;
  • nucleic acid component comprising:
  • the single-stranded guide RNA of the third aspect of the invention is the single-stranded guide RNA of the third aspect of the invention.
  • the present invention provides a cell comprising: the isolated nucleic acid molecule of the sixth aspect of the present invention, or the vector of the seventh aspect of the present invention.
  • the present invention provides a method for gene editing a target sequence in a cell or in vitro, the method comprising: making the Cas12 protein, the conjugate of the first aspect of the present invention or the second aspect of the present invention
  • the Cas12 protein is respectively Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein or ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 1 to SEQ ID NO: 6 , or having at least 80 amino acid sequences as shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 4, S
  • the present invention provides a kit comprising: Cas12 protein, the conjugate of the first aspect of the present invention or the fusion protein of the second aspect of the present invention and the single protein of the third aspect of the present invention Strand guide RNA, the isolated nucleic acid molecule of the fourth aspect and the fifth aspect of the present invention, the carrier of the sixth aspect and the seventh aspect of the present invention, or the CRISPR/Cas12 gene editing system of the eighth aspect of the present invention; or target sequences in an in vitro environment for gene editing; wherein the Cas12 protein is Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a respectively having amino acid sequences shown in SEQ ID NO: 1 to SEQ ID NO: 6 Protein, BgCas12a albumen or ChCas12b albumen, or for having any one in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ
  • the Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors.
  • the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
  • the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing.
  • Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment.
  • Cas12j-8 has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of Cas12j-2, which is more suitable for the development and application research of gene editing.
  • the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and have a relatively simple PAM sequence.
  • PAM is YYN, which expands the field of Cas12a protein and Cas12b protein, and increases the application range of Cas12a protein and Cas12b protein.
  • Figure 1 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Cas12J-8 gene editing system
  • Figure 2 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/ChCas12b gene editing system
  • Figure 3 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/Mb4Cas12a gene editing system
  • Figure 4 shows a schematic diagram of the editing efficiency results after the CRISPR/MoCas12a gene editing system performs gene editing on two target sites;
  • Figure 5 shows a schematic diagram of the editing efficiency results after the CRISPR/BgCas12a gene editing system performs gene editing on two target sites;
  • Figure 6 shows a schematic diagram of the results of editing efficiency after gene editing of two target sites by the CRISPR/MICas12a gene editing system
  • Figure 7 and Figure 8 are schematic diagrams showing the specific detection results of the CRISPR/Cas12J-8 gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 9 shows a schematic diagram of the specific detection results of the CRISPR/ChCas12b gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 10 shows a schematic diagram of the specific detection results of the CRISPR/Mb4Cas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 11 shows a schematic diagram of the specific detection results of the CRISPR/MoCas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 12 shows a schematic diagram of the specific detection results of the CRISPR/BgCas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 13 shows a schematic diagram of the specific detection results of the CRISPR/MICas12a gene editing system in the GFP reporter system HEK293T cell line;
  • Figure 14 shows the results of editing the target sites of each endogenous site by the Cas12J-8ABE base editor.
  • Fig. 15 shows a schematic diagram of using the GFP reporter cell line library to detect the editing of the target gene by the CRISPR/Cas system.
  • Figure 16 shows cell photographs of GFP reporter cell lines processed using several CRISPR/Cas12J gene editing systems, wherein the upper image is a fluorescent image, and the lower image is an ordinary microscopic image.
  • Figure 17 shows a schematic diagram of the editing efficiency of ChCas12b and its point mutants in GFP cell lines.
  • Figure 18 shows a schematic diagram of the editing efficiency of Cas12a and its point mutants in GFP cell lines.
  • Cas12 protein is a protein component of the CRISPR/Cas12 genome editing system, which can target and cut DNA target sequences under the guidance of single-stranded guide RNA (gRNA), forming DNA double-strand breaks (DSB). DNA double-strand breaks can activate the inherent repair mechanisms in cells, non-homologous end joining (NHEJ) and homologous recombination (homologous recombination, HR), thereby repairing DNA damage in cells. During repair, site-specific editing is performed on that specific DNA sequence.
  • gRNA single-stranded guide RNA
  • DSB DNA double-strand breaks
  • NHEJ non-homologous end joining
  • HR homologous recombination
  • point mutant refers to a mutein with one amino acid or multiple amino acid mutations relative to the wild-type protein, within the scope of the term “homologue” of the present invention.
  • point mutants include mutants having single point mutations or multiple point mutations.
  • the notation "AXXXB” is used to denote a point mutant, wherein amino acid A denoting XXX is mutated to B.
  • the point mutant T207A of Mb4Cas12a protein represents the mutant that T (threonine (Thr)) at position 207 of Mb4Cas12a protein is mutated to A (alanine (Ala)).
  • the point mutant T207A-N616S of the Mb4Cas12a protein represents that the T (threonine (Thr)) at the 207th position of the Mb4Cas12a protein is mutated into A (alanine (Ala)) and the N (asparagus at the 616th position) Amide (Asn)) is mutated to S (serine (Ser)) mutants.
  • sgRNA single-stranded guide RNA
  • sgRNA single guided RNA
  • a single-stranded guide RNA or sgRNA may comprise a CRISPR repeat sequence (repeat sequence) and a guide sequence (guide sequence), and the guide sequence is also referred to herein as a guide RNA (guide RNA or gRNA).
  • guide RNA or gRNA guide RNA
  • the guide sequence is also called a spacer.
  • a guide sequence is any polynucleotide sequence that has sufficient similarity to a target sequence to hybridize to and direct specific binding of the CRISPR/Cas12 complex to the target sequence.
  • the degree of complementarity between the guide sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, Or at least 99%. Determining the optimal alignment is within the ability of one of ordinary skill in the art. For example, there are public and commercially available alignment algorithms and programs such as, but not limited to, ClustalW, Smith-Waterman in matlab, Bowtie, Geneious, Biopython, and SeqMan.
  • CRISPR/Cas12 complex refers to the complex formed by the combination of single-stranded guide RNA (single guide RNA) or mature crRNA and the Cas12 protein, which includes hybridization with the target sequence and thus makes the Cas12 protein and Cas12 protein A guide sequence to which the target sequence binds.
  • the complex recognizes and cleaves polynucleotides that hybridize to the single-stranded guide RNA or mature crRNA.
  • target sequence refers to a polynucleotide targeted by a guide sequence designed to be targeted, e.g., a sequence complementary to the guide sequence, wherein the target The hybridization between the sequence and the guide sequence will promote Cas12 to exert its activity, such as the activity of cutting the target sequence. Full complementarity is not required, as long as there is sufficient complementarity to cause hybridization and facilitate Cas12 to exert its activity.
  • a target sequence can include any polynucleotide, such as DNA or RNA.
  • the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located in an organelle of a eukaryotic cell such as the mitochondria or chloroplast.
  • target sequence or "target polynucleotide” as used herein may be any polynucleotide endogenous or exogenous to a cell (eg, a eukaryotic cell).
  • the target polynucleotide can be a polynucleotide present in the nucleus of a eukaryotic cell.
  • the target polynucleotide can be a sequence encoding a gene product (eg, protein) or a non-coding sequence (eg, regulatory polynucleotide or dummy DNA).
  • the target sequence should be related to a protospacer adjacent motif (PAM).
  • PAM protospacer adjacent motif
  • the exact sequence and length requirements for the PAM vary depending on the Cas protein used, but the PAM is typically 2-5 bases sequence adjacent to the protospacer (target sequence). Those skilled in the art will be able to identify the PAM sequence to use with a given Cas protein.
  • nucleic acid sequence As used herein, the terms “polynucleotide”, “nucleic acid sequence”, “nucleotide sequence” or “nucleic acid fragment” are used interchangeably and are single- or double-stranded RNA or DNA polymers, optionally containing synthetic unnatural, or altered nucleotide bases.
  • Nucleotides are referred to by their single letter designations as follows: “A” for adenosine or deoxyadenosine (for RNA or DNA, respectively), “C” for cytidine or deoxycytidine, “G” for guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T,” H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
  • polypeptide used herein are used interchangeably in this application to refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are an artificial chemical analog of the corresponding naturally occurring amino acid, and to naturally occurring amino acid polymers.
  • polypeptide may also include modified forms including, but not limited to, glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxylation ylation and ADP-ribosylation.
  • sequence identity or “homology” as used herein has an art-recognized meaning, and the percent sequence identity between two nucleic acid or polypeptide molecules or regions can be calculated using published techniques. Sequence identity can be measured along the entire length of a polynucleotide or polypeptide, or along a region of the molecule. (See, e.g., Computational Molecular Biology, Lesk,
  • identity is well known to the skilled artisan to be suitable for conservative amino acid substitutions in peptides or proteins and can generally be made unchanged The biological activity of the resulting molecule.
  • those skilled in the art recognize that single amino acid substitutions in non-essential regions of a polypeptide do not substantially alter biological activity (see, e.g., Watson et al., Molecular Biology of the Gene, 4th Edition, 1987, The Benjamin/Cummings Pub.co ., p. 224).
  • vector refers to a nucleic acid delivery vehicle into which a polynucleotide can be inserted.
  • the vector When the vector enables the expression of a protein encoded by the inserted polynucleotide, or when the vector enables the transcription of the inserted polynucleotide (eg, into mRNA or functional RNA), the vector is called an expression vector.
  • a vector can be introduced into a host cell by transformation, transduction or transfection, so that the genetic material elements it carries can be expressed in the host cell.
  • Vectors are well known to those skilled in the art, including but not limited to: plasmid vectors, viral vectors and the like.
  • the vector may also contain various regulatory sequences to regulate expression.
  • regulatory sequence and “regulatory element” are used interchangeably herein to refer to a sequence located upstream (5' non-coding sequence), intermediate or downstream (3' non-coding sequence) of a coding sequence and which affects the transcription of the associated coding sequence, RNA processing or stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoter sequences, transcription initiation sequences, enhancer sequences, selection elements, reporter genes, and the like. The regulatory sequences may be of different origin, or of the same origin but arranged in a manner different from that normally found in nature. In addition, the vector may also contain an origin of replication.
  • promoter refers to a nucleic acid segment capable of controlling the transcription of another nucleic acid segment.
  • the promoter is a promoter capable of controlling the transcription of a gene in a cell, whether or not it is derived from said cell.
  • the promoter may be a constitutive promoter or a tissue specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably and refer to expression that is primarily, but not necessarily exclusively, in one tissue or organ, and may also be expressed in a particular cell or cell type promoter.
  • a “developmentally regulated promoter” refers to a promoter whose activity is determined by developmental events.
  • An “inducible promoter” selectively expresses an operably linked DNA sequence in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • Introducing" a nucleic acid molecule eg, plasmid, linear nucleic acid fragment, RNA, etc.
  • a nucleic acid molecule or protein into an organism means transforming cells of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • Transformation as used in the present invention includes stable transformation and transient transformation.
  • stable transformation refers to the introduction of a foreign nucleotide sequence into the genome, resulting in stable inheritance of the foreign gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • transient transformation refers to the introduction of a nucleic acid molecule or protein into a cell to perform a function without the stable inheritance of a foreign gene. In transient transformation, the exogenous nucleic acid sequence does not integrate into the genome.
  • complementarity refers to the ability of a nucleic acid sequence to form one or more hydrogen bonds with another nucleic acid sequence by means of traditional Watson-Crick or other non-traditional types. Percent complementarity indicates the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with another nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 complementary, then the complementary percentages are 50%, 60%, 70%, 80%, 90% and 100%). "Perfectly complementary” means that all contiguous residues of one nucleic acid sequence hydrogen bond with the same number of contiguous residues in the other nucleic acid sequence.
  • Substantially complementary refers to a group having At least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98 over a region of 30, 35, 40, 45, 50 or more nucleotides %, 99% or 100% degree of complementarity, or refers to two nucleic acids that hybridize under stringent conditions.
  • stringent conditions used herein in relation to hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence primarily hybridizes to the target sequence and does not substantially hybridize to non-target sequences. Stringent conditions are generally sequence-dependent and depend on many factors. In general, the longer the sequence, the higher the temperature at which the sequence will specifically hybridize to its target sequence. Non-limiting examples of stringent conditions are described in “Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization With Nucleic Acid Probes" by Tijssen (1993). ), Part I, Chapter 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assay", Elsevier, New York).
  • hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized by hydrogen bonding of the bases between the nucleotide residues . Hydrogen bonding can occur by means of Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner.
  • the complex may comprise two strands forming a duplex, three or more strands forming a multi-strand complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a broader process such as the initiation of PCR, or cleavage of a polynucleotide by an enzyme. A sequence that is capable of hybridizing to a given sequence is called the "complement" of that given sequence.
  • the Cas12 protein can be derivatized, for example linked to another molecule (eg another protein or polypeptide).
  • another molecule eg another protein or polypeptide
  • derivatization e.g., labeling
  • the Cas12 protein can be functionally connected (by chemical coupling, gene fusion, non-covalent connection or other means) to one or more other molecular parts, such as another protein or polypeptide, detectable label, Pharmaceutical reagents, etc.
  • the Cas12 protein can be linked to other functional units.
  • it can be linked to a nuclear localization signal (NLS) sequence to enhance the ability of the protein of the invention to enter the nucleus.
  • NLS nuclear localization signal
  • it can be linked with a targeting moiety so that the Cas12 protein is targeted.
  • it can be linked with a detectable label to facilitate the detection of Cas12 protein.
  • it can be linked with an epitope tag to facilitate the expression, detection, tracking and/or purification of the Cas12 protein.
  • the present invention provides a conjugate comprising:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the "biological activity" of the so-called Cas12 protein refers to the activity of the protein in conjunction with single-stranded guide RNA, endonuclease activity (including single-strand cleavage activity and double-strand cleavage activity), and/or in the guide RNA (gRNA)-guided binding and cleavage activity at a specific site in a target sequence, but not limited thereto.
  • endonuclease activity including single-strand cleavage activity and double-strand cleavage activity
  • gRNA guide RNA
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the Cas12 protein in addition to the Cas12 protein itself, the Cas12 protein can also be combined with other substances such as other proteins or labelable tags to impart other functions.
  • the modified moiety may be an additional protein or polypeptide, a detectable label, or a combination thereof.
  • said additional protein or polypeptide is selected from an epitope tag, a reporter protein or a nuclear localization signal (NLS) sequence, cytosine deaminase (CBE), adenine deaminase (ABE), Cytosine methylase DNMT3A and MQ1, cytosine demethylase Tet1, transcription activators VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endogenous One or more of Dicer FokI.
  • NLS nuclear localization signal
  • Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • the Cas12 protein of the present invention can be coupled, conjugated or fused to the modified part through a linker, or directly connected to the modified part without a linker.
  • Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, ⁇ -Ala, GABA or Ava), or PEG, etc.
  • the present invention provides a fusion protein comprising:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO :5 and SEQ ID NO: any point mutant of the Cas12 protein shown in 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the The point mutants of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the BgCas12a Point mutants of the protein such as Q144R, D148G
  • the additional protein or polypeptide may be selected from epitope tags, reporter proteins or nuclear localization signal (NLS) sequences, cytosine deaminase (CBE), adenine deaminase (ABE), cytosine methylase DNMT3A One of MQ1, cytosine demethylase Tet1, transcription activator VP64, p65 and RTA, transcription repressor KRAB, histone acetylase p300, histone deacetylase LSD1, and endonuclease FokI or multiple.
  • NLS nuclear localization signal
  • Epitope tags are well known to those skilled in the art, examples of which include but are not limited to His, V5, FLAG, HA, Myc, VSV-G, Trx, etc., and those skilled in the art know how to detection or tracking) to select the appropriate epitope tag.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Reporter proteins are well known to those skilled in the art, and examples thereof include, but are not limited to, GST, HRP, CAT, GFP, HcRed, DsRed, CFP, YFP, BFP, and the like.
  • Detectable labels are well known to those skilled in the art, examples of which include fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • fluorescent dyes such as fluorescein isothiocyanate (FITC) or DAPI.
  • the Cas12 protein of the present invention can be coupled, conjugated or fused with the other protein or polypeptide through a linker, or directly connected with the other protein or polypeptide without a linker.
  • Linkers are well known in the art, examples of which include but are not limited to linkers comprising 1-50 amino acids (such as Glu or Ser) or amino acid derivatives (such as Ahx, ⁇ -Ala, GABA or Ava), or PEG, etc.
  • the fusion protein comprises: Cas12J-8 protein, adenine deaminase (ABE) with the amino acid sequence shown in SEQ ID NO: 1, and optionally connecting the Cas12J-8 protein and the linker of the adenine deaminase (ABE).
  • ABE adenine deaminase
  • the fusion protein is sequentially composed of the adenine deaminase (ABE), the linker, and the Cas12J-8 protein from its N-terminus to its C-terminus.
  • ABE adenine deaminase
  • the linker the linker
  • the Cas12J-8 protein from its N-terminus to its C-terminus.
  • amino acid sequence of the fusion protein is shown in SEQ ID NO: 7.
  • the Cas12j-8 protein has a small number of amino acids, especially the minimum number of amino acids currently available in eukaryotic gene editors, and thus can be efficiently packaged into expression vectors such as adeno-associated virus vectors.
  • the protein has the characteristics of high specificity and simple PAM, and the protein has a small molecular weight and can be easily packaged by vector tools such as adeno-associated virus, which is very suitable for later development as a gene therapy tool.
  • the PAM of the Cas12j-8 protein is TTN, which is simple and has a wide range of editing.
  • Cas12j-8 protein has a significant advantage in editing efficiency at random sites compared with FnCas12a protein, and has a strong gene editing ability in a eukaryotic environment.
  • the Cas12j-8 protein has a very significant editing advantage, and the editing ability at random sites is significantly higher than that of the Cas12j-2 protein, which is more suitable for the development and application research of gene editing.
  • the Cas12a protein and Cas12b protein of the present invention have higher editing activity, higher specificity, and a relatively simple PAM sequence, and the PAM of the Cas12a protein and Cas12b protein is YYN , expanding the field of Cas12a protein and Cas12b protein, and increasing the application range of Cas12a protein and Cas12b protein.
  • the present invention provides a single-stranded guide RNA comprising a CRISPR repeat sequence, the CRISPR repeat sequence having:
  • BgCas12a albumen its homologue, the nucleic acid sequence shown in the SEQ ID NO of conjugate or fusion protein: 17, or
  • ChCas12b protein its homologue, conjugate or fusion protein SEQ ID NO: the nucleotide sequence shown in 18;
  • nucleic acid sequence modified based on the nucleic acid sequence described in any one of SEQ ID NO: 15 to SEQ ID NO: 18 and retaining its biological activity.
  • the modification may be one or more of base phosphorylation, base sulfuration, base methylation, base hydroxylation, sequence shortening and sequence lengthening.
  • shortening of said sequence and lengthening of said sequence comprises the presence of one, two, three, four, five, six, seven, eight, nine or Deletions or additions of ten bases.
  • the single-stranded guide RNA may further include a CRISPR spacer sequence at the 3' end of the CRISPR repeat sequence, and the CRISPR spacer sequence is 20, 21, 22, 23, 24, 25, A sequence of 26, 27, 28, 29, 30 nucleotides (preferably 24 nucleotides) capable of complementary pairing with the target sequence.
  • the CRISPR spacer sequence is a sequence of 24 nucleotides in length and capable of complementary pairing with the target sequence.
  • said single-stranded guide RNA further comprises a terminator at the 3' end of said spacer sequence.
  • the terminator may be a plurality of terminators composed of at least six (eg, seven or eight) Us.
  • the single-stranded guide RNA can combine with the above-mentioned Cas12 protein, conjugate or fusion protein to form a complex, which can recognize the corresponding PAM and thus bind to the target sequence, thereby realizing the shearing of the target sequence or Say gene editing.
  • the present invention provides an isolated nucleic acid molecule comprising a nucleic acid sequence encoding:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, A point mutant of any one of the Cas12 protein shown in SEQ ID NO: 5 and SEQ ID NO: 6, including but not limited to the point mutant of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S- I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutant of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, Point mutants of the BgCas12a protein such as Q144R, D148G,
  • the isolated nucleic acid molecule comprises any of SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13 A nucleic acid sequence as indicated or a degenerate sequence thereof.
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
  • the isolated nucleic acid molecule comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
  • the isolated nucleic acid molecule also encodes the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
  • the isolated nucleic acid molecule comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, its homologue, a conjugate or a fusion protein (such as a fusion shown in SEQ ID NO: 7) Protein) nucleic acid sequence, such as SEQ ID NO: 8, or the nucleic acid sequence shown in SEQ ID NO: 14, and comprises a sequence encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID
  • the nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity such as the nucleic acid sequence shown in SEQ ID NO: 19.
  • the isolated nucleic acid molecule comprises a Cas12a protein encoding an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue, a conjugate or a fusion protein thereof Nucleic acid sequence, such as the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID coding for the Cas12a protein, its homologue, conjugate or fusion protein
  • the CRISPR repeat sequence shown in NO: 16 comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity
  • the nucleic acid sequence of the single-stranded guide RNA of the modified sequence of the activity such as the nucleic acid sequence shown in SEQ ID NO:20.
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 12 Nucleic acid sequence, and comprising encoding for the BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and a modified sequence that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 21 .
  • the isolated nucleic acid molecule comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, such as shown in SEQ ID NO: 13 Nucleic acid sequence, and comprising encoding for the ChCas12b protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 18 shown in the CRISPR repeat sequence, comprising at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence that includes a single-stranded guide RNA that is modified based on SEQ ID NO: 18 and that retains its biological activity, such as the nucleic acid sequence shown in SEQ ID NO: 22 .
  • the invention provides an isolated nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the invention.
  • the nucleic acid molecule of described separation comprises SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21 and the nucleotide sequence shown in any one of SEQ ID NO:22 or its degenerate sequence .
  • said isolated nucleic acid molecule further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
  • the isolated nucleic acid molecule of the present invention can express the Cas12 protein, Cas12 protein, Its conjugate or fusion protein, and/or the above-mentioned single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
  • the isolated nucleic acid molecule of the present invention can express Cas12 protein, its conjugate or fusion protein, and single-stranded guide RNA separately/respectively, or express the expression product integrally, which expression method is selected according to It depends.
  • the present invention provides a vector comprising a nucleic acid sequence encoding the following:
  • Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the vector comprises any one of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, and SEQ ID NO: 13 Nucleic acid sequence or its degenerate sequence.
  • the vector comprises a nucleic acid sequence encoding a fusion protein shown in SEQ ID NO:7.
  • the vector comprises the nucleic acid sequence shown in SEQ ID NO: 14 or its degenerate sequence.
  • the vector may be an expression vector, such as a plasmid vector such as a pUC19 vector, an episomal vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
  • a plasmid vector such as a pUC19 vector, an episomal vector, a pAAV2_ITR vector, a retroviral vector, a lentiviral vector, an adenoviral vector or an adeno-associated viral vector.
  • the vector further comprises a nucleic acid sequence encoding the single-stranded guide RNA corresponding to the Cas12 protein of the third aspect of the present invention.
  • the vector comprises a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7).
  • Nucleic acid sequence for example the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14, and comprises coding for this Cas12J-8 protein, its homologue, conjugate or fusion protein comprising SEQ ID NO: 15 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity.
  • the nucleic acid sequence of the single-stranded guide RNA such as the nucleic acid sequence shown in SEQ ID NO: 19.
  • the vector comprises a nucleic acid sequence encoding a Cas12a protein, a homologue thereof, a conjugate or a fusion protein thereof having an amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4,
  • the nucleic acid sequence shown in SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11, and comprising SEQ ID NO: 16 coding for the Cas12a protein, its homologue, conjugate or fusion protein The shown CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the nucleic acid sequence of the single-stranded guide RNA of sequence for example the nucleic acid sequence shown in SEQ ID NO:20.
  • the vector comprises a nucleic acid sequence encoding a BgCas12a protein having an amino acid sequence shown in SEQ ID NO: 5, a homologue thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 12, And comprise coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological
  • the vector comprises a nucleic acid sequence encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, such as a nucleic acid sequence shown in SEQ ID NO: 13, And comprise coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising and SEQ ID NO: 18 has at least 90% sequence identity and retains its biological
  • the present invention provides a vector comprising a nucleic acid molecule encoding the single-stranded guide RNA of the third aspect of the present invention.
  • the vector comprises the nucleic acid sequence shown in any one of SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22 or a degenerate sequence thereof.
  • the vector further comprises a nucleic acid sequence encoding a CRISPR spacer sequence.
  • the nucleic acid sequence cloned in the vector can be expressed as Cas12 protein, its conjugate or fusion protein, and/or the above-mentioned Single-stranded guide RNA, where it can perform corresponding functions, such as gene editing.
  • multiple vectors such as two vectors, can be transfected into cells, one of which expresses the Cas12 protein, its conjugate or fusion protein, and the other expresses a single-stranded guide RNA. Subsequently, the expressed Cas12 protein, its conjugate or fusion protein complexes with the expressed single-stranded guide RNA to form a complex, where it performs corresponding functions, such as gene editing.
  • nucleic acid sequence encoding the Cas12 protein, its conjugate or fusion protein, and the nucleic acid sequence encoding the single-stranded guide RNA can also be cloned into a vector, so that the vector is transfected into cells to express the Cas12 protein, its conjugate or fusion protein, and the single-stranded guide RNA, and perform corresponding functions here, such as gene editing.
  • the present invention provides a CRISPR/Cas12 gene editing system, which comprises:
  • a) protein component comprising:
  • Cas12 protein the Cas12 protein is:
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • 1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
  • nucleic acid component comprising: a single-stranded guide RNA corresponding to the protein component in a) according to the third aspect of the present invention
  • the protein component and the nucleic acid component combine with each other to form a complex.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the protein component comprises a Cas12J-8 protein with the amino acid sequence shown in SEQ ID NO: 1, its homologue, conjugate or fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single-stranded guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 15, comprising a single strand of a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity Guide RNA, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity.
  • the protein component comprises Cas12a protein, its homologue, conjugate or fusion protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, said
  • the nucleic acid component comprises a single-stranded guide RNA that is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 16, comprising at least 90% sequence identity with SEQ ID NO: 16 and retaining its A single-stranded guide RNA with a biologically active homologous sequence, or a single-stranded guide RNA containing a modified sequence based on SEQ ID NO: 16 and retaining its biological activity.
  • the protein component comprises BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 17 and retaining its biological activity.
  • the protein component comprises a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein
  • the nucleic acid component comprises a single-stranded guide RNA
  • the single The strand guide RNA is a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity , or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
  • the expression "at least 90% sequence identity” mentioned for single-stranded guide RNAs may be, for example, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% , at least 97%, at least 98%, at least 99%, at least 99.9%, or at least 100% sequence identity.
  • the CRISPR/Cas12 gene editing system of the present invention can be directly constituted by the Cas12 protein described herein, its homologue, or their conjugates or fusion proteins and the single-stranded guide RNA described herein, or can be directly composed of the single-stranded guide RNA described herein.
  • the expression product obtained by the vector expression constitutes.
  • the CRISPR/Cas12 gene editing system of the present invention realizes the recognition, positioning, cutting and gene editing of the target sequence through the joint action of the Cas12 protein contained therein and the single-stranded guide RNA.
  • the CRISPR/Cas12 gene editing system of the present invention can precisely locate the target sequence.
  • precise positioning has two meanings: the first meaning means that the CRISPR/Cas12 gene editing system of the present invention can recognize and bind the target sequence itself, and the second meaning means that the CRISPR/Cas12 gene editing system of the present invention can Other proteins fused with the Cas12 protein or proteins that specifically recognize the sgRNA are brought to the position of the target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention has a low tolerance to non-target sequences.
  • the so-called "with low tolerance” means that the CRISPR/Cas12 gene editing system of the present invention is basically unable or completely unable to recognize and bind non-target sequences, or basically unable or completely unable to fuse with the Cas12 protein Other proteins or proteins that specifically recognize the sgRNA are brought to the position of the non-target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention can target more DNA sequences in the genome because the PAM sequence on the target sequence recognized by the Cas12 protein contained therein is simpler.
  • the present invention provides a cell comprising: the isolated nucleic acid molecule of the fourth and fifth aspects of the present invention, or the vector of the sixth and seventh aspects of the present invention.
  • the cells may be prokaryotic or eukaryotic.
  • the eukaryotic cell it may be, for example, a plant cell or an animal cell.
  • the animal cell as an example, it may be a mammalian cell such as a human cell.
  • the present invention provides a method for gene editing a target sequence in a cell or in vitro environment, the method comprising combining any one of the following (1) to (4) with the intracellular or in vitro environment
  • the Cas12 protein is:
  • Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1,
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • BgCas12a protein of the amino acid sequence shown in SEQ ID NO: 5, or
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • the Cas12 protein, its homologue, conjugate or fusion protein recognizes the respective protospacer adjacent sequence (PAM), and the PAM is located at the 5' end of the target sequence, and, for The Cas12J-8 protein, the Mb4Cas12a protein, the MlCas12a protein, the MoCas12a protein, the BgCas12a protein, and the ChCas12b protein, or their respective homologues, conjugates or fusion proteins, the The PAMs are 5'-TTN, 5'-YYN, 5'-YYN, 5'-YYN, 5'-YYN and 5'-TTN, respectively.
  • PAM protospacer adjacent sequence
  • a Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, a homologue thereof, a conjugate or a fusion protein thereof, and a Cas12a protein comprising the amino acid sequence of SEQ ID NO: 16 CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity single-stranded guide RNA;
  • the nucleic acid sequence of the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologues, their conjugates or fusion proteins, and the CRISPR repeat sequence shown in SEQ ID NO: 17, including and SEQ ID NO: 17 has at least 90% sequence identity and retains its biological activity homologous sequence, or comprises a single-stranded guide RNA based on SEQ ID NO: 17 modified sequence obtained and retains its biological activity;
  • a ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, a homolog thereof, a conjugate or a fusion protein, and a CRISPR repeat sequence comprising SEQ ID NO: 18, comprising the same sequence as SEQ ID NO: 18
  • a homologous sequence having at least 90% sequence identity and retaining its biological activity, or a single-stranded guide RNA comprising a modified sequence based on SEQ ID NO: 18 and retaining its biological activity.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • the nucleic acid sequence for example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 19 of the single-stranded guide RNA;
  • a nucleic acid sequence such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11
  • the vector comprising SEQ ID NO: 16 encoding for the Mb4Cas12a protein, its homologue, conjugate or fusion protein
  • the shown CRISPR repeat sequence comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 20 of the single-stranded guide RNA of sequence;
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological
  • the homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
  • the cell is a prokaryotic cell or a eukaryotic cell, such as a plant cell or an animal cell, such as a mammalian cell such as a human cell.
  • the gene editing includes gene knockout of target sequence, site-directed base change, site-directed insertion, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification One or more of , single base conversion, and chromatin imaging tracking.
  • the single base conversion comprises the conversion of the base adenine to guanine, the conversion of cytosine to thymine or the conversion of cytosine to uracil.
  • the CRISPR spacer sequence of the single-stranded guide RNA forms a complete base pairing structure with the target sequence, and forms an incomplete base pairing structure with the non-target sequence.
  • the incomplete base complementary pairing structure refers to a structure including a part of base complementary pairing and a part of non-base complementary pairing including, for example, base mismatch (mismatch) and/or Or base bulge (bulge), etc.
  • the incomplete complementary base pairing structure includes one or more, eg, two or more, base mismatches.
  • the Cas12 protein of the present invention can cut the target site on the target sequence, and under the cleavage action of the Cas12 protein, a double-strand break occurs in the target sequence. Further, when the method is carried out in cells, the cleaved target sequence can be repaired by intracellular non-homologous end joining repair or homologous recombination repair pathway, thereby realizing gene editing of the target sequence.
  • the CRISPR/Cas12 gene editing system of the present invention and the gene editing method using the gene editing system are found through experiments to have 40%-70% (for the Cas12J-8 protein), 12%-56% (for the ChCas12b protein) and 10% %-20% (for each other Cas12a protein) editing efficiency.
  • the mismatch of the first 14bp guide RNA has an error tolerance rate close to 0%. Therefore, the gene editing system can edit target genes with high specificity, has the characteristics of high editing efficiency and low off-target rate, and can be widely used in gene editing in cells or in vitro environments.
  • the present invention provides a kit for gene editing a target sequence in a cell or in an in vitro environment, comprising:
  • the Cas12 protein is:
  • MlCas12a protein of the amino acid sequence shown in SEQ ID NO: 3,
  • ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6,
  • 1.2) have at least 80% of the amino acid sequence shown in any one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 , at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6% , at least 99.7%, at least 99.8%, at least 99.9%, at least 99.95%, at least 99.99%, at least 99.999%, at least 100%, or any percentage of sequence identity between 80%-100% and retain its biological activity Homologues of amino acid sequences;
  • a Cas12J-8 protein having the amino acid sequence shown in SEQ ID NO: 1, a homologue, a conjugate or a fusion protein thereof, and a Cas12J-8 protein comprising the amino acid sequence shown in SEQ ID NO: 15 A single-stranded guide RNA of a CRISPR repeat sequence, a single-stranded guide RNA comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 A single-stranded guide RNA with a modified sequence that retains its biological activity;
  • the Cas12a protein having the amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, which has the same protein as SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4 Homologs of amino acid sequences having at least 80% sequence identity, their conjugates or fusion proteins, and a single-stranded guide RNA comprising the CRISPR repeat sequence shown in SEQ ID NO: 16, comprising the same sequence as SEQ ID NO: 16 A single-stranded guide RNA with at least 90% sequence identity and a homologous sequence that retains its biological activity, or a single-stranded guide RNA that includes a modified sequence based on SEQ ID NO: 16 and retains its biological activity;
  • the BgCas12a protein having the amino acid sequence shown in SEQ ID NO: 5, its homologue with the amino acid sequence having at least 80% sequence identity to SEQ ID NO: 5, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 17, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 17 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 17 and retaining its biological activity;
  • the ChCas12b protein having the amino acid sequence shown in SEQ ID NO: 6, its homologue having an amino acid sequence with at least 80% sequence identity to SEQ ID NO: 6, their conjugates or fusion proteins, And a single-stranded guide RNA comprising a CRISPR repeat sequence shown in SEQ ID NO: 18, a single-stranded guide RNA comprising a homologous sequence with at least 90% sequence identity to SEQ ID NO: 18 and retaining its biological activity, or comprising A single-stranded guide RNA based on the modified sequence of SEQ ID NO: 18 and retaining its biological activity.
  • said homologue can be SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 Any one of the point mutants of the Cas12 protein, including but not limited to the point mutants of the Mb4Cas12a protein such as T207A, R432G, I902T, T207A-N616S or T207A-N616S-I902T, the point mutant of the MlCas12a protein such as V68T, R347Q, V68T-R347Q or V68T-R347Q-V1109K, the point mutants of the MoCas12a protein such as Q784E, T902I, I1105K, Q784E-I1105K or Q784E-T902I-I1105K, the point mutants of the BgCas12a protein such as Q144R, D148G, V279I, Q144R
  • a Cas12J-8 protein encoding an amino acid sequence shown in SEQ ID NO: 1, a homologue thereof, a conjugate or a fusion protein (such as a fusion protein shown in SEQ ID NO: 7) nucleic acid sequence (such as SEQ ID NO: 8 or the nucleic acid sequence shown in SEQ ID NO: 14) the isolated nucleic acid molecule of nucleic acid molecule, and comprise coding for this Cas12J-8 albumen, its homologue, conjugate or fusion protein comprising SEQ ID NO:
  • the isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 19) of the single-stranded guide RNA of the modified sequence;
  • nucleic acid sequence (SEQ ID NO : 9, SEQ ID NO: 10 or the nucleic acid sequence shown in SEQ ID NO: 11) isolated nucleic acid molecules, and comprising SEQ ID NO coding for the Cas12a protein, its homologue, conjugate or fusion protein : CRISPR repeat sequence shown in 16, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 16 and retaining its biological activity
  • isolated nucleic acid molecule of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of the modified sequence;
  • the isolated nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 12) comprising the BgCas12a protein encoding the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein Nucleic acid molecule, and comprising coding for this BgCas12a protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 17 and that retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 21 ) isolated nucleic acid molecule;
  • an isolated nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) comprising a ChCas12b protein encoding an amino acid sequence shown in SEQ ID NO: 6, a homologue thereof, a conjugate or a fusion protein thereof Nucleic acid molecule, and comprising coding for this ChCas12b albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and A homologous sequence that retains its biological activity, or a nucleic acid sequence comprising a single-stranded guide RNA that is transformed based on SEQ ID NO: 18 and retains its biological activity (for example, the nucleic acid sequence shown in SEQ ID NO: 22 ) isolated nucleic acid molecule.
  • a nucleic acid sequence shown in SEQ ID NO: 13 comprising a ChCas12b protein encoding an amino acid sequence shown in SEQ ID NO:
  • the nucleic acid sequence for example, the vector of the nucleic acid sequence shown in SEQ ID NO: 8 or SEQ ID NO: 14), and the vector comprising SEQ ID NO: 15 encoding the Cas12J-8 protein, its homologue, conjugate or fusion protein CRISPR repeat sequence, comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 15 and retaining its biological activity, or comprising a modified sequence based on SEQ ID NO: 15 and retaining its biological activity
  • the carrier of the nucleic acid sequence such as the nucleic acid sequence shown in SEQ ID NO: 19 of the single-stranded guide RNA;
  • a nucleic acid sequence (such as SEQ ID NO: 9, SEQ ID NO: 10 or the vector of the nucleic acid sequence shown in SEQ ID NO: 11), and the vector comprising SEQ ID NO: 16 encoding for the Cas12a protein, its homologue, conjugate or fusion protein
  • the shown CRISPR repeat sequence comprising a homologous sequence having at least 90% sequence identity with SEQ ID NO: 16 and retaining its biological activity, or comprising a modification based on SEQ ID NO: 16 and retaining its biological activity
  • the carrier of the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 20) of the single-stranded guide RNA of sequence;
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 12) encoding the BgCas12a protein with the amino acid sequence shown in SEQ ID NO: 5, its homologue, conjugate or fusion protein, And comprising coding for this BgCas12a albumen, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 17, comprising having at least 90% sequence identity with SEQ ID NO: 17 and retaining its biological
  • a vector comprising a nucleic acid sequence (such as a nucleic acid sequence shown in SEQ ID NO: 13) encoding a ChCas12b protein having an amino acid sequence shown in SEQ ID NO: 6, its homologue, a conjugate or a fusion protein, And comprising coding for this ChCas12b protein, its homologue, conjugate or fusion protein comprising the CRISPR repeat sequence shown in SEQ ID NO: 18, comprising having at least 90% sequence identity with SEQ ID NO: 18 and retaining its biological
  • the homologous sequence of biological activity, or the nucleic acid sequence (such as the nucleic acid sequence shown in SEQ ID NO: 22) of the single-stranded guide RNA comprising the modified sequence obtained based on SEQ ID NO: 18 and retaining its biological activity .
  • kit of the present invention may also contain other reagents that are helpful for gene editing.
  • SEQ ID NO: 1 Cas12J-8 protein sequence
  • SEQ ID NO: 2 Mb4Cas12a protein sequence
  • SEQ ID NO: 3 MlCas12a protein sequence
  • SEQ ID NO: 4 MoCas12a protein sequence
  • SEQ ID NO: 5 BgCas12a protein sequence
  • SEQ ID NO: 6 ChCas12b protein sequence
  • SEQ ID NO: 7 fusion protein comprising Cas12J-8 protein
  • SEQ ID NO: 8 Coding sequence of Cas12J-8 protein
  • SEQ ID NO: 9 Coding sequence of Mb4Cas12a protein
  • SEQ ID NO: 10 coding sequence of MlCas12a protein
  • SEQ ID NO: 11 Coding sequence of MoCas12a protein
  • SEQ ID NO: 12 coding sequence of BgCas12a protein
  • SEQ ID NO: 13 Coding sequence of ChCas12b protein
  • SEQ ID NO: 14 fusion protein coding sequence comprising Cas12J-8 protein
  • SEQ ID NO: 15 CRISPR repeat sequence used in conjunction with Cas12J-8 protein
  • SEQ ID NO: 16 CRISPR repeat sequence used in conjunction with Mb4Cas12a, MlCas12a and MoCas12a proteins
  • SEQ ID NO: 17 CRISPR repeat sequence used in conjunction with BgCas12a protein
  • SEQ ID NO: 18 CRISPR repeat sequence used in conjunction with ChCas12b protein
  • SEQ ID NO: 19 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Cas12J-8 protein
  • SEQ ID NO: 20 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with Mb4Cas12a, MlCas12a, and MoCas12a proteins
  • SEQ ID NO: 21 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with BgCas12a protein
  • SEQ ID NO: 22 DNA sequence of CRISPR repeat sequence of single-stranded guide RNA associated with ChCas12b protein
  • SEQ ID NO: 23 Cas12J-4 protein sequence
  • SEQ ID NO: 24 Cas12J-5 protein sequence
  • SEQ ID NO: 25 Cas12J-7 protein sequence
  • SEQ ID NO: 26 Cas12J-9 protein sequence
  • SEQ ID NO: 27 Coding sequence of Cas12J-4 protein
  • SEQ ID NO: 28 Coding sequence of Cas12J-5 protein
  • SEQ ID NO: 29 Coding sequence of Cas12J-7 protein
  • SEQ ID NO: 30 Coding sequence of Cas12J-9 protein
  • SEQ ID NO: 31 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-4 protein
  • SEQ ID NO: 32 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-5 protein
  • SEQ ID NO: 33 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-7 protein
  • SEQ ID NO: 34 DNA sequence of CRISPR repeat sequence used in conjunction with Cas12J-9 protein
  • each Cas12 protein listed in Table 1 download its amino acid sequence, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 To SEQ ID NO: shown in 6.
  • Cas12 protein name NCBI Protein Search ID amino acid sequence Cas12J-8 none SEQ ID NO: 1 Mb4Cas12a WP_078273923.1 SEQ ID NO: 2 M1Cas12a WP_065256572.1 SEQ ID NO: 3 MoCas12a WP_112744621.1 SEQ ID NO: 4 BgCas12a OLA11341.1 SEQ ID NO: 5 ChCas12b OQB30769 SEQ ID NO: 6
  • the coding nucleic acid sequences of the above Cas12 proteins were codon-optimized to obtain the gene sequences of the highly expressed Cas12 proteins in human cells.
  • Cas12J-8 protein The coding nucleic acid sequences of the above Cas12 proteins were codon-optimized to obtain the gene sequences of the highly expressed Cas12 proteins in human cells.
  • Mb4Cas12a protein MlCas12a protein
  • MoCas12a protein MlCas12a protein
  • BgCas12a protein BgCas12a protein
  • ChCas12b protein The optimized gene sequences of Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively shown in SEQ ID NO: 8 to SEQ ID NO: 13.
  • the highly expressed gene sequences of each Cas12 protein obtained above from SEQ ID NO: 8 to SEQ ID NO: 13 were gene synthesized and constructed on the slugCas9 backbone plasmid (Addgene platform, catalog #163793) to obtain the plasmid pAAV2_Cas12_ITR.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases.
  • the digestion system was: 1 ⁇ g plasmid psk-BbsI - Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI-Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
  • the pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid hU6-sa-tracr-BsaI, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased at NEB Company), 1 ⁇ L BsaI and 1 ⁇ L NotI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 3 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested with BbsI restriction endonuclease.
  • the digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA fragment is the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR comprising the coding gene of each Cas12 protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp ( MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • Each gRNA was designed, and its sequence is shown in Table 2. Add cohesive end sequences corresponding to both sides of the linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR to the sense strand and antisense strand of each designed gRNA sequence pair, and synthesize two oligonucleotide single-stranded DNAs. The specific sequence of the oligonucleotide single-stranded DNA is also shown in the table below.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • HEK293T cells containing the target sequence were plated in a 6-well plate, and the cell density was about 30%.
  • Opti-MEM medium purchased from Gibco
  • liposome transfection reagent 2000 (purchased from Invitrogen) or polyethyleneimine (hereinafter referred to as PEI) (purchased from polysciences) flicked and mixed, pipette 5 ⁇ L 2000 or PEI was added to 100 ⁇ L of Opti-MEM medium (purchased from Gibco), mixed gently, and allowed to stand at room temperature for 5 minutes.
  • PEI polyethyleneimine
  • HEK293T cells were collected three days after editing, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
  • a DNA kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304
  • PCR primers are as follows:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • PCR primers are as follows:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
  • each Cas12 protein listed in Table 1 above download its amino acid sequence information, wherein the amino acid sequences of Cas12J-8 protein, Mb4Cas12a protein, MICas12a protein, MoCas12a protein, BgCas12a protein and ChCas12b protein are respectively as SEQ ID NO: 1 to SEQ ID NO: shown in 6.
  • Codon optimization was carried out on the coding nucleic acid sequence of the Cas12 protein obtained above to obtain the gene sequence of the Cas protein highly expressed in human cells.
  • the gene sequences of Cas12J-8 protein, Mb4Cas12a protein, MlCas12a protein, MoCas12a protein, BgCas12a protein protein and ChCas12b are shown in SEQ ID NO: 8 to SEQ ID NO: 13 respectively.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid (Addgene platform, commercially available, catalog #74707) was digested with BbsI and XhoI restriction endonucleases.
  • the digestion system was: 1 ⁇ g plasmid psk-BbsI -Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the repeat sequence on the Cas12j-8 protein genome (its DNA sequence is SEQ ID NO: 19), the repeat sequence is gene-synthesized and constructed on the linearized pBluescriptSKII+U6-sgRNA(F+E)empty backbone, The plasmid Cas12J-8-PSK-u6-crRNA was obtained.
  • the pBluescriptSKII+U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction enzymes.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI-Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the truncated repeat sequences (the DNA sequences of which are respectively SEQ ID NO: 20 and SEQ ID NO: 21) were gene synthesized and constructed in linearized pBluescriptSKII+U6-sgRNA (F+ E) On the empty backbone, the plasmid psk-BbsI-Cas12a-crRNA1 is obtained.
  • the pX330_sgACTA2 plasmid (Addgene platform, catalog #63712) was digested with BsaI and NotI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid hU6-sa-tracr-BsaI, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased at NEB Company), 1 ⁇ L BsaI and 1 ⁇ L NotI restriction endonuclease (purchased from NEB Company), and make up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 3 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 2998bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • RNA Scaffold sequence (its DNA sequence is SEQ ID NO: 22) according to the secondary structure, carry out gene synthesis on this sequence, and construct it in the linearized hU6- On the sa-tracr-BsaI backbone, the plasmid hU6-OQB30769_tracr-Bsa1 was obtained.
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 plasmids expressing the corresponding sgRNA of each protein in (2) were linearized by PCR method .
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-8-PSK-u6-crRNA, psk-BbsI-Cas12a-crRNA1 and hU6-OQB30769_tracr-Bsa1 fragments were homologously recombined according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease.
  • the enzyme digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10xCutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding gene of each Cas protein above, and its size is respectively 7135bp (Cas12J-8 protein), 7866bp (Mb4Cas12a protein), 7875bp (MlCas12a protein), 7998bp (MoCas12a protein), 7875bp (BgCas12a) and 8606bp (ChCas12b).
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the obtained products were respectively ligated to the obtained linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein, wherein the PAM sequence is shown in Figure 7 to Figure 13, the sequence of the target site (target site) For GGATATGTTGAAGAACACCATGAC.
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO 2 incubator to continue culturing.
  • the flow cytometric analysis technology analyzes the editing efficiency and off-target rate of the CRISPR gene editing system of the present invention on the target sequence.
  • the HEK293T cell line cultured in a CO 2 incubator for 3 days was collected, its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the Y-axis in the lower histograms in Figure 7 to Figure 13 represents the percentage (%) of GFP-positive cells
  • the X-axis represents the oligonucleotide single-stranded DNA sequence corresponding to On-target gRNA and mismatch gRNA.
  • SlugABEmax plasmid (Addgene platform, catalog#163798) as a template for PCR reaction, and the primer sequence is:
  • Primer 1 TCTGGTGGTTTCTCCCAAGAAGA
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel with a voltage of 120V for 30min, and a DNA fragment of 4152bp was purified with a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Or store at -20°C for long-term storage.
  • homologous recombination was performed on the linearized SlugABEmax backbone fragment and the humanized Cas12J-8 fragment (SEQ ID NO: 8) synthesized by the company according to the ratio required by the instructions.
  • the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed at 120V for 30 min on a 1% agarose gel, and a DNA fragment of 6305 bp was purified using a gel recovery kit according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and T4 PNK treatment and T4 DNA ligase treatment were carried out respectively, and the reaction system was as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the pAAV2_envTadA-dCas12J-8_ITR plasmid was digested with Kpn1 and Not1 restriction enzymes (NEB), and the reaction system was: 2 ⁇ g plasmid pAAV2_envTadA-dCas12J-8_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L Kpn1 restriction endonuclease (purchased from NEB Company), 1 ⁇ L Not1 restriction endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment pAAV2_envTadA-dCas12J-8_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • the primer sequence is:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1.5% agarose gel with a voltage of 120V for 30min, and the gel recovery kit was used to purify the Cas12J-8 crRNA DNA fragment of 394bp according to the steps provided by the manufacturer, which was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific) DNA concentration, spare or store at -20°C for long-term storage.
  • a NanoDrop TM Lite spectrophotometer Thermo Scientific
  • the linearized pAAV2_envTadA-dCas12J-8_ITR fragment and the Cas12J-8crRNA fragment were subjected to homologous recombination according to the ratio required by the instructions, and the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the pAAV2_envTadA-dCas12J-8-crRNA_ITR plasmid was digested with BbsI restriction endonuclease, and the digestion system was: 2 ⁇ g plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized plasmid pAAV2_envTadA-dCas12J-8-crRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term storage.
  • the endogenous site target sequence that meets the PAM requirements of the Cas12J-8 protein is randomly selected in the human genome, and the corresponding oligonucleotide single-stranded DNA is shown in the table below.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_envTadA-dCas12J-8-crRNA_ITR vector by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR plasmids were transfected into wild-type HEK293T cell lines by liposomes.
  • transfection process comprises the following steps:
  • HEK293T cell lines were plated in 6-well plates according to transfection requirements, and the cell density was controlled at 30%.
  • plasmid Take 2 ⁇ g of the plasmid to be transfected, pAAV2_envTadA-dCas12J-8-crRNA-gRNA_ITR, and add it to 100 ⁇ L of Opti-MEM medium (purchased from Gibco), and gently mix by pipetting.
  • Opti-MEM medium purchased from Gibco
  • HEK293T cells seven days after editing were collected, and genomic DNA was extracted with a DNA kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304) according to the instructions provided by the DNA kit.
  • a DNA kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP304
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the second round of PCR for PCR library construction was carried out, and 2 ⁇ Q5 Mastermix was used for PCR reaction, and the PCR primers were the same as the F2 primers and R2 primers given in Example 1 above.
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the second-round PCR product was purified with a gel recovery kit according to the steps provided by the manufacturer to purify the DNA fragment, and thus the next-generation sequencing library was prepared.
  • next-generation sequencing library was subjected to paired-end sequencing on the high-throughput sequencer HiseqXTen (illumina).
  • next-generation sequencing results were calculated to obtain the editing ratio of adenine A that meets the editing requirements in each endogenous site target site, and the results are shown in FIG. 14 . It can be seen from the figure that the Cas12J-8ABE base editor has successfully edited these several endogenous site target sites, and only 938 cells containing the Cas12J-8ABE base editor protein A single amino acid can be easily packaged by AAV virus, thus making it possible to apply the CRISPR single base editor system in gene therapy of organisms.
  • the coding nucleic acid sequence of each Cas12 protein is codon-optimized, and the gene sequence of the Cas12 protein highly expressed in human cells is obtained.
  • the gene sequences of Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8 and Cas12J-9 proteins are shown in SEQ ID NO: 27-29, 8 and 30 respectively.
  • the highly expressed gene sequences of Cas12 proteins obtained above as shown in SEQ ID NO: 27-29, 8 and 30 were gene-synthesized, and respectively constructed on the slugCas9 backbone plasmid (Addgene platform, catalog#163793) to obtain each plasmid pAAV2_Cas12_ITR.
  • the pBluescriptSKII+ U6-sgRNA(F+E)empty plasmid was digested with BbsI and XhoI restriction endonucleases.
  • the enzyme digestion system was: 1 ⁇ g plasmid psk-BbsI- Sasg, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L BbsI and 1 ⁇ L XhoI restriction endonuclease (purchased from NEB Company), make up to 50 ⁇ L with water.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • the 3296bp DNA fragment was excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the pAAV2_Cas12_ITR plasmid expressing Cas12 protein in (1) and the Cas12J-PSK-u6-crRNA plasmid expressing sgRNA corresponding to each protein in (2) were linearized by PCR method.
  • primer sequences are:
  • the primer sequences are:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on 1% agarose gel with 120V voltage for 30min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer, and the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific). Store at -20°C for long-term storage.
  • the linearized pAAV2_Cas12_ITR fragment and the linearized Cas12J-PSK-u6-crRNA fragment were homologously recombined according to the ratio required by the instructions, and the homologous recombination enzyme used was High-fidelity DNA Assembly Master Mix (NEB), the reaction system is as follows:
  • the reaction conditions are as follows:
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the Escherichia coli DH5 ⁇ clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain each plasmid pAAV2_Cas12-hU6-sgRNA_ITR for future use.
  • Each plasmid pAAV2_Cas12-hU6-sgRNA_ITR prepared in (3) was digested and linearized with BbsI restriction endonuclease.
  • the enzyme digestion system was: 1 ⁇ g plasmid pAAV2_Cas12-hU6-sgRNA_ITR, 5 ⁇ L 10xCutSmart buffer (purchased from NEB Company ), 1 ⁇ L BbsI restriction endonuclease (purchased from NEB Company), and water to make up to 50 ⁇ L.
  • the enzyme cleavage system was reacted at 37° C. for 1 hour.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • the DNA fragment is the linearized plasmid pAAV2_Cas12_ITR comprising the coding genes of the above Cas proteins.
  • the DNA concentration of the recovered linearized plasmid pAAV2_Cas12-hU6-sgRNA_ITR was measured with a NanoDrop TM Lite spectrophotometer NanoDrop (Thermo Scientific), and stored at -20°C for long-term storage.
  • Oligo-F GGATATGTTGAAGAACACCATGAC
  • Oligo-R GTCATGGTGTTCTTCAACATATCC
  • the cohesive ends of Oligo-F for Cas12J-4, Cas12J-5, Cas12J-7, Cas12J-8, and Cas12J-9 are CGAC, GGAC, AGAC, AGAC, and AGAC, respectively, and Oligo-R for all Cas12 proteins
  • the cohesive ends of both are AAAA.
  • the oligonucleotides are annealed to single-stranded DNA to obtain double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the linearized pAAV2_Cas12-hU6-sgRNA_ITR plasmid obtained in step (2) by DNA ligase (purchased from NEB Company).
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the GFP reporter system HEK293T cell line library containing the target sequence is obtained in the following manner: a 5bp random sequence (as a PAM sequence) and a 24bp protospacer (as a target sequence) are inserted between the initiation codon ATG and the GFP coding sequence ), resulting in GFP frameshift mutation and no expression.
  • the GFP gene containing the insert was driven by the CMV promoter and constructed on a lentiviral expression vector. This sequence is randomly inserted into the genome of HEK293T cells by lentivirus, making it a stable GFP reporter cell line library.
  • the cells When the gene editing system is used to cut the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence, and the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry .
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line library containing the target sequence was plated on a 6-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line library containing the target sequence comprises the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is a 5bp random sequence, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC (FIG. 15).
  • Opti-MEM medium purchased from Gibco
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 20 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line library containing the target sequence, and Place it in a 37°C, 5% CO 2 incubator to continue culturing.
  • the pAAV2_Cas12_ITR (SEQ ID NO: 13) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer.
  • the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific)
  • the reaction system is as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the hU6-OQB30769_tracr-Bsa1 plasmid was digested with Bsa1 restriction endonuclease (NEB), and the reaction system was: 2 ⁇ g plasmid hU6-OQB30769_tracr-Bsa1, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L Bsa1 restriction Endonuclease (purchased from NEB Company), water to make up to 50 ⁇ L.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment hU6-OQB30769_tracr-Bsa1 was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the obtained product was ligated to the obtained linearized hU6-OQB30769_tracr-Bsa1 plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting plasmids Q23T, P182R, E507D, E1090K, Q23T-P182R, Q23T-E507D, Q23T-P182R-E507D and Q23T-P182R-E507D-E1090K expressing the ChCas12b point mutant protein were mixed with hU6-OQB30769_tracr-Bsa1-on
  • the sgRNA was co-transfected into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposome.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is CGTTG, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
  • PEI purchased from polysciences company
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
  • the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the editing efficiency results of the ChCas12b point mutant in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 17 .
  • the ChCas12b point mutant cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to produce green fluorescence.
  • the Y-axis in Figure 17 represents the percentage (%) of GFP-positive cells, the X-axis represents ChCas12b and its different point mutants, and NC represents the negative control group (no transfection plasmid).
  • the pAAV2_Cas12_ITR (SEQ ID NO: 9 to SEQ ID NO: 12) plasmid was used as a template for a circular PCR reaction, and the primer sequences are shown in the table below:
  • the reaction system is as follows:
  • the PCR running procedure is as follows:
  • the PCR product was electrophoresed on a 1% agarose gel at 120V for 30 min, and the gel recovery kit was used to purify the target DNA fragment according to the steps provided by the manufacturer.
  • the DNA concentration was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific)
  • the reaction system is as follows:
  • the reaction conditions are as follows:
  • T4 DNA ligase N4 DNA ligase
  • Escherichia coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • incubate on ice for 30 min heat shock at 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C 1 hour for the activation and recovery of Escherichia coli DH5 ⁇ competent cells.
  • the recovered Escherichia coli DH5 ⁇ competent cells were spread on LB solid plates containing ampicillin resistance and cultured upside down in a 37°C incubator, and the obtained Escherichia coli DH5 ⁇ monoclonals were verified by Sanger sequencing.
  • the Escherichia coli DH5 ⁇ clone with correct connection verified by sequencing was shaken, and the plasmid was extracted to obtain the point mutant plasmid, which was stored for later use or stored at -20°C.
  • the psk-BbsI-Cas12a-crRNA1 plasmid was digested with BbsI restriction endonuclease (NEB), and the reaction system was: 2 ⁇ g plasmid psk-BbsI-Cas12a-crRNA1, 5 ⁇ L 10 ⁇ CutSmart buffer (purchased from NEB Company), 1 ⁇ L of BbsI restriction endonuclease (purchased from NEB Company), made up to 50 ⁇ L with water.
  • the enzyme cleavage system was allowed to react at 37°C for 2 hours.
  • digested products were electrophoresed on 1% agarose gel at 120V for 30min.
  • DNA fragments were excised from the agarose gel, recovered with a gel recovery kit (Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209) according to the instructions provided by the manufacturer, and finally eluted with ultrapure water.
  • a gel recovery kit Tiangen Biochemical Technology (Beijing) Co., Ltd., DP209
  • the DNA concentration of the recovered linearized fragment psk-BbsI-Cas12a-crRNA1 was measured with a NanoDrop TM Lite spectrophotometer (Thermo Scientific), and stored at -20°C for long-term use.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L water.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • the resulting product was ligated to the resulting linearized psk-BbsI-Cas12a-crRNA1 plasmid by DNA ligase (purchased from NEB Company).
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • E. coli DH5 ⁇ competent cells purchased from Shanghai Weidi Biotechnology Co., Ltd.
  • heat shock 42°C for 1 min
  • incubate on ice for 2 min add 900 ⁇ L LB medium
  • incubate at 37°C
  • the Escherichia coli DH5 ⁇ competent cells were activated and recovered.
  • the revived E. coli DH5 ⁇ competent cells were spread on the LB solid plate containing the corresponding resistance and cultured upside down in a 37°C incubator, and the obtained E. coli DH5 ⁇ monoclonal was verified by Sanger sequencing.
  • the resulting plasmids expressing the Cas12a point mutant protein were co-transfected with psk-BbsI-Cas12a-crRNA1-on target sgRNA into the GFP reporter system HEK293T cell line library containing the target sequence (GGATATGTTGAAGAACACCATGAC) by liposomes.
  • the HEK293T cell line of the GFP reporter system containing the target sequence is obtained by inserting a PAM sequence and a specific target sequence between the initiation codon ATG and the GFP coding sequence, causing a GFP frameshift mutation, and then by slowly
  • the virus infection was integrated into HEK293T cells, and the HEK293T cell line containing the GFP reporter system containing the target sequence was obtained.
  • the gene editing system cuts the target sequence, the cells will restore the GFP reading frame through the self-repair system to generate green fluorescence.
  • the editing ability and specificity of the gene editing system can be evaluated by counting the ratio of GFP positive cells by flow cytometry.
  • transfection process comprises the following steps:
  • the GFP reporter system HEK293T cell line containing the target sequence was plated in a 48-well plate, and the cell density was controlled at 30%.
  • the GFP reporter system HEK293T cell line containing the target sequence contains the nucleotide sequence of CMV-ATG-PAM-target site-GFP, wherein the PAM sequence is GTTTT, and the sequence of the target site (target site) is GGATATGTTGAAGAACACCATGAC.
  • PEI purchased from polysciences company
  • the diluted plasmid and the diluted transfection reagent were mixed, gently blown and mixed, and the resulting mixture was allowed to stand at room temperature for 15 minutes, and then added to the culture medium of the GFP reporter system HEK293T cell line containing the target sequence, and It was placed in a 37°C, 5% CO2 incubator to continue culturing.
  • the HEK293T cell line cultured in a CO2 incubator for 5 days was collected, and its specificity was detected by flow cytometry (BD Biosciences FACSCalibur), and the positive ratio of GFP was analyzed and plotted by FlowJo analysis software.
  • the editing efficiency results of Cas12a point mutants in the GFP reporter system HEK293T cell line containing the target sequence are shown in FIG. 18 .
  • the Cas12a point mutant cuts the target sequence, the cells will restore the GFP reading frame to some cells through the self-repair system, resulting in green fluorescence.
  • the Y-axis in Figure 18 represents the percentage (%) of GFP-positive cells, the X-axis represents Cas12a and its different point mutants, and NC represents the negative control group (no transfection plasmid).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Urology & Nephrology (AREA)
  • Virology (AREA)
  • Cell Biology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

La présente invention concerne le domaine technique de l'édition génique, et en particulier un système d'édition de gènes CRISPR/Cas12 et une application de celui-ci. Le système d'édition de gènes est un complexe formé par une protéine Cas12 spécifique et un ARNg, et peut localiser avec précision une séquence d'ADN cible et provoquer une coupure, afin d'endommager la séquence cible par une rupture double brin. L'édition de gènes fait référence à l'édition de gènes intracellulaire ou in vitro. Plus particulièrement, la protéine Cas12J-8 spécifique, une protéine Cas12a et une protéine Cas12b sont impliquées, la protéine Cas12J-8 spécifique a un nombre relativement faible d'acides aminés, la protéine Cas12J-8 spécifique, la protéine Cas12a et la protéine Cas12b ont toutes une efficacité d'édition élevée, et les séquences PAM identifiées par les trois types de protéines sont simples, ce qui fait que la présente invention a de larges perspectives d'application dans le domaine de l'édition de gènes.
PCT/CN2022/096002 2021-05-31 2022-05-30 Protéine cas12, système d'édition de gènes contenant une protéine de cas12, et application WO2022253185A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110606220.9 2021-05-31
CN202110606220.9A CN113373130B (zh) 2021-05-31 2021-05-31 Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用

Publications (1)

Publication Number Publication Date
WO2022253185A1 true WO2022253185A1 (fr) 2022-12-08

Family

ID=77575235

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096002 WO2022253185A1 (fr) 2021-05-31 2022-05-30 Protéine cas12, système d'édition de gènes contenant une protéine de cas12, et application

Country Status (2)

Country Link
CN (1) CN113373130B (fr)
WO (1) WO2022253185A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116179512A (zh) * 2023-03-16 2023-05-30 华中农业大学 靶标识别范围广的核酸内切酶及其应用

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113373130B (zh) * 2021-05-31 2023-12-22 复旦大学 Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用
CN114438055B (zh) * 2021-10-26 2022-08-26 山东舜丰生物科技有限公司 新型的crispr酶和系统以及应用
CN114441772B (zh) * 2022-01-29 2023-03-21 北京大学 用于检测细胞内能够与rna结合的靶分子的方法和试剂
CN116555226A (zh) * 2022-03-03 2023-08-08 吉林省农业科学院 CasF2蛋白、CRISPR/Cas基因编辑系统及其在植物基因编辑中的应用
WO2023216037A1 (fr) * 2022-05-07 2023-11-16 上海鲸奇生物科技有限公司 Développement d'un outil d'édition génique ciblant l'adn
WO2023232109A1 (fr) * 2022-06-01 2023-12-07 中国科学院遗传与发育生物学研究所 Nouveau système d'édition de gène crispr
CN116286742B (zh) * 2022-09-29 2023-11-17 隆平生物技术(海南)有限公司 CasD蛋白、CRISPR/CasD基因编辑系统及其在植物基因编辑中的应用
WO2024089629A1 (fr) * 2022-10-27 2024-05-02 Geneditbio Limited Protéine cas12, système crispr-cas et leurs utilisations
CN116144631B (zh) * 2023-01-17 2023-09-15 华中农业大学 耐热型核酸内切酶及其介导的基因编辑系统
CN116410955B (zh) * 2023-03-10 2023-12-19 华中农业大学 两种新型核酸内切酶及其在核酸检测中的应用

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018148511A1 (fr) * 2017-02-10 2018-08-16 Zymergen Inc. Stratégie de conception de plasmide universelle modulaire pour l'assemblage et l'édition de multiples constructions d'adn pour hôtes multiples
CN109312316A (zh) * 2016-02-15 2019-02-05 本森希尔生物系统股份有限公司 修饰基因组的组合物和方法
WO2020086144A2 (fr) * 2018-08-15 2020-04-30 Zymergen Inc. APPLICATIONS DE CRISPRi DANS L'INGÉNIERIE MÉTABOLIQUE À HAUT RENDEMENT
CN112301016A (zh) * 2020-07-23 2021-02-02 广州美格生物科技有限公司 新型mlCas12a蛋白在核酸检测方面的应用
CN113373130A (zh) * 2021-05-31 2021-09-10 复旦大学 Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019201331A1 (fr) * 2018-04-20 2019-10-24 中国农业大学 Protéine effectrice crispr/cas et système
US11124783B2 (en) * 2018-09-13 2021-09-21 The Board Of Regents Of The University Of Oklahoma Variant CAS9 proteins with improved DNA cleavage selectivity
EP4023766B1 (fr) * 2018-09-20 2024-04-03 Institute Of Zoology, Chinese Academy Of Sciences Procédé de détection d'acide nucléique
WO2020146297A1 (fr) * 2019-01-08 2020-07-16 Integrated Dna Technologies, Inc. Gènes mutants de cas12a et polypeptides codés par ceux-ci
CA3130789A1 (fr) * 2019-03-07 2020-09-10 The Regents Of The University Of California Polypeptides effecteurs crispr-cas et procedes d'utilisation associes
CN110747187B (zh) * 2019-11-13 2022-10-21 电子科技大学 识别TTTV、TTV双PAM位点的Cas12a蛋白、植物基因组定向编辑载体及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109312316A (zh) * 2016-02-15 2019-02-05 本森希尔生物系统股份有限公司 修饰基因组的组合物和方法
WO2018148511A1 (fr) * 2017-02-10 2018-08-16 Zymergen Inc. Stratégie de conception de plasmide universelle modulaire pour l'assemblage et l'édition de multiples constructions d'adn pour hôtes multiples
WO2020086144A2 (fr) * 2018-08-15 2020-04-30 Zymergen Inc. APPLICATIONS DE CRISPRi DANS L'INGÉNIERIE MÉTABOLIQUE À HAUT RENDEMENT
CN112301016A (zh) * 2020-07-23 2021-02-02 广州美格生物科技有限公司 新型mlCas12a蛋白在核酸检测方面的应用
CN113373130A (zh) * 2021-05-31 2021-09-10 复旦大学 Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN 13 October 2019 (2019-10-13), ANONYMOUS : "type V CRISPR-associated protein Cas12a/Cpf1 [Moraxella bovis]", XP093010372, retrieved from NCBI Database accession no. WP_078273923.1 *
DATABASE PROTEIN 13 October 2019 (2019-10-13), ANONYMOUS : "type V CRISPR-associated protein Cas12a/Cpf1 [Moraxella ovis] ", XP093010373, retrieved from NCBI Database accession no. WP_112744621.1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116179512A (zh) * 2023-03-16 2023-05-30 华中农业大学 靶标识别范围广的核酸内切酶及其应用
CN116179512B (zh) * 2023-03-16 2023-09-15 华中农业大学 靶标识别范围广的核酸内切酶及其应用

Also Published As

Publication number Publication date
CN113373130B (zh) 2023-12-22
CN113373130A (zh) 2021-09-10

Similar Documents

Publication Publication Date Title
WO2022253185A1 (fr) Protéine cas12, système d'édition de gènes contenant une protéine de cas12, et application
US10781432B1 (en) Engineered cascade components and cascade complexes
Xu et al. Engineered miniature CRISPR-Cas system for mammalian genome regulation and editing
US11840694B2 (en) Truncated CRISPR-Cas proteins for DNA targeting
DK3350327T3 (en) CONSTRUCTED CRISPR CLASS-2-NUCLEIC ACID TARGETING-NUCLEIC ACID
CN113286880A (zh) 调控基因组的方法和组合物
WO2017161068A1 (fr) Protéines cas mutantes
CN113881652B (zh) 新型Cas酶和系统以及应用
CN112105728A (zh) CRISPR/Cas效应蛋白及系统
WO2019120193A1 (fr) Systèmes d'édition de gènes à base unique fragmentés et application associée
CN113015798B (zh) CRISPR-Cas12a酶和系统
US20230340481A1 (en) Systems and methods for transposing cargo nucleotide sequences
WO2020087631A1 (fr) Système et procédé d'édition génomique basée sur des nucléases c2c1
CN113583999A (zh) Cas9蛋白、含有Cas9蛋白的基因编辑系统及应用
KR102151064B1 (ko) 매칭된 5' 뉴클레오타이드를 포함하는 가이드 rna를 포함하는 유전자 교정용 조성물 및 이를 이용한 유전자 교정 방법
EP4271805A1 (fr) Nouvelles nucléases guidées par acide nucléique
WO2021081384A1 (fr) Nucléases synthétiques
CN113652411A (zh) Cas9蛋白、含有Cas9蛋白的基因编辑系统及应用
CN116751762A (zh) Cas12b蛋白、单链向导RNA、包含它们的基因编辑系统及相关应用
WO2023165613A1 (fr) Utilisation d'une exonucléase dans le sens 5' vers 3' dans un système d'édition génique, et système d'édition génique, et procédé d'édition génique
WO2022188816A1 (fr) Système d'édition de base cg amélioré
CN117025570A (zh) Cas12a突变体蛋白、含有Cas12a突变体蛋白的基因编辑系统及应用
CN116144629A (zh) Cas9蛋白、含有Cas9蛋白的基因编辑系统及应用
US20230242922A1 (en) Gene editing tools
CN116804190A (zh) SlugCas9突变体蛋白及其相关应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22815230

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22815230

Country of ref document: EP

Kind code of ref document: A1