WO2023059115A1 - Système cible pour l'édition du génome et ses utilisations - Google Patents

Système cible pour l'édition du génome et ses utilisations Download PDF

Info

Publication number
WO2023059115A1
WO2023059115A1 PCT/KR2022/015067 KR2022015067W WO2023059115A1 WO 2023059115 A1 WO2023059115 A1 WO 2023059115A1 KR 2022015067 W KR2022015067 W KR 2022015067W WO 2023059115 A1 WO2023059115 A1 WO 2023059115A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
seq
cas12f1
gene editing
rna
Prior art date
Application number
PCT/KR2022/015067
Other languages
English (en)
Korean (ko)
Inventor
김용삼
김도연
Original Assignee
주식회사 진코어
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 진코어 filed Critical 주식회사 진코어
Publication of WO2023059115A1 publication Critical patent/WO2023059115A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Definitions

  • the present invention relates to a Tiny nuclease-augment RNA-based Genome Editing Technology (TaRGET) system developed using a small-sized Cas12f1 variant or a homologue thereof and an engineered guide RNA.
  • TaRGET Tiny nuclease-augment RNA-based Genome Editing Technology
  • it relates to a Cas12f1 variant or homologue thereof having a smaller size and improved cleavage efficiency compared to conventional Cas endonucleases, and a gene editing system using a guide RNA engineered with the variant.
  • Genome editing technology which freely corrects the genetic information of living organisms as needed, enables changes to have the desired genetic information in the genome of bacteria, yeast, plants, and animal cells including humans.
  • gene editing technology is evaluated as a key technology that will create a new high-tech bio industry, such as cell engineering, model animal production, transgenic plant production, and use in gene therapy for cancer, genetic diseases, and infectious diseases. are receiving Accordingly, gene editing technology has recently been rapidly developed, and various studies are in progress.
  • Gene editing is performed through a gene editing system that can accurately find a target gene or target nucleic acid sequence and cut or modify the site, and the CRISPR/Cas system is representative.
  • Cas endonuclease forms a complex with CRISPR RNA (crRNA) that recognizes a target gene sequence, and in some cases, transactivating CRISPR RNA (tracrRNA) that binds to the Cas endonuclease is can be added
  • crRNA CRISPR RNA
  • tracrRNA transactivating CRISPR RNA
  • a single guide RNA (sgRNA) form in which the crRNA and tracrRNA are connected by a linker is mainly used, and this guide RNA is a target to be cut or modified by the Cas endonuclease of the gene editing CRISPR system.
  • Cas endonuclease located at the target gene site recognizes a protospacer adjacent motif (PAM) adjacent to the target gene sequence, and then internal or external nucleotide sequences (base pair, base pair) of the target nucleic acid sequence. bp) is cut or modified.
  • PAM protospacer adjacent motif
  • Target nucleic acids cut by the genome editing technology system are repaired through DNA repair mechanisms of homology directed repair (HDR) or non-homologous end joining (NHEJ) processes.
  • HDR homology directed repair
  • NHEJ non-homologous end joining
  • insertion or deletion of random bases occurs between the cut DNA sites (insertion and deletion, indel), and as a result, the coding part of the gene In , a frameshift mutation or premature mutation occurs and the target gene is knocked out.
  • the DNA repair mechanism of homologous recombination (HDR) requires donor DNA (Donor DNA) to recover the cut DNA.
  • gene editing is completed (Jinek, M. et al., 2012).
  • Class 1 includes type I, type III and type IV Cas nucleolytic proteins, and type II, type V and type VI Cas nucleolytic proteins are classified as Class 2 (Koonin et al., 2017, Makarova et al., 2020).
  • Class 2 CRISPR/Cas system is characterized in that its effector complex contains a large single protein with multiple domains.
  • Cas9 (type II) derived from Streptococcus pyogenes , which has been most actively studied to date, and CRISPR/Cpf1 (type V), which is actively studied for gene editing, are representative Class 2 nucleic acid degradation proteins. (Chylinski et al., 2014; Shmakov et al., 2015).
  • the CRISPR/SpCas9 system which has been the most actively researched and known to have high efficiency, has been pointed out as a disadvantage in that the size of the corresponding gene is very large.
  • the SpCas9 gene alone is over 4.3 kb, and when guide RNA, promoter and poly A sequence, which are various gene expression constructs, are added, it exceeds 5 kb.
  • AAV adeno-associated virus
  • CRISPR systems such as SaCas9 or CjCas9 have been discovered and developed and used as gene editing tools.
  • the biggest problem with these two gene editing tools is that they show relatively inferior gene editing efficiency compared to SpCas9 or Cpf1.
  • you want to use this CRISPR system for base editing, prime editing, and epigenetic regulation you will encounter the same AAV delivery limit problem as SpCas9. will do
  • Non-Patent Document 1 Jinek, M. et al., A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity, Science, Vol. 337, 816-821 (2012)
  • Non-Patent Document 2 Koonin, EV. et al., Mobile genetic elements and evolution of CRISPR-Cas system; All the way there and back, Genome Biol. Evol., Vol. 9, no. 10, 2812-2825 (2017),
  • Non-Patent Document 3 Makarova, KS. et al., Evolutionary classification of the CRISPR-Cas system: a burst of class 2 and derived variants, Nat. Rev. Microbiol., Vol. 18, 67-83(2020)
  • Non-Patent Document 4 Chylinski, K. et al., Classification and evolution of type II CRISPR-Cas system, Nucleic Acids Research, Vol. 42, no. 10, 6091-6105 (2014)
  • Non-Patent Document 5 Shmakov, S. et al., Discovery and Functional Characterization of diverse class 2 CRISPR-Cas system, Vol. 60, 385-397 (2015)
  • Non-Patent Document 6 Karvelis, T. et al., PAM recognition by miniature CRISPR-Cas12f nucleases triggers programmable double-stranded DNA target cleavage, Nucleic Acids Research, Vol. 48, no. 9, 5016-5023 (2020)
  • Non-Patent Document 7 Harrington, LB. et al., Programmed DNA destruction by miniature CRISPR-Cas14 enzymes, Science, Vol. 362, 839-842 (2018)
  • Non-Patent Document 8 Takeda, SN. et al., Structure of the miniature type V-F CRISPR-Cas effector enzyme, Molecular Cell 81, 1-13 (2021)
  • Non-Patent Document 9 Xiao, R. et al., Structural basis for the dimerization-dependent CRISPR-Cas12f nuclease, bioRxiv, 1-20 (2020)
  • Non-Patent Document 10 Wang, D. et al., Adeno-associated virus vector as a platform for gene therapy delivery, Nat. Rev. Drug Discov., Vol. 18, no. 5, 358-378 (2019)
  • the object of the present invention is to solve all of the above problems.
  • the present invention relates to a variant protein for Cas12f1, a homolog protein thereof, or a small endonuclease containing the same for Cas12f1, which is distinguished from Cas proteins that act as endonucleases in association with the conventional CRISPR system. Its purpose is to provide
  • Another object of the present invention is to provide a guide RNA capable of increasing the indel efficiency of the Cas12f1 mutant protein or a homologous protein thereof.
  • Another object of the present invention is to provide a gene editing system based on the Cas12f1 mutant protein.
  • Another object of the present invention is to provide a nucleic acid encoding a gene editing molecule based on a Cas12f1 mutant protein or a vector system for expressing the same.
  • Another object of the present invention is to provide a composition for gene editing based on the Cas12f1 mutant protein.
  • Another object of the present invention is to provide a method for editing a gene using a Cas12f1 mutant protein.
  • a small endonuclease including a Cas12f1 variant protein or a homolog protein thereof and a polynucleotide encoding the same are provided.
  • a small endonuclease comprising a Cas12f1 mutant protein or a homologous protein thereof; And a gene editing system comprising a guide RNA is provided.
  • a first nucleic acid construct operably linked to a nucleotide encoding a small endonuclease including a Cas12f1 mutant protein or a homologous protein thereof; and/or a second nucleic acid construct to which a nucleotide sequence encoding a guide RNA is operably linked.
  • a gene editing composition comprising the gene editing system, the vector system, or both systems is provided.
  • a gene editing method comprising the step of contacting the gene editing system, the vector system, or the gene editing composition with a target gene or a target nucleic acid.
  • the present invention relates to a novel Cas12f1 variant-based CRISPR protein previously unknown as an endonuclease and a microgene comprising an engineered guide RNA exhibiting excellent gene editing efficiency when used together with the mutant protein.
  • An editing system (named “Hypercompact TaRGET system") is provided.
  • the miniaturized gene editing system of the present invention has the advantage of being able to mount all of the gene editing tools required for editing of various genomes targeted to a single adeno-associated virus (AAV) vector.
  • AAV adeno-associated virus
  • the miniaturized gene editing system is a packaging tool for an AAV vector, which is a clinically validated intracellular delivery means due to the size of a gene editing system containing proteins such as Cas9 or Cpf1, which have been mainly used for chromosome editing. It is to present a new gene editing system that solves the biggest limitation in using it as a gene editing system.
  • the miniaturized gene editing system according to the present invention exhibits excellent target gene editing efficiency by including a novel Cas12f1-based mutant CRISPR protein and a guide RNA engineered appropriately therefor.
  • FIG. 1 shows Modification Sites (MS) MS1 to MS5 for an engineered guide RNA (hereinafter “augment RNA”) according to an embodiment of the present invention.
  • Figures 2a and 2b show exemplary structures showing various modification sites for producing engineered single guide RNA (sgRNA) according to an embodiment of the present invention:
  • Figure 2a is an exemplary modification site of canonical sgRNA for Cas12f1 mutant shows
  • Figure 2b shows an exemplary modification site of mature form sgRNA for Cas12f1 variants engineered according to an embodiment of the present invention.
  • 3a to 3d show the N-terminus or C-terminus of Cas12f1, Cas12f1 variant, Cas12f1 variant v1, Cas12f1 variant v2, Cas12f1 variant v3, and Cas12f1 variant (SEQ ID NO: 1) by augment RNA according to an embodiment of the present invention.
  • FIG. 3a shows the results of measuring the indel efficiency of Cas12f1 variants with respect to the target sequence, Target-1.
  • Figure 3b shows the results of measuring the indel efficiency of Cas12f1 variants with respect to the target sequence, Target-2.
  • Figure 3c shows the results of measuring the indel efficiency of Cas12f1 variants with respect to the target sequence, Target-3.
  • Figure 3d shows the result of measuring the indel efficiency for the target sequences Target-1 and Target-2 of the Cas12f1 mutant protein to which amino acids are added to the N-terminus or C-terminus.
  • Cas12f1_ge4.0 was used as a guide RNA.
  • Figure 4 shows the results of comparative measurement of intracellular indel efficiencies of existing gene editing proteins (SpCas9, AsCas12a) and Cas12f1 variants.
  • Figures 5a and 5b show the results of measuring the indel efficiency (%) of the miniature gene editing system including the Cas12f1 variant and augment RNA having one or more modifications of MS1 to MS5 in each region of the wild-type guide RNA.
  • Figure 5a shows the indel efficiency (%) for the target sequence Target-1.
  • Figure 5b shows the indel efficiency (%) for the target sequence Target-2.
  • Figure 6 shows the result of confirming the indel efficiency (%) for the target sequence Target-1 of the microgene editing system including augment RNA and Cas12f1 variant according to an embodiment of the present invention.
  • 7a to 7d show the results of measuring the indel efficiency (%) of the miniature gene editing system including the Cas12f1 variant and augment RNA further having one or more modifications of MS3 to MS5 in each region of the mature form sgRNA.
  • 7A and 7B are graphs showing indel efficiency (%) for the target sequence Target-1, respectively.
  • 7c and 7d are graphs showing indel efficiency (%) for the target sequence Target-2, respectively.
  • FIG. 8a and 8b show the results of measuring the indel efficiency (%) for the target sequence Target-1 of the microgene editing system including augment RNA and Cas12f1 variant according to an embodiment of the present invention:
  • FIG. 8a is a graph showing the indel efficiency (%) when using augmented RNA with MS3-3, MS3-3/MS4-3 or MS3-3/MS4-3/MS5-3 modifications in mature form sgRNA am.
  • Figure 8b shows the indel efficiency in the case of using augmented RNA having MS3-3, MS3-3/MS4-3 or MS3-3/MS4-3/MS5-3 modifications and MS2 modifications in mature form sgRNA (indel) It is a graph showing %).
  • the term "genome editing protein” or “nucleolysis protein” refers to DNA or RNA, which is a target nucleic acid, or a protospacer adjacent motif (PAM) in a target gene. After recognizing, It refers to an (endo)nuclease capable of editing DNA double strand breaks (DSB) at an internal or external base pair (bp) of a target nucleic acid sequence.
  • the gene editing protein or nucleic acid degradation protein is also referred to as an effector protein constituting a gene editing system or a nucleic acid construct for gene editing.
  • the effector protein may be a nucleolytic protein capable of binding to guide RNA (gRNA) or engineered RNA, or a peptide fragment capable of binding to a target nucleic acid or target gene.
  • gene editing CRISPR/Cas system or “gene editing system” refers to a gene editing protein or nuclease, such as Cas endonuclease, and a nucleic acid corresponding to the nuclease.
  • a complex containing a targeting molecule it refers to a complex capable of cleaving or editing a target site of a target nucleic acid or gene by binding to a target nucleic acid or target gene.
  • the nucleic acid targeting molecule may be represented by guide RNA (gRNA), but is not limited thereto.
  • Hypercompact TaRGET system refers to a complex containing a nucleic acid degrading enzyme such as a small gene editing protein or a small endonuclease and a nucleic acid targeting molecule corresponding to the nucleic acid degrading enzyme. It is used as a term differentiated from the gene editing system of .
  • the nucleic acid targeting molecule may be represented by guide RNA (gRNA), but is not limited thereto.
  • gRNA guide RNA
  • the system refers to a complex capable of binding to a target nucleic acid or target gene and cleaving or editing the target site of the target nucleic acid or gene.
  • nucleic acid construct refers to a construct comprising, as components, a nucleotide sequence encoding a genome editing protein or a nucleolytic protein and/or a nucleotide sequence encoding a guide RNA. It may further include a nucleotide sequence encoding a (poly)peptide or linker of the kind.
  • the nucleic acid construct can be used as a vector for gene editing or a component constituting the hypercompact TaRGET system of the present invention.
  • Target nucleic acid refers to a gene or nucleic acid that is targeted or targeted for gene editing by a microgene editing system (e.g., the Hypercompact TaRGET system). .
  • Target nucleic acids or target genes may be used interchangeably and may refer to the same target.
  • the target gene may be an intrinsic gene or nucleic acid of the target cell, a gene or nucleic acid derived from the outside, or an artificially synthesized nucleic acid or gene, single-stranded DNA, double-stranded DNA and/or RNA. can mean anything
  • the target gene or target nucleic acid is not particularly limited as long as it can be a target of gene editing by the miniaturized gene editing system.
  • target region refers to a sequence present in a target nucleic acid or target gene, which is recognized by the miniaturized gene editing system of the present invention to cleave the target gene or target nucleic acid.
  • target site or target sequence may be appropriately selected depending on the purpose.
  • guide RNA refers to a complex capable of forming a complex with a gene editing protein or a nucleolytic protein, capable of hybridizing with a target nucleic acid sequence, and sequence-specific binding of the complex to a target nucleic acid sequence.
  • RNA comprising a guide sequence that is complementary to a target nucleic acid sequence to a sufficient degree to cause Guide molecule or guide RNA are used interchangeably herein.
  • scaffold region refers to the part of guide RNA (gRNA) that can interact with genome editing proteins or nucleolytic proteins, excluding spacers among parts of guide RNA found in nature. You can refer to the rest.
  • gRNA guide RNA
  • spacer sequence refers to a polynucleotide that hybridizes to a portion of a target sequence in a miniaturized gene editing system.
  • the spacer sequence refers to 10 to 50 consecutive nucleotides near the 3'-end of the crRNA of the guide RNA in the microgene editing system.
  • tracrRNA and “crRNA” include all meanings recognizable to one of ordinary skill in the art of gene editing. It can be used as a term to refer to each molecule of dual guide RNA found in nature, and can also be used to refer to each corresponding part of single guide RNA that connects the tracrRNA and crRNA with a linker. can Unless otherwise stated, when only tracrRNA and crRNA are described, it means tracrRNA and crRNA constituting a genome editing system.
  • a vector refers collectively to any material capable of carrying genetic material into a cell, unless otherwise specified.
  • a vector may be a DNA molecule including a nucleic acid encoding an effector protein of a genome editing system, which is a genetic material to be delivered, and/or a nucleic acid encoding a guide RNA (gRNA). , but is not limited thereto.
  • the "vector” may be an "expression vector” containing essential regulatory elements operably linked so that the inserted gene is normally expressed.
  • operably linked means, in gene expression technology, that a particular element is linked to another element so that the particular element can function in its intended manner.
  • engineered is a term used to distinguish substances, molecules, etc. having a configuration that already exists in the natural world, and means that artificial transformation has been applied to the substances or molecules.
  • engineered guide RNA it is a guide RNA (gRNA) in which artificial changes have been made to the composition of guide RNA (gRNA) existing in nature. It can be referred to as augmented RNA within this specification. there is.
  • polynucleotide and “nucleic acid” may be used interchangeably and refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides. Accordingly, the term refers to single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or non-naturally occurring purine and pyrimidine bases or other natural, chemically or biochemically modified, or polymers comprising derivatized nucleotide bases.
  • polynucleotide and “nucleic acid” should be understood to include single-stranded (eg sense or antisense) and double-stranded polynucleotides, as applicable to the embodiments described herein.
  • polypeptide refers to genetically encoded and non-genetically encoded amino acids, chemically or biochemically modified or derivatized amino acids, and modified peptide backbones. refers to a polymeric form of amino acids of any length that may include a polypeptide having The term includes fusion proteins with heterologous amino acid sequences, fusions with heterologous and homologous leader sequences, with or without an N-terminal methionine residue; fusion proteins including, but not limited to, immunologically tagged proteins and the like.
  • A, T, C, G, and U may be properly interpreted as Base, Nucleoside, or Nucleotide on DNA or RNA, depending on context and description.
  • base when meaning a base, it is interpreted as one selected from among adenine (A), guanine (G), cytosine (C), tymine (T), and uracil (U). It can be.
  • nucleoside when meaning a nucleoside, it can be interpreted as adenosine (A), thymidine (T), cytidine (C), guanosine (G) or uridine (U), respectively. It may be, and when meaning a nucleotide in a sequence, it should be interpreted as meaning a nucleotide including each of the nucleosides.
  • the term “about” refers to a reference amount, level, value, number, frequency, percentage, measure, size, amount, weight or length of 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4 means an amount, level, value, number, frequency, percentage, dimension, size, amount, weight or length that varies by 3, 2 or 1%.
  • the present inventors found that the TnpB protein, known as a factor constituting a transposase derived from a bacterial species, has an amino acid sequence similar to that of the Cas12f1 protein, and is superior to existing nucleolytic proteins, including the Cas9 protein, which has been studied the most to date.
  • the protein is defined as a Cas12f1 mutant protein and it is identified for the first time that the protein exhibits high-efficiency ultra-small gene editing protein activity. did
  • the present inventors have developed a Cas12f1 variant, a subminiature gene editing protein, as a system that can be easily loaded into an adeno-associated virus (AAV) vector and can be effectively delivered in vivo to edit target nucleic acids or target genes in cells.
  • AAV adeno-associated virus
  • the present inventors confirmed for the first time that gene editing with a more efficient and increased range of applications is possible by using a new subminiature gene editing protein Cas12f1 mutant protein, rather than a previously known Cas endonuclease such as Cas9 or Cpf1,
  • the present invention was completed based on the fact that the novel microgene editing system can be used for various genome editing.
  • the present invention provides a small endonuclease comprising a Cas12f1 mutant protein or a homologous protein thereof for use in specifically and efficiently editing a target nucleic acid or target gene; and a hypercompact TaRGET system including a guide RNA.
  • the present invention provides a structure or vector for editing a small nucleic acid comprising a small endonuclease containing a Cas12f1 mutant protein or a homologous protein thereof, a method for editing a target site in a target nucleic acid or a target gene using the same, and a method for the same It's about the composition.
  • the subminiature gene editing system according to the present invention is an adeno-associated virus (AAV) approved by the FDA as an intracellular delivery medium due to the size of most of the previously studied Cas endonucleases and gene editing systems including the same. This is a meaningful result that solves the constraints of loading into a vector.
  • AAV adeno-associated virus
  • the hypercompact TaRGET system has high gene editing specificity and editing efficiency for cleaving a specific target region of a target nucleic acid or target gene
  • the hypercompact TaRGET system according to the present invention is a target nucleic acid editing system. Its application range is wide as a research on editing technology for various editing of genes and as a new treatment for gene-related diseases.
  • a Cas12f1 variant protein or homologue thereof characterized in that it exhibits excellent activity in target site cleavage of a target nucleic acid and the size of a nucleolytic protein is significantly smaller than that of the existing CRISPR/Cas9 system by about 1/3. Proteins or (small) endonucleases comprising them are provided.
  • Cas12f1 variant proteins include both naturally occurring Cas12f1 variants and engineered Cas12f1 variants.
  • the Cas12f1 mutant protein includes the amino acid sequence of SEQ ID NO: 1 or an amino acid sequence in which 1 to 28 amino acids are removed or substituted from the N-terminus based on the sequence (provided that it consists of the amino acid sequence of SEQ ID NO: 5). except for the Cas12f1 protein).
  • the Cas12f1 variant protein may contain or be a sequence derived from the transposase accessory protein TnpB protein of the IS200/IS605 family similar in size to the Cas12f1 protein belonging to the V-F subtype of Class 2, type V CRISPR/nucleolysis proteins.
  • the TnpB protein is a protein conventionally known as a transposase. Until now, the TnpB protein has been known only as a transposon-encoded nuclease that encodes a transposon, and it is unknown whether the TnpB protein has Cas endonuclease activity. Also, guide RNA for the TnpB protein is not known.
  • the present invention is a Cas12f1 variant partially based on the TnpB protein sequence or an engineered Cas12f1 variant, which is similar in size to the Cas12f1 protein belonging to the smallest group of nucleolytic proteins, and is an excellent endoplasmic reticulum for targeting and editing a target nucleic acid or target gene. It was confirmed for the first time that it has endonuclease activity, and it was completed by producing a guide RNA that exhibits excellent editing efficiency when used together with a Cas12f1 mutant protein.
  • the Cas12f1 mutant protein belongs to the group with the smallest molecular weight among currently existing nucleolytic proteins and has an excellent effect of targeting and editing target nucleic acids or target genes by forming a complex with the engineered short guide RNA (gRNA) of the present invention. , there is a great advantage in fabricating a miniaturized gene editing system for intracellular application.
  • the Cas12f1 mutant protein or its homolog protein which is a small gene editing protein, is T-rich such as 5'-TTTA-3' or 5'-TTTG-3'. Since it has PAM as a PAM, it is possible to select a sequence rich in thymine (T) as a target nucleic acid or target gene, broadening the selection of nucleolytic proteins for genome editing.
  • the Cas12f1 variant protein may be a variant protein comprising or consisting of the amino acid sequence of SEQ ID NO: 1.
  • the Cas12f1 mutant protein may be a Cas12f1 mutant protein comprising or consisting of an amino acid sequence in which 1 to 28 amino acids are removed or substituted at the N-terminus of the amino acid sequence of SEQ ID NO: 1.
  • the Cas12f1 protein consisting of the amino acid sequence of SEQ ID NO: 5 is not included.
  • the Cas12f1 mutant protein may be a Cas12f1 mutant protein comprising or consisting of any one amino acid sequence selected from the group consisting of SEQ ID NO: 2 to SEQ ID NO: 4.
  • the Cas12f1 mutant protein may further include one or more amino acids in the Cas12f1 protein consisting of the amino acid sequence of SEQ ID NO: 5.
  • the Cas12f1 variant protein is a Cas12f1 variant v1 protein (SEQ ID NO: 2) comprising the N-terminal 26aa of CasX at the N-terminus of the Cas12f1 protein, a Cas12f1 variant v2 protein (SEQ ID NO: 3) comprising a 28aa random sequence, or 26aa It may comprise or consist of a Cas12f1 variant v3 protein (SEQ ID NO: 4) comprising a random sequence.
  • the homologous protein of the Cas12f1 variant protein may be a TnpB protein from various biological species or include variants derived therefrom.
  • the homologous protein may include any one amino acid sequence selected from the group consisting of SEQ ID NO: 141 to SEQ ID NO: 232.
  • Homologous proteins refer to proteins that share the same in vivo activity (i.e., endonuclease activity) as the Cas12f1 variant protein, and regardless of their sequence similarity (or identity), the characteristics derived from a common ancestor are not lost and This means that the protein is conserved.
  • the Cas12f1 mutant protein may consist of an amino acid sequence in which 1 to 600 amino acids are added to the N-terminus or C-terminus of the amino acid sequence of SEQ ID NO: 1.
  • the amino acid sequence added to the N-terminus or C-terminus may be the amino acid sequence of SEQ ID NO: 233 or SEQ ID NO: 234.
  • An NLS sequence may be further included between the additional sequence and the Cas12f1 mutant protein.
  • the Cas12f1 mutant protein may have the same function as the wild-type Cas12f1 protein, or may have an altered function compared to the wild-type Cas12f1 protein. More specifically, the alteration includes modification of all or part function, loss of all or part function, and/or addition of additional function.
  • the Cas12f1 mutant protein may include any modifications without particular limitation, as long as those skilled in the art can apply them to the nucleolytic protein of the microgene editing system.
  • the Cas12f1 mutant protein may be used to perform not only DNA double-strand cleavage activity, but also single-stranded DNA or RNA, or DNA and RNA hybrid double-strand cleavage activity, base correction or prime correction.
  • the miniaturized gene editing system of the present invention cuts nucleic acids at a target nucleic acid or a target site of a target gene
  • the target site may be located in the nucleus of a cell.
  • the Cas12f1 mutant protein or its homologue protein used in the miniaturized gene editing system of the present invention may include one or more nuclear localization signal (NLS) sequences that localize it into the nucleus.
  • NLS nuclear localization signal
  • one or more nuclear localization signal sequences may have an amount or activity strength sufficient to induce the Cas12f1 variant protein or its homolog protein to be targeted into the nucleus in a detectable amount in the nucleus of a eukaryotic cell (including mammalian cell).
  • the difference in activity intensity may be caused by the number of NLSs included in the Cas12f1 mutant protein or its homologue protein, the type of specific NLS(s) used, or a combination of these factors.
  • the variant protein or homolog thereof has at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more at or near the amino-terminus (N-terminus) of the protein.
  • NLSs about 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 or more NLSs at or near the carboxy-terminus (C-term), or combinations thereof.
  • it may include zero or one or more NLS sequences at the amino-terminus (N-term) and/or zero or one or more NLS sequences at the carboxy-terminus (C-term).
  • each NLS sequence must be selected independently of the others such that a single NLS can be present in more than one copy and can be present in combination with more than one other NLS present in more than one copy. can
  • the NLS sequence is heterologous to the protein, exemplified by, but not limited to, the NLS sequence below.
  • the NLS is the NLS of the SV40 virus large T-antigen having the amino acid sequence PKKKRKV (SEQ ID NO: 54), the nucleoplasmin binary having KRPAATKKAGQAKKKK (SEQ ID NO: 55) as the NLS sequence from nucleoplasmin ( bipartite) NLS, c-myc NLS with amino acid sequence PAAKRVKLD (SEQ ID NO: 56) or RQRRNELKRSP (SEQ ID NO: 57).
  • the NLS sequence of hRNPA1 M9 the NLS sequence of the IBB domain from importin-alpha, the NLS sequence of myoma T protein and the NLS sequence of human p53, the NLS sequence of mouse c-abl IV, the NLS sequence of influenza virus NS1 , the NLS sequence of hepatitis virus delta antigen, the NLS sequence of mouse Mx1 protein, the NLS sequence of human poly(ADP-ribose) polymerase, or the NLS sequence of steroid hormone receptor (human) glucocorticoid.
  • the Cas12f1 mutant protein or its homologue protein may be a fusion of various enzymes that may be involved in gene expression in cells.
  • the Cas12f1 analog protein to which the enzyme is fused may cause various quantitative and/or qualitative changes in gene expression in cells.
  • the various enzymes additionally coupled may be DNMT, TET, KRAB, DHAC, LSD, p300, Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase or variants thereof.
  • M-MLV Moloney Murine Leukemia Virus
  • the Cas12f1 mutant protein or its homolog protein to which the reverse transcriptase is fused can also function as a prime editor.
  • the following two conditions are required for the miniaturized gene editing system to locate a target nucleic acid or target site of a target gene and accurately cleave the target site nucleic acid.
  • nucleotide sequence of a certain length that can be recognized by the Cas12f1 mutant protein or its homologue protein in the target nucleic acid or target gene.
  • gRNA guide RNA
  • a target nucleic acid or a target site nucleic acid of a target gene can be precisely cut or edited.
  • a nucleotide sequence of a certain length recognized by the Cas12f1 mutant protein or its homologue protein is referred to as a protospacer adjacent motif (PAM) sequence.
  • the PAM sequence is a unique sequence determined according to the Cas12f1 mutant protein, which is a small gene editing protein, or its homologue protein. This means that when determining the target sequence of the Cas12f1 mutant protein or its homologous protein complex in the microgene editing system, the target sequence must be determined within a sequence adjacent to the PAM sequence.
  • the PAM sequence of the Cas12f1 mutant protein or its homologue protein may be a T-rich sequence. More specifically, the PAM sequence of the Cas12f1 mutant protein or its homologue protein may be 5'-TTTN-3'.
  • N is one of deoxythymidine (T), deoxyadenosine (A), deoxycytidine (C), or deoxyguanosine (G).
  • the PAM sequence of the Cas12f1 mutant protein or its homologue protein may be 5'-TTTA-3', 5'-TTTT-3', 5'-TTTC-3' or 5'-TTTG-3' .
  • the PAM sequence of the Cas12f1 mutant protein or its homologue protein may be 5'-TTTA-3' or 5'-TTTG-3'.
  • the PAM sequence of the Cas12f1 mutant protein or its homologous protein may be different from the PAM sequence of the wild-type Cas12f1 mutant protein (or a protein consisting of a sequence derived from TnpB).
  • Embodiments of the present invention are derived to overcome the limitations of intracellular delivery according to the protein molecular weight of Cas9 of the prior art. Therefore, in addition to selecting the Cas12f1 mutant protein or its homologue protein with a small molecular weight as the gene editing protein included in the miniaturized gene editing system of the present invention, the guide RNA (gRNA) for the Cas12f1 mutant or its homologue that exists in nature Engineered guide RNAs were artificially engineered much shorter than the guide RNAs to achieve size minimization and increase indel efficiency on the target.
  • gRNA guide RNA
  • the guide RNA (gRNA) that exists in nature for the Cas12f1 variant protein may be a guide RNA (gRNA) found in nature for the Cas12f1 protein that is similar in size to the Cas12f variant protein.
  • the guide RNA (gRNA) may have the nucleotide sequence of SEQ ID NO: 6.
  • the guide RNA (gRNA) for the Cas12f1 mutant protein is a guide RNA (gRNA) found in nature by adding a new structure or modifying part of its structure, at the 3'-end of the guide RNA (gRNA). It is characterized by an engineered guide RNA (engineered gRNA or augmented RNA) including a U-rich tail, which is a new configuration.
  • gRNA guide RNA
  • the guide RNA is a guide RNA engineered by deleting, replacing, or adding one or more nucleotide sequences in the wild-type guide RNA consisting of the nucleotide sequence of SEQ ID NO: 6, and the spacer portion of the engineered guide RNA complementary to the target sequence is 15 It may consist of more than 50 nucleotide sequences.
  • the engineered guide RNA can include an engineered tracrRNA sequence comprising scaffold regions 1 to 4 and/or engineered crRNA sequences comprising scaffold regions 5 to 6.
  • the engineered guide RNA may include a U-rich tail sequence as a seventh region at the 3' end of the crRNA.
  • the engineered guide RNA may be engineered at one or more transformation sites selected from transformation sites MS1 to MS5.
  • Figure 1 shows MS1 to MS5, which are modified sites included in an engineered guide RNA according to an embodiment of the present invention.
  • the engineered guide RNA may be an engineered tracrRNA sequence comprising a scaffold in which one or more of the first to fourth regions of the scaffold is modified, and a scaffold in which one or more of the fifth to sixth regions of the scaffold are modified. It may include an engineered crRNA sequence that includes a fold and/or a U-rich tail sequence that is a modified seventh region.
  • the 4th region of tracrRNA and the 5th region of crRNA are complementary binding sites, including modification site 1 (MS1) and modification site 4 (MS4) of guide RNA (gRNA). do.
  • the seventh region, the U-rich tail sequence corresponds to modification site 2 (MS2).
  • the first region is modification site 3 (MS3)
  • the second region includes modification site 5 (MS5).
  • the engineered guide RNA includes modifications in any one of MS1 to MS5 above, and may include any combination of one or more modifications selected from among them.
  • the miniaturized gene editing system may be engineered to optimize the lengths of tracrRNA and crRNA constituting the guide RNA, and to remove unnecessary scaffold sequences to produce highly efficient guide RNA. Manipulation in the scaffold sequence made it possible to produce a short guide RNA, and as a result, the guide RNA synthesis cost was reduced and additional loading space was secured when inserted into a viral vector.
  • the miniaturized gene editing system including the engineered guide RNA optimized for the Cas12f1 mutant protein of the present invention greatly improved the cutting or editing efficiency of the target nucleic acid or target gene, and further loaded into an adeno-associated virus (AAV) vector. This makes it more advantageous to use as a treatment.
  • AAV adeno-associated virus
  • the sequence acts as a transcription termination signal under specific conditions. Therefore, when the five consecutive uridine (U) sequences act as termination signals, the normal expression of the tracrRNA is inhibited, and the formation of normal guide RNA is also inhibited, resulting in the target nucleic acid of the miniaturized gene editing system of the present invention. or reduce the cleavage or editing efficiency of the target gene.
  • the present inventors convert at least one uridine (U) base of five contiguous uridine sequences (U) of the wild-type tracrRNA (SEQ ID NO: 58) to other bases A, C, T or G. An artificially modified tracrRNA was developed.
  • the engineered guide RNA is one in which a new structure is added to a guide RNA found in nature and a part of the structure is removed or modified, and a new structure, a U-rich tail, is included at the 3'-end.
  • the engineered guide RNA including the U-rich tail serves to increase the rate of nucleic acid cleavage or editing of target nucleic acids or target genes in the microgene editing system.
  • engineered guide RNAs engineered gRNAs
  • the present inventors formed a complex with a Cas12f1 mutant protein or a homologous protein thereof to produce a highly efficient guide RNA with increased cleavage or editing efficiency of a target nucleic acid or target gene, including A micro-gene editing system was completed.
  • the engineered guide RNA is characterized in that at least a portion of the scaffold region interacting with the Cas12f1 mutant protein is modified.
  • the scaffold region includes tracrRNA and parts of crRNA, and does not necessarily refer to one molecule of RNA.
  • the sequence of the engineered guide RNA may be an engineered tracrRNA sequence comprising a modification in one or more of the scaffold first to fourth regions and/or an engineered tracrRNA sequence in one or more of the scaffold fifth to sixth regions. It may further include an engineered crRNA sequence including a modified U-rich tail sequence, which is a modified seventh region.
  • engineered guide RNA may further include a linker or tag, if necessary.
  • the engineered scaffold region may be a combination of modifications in any one or more regions of the first to seventh regions described above with scaffold regions found in nature.
  • the engineered tracrRNA may be a tracrRNA modified so as not to include five or more contiguous uridine sequences (MS1 modification).
  • the engineered tracrRNA may be a tracrRNA modified not to include five or more contiguous uridine sequences and modified to have a shorter length than wild-type tracrRNA.
  • the engineered tracrRNA may include a first region, a second region, a third region, and a fourth region (including MS1 modification) in order from the 5'-end to the 3'-end.
  • the engineered crRNA may include a spacer sequence that is a fifth region, a sixth region, and a guide sequence in order from the 5'-end to the 3'-end.
  • the fourth region may include any polynucleotide having sufficient complementarity to bind to the direct repeat of the crRNA.
  • the first region may be a 5'-CUUCACUGAUAAAGUGGAGAA-3' (SEQ ID NO: 7) sequence or a partial sequence of SEQ ID NO: 7 sequence.
  • the partial sequence of the sequence of SEQ ID NO: 7 may be a partial sequence sequentially from the 3'-terminal portion remaining after the 5'-terminal sequence of the sequence of SEQ ID NO: 7 is sequentially removed.
  • the first region may be 5'-GAUAAAGUGGAGAA-3' (SEQ ID NO: 8), 5'-UGGAGAA-3' or 5'-A-3'.
  • the engineered tracrRNA may have all sequences corresponding to the first region (sites 1-21) removed.
  • the second region may be a 5'-CCGCUUCACCAAAAGCUGUCCCUUAGGGGAUUAGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 9) sequence or a partial sequence of SEQ ID NO: 9 sequence.
  • Part of the sequence of SEQ ID NO: 9 may be a sequence in which at least one pair of nucleotides forming a complementary bond and/or at least one or more nucleotides not forming a complementary bond in the sequence of SEQ ID NO: 9 are deleted.
  • the second region comprises a 5'-CCGCUUCACCAAUUAGUUGAGUGAAGGUG-3' (SEQ ID NO: 10) sequence, a 5'-CCGCUUCACCAAAAGCUUAGGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 11) sequence, or a 5'-CCGCUUCACCAAAAGCUGUUUAGAUUAGAAUCUUGAGUGAAGGUG-3' (SEQ ID NO: 12) may be a sequence.
  • the loop portion included in any one of SEQ ID NOs: 9 to 12 is a 5'-UUAG-3' sequence, which may be substituted with a 5'-GAAA-3' sequence if necessary.
  • the third region (front part of MS4, 72-129 region) may be a 5'-GGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAA-3' (SEQ ID NO: 13) sequence or a sequence having at least 70% sequence identity with the sequence of SEQ ID NO: 13.
  • the fourth region (part of MS4 including MS1, sites 130-161) may be a 5'-CAAAUUCANNNNNCCUCUCCAAUUCUGCACAA-3' (SEQ ID NO: 14) sequence or a partial sequence of SEQ ID NO: 14 sequence.
  • the internal 5'-NNNNN-3' moiety is the MS1 moiety, where each N can be A, C, G or U.
  • some of the sequences of SEQ ID NO: 14 may be sequences that include the 5'-CAAAUUCANNNNN-3' (SEQ ID NO: 15) sequence of the sequence of SEQ ID NO: 14 and do not include some sequences at the 3'-end. .
  • the fourth region may include a 5'-NNNNN-3' region substituted with 5'-NNNVN-3' or 5'-NVNNN-3'. wherein each N is independently A, C, G or U, and V may be A, C or G.
  • the fourth region may be a sequence including a 5'-CAAAUUCANNNCN-3' (SEQ ID NO: 18) sequence and not including some sequences at the 3'-end.
  • each N is independently A, C, G or U.
  • the crRNA can be a wild-type crRNA or an engineered crRNA.
  • the wild-type crRNA may include a wild-type repeat sequence and a spacer sequence, which is a guide sequence, in order from the 5'-end to the 3'-end.
  • the wild-type repetitive sequence may be a 5'-GUUGCAGAACCCGAAUAGACGAAUGAAGGAAUGCAAC-3' (SEQ ID NO: 19) sequence.
  • the engineered guide RNA includes an engineered tracrRNA (transactivating CRISPR RNA) or an engineered crRNA (CRISPR RNA), wherein the engineered tracrRNA is modified so as not to contain five or more contiguous uridine sequences, and has fewer nucleotides than wild-type tracrRNA.
  • the tracrRNA modified to have a shorter sequence length, and the engineered crRNA may include the nucleotide sequence of SEQ ID NO: 19 or a partial sequence thereof.
  • the fifth region may be a 5'-GUUGCAGAACCCGAAUAGNNNNNUGAAGGA-3' (SEQ ID NO: 20) sequence or a partial sequence of SEQ ID NO: 20 sequence.
  • Each N can independently be A, C, G or U.
  • Some sequences of the sequence of SEQ ID NO: 20 include the 5'-NNNNNUGAAGGA-3' (SEQ ID NO: 21) sequence of the sequence of SEQ ID NO: 20, and some sequences of the 5'-end (eg, at least one, two, three 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 nucleotide sequences) may be a sequence.
  • the fifth region may include the sequence 5'-NBNNNUGAAGGA-3' (SEQ ID NO: 22).
  • each N may independently be A, C, G or U
  • the B may be U, C or G.
  • the sixth region may be a sequence having at least 70% sequence identity or sequence similarity to the 5'-AUGCAAC-3' (SEQ ID NO: 23) sequence or the 5'-AUGCAAC-3' sequence.
  • the engineered crRNA may further include a U-rich tail sequence as a seventh region at the 3'-end of the crRNA.
  • the U-rich tail sequence may be a 5'-(UaN)dUe-3' sequence, a 5'-UaVUaVUe-3' sequence, or a 5'-UaVUaVUaVUe-3' sequence.
  • N may each be A, C, G or U
  • V may independently be A, C or G.
  • each a may be an integer of 0 to 4
  • d may be an integer of 0 to 3.
  • e may be an integer from 0 to 10.
  • the seventh area may be U 4 AU 4 .
  • the engineered crRNA may include a fifth region (including MS1 modification), a sixth region, and a guide sequence (spacer sequence) in order from the 5'-end to the 3'-end.
  • the fifth region may be a 5'-GUUGCAGAACCCGAAUAGNBNNNUGAAGGA-3' (SEQ ID NO: 59) sequence or a partial sequence of SEQ ID NO: 59 sequence.
  • N may independently be A, C, G or U
  • B may be U, C or G.
  • Part of the sequence of SEQ ID NO: 59 may be a sequence that includes the 5'-NBNNNUGAAGGA-3' (SEQ ID NO: 60) sequence of the sequence of SEQ ID NO: 59 and does not include a partial sequence at the 3'-end.
  • N may independently be A, C, G or U.
  • B can be U, C or G.
  • Engineered guide RNA can be a dual guide RNA or a single guide RNA.
  • the engineered guide RNA may further include a linker sequence.
  • the linker sequence may be located between the engineered tracrRNA and crRNA, and the linker sequence may be located between the engineered tracrRNA and crRNA, and may be 5'-GAAA-3' or 5'-UUAG-3'. .
  • the engineered crRNA may include the sequence 5'-GUUGCAGAACCCGAAUAGNGNNNUGAAGGAAUGCAAC-3' (SEQ ID NO: 28).
  • each N may independently be A, C, G or U.
  • the engineered tracrRNA is any one of SEQ ID NO: 24 (MS1), SEQ ID NO: 25 (MS1/MS3), SEQ ID NO: 26 (MS1/MS5-3) or SEQ ID NO: 27 (MS1/MS3/MS5-3).
  • the internal 5'-NNNCN-3' sequence is substituted with the 5'-GUGCU-3' sequence, and may include or consist of any one nucleotide sequence selected from SEQ ID NO: 29 to SEQ ID NO: 32.
  • the engineered crRNA is 5'-GUUGCAGAACCCGAAUAG in which the 5'- NGNNN -3' sequence inside the nucleotide sequence of the 5'-GUUGCAGAACCCGAAUAG NGNNN UGAAGGAAUGCAAC-3' (SEQ ID NO: 28) is substituted with the 5'-AGCAA-3' sequence. It may contain or consist of the nucleotide sequence of AGCAA UGAAGGAAUGCAAC-3' (SEQ ID NO: 33).
  • the engineered crRNA may include a 5'-GAAUAGNGNNNUGAAGGAAUGCAAC-3' (SEQ ID NO: 38) sequence and a guide sequence.
  • each N may independently be A, C, G or U.
  • the engineered tracrRNA is SEQ ID NO: 34 (MS1/MS4-2), SEQ ID NO: 35 (MS1/MS3/MS4-2), SEQ ID NO: 36 (MS1/MS5-3/MS4-2) or SEQ ID NO: 37 ( MS1/MS3/MS5-3/MS4-2) in which the internal 5'-NNNCN-3' sequence is substituted with the 5'-GUGCU-3' sequence, SEQ ID NO: 39 to SEQ ID NO: 42 It may contain or consist of any one of the nucleotide sequences selected from among.
  • the engineered crRNA is 5'-GAAUAG AGCAA in which the 5'- NGNNN -3' sequence inside the nucleotide sequence of 5'-GAAUAG NGNNN UGAAGGAAUGCAAC-3' (SEQ ID NO: 38 ) is substituted with the 5'-AGCAA-3' sequence. It may comprise or consist of the sequence of UGAAGGAAUGCAAC-3' (SEQ ID NO: 43).
  • the engineered tracrRNA has a first stem-loop structure and/or a second stem-loop structure from the 5'-end of wild-type trcrRNA (herein, a first region containing MS3 modification and a second region containing MS5 modification). ) may have been removed.
  • the second stem-loop structure may be removed while still having at least one double helix duplex structure composed of at least two or more nucleotides, and the loop structure may not be removed.
  • the engineered tracrRNA contains five or more contiguous uridine sequences at any polynucleotide site (herein, the fourth region comprising MS1 and/or MS4 modifications) having sufficient complementarity to bind with the crRNA sequence.
  • engineered tracrRNAs have 1 to 10, 2 to 10, 3 to 10, 4 to 10, 5 to 10, 6 to 10 RNAs with sufficient complementarity to bind to the crRNA sequence. 1 to 8, 2 to 8, 3 to 8, 4 to 8, 5 to 8, 1 to 6, 2 to 6, or 3 to 6 (eg, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10) of any nucleotide sequence.
  • the engineered guide RNA comprises any one nucleotide sequence selected from the group consisting of SEQ ID NO: 44 to SEQ ID NO: 48 and SEQ ID NO: 105 to SEQ ID NO: 137; Alternatively, it may include a sequence in which a U-rich tail sequence is added to the 3'-end of any one nucleotide sequence selected from the group consisting of SEQ ID NOs: 125 to 137.
  • the engineered guide RNA (augment RNA) for the subminiature gene editing protein Cas12f1 mutant protein or its homolog protein of the present invention is augment RNA (SEQ ID NO: 44) having a modification in MS1, augment RNA having a modification in MS1 / MS2 (SEQ ID NO: 45), augment RNA Cas12f1_ge3.0 (SEQ ID NO: 46) with modifications in MS1/MS2/MS3, augment RNA Cas12f1_ge4.0 (SEQ ID NO: 47) with modifications in MS2/MS3/MS4 and/or MS2/ It may be augmented RNA Cas12f1_ge4.1 (SEQ ID NO: 48) with modifications in MS3/MS4/MS5.
  • the details of the engineered guide RNA (augment RNA) of the present invention are the engineered guide RNA disclosed in the PCT/KR2020/014961, PCT/KR2021/013923, PCT/KR2021/013933 and PCT/KR2021/013898 applications, All references to transactivating CRISPR RNA (tracrRNA) and engineered crRNA (CRISPR RNA) are incorporated herein by reference.
  • tracrRNA transactivating CRISPR RNA
  • CRISPR RNA engineered crRNA
  • the sequence portion that interacts with the Cas12f1 mutant protein to form a guide RNA and Cas12f1 mutant protein complex can be divided into a sequence part that enables the complex to find a target nucleic acid and a U-rich tail sequence part.
  • a sequence portion that interacts with the Cas12f1 mutant protein to form a guide RNA and Cas12f1 mutant protein complex may be referred to as a scaffold sequence.
  • the scaffold sequence may include sequences of two or more RNA molecules, such as tracrRNA and crRNA.
  • the scaffold sequence may include a tracrRNA sequence among augment RNA sequences and a CRISPR RNA repeat sequence included in crRNA.
  • the tracrRNA sequence may be all or part of a tracrRNA sequence found in nature modified.
  • the CRISPR RNA repeat sequence may be a modified version of all or part of the CRISPR RNA repeat sequence found in nature.
  • the scaffold sequence may include an engineered tracrRNA sequence, a linker sequence, and a CRISPR RNA repeat sequence included in the engineered crRNA sequence.
  • the tracrRNA sequence may be all or part of a tracrRNA sequence found in nature modified.
  • the scaffold region includes tracrRNA and part of crRNA, and does not necessarily refer to one molecule of RNA.
  • the scaffold area may be further subdivided into a first area, a second area, a third area, a fourth area, a fifth area, and a sixth area.
  • the subdivided regions are described according to the tracrRNA and crRNA regions, the first to fourth regions are included in the tracrRNA, and the fifth to sixth regions are included in the crRNA, that is, the crRNA repeat sequence portion.
  • the engineered scaffold region is different from the scaffold region of a guide RNA found in nature, and a portion of the scaffold region is modified.
  • the engineered scaffold region may be obtained by removing some of guide RNA scaffold regions found in nature.
  • the engineered scaffold region may be obtained by removing one or more nucleotides included in the scaffold region of a guide RNA found in nature.
  • the scaffold region is a region that includes parts of tracrRNA and crRNA, and is a region that interacts with the Cas12f1 mutant protein or its homologue protein.
  • the first region of the scaffold is a region including the 5'-end of tracrRNA, and the first region may include nucleotides forming a stem structure in the guide RNA and the Cas12f1 mutant protein complex, and may include nucleotides adjacent thereto. .
  • the first region may include a region that does not interact with the Cas12f1 mutant protein in the guide RNA and the Cas12f1 mutant protein complex.
  • the first region may mean from the 1st nucleotide to the 21st nucleotide from the 5'-end of the wild-type tracrRNA comprising the nucleotide sequence of SEQ ID NO: 58.
  • the sequence of the first region may be 5'-CUUCACUGAUAAAGUGGAGAA-3' (SEQ ID NO: 7).
  • the sequence of the first region may be a part of the sequence of SEQ ID NO: 7.
  • Part of the sequence of SEQ ID NO: 7 may be a partial sequence sequentially at the 3'-end of SEQ ID NO: 7 after the sequence at the 5'-end of the sequence is sequentially removed.
  • the first region may be 5'-GAUAAAGUGGAGAA-3' (SEQ ID NO: 8), 5'-UGGAGAA-3' or 5'-A-3'.
  • the engineered tracrRNA may have all sequences corresponding to the first region (sites 1-21) removed.
  • the scaffold second region refers to a region located in the 3'-end direction of the first region in tracrRNA.
  • the second region may include nucleotides forming a stem structure within the guide RNA and the Cas12f1 mutant protein complex, and may include nucleotides adjacent thereto. At this time, the stem structure is different from the stem included in the first region.
  • the second region includes Stem 2 part (Takeda et al., Structure of the miniature type V-F CRISPR-Cas effector enzyme, Molecular Cell 81, 1-13 (2021)).
  • the second region may include one or more nucleotides adjacent to the Stem 2 portion.
  • the second region may include a region that does not interact with the Cas12f1 variant protein in the guide RNA and the Cas12f1 variant protein complex.
  • the second region may mean from the 22nd nucleotide to the 71st nucleotide from the 5'-end of the wild-type tracrRNA comprising the nucleotide sequence of SEQ ID NO: 58.
  • the sequence of the second region may be 5'-CCGCUUCACCAAAAGCUGUCCCUUAGGGGAUUAGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 9).
  • the second region may be a part of the sequence of SEQ ID NO: 9.
  • Part of the sequence of SEQ ID NO: 9 may be a sequence in which at least one pair of nucleotides forming a complementary bond and/or at least one or more nucleotides not forming a complementary bond in the sequence of SEQ ID NO: 9 are deleted.
  • it may be a 5'-CCGCUUCACCAAUUAGUUGAGUGAAGGUG-3' (SEQ ID NO: 10) sequence, a 5'-CCGCUUCACCAAAAGCUUAGGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 11) sequence, or a 5'-CCGCUUCACCAAAAGCUGUUUAGAUUAGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 12) sequence.
  • the loop portion included in any one of SEQ ID NOs: 9 to 12 is a 5'-UUAG-3' sequence, which may be substituted with a 5'-GAAA-3' sequence if necessary.
  • the scaffold third region refers to a region located in the 3'-end direction of the second region in tracrRNA.
  • the third region may include nucleotides forming complementary bonds with nucleotides forming the stem structure in the guide RNA and the Cas12f1 protein complex and some nucleotides included in the crRNA, and may include nucleotides adjacent thereto.
  • the third region may mean from the 72nd nucleotide to the 129th nucleotide from the 5'-end of the wild-type tracrRNA comprising the nucleotide sequence of SEQ ID NO: 58.
  • the sequence of the third region may be a 5'-GGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAA-3' (SEQ ID NO: 13) sequence or a sequence having at least 70% sequence homology to the SEQ ID NO: 13 sequence.
  • the scaffold fourth region refers to a region located in the 3'-end direction of the third region of tracrRNA.
  • the fourth region includes nucleotides capable of forming complementary bonds with some nucleotides included in the crRNA in the guide RNA and the Cas12f1 mutant protein complex, and may include nucleotides adjacent thereto.
  • the fourth region may include one or more nucleotides complementarily binding to one or more nucleotides included in the fifth region of crRNA.
  • the fourth region may include a region that does not interact with the Cas12f1 variant protein in the guide RNA and the Cas12f1 variant protein complex.
  • the fourth region may mean from the 130th nucleotide to the 161st nucleotide from the 5'-end of the wild-type tracrRNA comprising the nucleotide sequence of SEQ ID NO: 58.
  • the fourth region is a portion of MS4 including MS1, and may be a 5'-CAAAUUCANNNNNCCUCUCCAAUUCUGCACAA-3' (SEQ ID NO: 14) sequence or a partial sequence of SEQ ID NO: 14 sequence.
  • an internal 5'-NNNNN-3' portion is an MS1 portion, and N may be A, C, G, or U, respectively.
  • a part of the sequence of SEQ ID NO: 14 may be a sequence that includes a 5'-CAAAUUCANNNNN-3' (SEQ ID NO: 15) sequence of the sequence of SEQ ID NO: 14 and does not include a part of the sequence at the 3'-end. Specifically, it may be a 5'-CAAAUUCANNNNNCCUCCAAUUC-3' (SEQ ID NO: 16) sequence, a 5'-CAAAUUCANNNNNCCUCUC-3' (SEQ ID NO: 17) sequence, or a 5'-CAAAUUCANNNNN-3' (SEQ ID NO: 15) sequence.
  • the fourth region may include a region in which the 5'-NNNNN-3' site is substituted with 5'-NNNVN-3' or 5'-NVNNN-3'.
  • N may be each independently A, C, G or U
  • V may be A, C or G.
  • the fourth region may be a sequence including a 5'-CAAAUUCANNNCN-3' (SEQ ID NO: 18) sequence and not including some sequences at the 3'-end.
  • N is each independently A, C, G or U.
  • the scaffold region 5 refers to the region containing the 5'-end of the crRNA.
  • the fifth region includes a nucleotide forming a complementary bond with one or more nucleotides of the fourth region in the guide RNA and the Cas12f1 mutant protein complex, and may include a nucleotide adjacent thereto.
  • the fifth region may include one or more nucleotides complementarily binding to one or more nucleotides included in the fourth region.
  • the fifth region may include a region that does not interact with the Cas12f1 mutant protein in the guide RNA and the Cas12f1 mutant protein complex.
  • the fifth region may mean from the 1st nucleotide to the 30th nucleotide from the 5'-end of the wild-type crRNA repeat sequence including the nucleotide sequence of SEQ ID NO: 19.
  • the fifth region may be a 5'-GUUGCAGAACCCGAAUAGNNNNNUGAAGGA-3' (SEQ ID NO: 20) sequence or a partial sequence of SEQ ID NO: 20 sequence.
  • N may independently be A, C, G or U.
  • Part of the sequence of SEQ ID NO: 20 may be a sequence that includes the 5'-NNNNNUGAAGGA-3' (SEQ ID NO: 21) sequence among the nucleotide sequences of SEQ ID NO: 20 and does not include a partial sequence at the 5'-end.
  • the fifth region may include the sequence 5'-NBNNNUGAAGGA-3' (SEQ ID NO: 22).
  • N may independently be A, C, G or U
  • B may be U, C or G.
  • sequence of the fifth region may be 5'-GUUGCAGAACCCGAAUAGNBNNNUGAAGGA-3' (SEQ ID NO: 59).
  • N may be A, C, G or U
  • B may be U, C, or G.
  • it may be 5'-GUUGCAGAACCCGAAUAGACGAAUGAAGGA-3' (SEQ ID NO: 65).
  • the fifth region may mean from the 21st nucleotide to the 30th nucleotide from the 5'-end of the wild-type crRNA repeat sequence including the nucleotide sequence of SEQ ID NO: 19.
  • sequence of the fifth region may be 5'-GAAUGAAGGA-3' (SEQ ID NO: 66).
  • the sixth scaffold region refers to a region located in the 3'-end direction of the fifth region in crRNA. It may mean from the 31st nucleotide to the 37th nucleotide from the 5'-end of the wild-type crRNA repeat sequence comprising the nucleotide sequence of SEQ ID NO: 19.
  • the sixth region may be a 5'-AUGCAAC-3' (SEQ ID NO: 23) sequence or a sequence having at least 70% or more sequence homology to the 5'-AUGCAAC-3' sequence.
  • the sixth region includes a nucleotide that forms a complementary bond with one or more nucleotides of the third region within the guide RNA and the Cas12f1 mutant protein complex, and may include a nucleotide adjacent thereto.
  • the engineered crRNA according to an embodiment of the present invention may further include a U-rich tail sequence as a seventh region at the 3'-end of the crRNA.
  • a U-rich tail sequence as a seventh region at the 3'-end of the crRNA.
  • the engineered scaffold region synergizes with the aforementioned U-rich tail, thereby improving the gene editing efficiency of the miniaturized gene editing system using the engineered guide RNA.
  • the U-rich tail sequence may be 5'-(UaN)dUe-3', 5'-UaVUaVUe-3' or 5'-UaVUaVUaVUe-3'.
  • N may be A, C, G or U, and V may independently be A, C or G.
  • a may be an integer of 0 to 4
  • d may be an integer of 0 to 3
  • e may be an integer of 0 to 10.
  • the seventh area may be U 4 AU 4 .
  • the engineered guide RNA may include a U-rich tail rich in uridine (U) at the 3'-terminal portion.
  • the U-rich tail sequence is basically rich in uridine, and includes a sequence in which one or more uridines are consecutive.
  • the U-rich tail sequence may further include additional bases other than uridine, depending on the actual use environment and expression environment of the engineered miniaturized gene editing system, for example, the internal environment of eukaryotic cells or prokaryotic cells.
  • the U-rich tail sequence provided in the embodiment of the present invention is more preferably a ribonucleoside other than uridine (A, C, G) whenever uridine (U) is repeated 1 to 5 times.
  • the modified uridine contiguous sequence is particularly useful when designing a vector expressing an engineered crRNA.
  • the U-rich tail sequence may include a sequence in which one or more of UV, UUV, UUUV, UUUUV, and/or UUUUUV is repeated.
  • the V is one of adenosine (A), cytidine (C), and guanosine (G).
  • the sequence of the U-rich tail can be expressed as (UaN)bUc.
  • N is one of A, U, C, or G
  • a, b, and c are integers, a may be 1 or more and 5 or less, b may be 0 or more and 2 or less, and c may be 1 or more and 10 or less.
  • the sequence of the U-rich tail is 5'-U-3', 5'-UU-3', 5'-UUU-3', 5'-UUUU-3', 5'-UUUUU-3 ', 5'-UUUUU-3', 5'-UUUUUU-3', 5'-UUURUUU-3' (SEQ ID NO: 67), 5'-UUURUUURUUU-3' (SEQ ID NO: 68), 5'-UUUURU-3' (SEQ ID NO: 69), 5 '-UUUURUU-3' (SEQ ID NO: 70), 5'-UUUURUUU-3' (SEQ ID NO: 71), 5'-UUUURUUU-3' (SEQ ID NO: 72), 5'-UUUURUUUU-3' (SEQ ID NO: 73) or 5'-UUUURUUUUUU-3' (SEQ ID NO: 74).
  • R can be A
  • the U-rich tail sequence is that R is A in any one of SEQ ID NOs: 67 to 74, and includes or consists of any one nucleotide sequence selected from the group consisting of SEQ ID NOs: 75 to 82 it could be
  • the U-rich tail sequence is one in which R is G in any one of SEQ ID NOs: 67 to 74, and includes or consists of any one nucleotide sequence selected from the group consisting of SEQ ID NOs: 83 to 90 can
  • the sequence of the U-rich tail is 5'-UUUAUUUU-3' (SEQ ID NO: 80), 5'-UUUAUUUUU-3' (SEQ ID NO: 82), 5'-UUUUGUUUUUU-3' (SEQ ID NO: 90), or 5'-UUUUUU-3' (SEQ ID NO: 91).
  • N may be A, C, G or U.
  • the tracrRNA comprises a first region, a second region, a third region and a fourth region.
  • a first region, a second region, a third region, and a fourth region are sequentially linked from the 5'-end to the 3'-end.
  • the sequence of the crRNA in the scaffold region includes a crRNA repeat sequence and a spacer sequence.
  • the crRNA repeat sequence may be 5'-GAAUGAAGGAAUGCAAC-3' (SEQ ID NO: 63) or 5'-GGAAUGCAAC-3' (SEQ ID NO: 64).
  • the crRNA repeat sequence may include a fifth region and a sixth region.
  • the spacer sequence may vary depending on the target sequence, and generally includes 10 to 50 nucleotides.
  • the crRNA has a fifth region, a sixth region, and a spacer connected in order from the 5'-end to the 3'-end.
  • the crRNA can be a wild-type crRNA or an engineered crRNA.
  • the crRNA may include a wild-type repeat sequence and a spacer sequence, which is a guide sequence, in order from the 5'-end to the 3'-end.
  • the wild-type repetitive sequence may be a 5'-GUUGCAGAACCCGAAUAGACGAAUGAAGGAAUGCAAC-3' (SEQ ID NO: 19) sequence.
  • a spacer sequence is a sequence complementary to a target site sequence in a target nucleic acid or target gene, and is linked to the 3'-end side of the crRNA repeat sequence.
  • the spacer sequence is a sequence homologous to the protospacer sequence adjacent to the PAM (Protospacer Adjacent Motif) sequence recognized by the Cas12f1 mutant protein, and thymidine (T) in the protospacer sequence is substituted with uridine (U).
  • PAM Protospacer Adjacent Motif
  • T thymidine
  • U uridine
  • the spacer sequence portion of the crRNA may complementarily bind to the target nucleic acid. In one embodiment, the spacer sequence portion of the crRNA may complementarily bind to the target sequence portion of the target nucleic acid.
  • the spacer sequence when the target nucleic acid is double-stranded DNA, the spacer sequence may be a sequence complementary to the target sequence included in the target strand of the double-stranded DNA.
  • the spacer sequence when the target nucleic acid is double-stranded DNA, the spacer sequence may include a sequence homologous to a protospacer sequence included in a non-target strand of the double-stranded DNA.
  • the spacer sequence may have the same nucleotide sequence as the protospacer sequence, but may have a sequence in which each of thymidine (T) included in the nucleotide sequence is substituted with uridine (U).
  • the spacer sequence may include an RNA sequence corresponding to the DNA sequence of the protospacer.
  • the length of the spacer sequence may be 10 nucleotides to 50 nucleotides in length.
  • the spacer sequence may be between 17 nucleotides and 30 nucleotides in length. More preferably, the length of the spacer sequence may be 17 nucleotides to 25 nucleotides in length.
  • Engineered guide RNAs according to embodiments of the present invention may be single guide RNAs or dual guide RNAs.
  • Dual guide RNA means that the guide RNA is composed of two molecules of RNA, tracrRNA and crRNA.
  • Single guide RNA (sgRNA) means that the 3'-end of the engineered tracrRNA and the 5'-end of the engineered crRNA are connected through a linker.
  • the engineered single guide RNA further includes a linker sequence, and the tracrRNA sequence and the crRNA sequence may be linked through the linker sequence.
  • the 3'-end of the fourth region and the 5'-end of the fifth region included in the engineered scaffold may be connected through a linker. More preferably, the 3'-end of the fourth region and the 5'-end of the fifth region may be linked by a linker 5'-GAAA-3'.
  • a tracrRNA sequence, a linker sequence, a crRNA sequence, and a U-rich tail sequence are sequentially connected from the 5'-end to the 3'-end.
  • a portion of the tracrRNA sequence and all or a portion of the CRISPR RNA repeat sequence included in the crRNA sequence have sequences complementary to each other.
  • the single guide RNA may have a sequence selected from the group consisting of SEQ ID NOs: 44 to 48.
  • the engineered guide RNA according to an embodiment of the present invention may be a dual guide RNA in which tracrRNA and crRNA form separate RNA molecules.
  • a portion of the tracrRNA and a portion of the crRNA may have complementary sequences to form a double-stranded RNA.
  • a portion including the 3'-end of tracrRNA and a portion including the CRISPR RNA repeat sequence of crRNA may form a double strand.
  • the engineered guide RNA can bind to the Cas12f1 mutant protein to form a complex between the guide RNA and the Cas12f1 mutant protein, and recognizes a target sequence complementary to a spacer sequence included in the crRNA sequence to edit a target nucleic acid including the target sequence.
  • the sequence of the tracrRNA may include a complementary sequence having 0 to 20 mismatches with the CRISPR RNA repeat sequence.
  • the tracrRNA sequence may include a complementary sequence with 0 to 8 or 8 to 12 mismatches with the CRISPR RNA repeat sequence.
  • sgRNA single guide RNA
  • the engineered guide RNA provided by the present invention may be a single molecule of single guide RNA (sgRNA).
  • the engineered scaffold region may be one in which one or more of each region is modified, and additionally, the 3'-end of the fourth region of tracrRNA and the 5'-end of the fifth region of crRNA may be connected through a linker.
  • the engineered scaffold region is modified at one or more places in the scaffold region found in nature, and the 3'-end of the fourth region and the 5'-end of the fifth region are linked through a linker.
  • the linker may be 5'-GAAA-3'.
  • the engineered scaffold region includes regions corresponding to each portion of a scaffold region found in nature. Specifically, the engineered scaffold region includes a first region, a second region, a third region, a fourth region, a fifth region, and a sixth region, which are the first region included in the scaffold region found in nature. to the sixth area, respectively.
  • the engineered scaffold region may not include a region corresponding to the first region and/or the second region among scaffold regions found in nature.
  • the miniaturized gene editing protein is a protein comprising or consisting of any one amino acid sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 4, and the guide RNA is SEQ ID NO: 44 to SEQ ID NO: 48. It may contain or consist of any one nucleotide sequence selected from the group consisting of.
  • the miniaturized gene editing protein is a protein comprising or consisting of any one amino acid sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 4, and the guide RNA is an augment RNA (SEQ ID NO: 44) having a modification in MS1, MS1 /MS2 (SEQ ID NO: 45), augment RNA Cas12f1_ge3.0 (SEQ ID NO: 46) with modifications in MS1/MS2/MS3, augment RNA Cas12f1_ge4.0 (SEQ ID NO: 46) with modifications in MS2/MS3/MS4 number 47) and augment RNA Cas12f1_ge4.1 (SEQ ID NO: 48) having modifications in MS2/MS3/MS4/MS5.
  • an augment RNA SEQ ID NO: 44
  • the guide RNA is an augment RNA having a modification in MS1, MS1 /MS2 (SEQ ID NO: 45) having a modification in MS1, MS1 /MS2 (SEQ ID NO: 45), augment RNA Cas12f1_ge3.0 (SEQ ID
  • the guide RNA is MS1/MS3-1 augment RNA (SEQ ID NO: 105), MS1/MS3-2 augment RNA (SEQ ID NO: 106), MS1/MS3-3 augment RNA (SEQ ID NO: 107), MS1/MS4 * - 1 augment RNA (SEQ ID NO: 108), MS1 / MS4 * -2 augment RNA (SEQ ID NO: 109), MS1 / MS4 * -3 augment RNA (SEQ ID NO: 110), MS1 / MS5-1 augment RNA (SEQ ID NO: 111), MS1/MS5-2 augment RNA (SEQ ID NO: 112), MS1/MS5-3 augment RNA (SEQ ID NO: 113), MS1/MS2/MS4 * -2 augment RNA (SEQ ID NO: 114), MS1/MS3-3/MS4 * -2 augment RNA (SEQ ID NO: 115), MS1/MS2/MS5-3 augment RNA (SEQ ID NO: 116), MS1/MS3-3/MS5-3 sgRNA (SEQ ID NO: 117
  • the guide RNA may be 5'-CUUCACUGAUAAAGUGGAGAACCGCUUCACCAAAAGCUGUCCCuuagGGGAUUAGAACUUGAGUGAAGGUGGGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCA UUU gaaa GAA UGAAGGAAUGCAACNNNNNNNNNNNNNNNNNN-3' (SEQ ID NO: 12).
  • the guide RNA may be an augment RNA having some modification of the nucleic acid sequence in the mature form sgRNA.
  • MS3-1 augment RNA SEQ ID NO: 125
  • MS3-2 augment RNA SEQ ID NO: 126
  • MS3-3 augment RNA SEQ ID NO: 127
  • MS4-1 augment RNA SEQ ID NO: 128)
  • MS4-2 augment RNA SEQ ID NO: 129
  • MS4-3 augment RNA SEQ ID NO: 130
  • MS5-1 augment RNA SEQ ID NO: 131
  • MS5-2 augment RNA SEQ ID NO: 132
  • MS5-3 augment RNA SEQ ID NO: 133
  • MS3-3/MS4-3 augment RNA SEQ ID NO: 134
  • MS3-3/MS5-3 augment RNA SEQ ID NO: 135)
  • MS4-3/MS5-3 augment RNA SEQ ID NO: 136)
  • MS3-3 /MS4-3/MS5-3 augment RNA SEQ ID NO: 137
  • the augment RNA having some modification of the nucleic acid sequence in the mature form sgRNA may be an augment RNA in which the MS2 modification of the present invention is added to each of the engineered augment RNAs composed of the nucleotide sequences of SEQ ID NOs: 125 to 137.
  • the MS2 modification is a U-rich tail sequence, and the sequence may be a 5'-(UaN)dUe-3' sequence, a 5'-UaVUaVUe-3' sequence, or a 5'-UaVUaVUaVUe-3' sequence.
  • N can be A, C, G or U.
  • Each V may independently be A, C or G.
  • the a may be an integer of 0 to 4.
  • d may be an integer from 0 to 3
  • e may be an integer from 0 to 10.
  • it may be U 4 AU 4 .
  • the engineered scaffold region included in the engineered guide RNA for the Cas12f1 mutant protein or its homolog protein may include one or more nucleotides included in the first region among the scaffold regions removed. More specifically, the removed nucleotide may be a nucleotide included in a portion forming a stem structure in the guide RNA and Cas12f1 mutant protein complex in the first region.
  • the removed nucleotide may be a nucleotide belonging to Stem 1 (Takeda et al., Structure of the miniature type V-F CRISPR-Cas effector enzyme, Molecular Cell 81, 1-13 (2021)) in the first region there is.
  • the removed nucleotide may be a nucleotide that does not interact with the Cas12f1 mutant protein in the guide RNA and Cas12f1 mutant protein complex in the first region.
  • the modified first region may be 5'-CUUCACUGAUAAAGUGGAGAA-3' (SEQ ID NO: 7) or a partial sequence of SEQ ID NO: 7.
  • Some sequences of SEQ ID NO: 7 are at least 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 nucleotides may be sequentially removed from the sequence.
  • SEQ ID NO: 7 some sequences of SEQ ID NO: 7 are SEQ ID NO: Sequences at the 5'-end of the 7 sequences are sequentially removed, and some sequences may be sequentially at the remaining 3'-end.
  • the first region may be 5'-GAUAAAGUGGAGAA-3' (SEQ ID NO: 8), 5'-UGGAGAA-3', or 5'-A-3'. Alternatively, the entire first region may be removed.
  • the engineered scaffold region may include a modified second region.
  • the modified second region is one in which one or more nucleotides are removed from the second region of the scaffold region.
  • the removed nucleotide is a nucleotide selected from a region forming a stem structure in the guide RNA and the Cas12f1 mutant protein complex.
  • the removal of the nucleotide may occur in a portion forming a stem structure in the second region, and the nucleotide may be removed in base pair units.
  • the removed nucleotide may be a nucleotide included in a portion forming a stem structure in the guide RNA and Cas12f1 mutant protein complex in the second region.
  • the modified second region (MS5 region, 22-71 region) of the engineered scaffold region may be a 5'-CCGCUUCACCAAAAGCUGUCCCUUAGGGGAUUAGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 9) sequence or a partial sequence of SEQ ID NO: 9 sequence. there is. It may be from 1 to 50 nucleotides removed from the second region of the scaffold region found in nature.
  • Part of the sequence of SEQ ID NO: 9 may be a sequence in which at least one pair of nucleotides forming a complementary bond and/or at least one or more nucleotides not forming a complementary bond in SEQ ID NO: 9 are deleted.
  • the 5'-UUAG-3' sequence of the loop portion included in some sequences of SEQ ID NO: 9 may be optionally substituted with the 5'-GAAA-3' sequence.
  • the second region may have a loop removed.
  • the modified second region is the second region of the scaffold region found in nature, based on the nucleotide sequence of SEQ ID NO: 9, 1st to 22nd nucleotides from the 5'-end and / or 27 One or more of the th to 50th nucleotides may be removed.
  • the modified second region is the second region of the scaffold region found in nature, based on SEQ ID NO: 11 sequence, 1st to 22nd nucleotides from the 5'-end, and / or 27th to 50th nucleotides One or more of them may be removed, and the 23rd to 26th nucleotides may be substituted with other ones.
  • it may be a 5'-CCGCUUCACCAAUUAGUUGAGUGAAGGUG-3' (SEQ ID NO: 10) sequence, a 5'-CCGCUUCACCAAAAGCUUAGGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 11) sequence, or a 5'-CCGCUUCACCAAAAGCUGUUUAGAUUAGAACUUGAGUGAAGGUG-3' (SEQ ID NO: 12) sequence.
  • the engineered scaffold region provided by the present invention may be a scaffold region found in nature in which the second region is removed.
  • the engineered scaffold region may not have a region corresponding to a second region of a scaffold region found in nature.
  • the sequence of the engineered scaffold region from which the second region is removed may be 5'-CUUCACUGAUAAAGUGGAGAAGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCAUUUGAAAGAAUGAAGGAAUGCAAC-3' (SEQ ID NO: 92).
  • the engineered scaffold region may include a modified third region.
  • the modified third region is one in which one or more nucleotides are removed from the third region of the scaffold region found in nature.
  • the removed nucleotide is a nucleotide selected from a region forming a stem structure in the guide RNA and the Cas12f1 mutant protein complex.
  • the modified third region (pre-MS1 region, 72-129 region) of the engineered scaffold region is 5'-GGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAA-3' (SEQ ID NO: 13) or at least 70% (SEQ ID NO: 13) or more ( For example, it may be a sequence having sequence homology of 70%, 80% or 90%). It may be from 1 to 20 nucleotides removed from the third region of the scaffold region found in nature.
  • the modified third region is a third region of the scaffold region found in nature, based on the nucleotide sequence of SEQ ID NO: 13, 28th to 37th nucleotides from the 5'-end and / or 42 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 consecutive nucleotides may be removed from the nucleotide to the 51st nucleotide.
  • the modified third region is the third region of the scaffold region found in nature, based on the nucleotide sequence of SEQ ID NO: 13, 27th to 36th nucleotides from the 5'-end, and 42nd Among the nucleotides at 51 to 51, one or more nucleotide pairs forming base pairs in the guide RNA and the Cas12f1 mutant protein complex may be removed.
  • the modified third region is located at 27th to 36th nucleotides and 42nd to 51st nucleotides from the 5'-end based on the nucleotide sequence of SEQ ID NO: 13 in the third region of the scaffold region found in nature.
  • one or more base-paired nucleotides and/or one or more non-base-paired nucleotides in the guide RNA and the Cas12f1 mutant protein complex may be removed.
  • the modified third region is characterized by comprising 5'-GCUGCUUGCAUCAGCCUAAUGUCGAG-3' (SEQ ID NO: 93), 5'-UUCG-3', and 5'-CUCGA-3' sequences.
  • the engineered scaffold region provided herein may be one in which the fourth region and the fifth region are modified from scaffold regions found in nature. Since the fourth region and the fifth region include parts constituting the stem by hybridizing with each other in the guide RNA and the Cas12f1 mutant protein complex, the corresponding parts may be modified together to constitute an engineered scaffold region.
  • the modified fourth region is characterized by the removal of one or more nucleotides from the fourth region of the scaffold region found in nature.
  • the modified fifth region is characterized by the removal of one or more nucleotides from the fifth region of the scaffold region found in nature.
  • the modified fourth region is characterized by having a 5'-CAAA-3' or 5'-AACAAA-3' sequence in the 5'-terminal direction.
  • the modified fifth region is characterized by having a 5'-GGA-3' sequence in the 3'-end direction.
  • the modified fourth region of the engineered scaffold region may be one in which 1 to 7 nucleotides are removed from the fourth region of the scaffold region found in nature. In one embodiment, the modified fourth region of the engineered scaffold region may be one in which 1 to 28 nucleotides are removed from the fourth region of the scaffold region found in nature.
  • the modification in the fourth region is based on the nucleotide sequence of SEQ ID NO: 27 in the fourth region of the scaffold region found in nature, at least one of the 9th to 15th nucleotides from the 5'-end is removed.
  • the modified fourth region is based on the nucleotide sequence of SEQ ID NO: 27 in the fourth region of the scaffold region found in nature, at least one of the 9th to 36th nucleotides from the 5'-end may have been removed.
  • the modification in the fourth region includes 5'-CAAAUUCANNNNN-3' (SEQ ID NO: 15) of SEQ ID NO: 14 and includes some sequences (eg, at least one, two, or three) from the 3'-end.
  • the fourth region may include a region in which the 5'-NNNNN-3' site is substituted with 5'-NNNVN-3' or 5'-NVNNN-3'.
  • each N is independently A, C, G or U, and V may be A, C or G.
  • the fourth region includes 5'-CAAAUUCANNNCN-3' (SEQ ID NO: 18) and includes some sequences from the 3'-end (eg, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 or 18 nucleotide sequences). wherein each N is independently A, C, G or U.
  • the modified fifth region is in the fifth region of the scaffold region found in nature, based on the nucleotide sequence of SEQ ID NO: 19, at least one of the 1st to 7th nucleotides from the 5'-end is removed may have been In one embodiment, the modified fifth region is in the fifth region of the scaffold region found in nature, based on the nucleotide sequence of SEQ ID NO: 19, at least one of the 1st to 27th nucleotides from the 5'-end is removed may have been
  • the modified fourth and fifth regions are 9th to 9th from the 5'-end based on the nucleotide sequence of SEQ ID NO: 14 in the fourth and fifth regions of the scaffold region found in nature. Based on the nucleotide sequence of 15th and SEQ ID NO: 19, among the 1st to 7th nucleotides from the 5'-end, one or more nucleotides base-paired and/or one or more nucleotides not base-paired in the guide RNA and Cas12f1 mutant protein complex may have been removed.
  • the modified fourth and fifth regions are the 9th to 15th regions from the 5'-end and the sequence based on the nucleotide sequence of SEQ ID NO: 14 in the fourth and fifth regions of the scaffold region found in nature. Based on the nucleotide sequence of No. 19, among the 1st to 7th nucleotides from the 5'-end, one or more pairs of nucleotides base-paired in the guide RNA and the Cas12f1 mutant protein complex and / or one or more pairs of mismatched nucleotides may be removed. there is.
  • the modified fourth and fifth regions are the 9th to 36th nucleotides and sequences from the 5'-end based on the nucleotide sequence of SEQ ID NO: 14 in the fourth and fifth regions of the scaffold region found in nature Based on the nucleotide sequence of No. 19, one or more nucleotides that form base pairs in the guide RNA and Cas12f1 mutant protein complex among the 1st to 27th nucleotides and/or one or more nucleotides that do not form base pairs may be removed.
  • the sequence of the modified fourth region is 5'-AACAAA-3', 5'-AACAAAU-3', 5'-AACAAAUU-3', 5'-AACAAAUUC-3', 5'-AACAAAUUCA- 3' (SEQ ID NO: 94), 5'-AACAAAUUCAU-3' (SEQ ID NO: 95), 5'-AACAAAUUCAUU-3' (SEQ ID NO: 96), 5'-CAAA-3', 5'-CAAAU-3', It may be or include 5'-CAAAUU-3', 5'-CAAAUUC-3', 5'-CAAAUUCA-3', 5'-CAAAUUCAU-3' or 5'-CAAAUUCAUU-3'.
  • sequence of the modified fifth region is 5'-GGA-3', 5'-AGGA-3', 5'-AAGGA-3', 5'-GAAGGA-3', 5'-UGAAGGA-3', It may be or include 5'-AUGAAGGA-3' or 5'-AAUGAAGGA-3'.
  • the engineered scaffold region in which the fourth region and the fifth region are modified is 5'-AACAAA-3', 5'-AACAAAU-3', 5'- from 5'-end to 3'-end.
  • AACAAAUU-3 ', 5'-AACAAAUUC-3', 5'-AACAAAUUCA-3' (SEQ ID NO: 94), 5'-AACAAAUUCAU-3' (SEQ ID NO: 95) and 5'-AACAAAUUCAUU-3' (SEQ ID NO: 96 ) At least one sequence selected from the group consisting of; And from the 5'-end to the 3' end, 5'-GGA-3', 5'-AGGA-3', 5'-AAGGA-3', 5'-GAAGGA-3', 5'-UGAAGGA-3 It may be a nucleic acid comprising a sequence in which one or more sequences selected from the group consisting of ', 5'-AUGAAGGA-3' and 5'-AAUGAAGGA-3' are
  • the sixth region is a region including nucleotides belonging to crRNA in the PK(R:AR-1) region.
  • the sixth region of the engineered scaffold may be the same as the sixth region of the scaffold found in nature, or may be modified within the extent that the function of the sixth region is not damaged.
  • the sixth region may be 5'-AUGCAAC-3' (SEQ ID NO: 23) or a sequence having at least 70% or more sequence homology to the above sequence.
  • modification in the seventh region of the guide RNA includes providing a U-rich tail sequence at the 3'-end of the crRNA to improve gene editing efficiency of the gene editing system of the present invention.
  • the U-rich tail sequence is characterized in that it basically contains uridine in abundance, and includes a sequence in which one or more uridines are consecutive.
  • the engineered crRNA may further include a U-rich tail sequence as a seventh region at the 3'-end of the crRNA.
  • the U-rich tail sequence may be 5'-(UaN)dUe-3', 5'-UaVUaVUe-3' or 5'-UaVUaVUaVUe-3'.
  • N can each be A, C, G or U.
  • V may each independently be A, C or G.
  • the a may be an integer of 0 to 4.
  • d may be an integer from 0 to 3.
  • e may be an integer from 0 to 10.
  • the U-rich tail sequence may include 1 to 10 uridine repeat sequences.
  • the U-rich tail sequence may further include additional bases other than uridine, depending on the actual use environment and expression environment of the gene editing system including the engineered guide RNA, for example, the internal environment of eukaryotic cells or prokaryotic cells.
  • the U-rich tail sequence may include a sequence in which one or more of UV, UUV, UUUV, and/or UUUUV are repeated.
  • V is one of adenosine (A), cytidine (C), and guanosine (G).
  • the U-rich tail sequence is characterized in that it is linked to the 3'-end of the crRNA sequence included in the miniaturized gene editing system of the present invention.
  • the U-rich tail sequence serves to increase the cleavage efficiency of the target nucleic acid of the augment RNA and Cas12f1 mutant protein complex provided in the present invention.
  • the target nucleic acid may be single-stranded DNA, double-stranded DNA and/or RNA.
  • the term "tail sequence" used herein may mean not only the RNA sequence itself rich in uridine (U), but also the DNA sequence encoding it, which is appropriately interpreted depending on the context. The present inventors experimentally revealed the structure of the U-rich tail sequence and its effect in detail, and will be described in detail with specific embodiments below.
  • a U-rich tail sequence can be expressed as Ux.
  • the x may be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20.
  • x may be an integer within the range of two values selected in the immediately preceding sentence.
  • x can be an integer between 1 and 6.
  • x may be an integer between 1 and 20.
  • x may be an integer greater than or equal to 20.
  • the U-rich tail sequence may be expressed as (UaN)nUb.
  • N is one of adenosine (A), uridine (U), cytidine (C), and guanosine (G).
  • a is an integer between 1 and 5
  • n is an integer greater than or equal to 0.
  • n may be an integer between 0 and 2.
  • b may be 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
  • b may be an integer within the two numerical ranges selected in the immediately preceding sentence. For example, b may be an integer between 1 and 6.
  • the U-rich tail sequence may be expressed as (UaV)nUb.
  • the V is one of adenosine (A), cytidine (C), and guanosine (G).
  • A adenosine
  • C cytidine
  • G guanosine
  • a is an integer between 1 and 4
  • n is an integer greater than or equal to 0.
  • n may be 1 or 2.
  • b may be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20.
  • b may be an integer within the two numerical ranges selected in the immediately preceding sentence.
  • b may be an integer between 1 and 6.
  • b may be an integer between 1 and 20.
  • b may be an integer greater than or equal to 20.
  • the U-rich tail sequence may be a combination of a sequence represented by Ux and a sequence represented by (UaV)n.
  • the U-rich tail sequence may be expressed as U)n1-V1-(U)n2-V2-Ux.
  • V1 and V2 are each one of adenine (A), cytidine (C), and guanine (G).
  • n1 and n2 may each be an integer between 1 and 4.
  • x may be an integer between 1 and 20.
  • the length of the U-rich tail sequence is 1nt, 2nt, 3nt, 4nt, 5nt, 6nt, 7nt, 8nt, 9nt, 10nt, 11nt, 12nt, 13nt, 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, or It may be 20 nt. In one embodiment, the length of the U-rich tail sequence may be 20 nt or more.
  • the sequence of the U-rich tail is 5'-U-3', 5'-UU-3', 5'-UUU-3', 5'-UUUU-3', 5'-UUUUU- 3', 5'-UUUUUU-3', 5'-UUURUUU-3' (SEQ ID NO: 67), 5'-UUURUUURUUU-3' (SEQ ID NO: 68), 5'-UUUURU-3' (SEQ ID NO: 69), 5'-UUUURUU-3' (SEQ ID NO: 70), 5'-UUUURUU-3' (SEQ ID NO: 71), 5'-UUUURUUU-3' (SEQ ID NO: 72), 5'-UUUURUUUU-3' (SEQ ID NO: 73 ), or 5'-UUUURUUUUU-3' (SEQ ID NO: 74).
  • R can be A or G.
  • sequence of the U-rich tail is one in which R is A in any one of SEQ ID NOs: 67 to 74, and includes or consists of any one of SEQ ID NOs: 75 to 82.
  • the U-rich tail sequence may be one in which R is G in any one of SEQ ID NOs: 67 to 74, and may include or consist of any one of SEQ ID NOs: 83 to 90. there is.
  • the sequence of the U-rich tail is 5'-UUUUAUUUU-3' (SEQ ID NO: 80), 5'-UUUUAUUUUU-3' (SEQ ID NO: 82), 5'-UUUUGUUUUUU-3' (SEQ ID NO: 90) or 5'-UUUUUU-3' (SEQ ID NO: 91).
  • the U-rich tail sequence may include or consist of any one nucleotide sequence selected from the group consisting of SEQ ID NO: 67 to SEQ ID NO: 91.
  • the engineered tracrRNA of the present invention may optionally further include an additional sequence.
  • the additional sequence may be located at the 3'-end of the engineered tracrRNA.
  • the additional sequence may be located at the 3'-end of the fourth region.
  • the additional sequence may also be located at the 5'-end of the engineered tracrRNA.
  • the additional sequence may be located at the 5'-end of the first region.
  • the additional sequence may be from 1 to 40 nucleotides.
  • the additional sequence may be any nucleotide sequence or an arbitrarily arranged nucleotide sequence.
  • the additional sequence may be the sequence 5'-AUAAAGGUGA-3' (SEQ ID NO: 97).
  • the additional sequence may be a known nucleotide sequence.
  • the additional sequence may be a hammerhead ribozyme nucleotide sequence.
  • the hammerhead ribozyme nucleotide sequence may be a 5'-CUGAUGAGUCCGUGAGGACGAAACGAGUAAGCUCGUC-3' (SEQ ID NO: 98) sequence or a 5'-CUGCUCGAAUGAGCAAAGCAGGAGUGCCUGAGUAGUC-3' (SEQ ID NO: 99) sequence.
  • SEQ ID NO: 99 5'-CUGCUCGAAUGAGCAAAGCAGGAGUGCCUGAGUAGUC-3'
  • the engineered tracrRNA or engineered crRNA may have at least one nucleotide chemically modified, if necessary.
  • the chemical modification may be a modification of various covalent bonds that may occur in bases and/or sugars of nucleotides.
  • the chemical modification is methylation, halogenation, acetylation, phosphorylation, phosphorothioate linkage, locked nucleic acid (LNA), 2'-O-methyl 3'phosphorothioate (MS) or 2'-O-methyl 3'thioPACE (MSP) can be
  • LNA locked nucleic acid
  • MS 2'-O-methyl 3'phosphorothioate
  • MSP 2'-O-methyl 3'thioPACE
  • cleavage or editing of a target nucleic acid or target gene within a cell is significantly improved compared to the case of using a guide RNA found in nature. effect appears.
  • the engineered guide RNA optimizes the length with high efficiency, thereby reducing the guide RNA synthesis cost, securing additional space or capacity when inserting into a viral vector, normal expression of tracrRNA, increase in operable guide RNA expression, guide RNA Increased stability of RNA, increased stability of guide RNA and Cas12f1 mutant protein complex, induction of high-efficiency guide RNA and Cas12f1 mutant protein complex formation, target nucleic acid by microgene editing system including guide RNA and Cas12f1 mutant protein complex It may be accompanied by effects of increasing cleavage efficiency and increasing efficiency of editing target nucleic acids by the system. Accordingly, using the above-described engineered guide RNA for the Cas12f1 mutant protein or its homologue protein can cut or edit genes with high efficiency in cells by overcoming the above-mentioned limitations of the prior art.
  • the engineered guide RNA since the engineered guide RNA has a shorter length compared to guide RNAs found in nature, its application potential is high in the field of gene editing technology.
  • the miniaturized gene editing system including the guide RNA and the Cas12f1 mutant protein complex has a very small size and excellent editing efficiency, and can be used in various gene editing technologies.
  • a composition for gene editing comprising the gene editing system described above is provided.
  • a gene editing composition including the vector system described below or both the gene editing system and the vector system is provided.
  • the composition for gene editing includes a small endonuclease including a Cas12f1 mutant protein or a homologue thereof, or a nucleic acid encoding the endonuclease; and a guide RNA or a nucleic acid encoding the guide RNA.
  • a small endonuclease including a Cas12f1 mutant protein or a homologue thereof, or a nucleic acid encoding the endonuclease; and a guide RNA or a nucleic acid encoding the guide RNA.
  • composition for gene editing of the present invention may further include appropriate materials required for gene editing use, in addition to each component of the miniaturized gene editing system according to the present invention.
  • each component of the miniaturized gene editing system provided by the present invention is expressed in cells, according to one aspect of the present invention, nucleic acids or polynucleotides encoding each component of the miniaturized gene editing system are provided.
  • the nucleic acid or polynucleotide includes a nucleic acid sequence encoding a gene editing protein and/or guide RNA included in the miniaturized gene editing system to be expressed.
  • the sequence of the nucleic acid or polynucleotide includes not only a nucleic acid sequence encoding a wild-type gene editing protein and a wild-type guide RNA, but also a nucleic acid sequence encoding an augmented RNA engineered for the purpose and/or a gene editing protein optimized for codons. , a nucleic acid sequence encoding an engineered gene editing protein, or a nucleic acid sequence encoding a gene editing protein with lost or reduced DNA double-strand break activity.
  • the nucleic acid or polynucleotide may include a sequence configured to express the Cas12f1 mutant protein, which is a miniaturized gene editing protein, or a homologous protein thereof.
  • the Cas12f1 mutant protein or its homologue protein may be a protein having an activity of cleaving double-stranded or single-stranded DNA.
  • the nucleic acid or polynucleotide may include a sequence configured to express a Cas12f1 mutant protein.
  • the wild-type Cas12f1 mutant protein may be a protein consisting of the amino acid sequence of SEQ ID NO: 1.
  • the homologue of the Cas12f1 mutant protein according to the present invention may be a protein in which 1 to 28 amino acids are added to the N-terminus of the Cas12f1 protein consisting of the amino acid sequence of SEQ ID NO: 5.
  • the nucleic acid or polynucleotide may include a sequence encoding a Cas12f1 mutant protein or a homologous protein thereof.
  • the nucleic acid or polynucleotide may include a human codon-optimized nucleic acid sequence encoding a Cas12f1 mutant protein or a homologous protein thereof.
  • the Cas12f1 mutant protein may be a protein consisting of any one amino acid sequence selected from the group consisting of SEQ ID NOs: 1 to 4, and the homologous protein consisting of any one amino acid sequence selected from the group consisting of SEQ ID NOs: 141 to 232. may be protein.
  • the nucleic acid or polynucleotide may include a sequence encoding a modified Cas12f1 variant protein or a Cas12f1 variant fusion protein.
  • the nucleic acid or polynucleotide may include a sequence configured to express a Cas12f1 mutant protein modified to cleave only one strand of the duplex of the target nucleic acid.
  • the modified Cas12f1 mutant protein is capable of cutting only one strand of the double strands of the target nucleic acid, and is modified to perform base editing or prime editing on the strand that is not cut.
  • the nucleic acid or polynucleotide may include a sequence encoding a Cas12f1 mutant protein modified to perform base editing or prime editing on the target nucleic acid or a gene expression control function.
  • the nucleic acid or polynucleotide is configured to express a guide RNA (augment RNA) engineered to have optimal targeting efficiency for the Cas12f1 mutant, or a sequence configured to express one or more different engineered guide RNAs.
  • a guide RNA (augment RNA) engineered to have optimal targeting efficiency for the Cas12f1 mutant, or a sequence configured to express one or more different engineered guide RNAs.
  • the augment RNA sequence may include a scaffold sequence, a spacer sequence, and a U-rich tail sequence.
  • the augment RNA sequence may include a modified tracrRNA sequence and/or a modified crRNA sequence, and may include a U-rich tail sequence.
  • the U-rich tail sequence may be expressed as (UaN)nUb.
  • N is one of adenosine (A), uridine (U), cytidine (C), and guanosine (G).
  • A adenosine
  • U uridine
  • C cytidine
  • G guanosine
  • a is an integer of 1 or more and 4 or less
  • n is an integer of 0, 1
  • 2 and b is an integer of 1 or more and 10 or less.
  • the U-rich tail sequence may be expressed as (UaV)nUb.
  • a, n, and b are integers, a may be 1 or more and 4 or less, n may be 0 or more, and b may be 1 or more and 10 or less.
  • a vector containing a sequence encoding each component of the gene editing system is introduced into a target cell, and each component of the gene editing system is introduced into the target cell.
  • a method that allows this to be expressed can be used.
  • the miniaturized gene editing system of the present invention for editing a target nucleic acid or target gene comprises, in order to achieve excellent targeting efficiency, the guide RNA and each component of the Cas12f1 mutant protein complex are operably linked and included as one vector. It is desirable Here, an effector protein may be linked to a nucleolytic protein or a guide molecule as necessary to form a fused protein.
  • the fused form of the protein may include an orthogonal RNA-binding protein or adapter protein present in a bacteriophage coat protein.
  • the envelope proteins are MS2, Q ⁇ , F2, GA, fr, JP501, M12, R17, BZ13, JP34, JP500, KU1, M11, MX1, TW18, VK, SP, FI, ID2, NL95, TW19, AP205, ⁇ Cb5 , ⁇ Cb8r, ⁇ Cb12r, ⁇ Cb23r, 7s, and PRR1.
  • the fused form of the protein can be delivered through one or more lipid nanoparticles.
  • the Cas12f1 mutant protein or its homologue protein which is a miniaturized gene editing protein corresponding to the components of the miniaturized gene editing system of the present invention, can be delivered to cells as one or more guide RNAs or one or more mRNA molecules encoding them. .
  • the RNA molecule may be delivered through one or more lipid nanoparticles.
  • the components of the miniaturized gene editing system of the present invention may be in the form of one or more DNA molecules.
  • one or more DNA molecules may include one or more regulatory elements operably configured to express gene editing proteins or guide molecules. If desired, one or more regulatory elements may include an inducible promoter.
  • the DNA molecules constituting the miniaturized gene editing system may be contained in one or more adeno-associated virus (AAV) vectors and delivered into cells.
  • AAV adeno-associated virus
  • all of the DNA molecules can be contained in a single adeno-associated virus (AAV) vector and delivered into cells.
  • the components of the vector enabling the microgene editing system of the present invention to be expressed in cells include the following.
  • the sequence of the vector essentially includes at least one of nucleic acid sequences encoding each component of the miniaturized gene editing system. Should be.
  • the vector system comprises a first nucleic acid construct operably linked to a nucleotide encoding a small endonuclease including a Cas12f1 variant protein or a homolog protein thereof; and a second nucleic acid construct to which a nucleotide sequence encoding a guide RNA is operably linked.
  • the first nucleic acid construct and the second nucleic acid construct may be located on the same vector or on different/separate vectors of the vector system.
  • the connection may be made directly or through a linker.
  • the nucleic acid construct may include a nucleic acid encoding an engineered guide RNA.
  • the guide RNA engineered herein may include an engineered tracrRNA and/or an engineered crRNA.
  • the engineered guide RNA may have the same configuration as the previously described embodiment of the engineered guide RNA.
  • the guide RNA is an augment RNA having a modification in MS1 (SEQ ID NO: 44), an augment RNA having a modification in MS1/MS2 (SEQ ID NO: 45), an augment RNA having a modification in MS1/MS2/MS3 Cas12f1_ge3.0 ( SEQ ID NO: 46), augment RNA Cas12f1_ge4.0 (SEQ ID NO: 47) with modifications in MS2/MS3/MS4 and/or augment RNA Cas12f1_ge4.1 (SEQ ID NO: 48) with modifications in MS2/MS3/MS4/MS5.
  • RNA may contain/consist of a nucleic acid sequence that
  • the novel ultra-small gene editing protein Cas12f1 mutant protein is a protein comprising any one amino acid sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 4, and the nucleic acid construct is a nucleic acid encoding the protein or the It may include codon-optimized nucleic acids of proteins.
  • the miniaturized gene editing protein may be a gene editing protein characterized in that it consists of any one amino acid sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 4, and the codon-optimized nucleic acid encoding the same is a human codon-optimized nucleic acid.
  • As an optimized nucleic acid it may consist of any one nucleotide sequence selected from SEQ ID NO: 101 to SEQ ID NO: 104.
  • a peptide of a certain length may be added to the Cas12f1 mutant protein or a homologous protein thereof, which is a novel microgene editing protein of the nucleic acid construct.
  • the peptide may include or consist of a nucleic acid sequence encoding any one amino acid sequence selected from the group consisting of SEQ ID NO: 49 to SEQ ID NO: 51.
  • the nucleic acid construct may include one or more nuclear localization signal (NLS) or nuclear export signal (NES) sequences at the N-terminus or C-terminus.
  • the NLS sequence refers to a peptide or its sequence of a certain length that acts as a kind of "tag” attached to a protein to be transported when a substance outside the cell nucleus is transported into the nucleus by nuclear transport.
  • the NES sequence refers to a peptide or its sequence of a certain length attached to a protein to be transported and acting as a kind of "tag” when a substance inside the cell nucleus is transported out of the nucleus by nuclear transport.
  • the NLS sequence may include or consist of a nucleotide sequence of SEQ ID NO: 52 or SEQ ID NO: 53 or a nucleic acid sequence encoding any one amino acid sequence selected from SEQ ID NO: 54 to SEQ ID NO: 57.
  • the sequence of the vector includes a nucleic acid sequence encoding a guide RNA and/or a gene editing protein included in the miniaturized gene editing system to be expressed.
  • a nucleic acid sequence encoding a guide RNA and/or a gene editing protein included in the miniaturized gene editing system to be expressed.
  • contents related to the nucleic acid sequence refer to the contents described in "III. Nucleic acids encoding components of the microgene editing system".
  • the vectors may be configured to express two or more different engineered guide RNAs.
  • the vector may be configured to express the first augment RNA and the second augment RNA.
  • the first augment RNA sequence includes a first scaffold sequence, a first spacer sequence, and a first U-rich tail sequence
  • the second augment RNA sequence comprises a second scaffold sequence, a second spacer sequence. sequence and a second U-rich tail sequence.
  • the vector may include a nucleic acid sequence encoding an additional expression element that a person skilled in the art desires to express as needed.
  • the additional expression element may be a tag.
  • the additional expression element is a herbicide resistance gene such as glyphosate, glufosinate ammonium or phosphinothricin, ampicillin, kanamycin, G418 , antibiotic resistance genes such as bleomycin, hygromycin, and chloramphenicol.
  • the vector in a cell In order to express the vector in a cell, it must contain one or more regulatory and/or control elements.
  • the regulatory and/or control elements are promoters, enhancers, introns, polyadenylation signals, Kozak consensus sequences, Internal Ribosome Entry Sites (IRES), splice acceptors, 2A It may include a sequence and/or origin of replication, but is not limited thereto.
  • the origin of replication may be the f1 origin of replication, the SV40 origin of replication, the pMB1 origin of replication, the adeno origin of replication, the AAV origin of replication and/or the BBV origin of replication, but is not limited thereto.
  • a promoter sequence is operably linked to the sequence encoding each component to activate the RNA transcription factor in the cell.
  • the promoter sequence can be designed differently according to the corresponding RNA transcription factor or expression environment, and is not limited as long as it can properly express the components of the hypercompact TaRGET system in cells.
  • the promoter sequence may be a promoter that promotes transcription of RNA polymerase RNA Pol I, Pol II, or Pol III.
  • the promoter is a cytomegalovirus such as SV40 early promoter, mouse mammary tumor virus long terminal repeat (LTR) promoter, adenovirus major late promoter (Ad MLP), herpes simplex virus (HSV) promoter, CMV immediate early promoter region (CMVIE) (CMV) promoter, rous sarcoma virus (RSV) promoter, human U6 small nuclear promoter (U6), enhanced U6 promoter, 7SK promoter (7SK), and human H1 promoter (H1).
  • cytomegalovirus such as SV40 early promoter, mouse mammary tumor virus long terminal repeat (LTR) promoter, adenovirus major late promoter (Ad MLP), herpes simplex virus (HSV) promoter, CMV immediate early promoter region (CMVIE) (CMV) promoter, rous sar
  • the vector sequence When the vector sequence includes a promoter sequence, transcription of a sequence operably linked to the promoter is induced by an RNA transcription factor, and a termination signal that induces termination of transcription of the RNA transcription factor may be included.
  • the termination signal may vary depending on the type of promoter sequence. Specifically, when the promoter is a U6 or H1 promoter, the promoter recognizes a TTTTT(T5) or TTTTTT(T6) sequence, which is a thymidine (T) sequence, as a termination signal.
  • the sequence of the engineered guide RNA provided in the present invention includes a U-rich tail sequence at its 3'-end. Accordingly, the sequence encoding the engineered guide RNA includes a T-rich sequence corresponding to the U-rich tail sequence at its 3'-end.
  • some promoter sequences recognize a thymidine (T) contiguous sequence, for example, a sequence in which 5 or more thymidine (T) are consecutively linked, as a termination signal. In some cases, the T-rich sequence is recognized as a termination signal. can be recognized as
  • the vector sequence provided herein includes a sequence encoding an engineered guide RNA
  • a sequence encoding a U-rich tail sequence included in the augment RNA sequence may be used as a termination signal.
  • the vector sequence when the vector sequence includes a U6 or H1 promoter sequence and a sequence encoding an engineered guide RNA operably linked thereto, encoding a U-rich tail sequence included in the augment RNA sequence A portion of the sequence that does can be recognized as a termination signal.
  • the U-rich tail sequence includes a sequence in which 5 or more uridin (U) are consecutively linked.
  • the vector may be configured to express additional components such as NLS, NES and/or tag proteins, if necessary.
  • the additional component may be expressed independently of the Cas12f1 variant protein, a homologue of the Cas12f1 variant protein, and/or an engineered guide RNA (gRNA) for the Cas12f1 variant.
  • gRNA engineered guide RNA
  • the additional component may be expressed directly or by linking with an engineered guide RNA (gRNA) for the Cas12f1 variant protein, a homologue of the Cas12f1 variant protein, and/or a Cas12f1 variant.
  • gRNA engineered guide RNA
  • the nucleic acid construct encoding the components of the miniaturized gene editing system comprises at least one nuclear localization sequence (NLS) sequence at the N-terminus or C-terminus of the nucleic acid.
  • NLS sequence may be a nucleic acid construct comprising/consisting of a nucleotide sequence encoding any one amino acid sequence selected from SEQ ID NO: 54 to SEQ ID NO: 57.
  • the additional component may be a component that is generally expressed when a microgene editing system is to be expressed, and known technologies widely recognized by those skilled in the art may be referred to.
  • the present invention provides an engineered guide RNA (gRNA) according to the present invention or a nucleic acid encoding the nucleic acid and/or a nucleic acid included in a vector or the like to express the components of the miniaturized gene editing system.
  • the nucleic acid may be DNA or RNA existing in nature, or may be a modified nucleic acid in which a part or all of the nucleic acid is chemically modified.
  • the nucleic acid may be one or more nucleotides chemically modified.
  • the chemical modification may include all modifications of nucleic acids known to those skilled in the art.
  • a vector according to the present invention may be a viral vector. More specifically, the viral vector may be one or more selected from the group consisting of retrovirus, lentivirus, adenovirus, adeno-associated virus, vaccinia virus, poxvirus, and herpes simplex virus. In one embodiment, the viral vector may be an adeno-associated viral vector.
  • the vector according to the present invention may be a non-viral vector. More specifically, the non-viral vector may be one or more selected from the group consisting of plasmid, phage, naked DNA, DNA complex, and mRNA.
  • the plasmids are pcDNA series, pSC101, pGV1106, pACYC177, ColE1, pKT230, pME290, pBR322, pUC8/9, pUC6, pBD9, pHC79, pIJ61, pLAFR1, pHV14, pGEX series, pET series, and pUC19. It may be selected from the group consisting of
  • the phage may be M13, and the vector may be a PCR amplicon.
  • the vector according to the present invention may be designed in the form of a linear or circular vector.
  • RNA transcription is terminated at its 3'-end, even if the sequence of the linear vector does not separately include a termination signal.
  • RNA transcription is not terminated unless the sequence of the circular vector separately includes a termination signal. Therefore, when a circular vector is used as the vector, a termination signal corresponding to a transcription factor related to each promoter sequence must be included in order to express the intended target.
  • the present invention provides a method for editing or targeting a target nucleic acid or target gene in a target cell or in vitro using an engineered guide RNA (augment RNA) having optimal target editing activity for a Cas12f1 mutant protein or a homolog thereof. to provide.
  • the gene editing method may be a method of cleaving a nucleic acid at a target site.
  • the target gene or target nucleic acid includes a target sequence, and the target nucleic acid may be single-stranded DNA, double-stranded DNA and/or RNA.
  • a gene editing method comprising the step of contacting the gene editing system of the present invention, the vector system of the present invention, or the gene editing composition of the present invention with a target gene or a target nucleic acid.
  • the gene editing method comprises a guide RNA (augment RNA) engineered for a Cas12f1 mutant, a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding each of them, in a target cell containing a target nucleic acid or target gene. including delivery within As a result, the guide RNA containing the engineered guide RNA and the Cas12f1 mutant protein complex is injected into the target cell, or the formation of the guide RNA and the Cas12f1 mutant protein complex is induced, and the guide RNA and the Cas12f1 mutant protein complex are targeted by the complex. Genes are edited.
  • RNA advance RNA
  • Gene editing includes cleavage of double-stranded DNA, single-stranded DNA, or DNA and RNA hybrid double-stranded nucleic acids having a target sequence in a target gene or target nucleic acid.
  • the Cas12f1 mutant protein may be a wild-type Cas12f1 mutant protein, an engineered Caf12f1 mutant protein, a modified Cas12f1 mutant protein, or a Cas12f1 mutant homologue protein.
  • the gene editing method comprises a nucleic acid encoding a Cas12f1 mutant protein or a homologous protein thereof; And it may include delivering an engineered guide RNA (augment RNA) or a nucleic acid encoding the same into a target cell.
  • the engineered guide RNA (augment RNA) sequence includes the altered scaffold region sequence, spacer sequence, and U-rich tail sequence.
  • the sequence of the altered scaffold region may have the same characteristics and structures as described in the above-described "3. 3. Engineered guide RNA for Cas12f1 mutant protein" and "(2) Scaffold region” section. .
  • the engineered tracrRNA may include/consist of any one nucleotide sequence selected from SEQ ID NO: 29 to SEQ ID NO: 32, and the engineered crRNA may include/consist of the nucleotide sequence of SEQ ID NO: 33.
  • the engineered tracrRNA may include/consist of any one nucleotide sequence selected from SEQ ID NO: 39 to SEQ ID NO: 42, and the engineered crRNA may include/consist of the nucleotide sequence of SEQ ID NO: 43.
  • the engineered guide RNA may include/consist of any one nucleotide sequence selected from SEQ ID NO: 44 to SEQ ID NO: 48.
  • the spacer sequence may complementarily bind to a target gene or target nucleic acid contained in the target cell, and the U-rich tail sequence may be expressed as (UaV)nUb.
  • a, n and b are integers, a is 1 or more and 4 or less, n is 0 or more, and b is 1 or more and 10 or less.
  • the U-rich tail sequence may be expressed as (UaN)nUb.
  • N is one of adenosine (A), uridine (U), cytidine (C), and guanosine (G).
  • a is an integer of 1 or more and 4 or less
  • n is an integer of 0, 1 and 2
  • b is an integer of 1 or more and 10 or less.
  • the sequence of the U-rich tail is 5'-U-3', 5'-UU-3', 5'-UUU-3', 5'-UUUU-3', 5'-UUUUU-3' , 5'-UUUUU-3', 5'-UUURUUU-3' (SEQ ID NO: 67), 5'-UUURUUUU-3' (SEQ ID NO: 68), 5'-UUUURU-3' (SEQ ID NO: 69), 5' -UUUURUU-3' (SEQ ID NO: 70), 5'-UUUURUU-3' (SEQ ID NO: 71), 5'-UUUURUUU-3' (SEQ ID NO: 72), 5'-UUUURUUUU-3' (SEQ ID NO:
  • R may be A or G.
  • the sequence of the U-rich tail may be any one of SEQ ID NO: 75 to SEQ ID NO: 90.
  • the sequence of the U-rich tail may be 5'-UUUUAUUUU-3' (SEQ ID NO: 80) or 5'-UUUUGUUUU-3' (SEQ ID NO: 88).
  • An object to be gene-edited by the hypercompact TaRGET system of the present invention may be nucleic acid in vitro or nucleic acid in a prokaryotic or eukaryotic cell. More specifically, the eukaryotic cells may be yeast, insect cells, plant cells, animal cells, and/or human cells, but are not limited thereto.
  • the target nucleic acid, target gene, or target sequence may be determined in consideration of the purpose of gene editing, the environment of the subject to be edited, the PAM sequence recognized by the Cas12f1 mutant protein or its homologue protein, and/or other variables.
  • the method can be performed without particular limitation using known techniques. there is.
  • a spacer sequence in the guide RNA corresponding thereto is designed.
  • the spacer sequence is designed as a sequence capable of binding to the target sequence.
  • the spacer sequence is designed as a sequence capable of complementary binding to the target nucleic acid or target gene.
  • the spacer sequence may be designed as a sequence complementary to the target sequence included in the target strand sequence of the target nucleic acid or target gene.
  • the spacer sequence may be designed as an RNA sequence corresponding to the DNA sequence of the protospacer included in the non-target strand sequence of the target nucleic acid.
  • the spacer sequence may have the same nucleotide sequence as the protospacer sequence, and may be designed as a sequence in which each of thymidine (T) contained in the nucleotide sequence is substituted with uridine (U).
  • the spacer sequence may be a complementary sequence having 60% or more sequence identity with the target sequence.
  • the spacer sequence may be a complementary sequence having 60% to 90% sequence homology with the target sequence. More preferably, the spacer sequence may be a complementary sequence having 90% to 100% sequence homology with the target sequence.
  • the spacer sequence according to the present invention is complementary having 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 mismatches with the target sequence. It can be a random sequence. In one embodiment, the spacer sequence may have 1 to 5 mismatches with the target sequence. In addition, the spacer sequence may have 6 to 10 mismatches with the target sequence.
  • the base correction and gene editing method provided herein takes advantage of the fact that the hypercompact TaRGET system according to the present invention has an activity of recognizing and editing a target sequence for a target nucleic acid or target gene.
  • the gene editing method provided herein is based on the premise that a microgene editing system or vector containing an engineered guide RNA (gRNA) for a Cas12f1 variant in a target cell contacts a target sequence site of a target nucleic acid or target gene. .
  • gRNA engineered guide RNA
  • the gene editing method of the present invention includes effective delivery of the microgene editing system into a target cell.
  • the nucleic acid construct of the microgene and/or each component of the microgene editing system including the microgene is brought into contact with or induces contact with a target sequence site of a target nucleic acid or target gene in a target cell.
  • the gene editing method includes a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same; And it may include delivering an engineered guide RNA (augment RNA) or a nucleic acid encoding the same into a target cell.
  • the gene editing method includes a Cas12f1 mutant protein or a homologous protein thereof; And it may include delivering the engineered guide RNA (augment RNA) into the target cell.
  • the gene editing method includes a Cas12f1 mutant protein or a homologous protein thereof; And it may include delivering a nucleic acid encoding an engineered guide RNA (augment RNA) into a target cell.
  • the gene editing method comprises a nucleic acid encoding a Cas12f1 mutant protein or a homologue thereof; And it may include delivering the engineered guide RNA (augment RNA) into the target cell.
  • the gene editing method comprises a nucleic acid encoding a Cas12f1 mutant protein or a homologue thereof; And it may include delivering a nucleic acid encoding an engineered guide RNA (augment RNA) into a target cell.
  • the induction is not particularly limited as long as it is a method in which a microgene editing system or a microgenetic editing nucleic acid construct containing an engineered guide RNA (augment RNA) for the Cas12f1 variant is brought into contact with a target nucleic acid in a cell.
  • the delivery form of the ultracompact gene editing nucleic acid construct and the hypercompact TaRGET system including the same for the method of the present invention is a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same; And it is not particularly limited as long as it can deliver the engineered guide RNA (augment RNA) or nucleic acid encoding it into cells in an appropriate delivery form.
  • the ribonucleoprotein particle to which the guide RNA engineered for the Cas12f1 mutant and the Cas12f1 mutant protein are bound ) can be used.
  • the gene editing method may include injecting a guide RNA engineered for the Cas12f1 mutant and a complex of the guide RNA and the Cas12f1 mutant protein to which the Cas12f1 mutant protein binds into a target cell.
  • a Cas12f1 variant protein or a homologous protein thereof, or a nucleic acid encoding the same may be used.
  • a non-viral vector containing an engineered guide RNA or a nucleic acid encoding the same may be used.
  • the gene editing method may include injecting a non-viral vector comprising a nucleic acid sequence encoding a Cas12f1 mutant protein and a nucleic acid sequence encoding a guide RNA engineered for the Cas12f1 mutant into a target cell.
  • the non-viral vector may be a plasmid, naked DNA, DNA complex, mRNA or linear PCR amplicon, but is not limited thereto.
  • the gene editing method comprises a first non-viral vector comprising a nucleic acid sequence encoding a Cas12f1 variant protein and a second non-viral vector comprising a nucleic acid sequence encoding a guide RNA engineered for the Cas12f1 variant. It may include injecting into a target cell.
  • the first non-viral vector and the second non-viral vector may each be one selected from the group consisting of plasmid, naked DNA, DNA complex, mRNA, and linear PCR amplicon, but are not limited thereto.
  • a virus comprising a nucleic acid sequence encoding a Cas12f1 variant protein and a nucleic acid sequence encoding a guide RNA engineered for the Cas12f1 variant can be used.
  • the gene editing method may include injecting one virus comprising a nucleic acid sequence encoding a Cas12f1 mutant protein and a nucleic acid sequence encoding a guide RNA engineered for the Cas12f1 mutant into a target cell.
  • the virus may be one selected from the group consisting of retrovirus, lentivirus, adenovirus, adeno-associated virus, vaccinia virus, poxvirus, and herpes simplex virus, but is not limited thereto.
  • the virus may be an adeno-associated virus.
  • the gene editing method targets a first virus comprising a nucleic acid sequence encoding a Cas12f1 mutant protein or a homologous protein thereof and a second virus comprising a nucleic acid sequence encoding a guide RNA engineered for the Cas12f1 variant. It may include injecting into cells.
  • the first viral vector and the second viral vector may be one selected from the group consisting of retroviruses, lentiviruses, adenoviruses, adeno-associated viruses, vaccinia viruses, poxviruses, and herpes simplex viruses, respectively. It is not limited.
  • the delivery form may include a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same; and a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the guide RNA may be delivered using nanoparticles.
  • the delivery method comprises a Cas12f1 variant protein or a nucleic acid encoding the same, a first engineered guide RNA for the Cas12f1 variant or a nucleic acid encoding the same, and/or a second engineered guide RNA for the Cas12f1 variant or encoding the same It may be to deliver nucleic acids using nanoparticles.
  • the delivery method is a cationic liposome method, lithium acetate-DMSO, lipid-mediated transfection, calcium phosphate precipitation, lipofection, electroporation, gene gun, sonoporation, Magnetofection, and/or transient cell compression or squeezing PEI (Polyethyleneimine)-mediated transfection, DEAE-dextran-mediated transfection or nanoparticle-mediated nucleic acid delivery, but is not limited thereto.
  • PEI Polyethyleneimine
  • a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same may be performed by combining the above delivery forms.
  • the gene editing method includes delivering a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same, in a first delivery form, and a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the same in a second delivery form.
  • a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the same in a second delivery form may include forwarding.
  • the first delivery form and the second delivery form may each be any one of the above-described delivery forms.
  • the target nucleic acid or target target nucleic acid to be edited is the miniaturized gene editing nucleic acid construct according to the present invention to be delivered to one vector or the components of the miniaturized gene editing system including the same. It is not particularly limited as long as it is a delivery form that can be delivered to the environment where the gene exists.
  • the gene editing method may include a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same; and delivering a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the same into a cell, wherein the construct may be delivered into the cell simultaneously or sequentially with a time difference.
  • the gene editing method includes a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same; and simultaneously delivering a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the same.
  • the gene editing method transfers a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same into a cell, and then transfers a guide RNA engineered for the Cas12f1 variant or a nucleic acid encoding the same into the cell with a time difference.
  • the gene editing method involves delivering a guide RNA engineered for a Cas12f1 mutant or a nucleic acid encoding the same into a cell, and then introducing the Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same, into the cell with a time difference.
  • a guide RNA engineered for a Cas12f1 mutant or a nucleic acid encoding the same into a cell, and then introducing the Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same, into the cell with a time difference. may include delivery.
  • the gene editing method provided by the present invention is a Cas12f1 mutant protein or a homologous protein thereof, or a nucleic acid encoding the same, in a target cell; and delivering a guide RNA engineered for two or more Cas12f1 variants or a nucleic acid encoding the same.
  • two or more guide RNAs and Cas12f1 mutant protein complexes targeting different sequences may be injected into a target cell, or the two or more guide RNAs and Cas12f1 mutant protein complexes may be formed in a target cell.
  • two or more target genes or target nucleic acids contained in a cell can be edited.
  • the gene editing method comprises a Cas12f1 variant protein or a nucleic acid encoding the same, a first engineered guide RNA for the Cas12f1 variant or a nucleic acid encoding the same, and a second engineered guide RNA for the Cas12f1 variant or encoding the same It includes delivering a nucleic acid into a target gene or target cell containing the target nucleic acid.
  • each of the components may be delivered into cells using one or more of the above-described delivery forms and delivery methods.
  • two or more components may be delivered into cells simultaneously or sequentially with a time difference.
  • the hypercompact TaRGET system includes an engineered guide RNA that maximizes the target activity or gene editing activity of the Cas12f1 mutant protein or its homologue protein.
  • the guide RNA is augment RNA with modifications in MS1 (SEQ ID NO: 44), augment RNA with modifications in MS1/MS2 (SEQ ID NO: 45), augment RNA with modifications in MS1/MS2/MS3 Cas12f1_ge3.0 ( SEQ ID NO: 46), augment RNA Cas12f1_ge4.0 (SEQ ID NO: 47) with modifications in MS2 / MS3 / MS4 and / or augment RNA Cas12f1_ge4.1 (SEQ ID NO: 48) with modifications in MS2 / MS3 / MS4 / MS5. there is.
  • This is a simple example and is not limited thereto.
  • composition for gene editing of the present invention may further include appropriate materials required for gene editing use in addition to each component of the hypercompact TaRGET system according to the present invention.
  • the present invention also provides a method for editing nucleic acids comprising the step of contacting the miniaturized gene editing system according to the present invention or the composition comprising the same with a target sequence.
  • the nucleic acid editing may be nucleic acid cleavage.
  • the gene editing method is a eukaryotic minigene editing system in the form of a ribonucleoprotein particle in which the guide RNA engineered for the Cas12f1 variant and the miniaturized gene editing protein according to the present invention, the Cas12f1 mutant protein or its homolog protein are bound. It may include intracellular delivery. In this case, the delivery may be performed using electroporation or lipofection.
  • the gene editing method preferably uses one adeno-associated virus comprising both a nucleic acid sequence encoding a guide RNA engineered for Cas12f1 mutant and a nucleic acid sequence encoding a Cas12f1 mutant protein or a homologous protein thereof.
  • AAV AAV using a vector to transfer the target nucleic acid or target gene into cells.
  • Example 1 Manufacturing of components of a hypercompact gene editing system (Hypercompact TaRGET system)
  • Example 1.1 Gene editing proteins and human codon-optimized nucleic acids encoding them
  • the present invention is a protein constituting the Hypercompact TaRGET (Tiny nuclease-augment RNA-based Genome Editing Technology) system, which is a miniaturized gene editing system, and includes a Cas12f1 variant protein or a homolog protein thereof.
  • Hypercompact TaRGET Ti nuclease-augment RNA-based Genome Editing Technology
  • the Cas12f1 mutant protein includes the amino acid sequence of SEQ ID NO: 1 or a protein consisting of an amino acid sequence in which 1 to 28 amino acids are removed or substituted from the N-terminus based on the sequence (provided that the amino acid sequence of SEQ ID NO: 5) sequence), representative examples of Cas12f1 mutant proteins (engineered Cas12f mutant proteins) consisting of an amino acid sequence in which 1 to 28 amino acids are removed or substituted from the N-terminus based on the amino acid sequence of SEQ ID NO: 1 Cas12f1 variant v1 protein (SEQ ID NO: 2) comprising the N-terminal 26aa of CasX at the N-terminus of Cas12f1, Cas12f1 variant v2 protein (SEQ ID NO: 3) comprising the 28aa random sequence, or Cas12f1 variant comprising the 26aa random sequence
  • the v3 protein (SEQ ID NO: 4) is provided.
  • the Cas12f1 mutant protein includes a protein consisting of an amino acid sequence in which 1 to 600 amino acids are added to the N-terminus or C-terminus of the Cas12f1 variant comprising or consisting of the amino acid sequence of SEQ ID NO: 1.
  • 1 to 600 amino acids added to the N-terminus or C-terminus may include or consist of the amino acid sequence of SEQ ID NO: 233 or SEQ ID NO: 234, and one or more amino acids may be added between the additional sequence and the Cas12f1 mutant protein.
  • An NLS sequence may further be included.
  • a homolog of the Cas12f1 mutant protein may be a protein comprising or consisting of any one amino acid sequence selected from SEQ ID NO: 141 to SEQ ID NO: 232.
  • the present invention also relates to human codon-optimized human codon-optimized Cas12f1 mutant proteins using a codon optimization program to construct a hypercompact TaRGET system expressed in human cells and a hypercompact gene editing nucleic acid construct for nucleic acid cleavage. got the gene.
  • Table 1 shows the human codon-optimized Cas12f1 variant base sequence encoding the Cas12f1 variant protein prepared above and the amino acid sequence of the Cas12f1 variant protein.
  • Table 2 shows the nucleotide sequences of human codon-optimized nucleic acids encoding Cas12f1 variant v1 to v3 proteins, respectively. These were used as nucleic acids encoding the gene editing proteins constituting the miniaturized gene editing system according to the present invention.
  • the miniaturized gene-edited nucleic acid construct prepared above was prepared by the following method.
  • the nucleic acid construct used in the present invention includes the gene sequence of human codon-optimized Cas12f1 variants (including engineered variants). Proceed with PCR amplification using the gene sequence as a template, and desired cloning into a vector having a promoter capable of expression in a eukaryotic cell system and a poly(A) signal sequence by the Gibson assembly method Cloning was performed according to the sequence. After cloning, the sequence of the obtained recombinant plasmid vector was finally confirmed through the Sanger sequencing method.
  • Example 1.1 The gene prepared in Example 1.1 was expressed, and the protein was purified.
  • the nucleic acid construct was cloned into the pMAL-c2 plasmid vector and transformed into BL21(DE3) E. coli cells.
  • the transformed E. coli colonies were grown in LB broth at 37°C until an optical density of 0.7 was reached.
  • the transformed E. coli cells were cultured overnight at 18°C in the presence of 0.1 mM isopropylthio- ⁇ -D-galactoside.
  • the cultured cells were collected by centrifugation at 3,500 g for 30 minutes, and the collected cells were resuspended in 20 mM Tris-HCl (pH 7.6), 500 mM NaCl, 5 mM ⁇ -mercaptoethanol, and 5% glycerol. .
  • the cells were lysed in lysis buffer, they were disrupted by sonication.
  • the supernatant obtained by centrifuging the sample containing the disrupted cells at 15,000 g for 30 minutes was filtered through a 0.45 ⁇ m syringe filter (Millipore), and the filtered supernatant was filtered through an FPLC purification system (KTA Purifier, GE Healthcare). was used and loaded onto a Ni 2+ -affinity column. Bound fractions were eluted with a 80-400 mM imidazole, 20 mM Tris-HCl (pH 7.5) gradient.
  • the eluted protein was cleaved by treatment with TEV protease for 16 hours.
  • the cleaved protein was purified on a Heparin column with a 0.15-1.6 M NaCl linear gradient.
  • the recombinant Cas12f1 variant protein purified on a heparin column was dialyzed against a solution of 20 mM Tris pH 7.6, 150 mM NaCl, 5 mM ⁇ -mercaptoethanol, and 5% glycerol.
  • the dialyzed protein was purified by passing it through an MBP column and then re-purified on a monoS column (GE Healthcare) or EnrichS with a linear gradient of 0.5-1.2 M NaCl.
  • the re-purified proteins were collected and dialyzed with a solution of 20 mM Tris pH 7.6, 150 mM NaCl, 5 mM ⁇ -mercaptoethanol, and 5% glycerol to purify the microgene editing protein (small endonuclease) used in the present invention. did The concentration of the miniaturized gene-edited protein produced above was quantified using the Bradford quantification method using bovine serum albumin (BSA) as a standard and electrophoretically measured on a coomassie blue-stained SDS-PAGE gel.
  • BSA bovine serum albumin
  • Example 1.3 Engineered guide RNAs for Cas12f1 variants or homologs thereof
  • an engineered guide RNA (augment RNA) having high efficiency targeting activity and gene editing activity for Cas12f1 mutant protein or its homologue was prepared.
  • the endonuclease activity of the Cas12f1 mutant protein or its homologue protein constituting the system is also important, but in addition, the gene editing protein binds to the target nucleic acid or target gene site. It was estimated that there would be a big difference in the activity by the degree. Accordingly, an engineered augment RNA for the Cas12f1 mutant was produced as follows.
  • Augmented RNA for the Cas12f1 mutant protein or its homologue protein is a guide RNA found in nature by adding a new structure or deleting or modifying some of its structures, and may include a new U-rich tail at the 3'-end.
  • the augment RNA is an engineered tracrRNA sequence including modified scaffold regions 1 to 4, an engineered crRNA sequence including modified scaffold regions 5 to 6, and/or a modified second region. It is characterized by including a U-rich tail sequence of 7 regions (see FIGS. 2a and 2b).
  • the fourth region and the fifth region are sites complementary to each other, and include modification site 1 (MS1) and modification site 4 (MS4), and the seventh region, U-
  • the rich tail sequence corresponds to modification site 2 (MS2).
  • the first region is modification site 3 (MS3), and the second region includes modification site 5 (MS5).
  • FIG. 1 is a “modification site”, which is a site that undergoes modification in guide RNA existing in nature, in order to prepare wild-type guide RNA for Cas12f1 mutant and high-efficiency augment RNA for the Cas12f1 mutant protein and its homologues provided in the present invention.
  • Site, MS) MS1 to MS5" were shown in detail.
  • 2a and 2b show exemplary structures of various modified regions for the production of the engineered single guide RNA (augment RNA) of the present invention.
  • FIG. 2a illustrates a canonical sgRNA transformation site for a Cas12f1 mutant
  • FIG. 2b illustrates a transformation site for a mature form sgRNA for an engineered Cas12f1 mutant.
  • the present invention is based on "3. Engineered guide RNA for ultra-small gene editing system” and “7. Modification to make a single guide RNA” in the above-mentioned “II. High-efficiency miniaturized gene editing system/composition” section, Engineered guide RNAs for Cas12f1 variants (Cas12f1 variant augment RNAs) were produced.
  • these augment RNAs are only representative examples of the engineered guide RNAs used in the present invention, and the Cas12f1 mutant augment RNAs of the present invention are not limited to the exemplified sequences.
  • the 5'-NNNNNNNNNNNNNNNNNN-3' part is a spacer sequence, and may consist of 15 or more and 50 or less base sequences.
  • sgRNAs Sequence (5' to 3') SEQ ID NO: Canonical sgRNAs CUUCACUGAUAAAGUGGAGAACCGCUUCACCAAAAGCUGUCCCuuagGGGAUUAGAACUUGAGUGAAGGUGGGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCAUUUUUCCUCUCCAAUUCUGCACAAgaaaGUUGCAGAACCCGAAUAGacgaaUGAAGGAAUGNNNNNNCANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  • sgRNA in which MS1 was removed from canonical sgRNA was prepared, and it was named "mature form sgRNA".
  • the mature form sgRNA may consist of the nucleotide sequence of SEQ ID NO: 124, wherein one or more pairs of complementary nucleotide sequences may be additionally removed.
  • Table 5 shows exemplary Cas12f1 mutant mature form sgRNAs and their specific base sequences.
  • sgRNAs Sequence (5' to 3') SEQ ID NO: Mature form sgRNAs CUUCACUGAUAAAGUGGAGAACCGCUUCACCAAAAGCUGUCCCuuagGGGAUUAGAACUUGAGUGAAGGUGGGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCA UUU gaaa GAA UGAAGGAAUGCAACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 124 MS3-1 GAUAAAGUGGAGAACCGCUUCACCAAAAGCUGUCCCuuagGGGAUUAGAACUUGAGGUGAAGGUGGGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCA UUU gaaa GAA UGAAGGAAUGCAACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 124
  • Cas12f1 mutant augment RNA exemplified above was prepared by the following method. First, to prepare the engineered guide RNA, a previously designed guide RNA was chemically synthesized, and then a PCR amplicon containing the synthesized guide RNA sequence and the T7 promoter sequence was prepared. U-rich tail ligation to the 3'-end of the engineered Cas12f1 mutant guide RNA was performed using Pfu PCR Master Mix5 (Biofact) in the presence of a sequence-modified primer and a Cas12f1 mutant guide RNA plasmid vector. The PCR amplicons were purified using the HiGene TM Gel & PCR Purification System (Biofact).
  • Modification of the second, fourth and fifth regions of the engineered scaffold region of the engineered Cas12f1 mutant guide RNA is performed using Apo I and Bam HI restriction enzymes to linearize the guide RNA encoding vector Transmit the modified sequence This was done by cloning a synthetic oligonucleotide (Macrogen) to
  • the modification of the first region of the engineered scaffold region of the engineered Cas12f1 mutant guide RNA is a forward primer targeting the 5'-end of tracrRNA and a reverse primer targeting the U6 promoter region (PCR amplification of canonical or engineered template plasmid vectors using reverse primers.
  • the PCR amplification was performed by Q5 Hot Start high-fidelity DNA polymerase (NEB), and the PCR product was ligated using KLD Enzyme Mix (NEB).
  • the ligated PCR products were transformed into DH5 ⁇ E. coli cells. Mutagenesis was confirmed by Sanger sequencing analysis.
  • the modified plasmid vector was purified using the NucleoBond® Xtra Midi EF kit (MN). One microgram of the purified plasmid was used as a template for mRNA synthesis using T7 RNA polymerase (NEB) and NTPs (Jena Bioscience). Guide RNA engineered for the prepared Cas12f1 variant was purified using Monarch® RNA cleanup kit (NEB), aliquoted into cryogenic vials, and stored in liquid nitrogen.
  • MN NucleoBond® Xtra Midi EF kit
  • amplicons of canonical guide RNA and engineered guide RNA were prepared.
  • the template DNA plasmid and augment RNA template DNA plasmid of the canonical guide RNA were U6-complementary forward primer and protospacer PCR amplification was performed using a sequence-complementary reverse primer (protospacer sequence-complementary reverse primer).
  • PCR amplification products were purified using Higene TM Gel & PCR purification system (Biofact) to obtain canonical guide RNA and augment RNA amplicons.
  • In vitro transcription was performed using the PCR amplicon as a template and NEB T7 polymerase.
  • the in vitro transcription result was treated with DNase I (NEB), purified using Monarch RNA Cleanup Kit (NEB), and guide RNA was obtained. Thereafter, a plasmid vector containing a previously designed guide RNA sequence and a T7 promoter sequence was prepared according to the T-blunt plasmid (Biofact) cloning method.
  • the microgene editing system is a ribonucleoprotein (RNP) formed by the interaction between an engineered guide RNA (augment RNA) and one Cas12f1 mutant protein, or the interaction between an engineered guide RNA and two Cas12f1 mutant proteins. It may be an RNP formed by
  • Example 1.2 the miniaturized gene editing protein (small endonuclease) purified in Example 1.2 and the engineered guide RNA prepared in Example 1.3 were incubated together at concentrations of 300 nM and 900 nM, respectively, at room temperature for 10 minutes to ribonuclease.
  • a cleoprotein particle (RNP) was prepared.
  • the Cas12f1 variant a miniaturized gene editing protein, was human codon-optimized for expression in human cells, and an oligonucleotide of the codon-optimized Cas12f1 mutant gene was constructed.
  • oligonucleotides containing the nucleotide sequence of the prepared Cas12f1 mutant gene, as well as a nuclear localization signal (NLS) sequence and a linker sequence at the 5'-end and 3'-end, respectively, were synthesized (Bionics ), a polynucleotide of a human codon-optimized Cas12f1 variant nucleic acid construct for target nucleic acid or target gene cleavage of the present invention was synthesized.
  • the polynucleotide of the codon-optimized Cas12f1 variant nucleic acid construct is operably linked to a plasmid containing a sequence encoding eGFP linked to a chicken ⁇ -actin (CBA) promoter and a self-cleaving T2A peptide (2A) Cloning It became.
  • CBA chicken ⁇ -actin
  • a template DNA for the canonical guide RNA used in this experiment was synthesized (Twist Bioscience), and cloned into a pTwist Amp plasmid vector and cloned.
  • Template DNA for the engineered guide RNA was prepared using an enzyme cloning technique, and cloned into the pTwist Amp plasmid and cloned.
  • an amplicon of the canonical guide RNA or engineered guide RNA was prepared. manufactured. If necessary, the prepared amplicon was cloned into T-blunt plasmid (Biofact) and cloned.
  • engineered tracrRNA and oligonucleotides encoding engineered crRNA were digested with restriction enzymes Bam HI and Hind III (NEB) to pSilencer 2.0 vector (ThermoFisher Scientific) Cloned into and replicated.
  • a vector expressing the components of the miniaturized gene editing system is used to clone a polynucleotide encoding the engineered Cas12f1 variant augment RNA into a vector comprising the human codon-optimized Cas12f1 variant gene or a nucleic acid construct comprising the same using Gibson assembly.
  • a vector expressing the miniaturized gene editing system 1) a sequence encoding eGFP linked to a chicken ⁇ -actin (CBA) promoter and a self-cleaving T2A peptide (2A), 2) a Cas12f1 mutant protein or a homologue thereof
  • CBA chicken ⁇ -actin
  • 2A self-cleaving T2A peptide
  • AAV vector AAV inverted terminal repeat plasmid vector
  • the transcription of the nucleic acid construct and the guide RNA encoding the Cas12f1 mutant protein or its homolog protein were promoted by the chicken ⁇ and U6 promoters, respectively.
  • the AAV plasmid vector (AAV vector) can be appropriately changed according to the purpose of gene editing or modification, such as eGFP, the number of Cas12f1 mutant augment RNA, and / or the addition of effector proteins.
  • the AAV vector and helper plasmid were transduced into HEK 293T cells.
  • the transduced HEK293 T cells were cultured in DMEM medium containing 2% FBS.
  • Recombinant pseudotyped AAV vector stocks were generated using PEIpro (Polyplus-transfection) and PEI coprecipitation using triple-transfection for plasmids at equal molar ratios.
  • the cells were lysed and the AAV vector was purified from the lysate by iodixanol (Sigma-Aldrich) step gradient ultra-centrifugation.
  • HEK 293T (ATCC CRL-11268), HeLa (ATCC CLL-2), U-2 OS (ATCC HTB-96) and K-562 (ATCC CCL-243) cells were cultured in 10% heat-inactivated FBS, 1% penicillin/ In DMEM medium supplemented with streptomycin and 0.1 mM non-essential amino acids, the cells were cultured at 37°C and 5% CO 2 conditions.
  • a vector containing the same, or an engineered guide RNA 1.0 ⁇ 10 5 HEK 293T cells were transfected Dispensed 1 day ago.
  • Cell transfection was performed by electroporation or lipofection.
  • electroporation 2-5 ⁇ g of each of the DNA encoding the nucleic acid construct, the plasmid vector containing the same, or the engineered guide RNA was transfected into 4 ⁇ 10 5 HEK-293 T cells using the Neon transfection system (Invitrogen) Injected (transfection). Electroporation was performed under conditions of 1300V, 10 mA, and 3 pulses.
  • the ribonucleoprotein particles (RNP) prepared according to Example 1.4 are transfected into cells using electroporation, or after transfection through lipofection, 1 day later according to Example 1.3
  • the prepared engineered guide RNA was transfected into cells using electroporation.
  • the region containing the protospacer among genomic DNA isolated from HEK 293T cells was subjected to KAPA HiFi HotStart DNA polymerase (Roche) using target-specific primers. PCR was performed in the presence of The amplification method followed the manufacturer's instructions. PCR amplicons resulting from the above amplification containing Illumina TruSeq HT dual indexes were subjected to 150-bp pair-end sequencing using Illumina iSeq 100.
  • the MAUND is provided at https://github.com/ibscge/maund.
  • PCR products were obtained using BioFACT TM Lamp Pfu DNA polymerase.
  • the PCR product (100-300 ⁇ g) was reacted with 10 units of T7E1 enzyme (NEB) in a 25 ⁇ g reaction mixture at 37° C. for 30 minutes.
  • the 20 ⁇ l reaction mixture was loaded directly onto a 10% acrylamide gel and the digested PCR products were run in a TBE buffer system. After staining the gel image with an ethidium bromide solution, it was digitized using a Printgraph 2 M gel imaging system (Atto). The digitized result was analyzed to evaluate gene editing efficiency.
  • the adeno-associated virus (AAV) vector constructed in Example 2 was transduced into HEK 293T cells. After 3, 5 and 7 days, genomic DNA was obtained from the transfected HEK 293T cells and purified using a Genomic DNA prep kit (QIAGEN, catalog #: 69504). After the target nucleic acid or the target region of the target gene was amplified by PCR in the purified product, the final PCR product was analyzed using targeted deep sequencing. For library generation, the target site was amplified using the KAPA HiFi HotStart PCR kit (KAPA Biosystem #: KK2501). This library was sequenced using a MiniSeq TruSeq HT Dual Index system (Illumina).
  • the hypercompact TaRGET system has the activity of cleaving the target nucleic acid or gene target sequence in cells and the target sequence cleavage activity of the hypercompact TaRGET system according to the type of engineered guide RNA were investigated.
  • Deletion and insertion may occur by cleavage of a nucleic acid within a target nucleic acid or target gene.
  • the indel is a non-homologous end joining (NHEJ) in which two suitable ends formed by double-stranded cleavage repeat frequent contact to repair or repair double-strand breaks in DNA. , which results in insertions and/or deletions (insertions) of parts of nucleic acid sequences at NHEJ repair sites. Consequently, nucleic acid editing in which one or more bases are deleted and/or added within a target gene or target nucleic acid may occur by cleavage of the target nucleic acid by the gene editing system.
  • NHEJ non-homologous end joining
  • Target name Target sequence (5' to 3') SEQ ID NO: Target-1 [TTTG]CACACACACAGTGGGCTACC 138 Target-2 [TTTG]CATCCCCAGGACACACACAC 139 Target-3 [TTTA]AGAACACATACCCCTGGGCC 140
  • MS1/MS2/MS3 augment RNA (Cas12f1_ge3.0) and MS2/MS3/MS4 augment RNA (Cas12f1_ge4.0 ) and MS2/MS3/MS4/MS5 augmented RNA (Cas12f1_ge4.1) indel efficiency of the target sequence was investigated.
  • sgRNAs engineered for all of the target sequences Target-1 to Target-3 that is, MS1/MS2/MS3 augment RNA (Cas12f1_ge3.0), MS2/MS23/MS4 augment RNA (Cas12f1_ge4.0) and MS2/MS3/MS4/MS5 augment RNA (Cas12f1_ge4.1) each have similar indel efficiencies by allowing the Cas12f1 mutant protein or the Cas12f1 mutant v1 to v3 proteins to cleave the target nucleic acid with an efficiency of 90% or more showed up
  • unengineered canonical sgRNAs did not cause an indel effect in which Cas12f1 mutant proteins and Cas12f1 mutant v1 to v3 proteins cleave target nucleic acids (Target-1 to Target-3) at all (Figs. 3a to 3c).
  • the amino acid sequence of SEQ ID NO: 233 at the N-terminus of the Cas12f1 mutant protein (wtTnpB) or the amino acid sequence of SEQ ID NO: 234 at the C-terminus of the Cas12f1 mutant protein are linked to the NLS sequence (SEQ ID NO: 54; PKKKRKV) to add amino acids.
  • the engineered guide RNA is a highly efficient guide RNA that enables the Cas12f1 mutant protein to cut target nucleic acids compared to canonical sgRNA
  • the ultracompact gene editing system Hypercompact TaRGET of the present invention including it system
  • the Cas12f1 mutant protein which is a small gene editing protein
  • MS2/MS3/MS4/MS5 augmented RNA (Cas12f1_ge4.1) engineered to have the shortest length, and the indel efficiency of the microgene editing system was compared with a representative gene editing system known to have excellent indel activity. .
  • the representative Cas12f1 variant system, Cas12f1 variant v1 system, and Cas12f1 variant v2 system of the CRISPR/SpCas9 system, CRISPR/AsCas12a system, CRISPR/Cas12f1 system and the hypercompact TaRGET system of the present invention were used in HEK 293T cells, respectively.
  • the indel efficiency at the endogenous locus 5'-[TTTA]AGAACACATACCCCTGGGCC-3' (Target-3, SEQ ID NO: 140), was confirmed by deep sequencing analysis.
  • the CRISPR/SpCas9 system showed an indel efficiency of about 10%
  • the Cas12f1 variant system, the Cas12f1 variant v1 system, and the Cas12f1 variant v2 system showed indel efficiencies of 45%, 55%, and 38%, respectively (FIG. 4) .
  • This is in addition to the advantage of expanding the range of use for various gene editing due to the small size of the hypercompact TaRGET system including the miniature gene editing protein Cas12f1 variant according to the present invention, in cutting target nucleic acids. It was confirmed that the target nucleic acid or target gene cutting efficiency was significantly increased compared to the most researched and currently used CRISPR/Cas system.
  • Example 5.3 Indel activity analysis according to the combination of augment RNA and Cas12f1 variant
  • Example 5.3.1 Comparison of indel activity according to augmented RNA
  • Example RNA the engineered guide RNA (augment RNA) resulted in superior target nucleic acid cleavage activity for Cas12f1 mutant proteins (including variants v1 to v3) compared to canonical sgRNA.
  • each of the modified regions MS1 to MS5 in the canonical sgRNA was further subdivided into 3 compartments.
  • variously engineered augment RNAs were produced. Indel activity of the engineered augmented RNA was tested.
  • the hypercompact TaRGET system including the canonical sgRNA (full length) and the Cas12f1 mutant protein did not cleavage the target strand, but it was used for the test.
  • the engineered augment RNA affected the indel efficiency of the Cas12f1 mutant protein to the target nucleic acid according to its nucleotide sequence and target sequence.
  • MS1/MS2/MS3 augmented RNA MS1/MS2/MS4 * -2 augment RNA
  • MS1/MS3-3/MS4 * -2 augment RNA MS1/MS2/MS3-3/MS4 * -2 augment RNA
  • MS1/MS2/MS3-3/MS4 * -2 augment RNA have high indels of about 50% to 65% ) efficiency
  • the augment RNA and MS1/MS2/MS3-3/MS4 * -2/MS5-3 augment RNA showed an indel efficiency of about 30% to 40% (Fig. 5a).
  • MS1/MS2/MS3 augmented RNA MS1/MS2 /MS3-3/MS4 * -2 augment RNA
  • MS1/MS2/MS3-3/MS5-3 augment RNA MS1/MS2/MS3-3/MS4 * -2/MS5-3 augment RNA
  • MS1/MS2/MS3-3/MS4 * -2/MS5-3 augment RNA are about 35% to 45% % Indel efficiency was shown
  • MS1/MS2/MS4 * -2 augment RNA, MS1/MS3-3/MS4 * -2 augment RNA, MS1/MS2/MS5-3 augment RNA, MS1/MS3-3 /MS5-3 augment RNA, MS1/MS4 * -2/MS5-3 augment RNA, MS1/MS2/MS4 * -2/MS5-3 augment RNA and MS1/MS3-3/MS4 * -2/MS5-3 augment RNA showed an indel efficiency of about
  • MS1/MS2/MS3 augment RNA, MS1/MS2/MS4 * -2 augment RNA, MS1/MS3-3/MS4 * -2 augment RNA or MS1/MS2/MS3-3/MS4 * - Indel efficiency of the target nucleic acid of the microgene editing system including 2 augment RNA and Cas12f1 mutant protein was confirmed.
  • Cas12f1 variant v2 and Cas12f1 variant v3 showed very good indel efficiency of about 45% to about 65%, similar to the Cas12f1 mutant protein (FIG. 6).
  • an indel efficiency of about 15% was rather low.
  • Both augment RNA and MS1/MS2/MS3-3/MS4 * -2 augment RNA significantly increased the indel efficiency of the genetic scissors system containing the Cas12f1 variant v1 (Fig. 6).
  • Example 5.3.3 Comparison of indel activity of Cas12f1 variants according to augmented RNA based on mature form sgRNA
  • RNA(augment RNA) ⁇ ⁇ ⁇ Mature form sgRNA ⁇ 5'-CUUCACUGAUAAAGUGGAGAACCGCUUCACCAAAAGCUGUCCCuuagGGGAUUAGAACUUGAGUGAAGGUGGGCUGCUUGCAUCAGCCUAAUGUCGAGAAGUGCUUUCUUCGGAAAGUAACCCUCGAAACAAAUUCA UUU gaaa GAA UGAAGGAAUGCAACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN-3'( ⁇ 124) ⁇ ⁇ Mature form sgRNA ⁇ ⁇ Augment RNAs with some modifications of the sequence were prepared (Table 5), and the indel efficiency of the hypercompact TaRGET system of the present invention was measured by these augmented RNAs.
  • the MS3-3 augment RNA (SEQ ID NO: 127), MS3-3/MS4-3 augment RNA (SEQ ID NO: 134) or MS3-
  • the indel efficiency of the hypercompact TaRGET system including 3/MS4-3/MS5-3 augmented RNA (SEQ ID NO: 137) and the Cas12f1 mutant protein was confirmed.
  • the Cas12f1 variant proteins of the present invention (including the Cas12f variant v1 to v3 proteins) and the Cas12f1 variant protein homologous proteins exhibiting the same biological activity as these;
  • the microgene editing system including the augment RNA is 3 of the augment RNA or canonical guide RNA having a modification in which at least one nucleotide sequence is deleted or substituted, compared to having little nucleic acid cleavage activity when the canonical guide RNA is included. It is concluded that the cleavage activity of the target nucleic acid or target gene is increased by the modification in which a U-rich tail is added to the '-terminus.
  • Sequences omitted from the sequence listing electronic file attached hereto are provided below.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Virology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Mycology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente invention concerne une protéine mutante Cas12f1 constituant une nouvelle protéine d'édition génétique hypercompacte, ou une protéine homologue, et un système d'édition génétique hypercompact (système TaRGET hypercompact) la comprenant. Le système d'édition génétique hypercompact permet non seulement d'augmenter l'efficacité de l'édition génétique, mais il peut également être encapsidé dans un seul vecteur AAV, ce qui lui permet d'être livré efficacement à un site cible dans une cellule. Le système d'édition génétique hypercompact pour l'édition d'acide nucléique de la présente invention constitue un système d'édition génétique de nouvelle génération multipliant la sélection de protéines d'édition génétique pouvant être appliquées à divers gènes cibles et types d'acide nucléique et remplaçant un système d'édition génétique existant de grande taille rencontrant des difficultés de distribution dans les cellules, et pouvant être utilement utilisé pour le traitement et la recherche sur les maladies génétiques par l'édition d'acide nucléique.
PCT/KR2022/015067 2021-10-06 2022-10-06 Système cible pour l'édition du génome et ses utilisations WO2023059115A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2021-0132306 2021-10-06
KR20210132306 2021-10-06

Publications (1)

Publication Number Publication Date
WO2023059115A1 true WO2023059115A1 (fr) 2023-04-13

Family

ID=85804529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/015067 WO2023059115A1 (fr) 2021-10-06 2022-10-06 Système cible pour l'édition du génome et ses utilisations

Country Status (2)

Country Link
KR (1) KR20230051095A (fr)
WO (1) WO2023059115A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11814620B2 (en) 2021-05-10 2023-11-14 Mammoth Biosciences, Inc. Effector proteins and methods of use

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200190494A1 (en) * 2018-12-14 2020-06-18 Pioneer Hi-Bred International, Inc. Novel crispr-cas systems for genome editing
KR20210053228A (ko) * 2019-10-29 2021-05-11 주식회사 진코어 CRISPR/Cas12f1 시스템 효율화를 위한 엔지니어링 된 가이드 RNA 및 그 용도

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
UA118014C2 (uk) 2012-05-25 2018-11-12 Те Ріджентс Оф Те Юніверсіті Оф Каліфорнія Спосіб модифікації днк-мішені

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200190494A1 (en) * 2018-12-14 2020-06-18 Pioneer Hi-Bred International, Inc. Novel crispr-cas systems for genome editing
KR20210053228A (ko) * 2019-10-29 2021-05-11 주식회사 진코어 CRISPR/Cas12f1 시스템 효율화를 위한 엔지니어링 된 가이드 RNA 및 그 용도

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN ANONYMOUS CHRISTIAN : "MAG TPA: IS200/IS605 family element transposase accessory protein TnpB [Candidatus Woesearchaeota archaeon]", XP093020751, retrieved from GENBANK Database accession no. HIH05586 *
KARVELIS TAUTVYDAS, BIGELYTE GRETA, YOUNG JOSHUA K, HOU ZHENGLIN, ZEDAVEINYTE RIMANTE, BUDRE KAROLINA, PAULRAJ SUSHMITHA, DJUKANOV: "PAM recognition by miniature CRISPR–Cas12f nucleases triggers programmable double-stranded DNA target cleavage", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, GB, vol. 48, no. 9, 21 May 2020 (2020-05-21), GB , pages 5016 - 5023, XP055920188, ISSN: 0305-1048, DOI: 10.1093/nar/gkaa208 *
KIM DO YON; LEE JEONG MI; MOON SU BIN; CHIN HYUN JUNG; PARK SEYEON; LIM YOUJUNG; KIM DAESIK; KOO TAEYOUNG; KO JEONG-HEON; KIM YONG: "Efficient CRISPR editing with a hypercompact Cas12f1 and engineered guide RNAs delivered by adeno-associated virus", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 40, no. 1, 2 September 2021 (2021-09-02), New York, pages 94 - 102, XP037667066, ISSN: 1087-0156, DOI: 10.1038/s41587-021-01009-z *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11814620B2 (en) 2021-05-10 2023-11-14 Mammoth Biosciences, Inc. Effector proteins and methods of use

Also Published As

Publication number Publication date
KR20230051095A (ko) 2023-04-17

Similar Documents

Publication Publication Date Title
WO2021086083A2 (fr) Arn guide modifié pour augmenter l'efficacité de système crispr/cas12f1, et son utilisation
WO2019009682A2 (fr) Mutant crispr spécifique à une cible
WO2016021973A1 (fr) Édition du génome à l'aide de rgen dérivés du système campylobacter jejuni crispr/cas
WO2017217768A1 (fr) Procédé de criblage de ciseaux génétiques ciblés à l'aide d'un système à cibles multiples d'activité sur cible et hors cible et son utilisation
WO2019103442A2 (fr) Composition d'édition génomique utilisant un système crispr/cpf1 et son utilisation
WO2014065596A1 (fr) Composition pour le clivage d'un adn cible comprenant un arn guide spécifique de l'adn cible et un acide nucléique codant pour la protéine cas ou la protéine cas, et leur utilisation
WO2017188797A1 (fr) Procédé d'évaluation, in vivo, de l'activité d'une nucléase guidée par arn de manière très efficace
WO2018062866A2 (fr) Protéine recombinante cas9 perméable aux cellules (cp) et ses utilisations
WO2022075816A1 (fr) Arn guide modifié pour augmenter l'efficacité du système crispr/cas12f1 (cas14a1), et utilisation associée
WO2022075808A1 (fr) Arn guide modifié comprenant une queue riche en u pour augmenter l'efficacité d'un système crispr/cas12f1 et utilisation correspondante
WO2022060185A1 (fr) Désaminase ciblée et édition de base l'utilisant
WO2023059115A1 (fr) Système cible pour l'édition du génome et ses utilisations
WO2022220503A1 (fr) Système régulateur d'expression génétique utilisant un système crispr
WO2022075813A1 (fr) Arn guide modifié pour augmenter l'efficacité du système crispr/cas12f1, et son utilisation
CN101310015A (zh) 在两个功能性亚结构域中具有突变的laglidadg归巢核酸内切酶变体及其用途
WO2018231018A2 (fr) Plateforme pour exprimer une protéine d'intérêt dans le foie
WO2018088694A2 (fr) Système de contrôle de fonction de cellules de schwann (sc) artificiellement modifiées
WO2023153845A2 (fr) Système cible pour réparation dirigée par homologie et procédé d'édition de gène l'utilisant
WO2020235974A2 (fr) Protéine de substitution à base unique, et composition la comprenant
WO2015199387A2 (fr) Gène de l'a-1,2 fucosyltransférase d'helicobacter pylori et protéine caractérisée par une expression améliorée d'une protéine soluble, et leur utilisation dans le cadre de la production d'un a-1,2 fucosyloligosaccharide
WO2018230976A1 (fr) Système d'édition de génome pour une mutation de type expansion de répétition
WO2020022803A1 (fr) Édition génique d'anticoagulants
WO2023229222A1 (fr) Protéine cas12f modifiée avec une gamme de cibles élargie et ses utilisations
WO2023191570A1 (fr) Système d'édition génique pour le traitement du syndrome d'usher
WO2022240262A1 (fr) Composition et méthode de traitement de lca10 à l'aide d'une nucléase guidée par arn

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22878936

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE