AU2022201166B2 - Type ii crispr/cas9 genome editing system and the application thereof - Google Patents

Type ii crispr/cas9 genome editing system and the application thereof Download PDF

Info

Publication number
AU2022201166B2
AU2022201166B2 AU2022201166A AU2022201166A AU2022201166B2 AU 2022201166 B2 AU2022201166 B2 AU 2022201166B2 AU 2022201166 A AU2022201166 A AU 2022201166A AU 2022201166 A AU2022201166 A AU 2022201166A AU 2022201166 B2 AU2022201166 B2 AU 2022201166B2
Authority
AU
Australia
Prior art keywords
sequence
seq
sgrna
crispr
nts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
AU2022201166A
Other versions
AU2022201166A1 (en
Inventor
Long Huang
Rui Tian
Hongxian XIE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Shu Tong Medical Technology Co Ltd
Original Assignee
Zhuhai Shu Tong Medical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Shu Tong Medical Technology Co Ltd filed Critical Zhuhai Shu Tong Medical Technology Co Ltd
Priority to AU2022201166A priority Critical patent/AU2022201166B2/en
Publication of AU2022201166A1 publication Critical patent/AU2022201166A1/en
Application granted granted Critical
Publication of AU2022201166B2 publication Critical patent/AU2022201166B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Bakery Products And Manufacturing Methods Therefor (AREA)
  • Saccharide Compounds (AREA)
  • General Preparation And Processing Of Foods (AREA)

Abstract

The disclosure relates to a Type II CRISPR/Cas9 genome editing system, belonging to the technical field of genome editing. The genome editing system comprises a Cas9 protein, helper proteins, a CRISPR RNA and a trans-activated CRISPR RNA; wherein the Cas9 protein is a DNA endonuclease, and the Cas9 protein has an amino acid sequence as shown in SEQ ID NO: 1, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 1. According to the disclosure, through bioinformatics analysis, the Type II CRSIPR/Cas9 genome editing system in the Faecalibaculum rodentium is discovered, and the genome editing system is applied to editing prokaryotic or eukaryotic genes and provides a new selection for a genome editing toolbox.

Description

TYPE II CRISPR/CAS9 GENOME EDITING SYSTEM AND THE APPLICATION THEREOF
TECHNICAL FIELD
The disclosure relates to a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculumrodentium, belonging to the technical field of genome editing.
BACKGROUNDART
Gene editing technology makes it possible to modify DNA sequence localization points. For example, the first generation of genome editing tools zinc finger nucleases, (ZFNs), the second generation of genome editing tools such as transcription activator-like effector nucleases (TALENs) can all be used to transform targeted genomes. However, these methods are difficult to design, not easy to manufacture, expensive and not universal. The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)/Cas (CRISPR-associated protein) system is the innate immune system derived from archaea and bacteria, serving as the third-generation genome editing tool. Different from that previous genome editing tools (protein-DNA recognition), the method utilizes the principle of nucleic acid base complementary pair to identify the target DNA sequence and guide the Cas effector protein to carry out site-specific cleaving, and has the advantages of high applicability, simple design, low cost and high efficiency. Cas protein contains a variety of different effector domains, which play roles in different activities such as nucleic acid recognition, stabilization of complex structures, and hydrolysis of DNA phosphodiester bonds. Among them, the Type II CRISPR/Cas9 system derived from streptococcus pyogene Cas (SpCas9) has become the most widely used CRISPR/Cas system due to its high cleavage efficiency. This system identifies and cleaves the Protospacer Adjacent Motif (PAM) sequence, i.e. "NGG" on the targeted polynucleotide, leaving a flat-ended overhang, and affecting genome editing. In large and diverse metagenomes harboring uncultured or even undiscovered microorganisms, there may be a large number of undiscovered CRISPR/Cas9 systems whose activities in prokaryotes and eukaryotes, as well as in vitro environments, need to be confirmed. In 2015, Dr.Byoung-Chan Kim's team isolated a new anaerobic strain ALO17 from the feces of laboratory mice C57BL/6J, and analyzed the phylogenetic relationship of the strain with the 16SrRNA gene sequence of prokaryote, and found that the strain is closely related to Holdemanellabiformis DSM 3989T, FaecalicoccuspleomorphusATCC 29734T, FaecalitaleacylindroidesATCC 27803T, and AllobaculumstercoricanisDSM13633T (sequence homologies are 87.4%, 87.3%, 86.9% and 86.9%, respectively). On the basis of multiple taxonomic evidences, this species is considered to be a new genus of the Erysipelothricaceaefamily, and named as Faecalibaculumrodentium Gen.nov., sp. nov.. In the past five years, scientists in various countries around the world have carried out research on this strain in two fields, i.e., intestinal microbial environment and high-fat diet, and intestinal microbial environment and tumorigenesis. However, it has not been reported in the field of genome editing.
SUMMARY OF THE DISCLOSURE
In view of the defects of the prior art the disclosure provides a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium, wherein the genome editing system has new physical and chemical properties and can identify a plurality of different PAM (protospacer adjacent motif) sequences, including NGTA and NNTA (N is A, C, G, or T).
In a first aspect, there is provided a TypeII CRISPR/Cas9 genome editing system, comprising a Cas9 protein, helper proteins, a CRISPR RNA (crRNA) and a trans activated CRISPR RNA (tracrRNA) in the functional form of an ribonucleoprotein complex (RNP complex) of the Cas9 protein and a guide RNA formed by hybridizing the crRNA with the tracrRNA;
wherein the Cas9 protein is a DNA endonuclease, and the Cas9 protein has an amino acid sequence as shown in SEQ ID NO: 1, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 1; wherein the helper proteins comprise a Cas1 helper protein, a Cas2 helper protein and a Csn2 helper protein; wherein the Cas1 helper protein has an amino acid sequence as shown in SEQ ID NO: 2, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 2; wherein the Cas2 helper protein has an amino acid sequence as shown in SEQ ID NO: 3, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 3; wherein the Csn2 helper protein has an amino acid sequence as shown in SEQ ID NO: 4, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 4; wherein the CRISPR RNA is generated by transcription of a CRISPR Array, and has an RNA sequence as shown in SEQ ID NO: 5, or an RNA sequence with at least %, 85%, 90%, 95%, 98%, or 99% homology to the nucleic acid sequence as shown in SEQ ID NO: 5; and wherein the tracrRNA comprises a sequence complementary to a direct repeat sequence of the crRNA, and the tracrRNA has a nucleic acid sequence as shown in SEQ ID NO: 8, or a nucleic acid sequence with at least 7 0 %, 80%, 85%, 90%, 95%, 98%, or 99% to the nucleic acid sequence as shown in SEQ ID NO: 8.
In some embodiments, the Cas9 protein cleaves a double-stranded DNA complementary to a crRNA upstream of a PAM sequence by a nuclease domain, wherein the nuclease domain is selected from a HNH-like nuclease domain, a RuvC like nuclease domain, or a combination thereof.
In some embodiments, the term "ribonucleoprotein complex" is preferably referred to the "FrCas9 protein complex" according to the disclosure.
The mutations at several key amino acid sites of the specific Cas9 protein are explained in details as follows. The mutation of E to A at 796 position amino acid will result in nickase nuclease. The mutation of N to A at 902 position amino acid will result in nickase nuclease. The mutation of H to A at 1010 position amino acid will result in nickase nuclease. The mutation of D to A at 1013 position amino acid will result in nickase nuclease.
The simultaneous mutation of E to A at 796 position amino acid and D to A at 1013 position amino acid will result in a Cas9 nuclease that is non-cleaving but retains binding (i.e., dead Cas9).
In some embodiments, the CRISPR Array comprises a direct repeat sequence and a spacer sequence, wherein the direct repeat sequence has a nucleic acid sequence as shown in SEQ ID NO: 6, or a nucleic acid sequence with at least 80%, 85%,90%,95%, 98%, or 99% homology to the nucleic acid sequence as shown in SEQ ID NO: 6; and wherein the spacer sequence has a nucleic acid sequence as shown in SEQ ID NO: 7, or a nucleic acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology
to the nucleic acid sequence as shown in SEQ ID NO: 7.
In some embodiments, the guide RNA formed by hybridizing the crRNA with the tracrRNA has a scaffold composed of a sequence of 7 to 24 nts of a crRNA direct repeat sequence and a tracrRNA sequence; preferably the two parts are fused by a "GAAA", "TGAA", or "AAAC" linker to form an sgRNA scaffold; preferably, an sgRNA scaffold formed by fusion of the crRNA direct repeat sequence of 20 nts and a full length sequence of the tracrRNA by "GAAA" is as shown in SEQ ID NO: 9, and on this basis, preferably, a highly efficient variant of an sgRNA scaffold is
an sgRNA scaffold formed by fusion of the first 18 to 14 nts of the crRNA direct repeat sequence and the last 69 to 65 nts of the tracrRNA by a liner sequence (for
/A example, "GAAA"), preferably selected from the following five scaffolds:
(1) an sgRNA scaffold with a length of 91 nts, which comprises 18 nts direct repeat sequence and 69 nts tracrRNA, as shown in SEQ ID NO: 10;
(2) an sgRNA scaffold with a length of 89 nts, which comprises 17 nts direct repeat sequence and 68 nts tracrRNA, as shown in SEQ ID NO: 11;
(3) an sgRNA scaffold with a length of 87 nts, which comprises 16 nts direct repeat sequence and 67 nts tracrRNA, as shown in SEQ ID NO: 12;
(4) an sgRNA scaffold with a length of 85 nts, which comprises 15 nts direct repeat sequence and 66 nts tracrRNA, as shown in SEQ ID NO: 13;
(5) an sgRNA scaffold with a length of 83 nts, which comprises 14 nts direct repeat sequence and 65 nts tracrRNA, as shown in SEQ ID NO: 14;
optionally, the sgRNA scaffold has a nucleic acid sequence with at least 70%, 80%, %, 90%, 95%, 98%, or 99% homology to any of SEQ ID NOs: 9 to 14.
In some embodiments, the Type II CRISPR/Cas9 genome editing system, derived from Faecalibaculum rodentium, binds or cleaves a specific DNA in a biological process (such as genome editing process), by complementary pairing recognition of the guide RNA and a target of the specific DNA, and wherein a length of a paired binding part of the guide RNA and the target of the specific DNA ranges from 14 to 30 bps (preferably 20 to 23 bps); wherein the specific DNA is a DNA of prokaryote or eukaryote.
In some embodiments, the length of the paired binding part of the guide RNA and the target of the specific DNA is 21 bps, 22 bps or 23 bps, and wherein the RNP complex is highly sensitive to base mismatch of 14 bps close to a protospacer adjacent motif and the 14 bps is a seed region.
In some embodiments, the protospacer adjacent motif required for a function of binding or cleaving DNA is 5'-NNTA-3' downstream of the guide RNA/sgRNA recognition sequence
The disclosure also provides the use of the abovementioned Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium in editing DNA, CRISPR activation or interference. Preferably, the DNA is prokaryotic or eukaryotic DNA.
As another aspect of the disclosure, the disclosure further provides the use of the above Type II CRISPR/Cas9 genome editing system in the preparation of a nickase, a dead Cas9, a base editor, or a prime editor. Preferably, the preparation is in a prokaryote or a eukaryote.
Compared with the prior art, the disclosure has the desirable beneficial effects as follows.
(1) According to the disclosure, through bioinformatics analysis, a Type II CRSIPR/Cas9 genome editing system in Faecalibaculumrodentium is found, and the genome editing system is applied to editing prokaryotic or eukaryotic genes and provides a new option in the genome editing toolbox.
(2) The disclosure provides a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium, which has new physical and chemical properties and can identify a plurality of different PAM sequences, wherein the specific sequence of PAM recognized by the genome editing system is 5'-NNTA-3' downstream of the guide RNA/sgRNA recognition sequence.
(3) The disclosure provides a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium, which has high cleavage efficiency on the same DNA site compared with the most common SpCas9, while the off-target is lower than the most common SpCas9, such that it is a safer and more effective genome editing tool.
(4) The disclosure provides a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculumrodentium, wherein the PAM recognized at the genome editing system level has palindrome characteristics, and targets therefore are distributed "back-to-back" on a genome, which has a higher density than SpCas9.
(5) The disclosure provides a Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium, which has high flexibility and can be modified into a base editor and a prime editor, and is a genome editing tool capable of being widely applied to different scenarios.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a composition diagram of a TypeII CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure.
Figure 2 is a structural diagram of Cas9 protein of the TypeII CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure.
Figure 3 is an RNA secondary structure prediction diagram and an optimal sgRNA scaffold diagram of a guide RNA molecule recognized by the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure. Figure 3A shows the full length 95 nts gRNA. Figure 3B shows an sgRNA scaffold with a length of 91 nts, which comprises 18 nts direct repeat sequence and 69 nts tracrRNA. Figure 3C shows an sgRNA scaffold with a length of 89 nts, which comprises 17 nts direct repeat sequence and 68 nts tracrRNA. Figure 3D shows an sgRNA scaffold with a length of 87 nts, which comprises 16 nts direct repeat sequence and 67 nts tracrRNA. Figure 3E shows an sgRNA scaffold with a length of 85 nts, which comprises 15 nts direct repeat sequence and 66 nts tracrRNA. Figure 3F shows an sgRNA scaffold with a length of 83 nts, which comprises 14 nts direct repeat sequence and 65 nts tracrRNA.
Figure 4 is a schematic diagram of prokaryotic PAM sequences of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure.
Figure 5 is a schematic diagram of prokaryotic interference experiments of the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure.
Figure 6 is a schematic diagram of eukaryotic cleaving of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure. Figure 6A is an ODN-PCR gel diagram. Figure 6B is a Sanger sequence peak.
Figure 7 is a schematic diagram showing the optimal length of sgRNA in the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure. Figure 7A is a graph showing the cleavage efficiency of different sgRNA lengths at the HEK293 SITE2-T2 site. Figure 7B is a graph showing the cleavage efficiency of different sgRNA lengths at the DNMT1-T3 site. Figure 7C is a graph showing the cleavage efficiencies of different sgRNA lengths at the RNF2-T6 site.
Figure 8 is a schematic diagram showing the optimal length of the sgRNA recognition sequence of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure.
Figure 9 is a schematic diagram showing the comparison of the target efficiency and off-target of GUIDE-seq in the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure as compared with SpCas9.
Figure 10 is a graph showing experimental results of the FrCas9-BE base editor; wherein, Figure 1OA is a schematic diagram and results of confirming the form of FrCas9 nickase by ODN breakpoint PCR; Figure 1OB is the editing window for the nFrCas9-BE4Gram; Figure 1OC is the editing window for nFrCas9-ABE7.10; Figure 1OD is a Venn diagram of pathogenic mutations in the ClinVar database that can be uniquely corrected by the SpCas9 and FrCas9 base editors; Figure 1OE shows the C > T editing efficiency for FrCas9-BE4Gam using 2 "back-to-back" sgRNA simultaneously.
Figure 11 is a diagram showing the distribution of FrCas9 and SpCas9 targets in the human genome, where Figure 11A shows the distribution of 5'-GG-3' (representing SpCas9 PAM) in the GRCh38 human genome; Figure 11B shows the distribution of 5'-TA-3' (representing FrCas9 PAM) in the GRCh38 human genome.
Figure 12 shows the application of FrCas9-specific targeted TATA box in CRISPR interference and CRISPR activation, wherein Figure 12A is a schematic diagram of the TATA-box of the ABCA1 gene targeted by FrCas9; Figure 12B shows CRISPR interference of FrCas9 and SpCas9 by targeting TATA-box, and CRISPR activation of FrCas9 and SpCas9 by targeting TATA-box; Figure 12C shows CRISPR activation of FrCas9 and SpCas9 by targeting the TATA-box.
Figure 13 is a schematic diagram of the Prime Editing genome editing system.
Figure 14 shows the genome editing effect of the FrCas9-PE Prime Editing system verified by Sanger sequencing, wherein the measured sequence was CGAACACTCAAGGTAAT (SEQ ID NO: 33).
DETAILED DESCRIPTION OF THE DISCLOSURE
In order to better explain the objects, technical solutions and advantages of the disclosure, the disclosure will be further described below with reference to specific embodiments.
A Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium comprises a Cas9 protein, helper proteins, a CRISPR RNA and a trans-activated CRISPR RNA, as shown in Figure 1. The Cas9 protein is a DNA endonuclease, and the Cas9 protein has an amino acid sequence as shown in SEQ ID NO: 1, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 1.
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculumrodentium according to the disclosure, the Cas9 protein cleaves a double-stranded DNA complementary to crRNA upstream of the PAM sequence by a nuclease domain, as shown in Figure 2. The nuclease domain is selected from a HNH-like nuclease domain, a RuvC-like nuclease domain, or a combination thereof.
The Cas9 protein (Faecalibaculum rodentium Cas9), abbreviated as FrCas9 protein, comprises 1372 amino acids (SEQ ID NO: 1), and is a multi-domain and multifunctional DNA endonuclease. It efficiently cleaves a double-stranded DNA complementary to sgRNA upstream of the PAM by a nuclease domain, for example, cleaving a DNA strand complementary to the sgRNA sequence by a HNH-like nuclease domain, or cleaving a non-complementary strand DNA by a RuvC-like nuclease domain. Among them, The mutation of E to A at 796 position amino acid will result in nickase nuclease. The mutation of N to A at 902 position amino acid will result in nickase nuclease. The mutation of H to A at 1010 position amino acid will result in nickase nuclease. The mutation of D to Aat 1013 position amino acid will result in nickase nuclease. The simultaneous mutation of E to A at 796 position amino acid and D to A at 1013 position amino acid will result in a Cas9 nuclease that is non-cleaving but retains binding.
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium according to the disclosure, the helper proteins comprise a Cas1 helper protein, a Cas2 helper protein and a Csn2 helper protein. The Cas1 helper protein has an amino acid sequence as shown in SEQ ID NO: 2, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 2. The Cas2 helper protein has an amino acid sequence as shown in SEQ ID NO: 3, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 3. The Csn2 helper protein has an amino acid sequence as shown in SEQ ID NO: 4, or an amino acid sequence with at least %, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 4.
The helper proteins Cas1, Cas2 and Csn2 according to the disclosure participate in exogenous gene capture and maturation of crRNA.
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium according to the disclosure, the CRISPR RNA is generated by transcription of a CRISPR Array. The CRISPR RNA has an RNA sequence as shown in SEQ ID NO: 5, or an RNA sequence with at least 80%, %, 90%, 95%, 98%, or 99% homology to the nucleic acid sequence as shown in
SEQ ID NO: 5. The CRISPR Array comprises a direct repeat sequence and a spacer sequence. The direct repeat sequence is as shown in SEQ ID NO: 6. The spacer sequence is as shown in SEQ ID NO: 7.
The CRISPR RNA(crRNA) according to the disclosure guides the Cas protein to recognize an intruding foreign genome in a base complementary form. When bacteria are exposed to invasion by bacteriophage or virus, a short segment of foreign DNA is integrated as a new spacer between CRISPR repeated spacer sequences in the host chromosome, thereby providing a genetic record of infection. When the body is invaded by a foreign gene again, the CRISPR array transcribes and produces a precursor crRNA(pre-crRNA) with a spacer sequence at the 5'end and a length of 30 bps, which is complementary to the sequence from the foreign invasion gene. The 3' end is a repeat sequence with a length of 36 bps.
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium according to the disclosure, the trans-activated CRISPR RNA comprises a sequence complementary to a direct repeat sequence of the CRISPR RNA as shown in SEQ ID NO: 8.
The trans-activated crRNA(tracrRNA) according to the disclosure is a non-protein encoded RNA, and participates in the maturation of crRNA and the formation of sgRNA. Under the action of the tracrRNA and Cas9 nuclease, the pre-crRNA removes 0 to 16 nts upstream of the spacer sequence and 12 to 29 nts downstream of the repeat sequence to form mature crRNA, and binds the tracrRNA to form a tracrRNA-crRNA complex, which comprises a part recognizing the foreign DNA sequence and has length ranging from 14 to 30 bps. Tetraloop (for example, a "GAAA", "TGAA" or "AAAC" sequence) of four bases may be added between downstream of the crRNA and upstream of the tracrRNA to bind the tracrRNA and the crRNA, in order to form an sgRNA comprising two bulge and three duplex structures upstream, as well as a stem loop structure downstream, which can be further divided into a part that recognizes a foreign DNA sequence and a scaffold part.
The cleavage by the endonuclease can be further optimized by adjusting the length of the part of the sgRNA recognizing the foreign DNA sequence and the length of the tracrRNA. The early experiments of the disclosure prove that, the optimal length of the part of sgRNA recognizing the exogenous DNA sequence is 21 bps, 22 bps or 23 bps (as shown in Figure 8), and the optimal length of the scaffold part is as follow 5 kinds, as shown in Figures 3 and 7: 91 nts (18 nts crRNA direct repeat + 69 nts tracrRNA), 89 nts (17 nts crRNA direct repeat + 68 nts tracrRNA), 87 nts (16 nts crRNA direct repeat + 67 nts tracrRNA), 85 nts (15 nts crRNA direct repeat + 66 nts tracrRNA), and 83 nts (14 nts crRNA direct repeat + 65 nts tracrRNA).
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium according to the disclosure, the Faecalibaculum rodentium-derived Type II CRISPR/Cas9 genome editing system binds or cleaves structures of DNA functions in a genome editing process.
As a preferred embodiment of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium according to the disclosure, the DNA is a DNA of prokaryote or eukaryote.
In some embodiments, the FrCas9 according to the disclosure can recognize a variety of DNA double-stranded fragment (DSB) formed by ProSpacer Adjacent
Module (PAM) immediately downstream of the targeting sequence. Two important factors are needed for FrCas9 protein to recognize the targeting sequence: one is the nucleotide complementary to crRNA spacer; and the other is the Protospacer Adjacent Motif, PAM) sequence adjacent to the complementary sequence. The depletion experiment shows that FrCas9 has a cleavage effect in the prokaryotic system, and preliminarily verifies that the third and fourth positions of the PAM sequence recognized by this newly discovered Type II CRISPR/Cas9 system are TA, as shown in Figure 4. It is further confirmed by interference experiment that the PAM downstream of the targeting sequence recognized by FrCas9 is 5'-NNTA-3', as shown in Figure 5. All the above PAMs are verified by eukaryotic experiment, as shown in Figure 6. By artificially designing the spacer sequence in the crRNA, this CRISPR-Cas9 system can target almost all DNA sequences of interest in the genome, producing a site-specific flat-ended double-strand break (DSB). Repair of the DSB by non-homologous termini resulting in small random insertions/deletions (indels) at the cleavage site to inactivate the gene of interest. Alternatively, by high-fidelity homologous repair, precise genomic modifications at the DSB site can be performed using homologous repair templates.
Most human genetic diseases are single base mutation, which cannot be treated by traditional methods. The base editor is one of the latest and most effective ways to achieve accurate genome editing. It is characterized by the use of CRISPR-Cas protein in nicking form to locate specific DNA targets, and the use of DNA deaminase to modify and mutate this DNA target to correct the diseased bases without producing double-stranded DNA fragmentation. In some embodiments, the combination of E796A nFrCas9 with the optimized fourth generation cytidine base editor BE4Gam can successfully construct an E796A nFrCas9-BEGam cytosine base editor (CBE), such that the C:G base pair in DNA can be mutated to T:A. The combination of E796A nFrCas9 with the 7th generation adenine base editor ABE7.10 can successfully construct an E796A nFrCas9-BABE7.10 adenine base editor (ABE), such that the A:T base pair in DNA can be mutated to G:C.
In some embodiments, the specific PAM of FrCas9 of the disclosure is NNTA with a specific target on TATA-BOX (one of the elements constituting the eukaryotic promoter), such that FrCas9 specifically targets TATA-BOX to exert a unique CRISPR interference (CRISPRi)/activation (CRISPRa) effect. Among them, CRISPRi can be achieved by directly targeting the cleavage of TATA-BOX with the active form of FrCas9 and destroying TATA-box, or by directly binding to TATA-box with dFrCas9 (i.e., dead Cas9) without cleavage activity. CRISPRa can be achieved by, but not limited to, dFrCas9-VP64 targeting to the TATA-box site.
Prime Editing (PE) is a brand new accurate genome editing tool. The technology can realize the free replacement of single base and the accurate insertion and deletion of multiple bases, greatly reducing the harmful by-products produced by indels in the process of genome editing and significantly improving the editing accuracy. It is widely considered to be a significant advance in genome editing. In some embodiments, the FrCas9 of the disclosure is fused with a reverse transcriptase, and the corresponding pegRNA can be designed according to its PAM sequence to establish a FrCas9-PE genome editing system. Specifically, two sequences are added to the 3' end sequence of pegRNA. The first sequence is a primer binding site (PBS), which can be complementary to the end of a fractured target DNA chain to initiate a reverse transcription process, and the second sequence is a reverse transcription template (RT template), which carries a target point mutation or insertion deletion mutation to achieve accurate genome editing.
A further purpose of the disclosure is to provide the use of the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium in editing prokaryotic or eukaryotic genes, CRISPR activation or CRISPR interference.
As a preferred embodiment of the use described according to the disclosure, the Type II CRISPR/Cas9 genome editing system derived from Faecalibaculum rodentium is used to bind or cleave structures of DNA function at the DNA level.
Example 1
In the example, the RNA secondary structure of the guide RNA molecule recognized by the Type II CRISPR/Cas9 genome editing system of the Faecalibaculumrodentium according to the disclosure was predicted, and the RNA structure after the transcription of the tracrRNA and repeat was predicted by simulating the RNA binding process of the two, and the obtained RNA secondary structure is shown in Figure 3.
(1) Materials: Predicted tracrRNA and repeat sequences, and predicted anti-repeat sequences.
(2) Software: NUPACK (http://www.nupack.org/partition/new)
(3) Prediction method: The in vitro interaction process of 1Il each of two RNAs at 37 C was simulated by online application of NUPACK, and the secondary structure of the obtained RNA composition was predicted to obtain an RNA secondary structure as shown in Figure 2.
As shown in Figure 3, pre-crRNA removed 0 to 16 nts upstream of the spacer sequence and 12 to 29 nts downstream of the repeat sequence under the action of the tracrRNA and Cas9 nuclease to form mature crRNA which was fused with tracrRNA to form a tracrRNA-crRNA complex containing an exogenous DNA sequence complementary to the spacer sequence, and the sequence length was 14 to 30 bps. They were bound to form sgRNA by the addition of a tetraloop of four bases between the downstream of crRNA and the upstream of the tracrRNA, containing two bulge and three duplex structures upstream and three stem loop structures downstream.
Example 2
In the example, the Protospacer Adjacent Motif (PAM) recognized by the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium in the prokaryotic system was 5'-NNTA-3'.
(1) Materials: Genes related to the CRISPR/Cas9 genome editing system predicted by the above implementation.
(2) Verification method: In this example, a prokaryotic verification system was constructed for the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure, to verify its cleavage effect and preliminarily explore the recognized PAM sequence by a second generation sequencing technology, and the result is shown in Figure 4.
Detailed protocols are as follows.
(a) The Faecalibaculum rodentium Type II CRISPR/Cas9 genome editing system (comprising an endonuclease Cas9, helper proteins Cas1, Cas2, Csn2, a CRISPR array and a non-coding RNA tracrRNA) of the disclosure was inserted into a pACYC184 vector, wherein the Cas9 protein was subjected to escherichiacoli codon optimization. The natural spacer sequences and spacer sequences in library were added into the CRISPR array, and a strong heterologous promoter J23119 was added on the Cas9 protein and the CRISPR array to construct a prokaryotic expression plasmid of pACYC184-FrCas9.
(b) Seven random bases (16,384 insertions in total) were added into the spacer sequence 3' of library. Two restriction endonuclease sites of EcoRI and NcoI were selected from MCS polyclonal of pUC19 vector. The library was cloned into the vector and a target-library plasmid was constructed.
(c) The plasmid containing pACYC184-FrCas9 or empty pACYC184 and target-library were co-electrically transfected to E. coli DH5a. After resuscitation at C for 2h, they were uniformly spread on SOB medium containing double resistance of ampicillin sodium (100 p g/mL) and chloramphenicol (34ug/ml) for incubation at 25 C for 30h, and the plasmid was collected by alkaline lysis.
(d) PCR amplification on a region containing a spacer sequence and seven random bases was performed. The secondary sequencing was performed on two ends of a PCR product by adding a linker. The PAM depletion value (PPDV) relative to a no-load control group was calculated. The PAM sequence of the Type II CRISPR/Cas9 genome editing system of the Faecalibaculum rodentium was generated by using Weblogo, wherein the PAM sequence is 5'-NNTA-3'.
Figure 3 is a schematic diagram of a conservative PAM sequence recognized in a prokaryotic system by the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure. The second-generation sequencing analysis was performed on library DNA obtained through an depletion experiment to calculate the PAM depletion value (PPDV) relative to a no-load control group. The PAM sequence of FrCas9 generated by Weblogo is 5'-NNTA-3'.
Example 3
In the example, the Protospacer Adjacent Motif (PAM) recognized in the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure was verified by an interference experiment, and its cleaving ability at the prokaryotic level and potential genome editing ability in eukaryotes were determined. The results of interference experiments are shown in Figure 5, and the schematic diagrams of various possible PAM sequences recognized by the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure are shown in Figure 5.
(1) Materials: The pACYC184-FrCas9, target-library plasmid obtained in Example 4, and the preliminarily recognized PAM.
(2) Verification method: In the example, the Protospacer Adjacent Motif (PAM) recognized in a prokaryotic system by the Type II CRISPR/Cas9 genome editing system of the Faecalibaculum rodentium according to the disclosure was further determined through an interference experiment.
Detailed protocols are as follows.
(a) A total of 16 combined sequences were obtained by adding NNTA to the 3' position of the spacer sequence, and the target plasmid was constructed by cloning into pUC19 through the restriction endonuclease sites of EcoRI and NcoI, respectively.
(b) The 16 target plasmids were respectively transfected into E. coli DH5a electrogenic competence containing FrCas9-related loci, and the plasmids were gradually diluted after resuscitation at 25 C for 2 h. The target plasmids were incubated overnight at 25 C in SOB medium containing double resistance of ampicillin sodium (100ug/ml) and chloramphenicol (34ug/ml) by dot blot, to observe the number of monoclonal bacteria.
Figure 5 is a schematic diagram of an interference experiment of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure. Based on the NNTA found in the depletion experiment, 16 target plasmids of different combinations of NNs were constructed. The monoclonal colony count was observed through the interference experiment. In Figure 5A, the leftmost column was designated as the single FrCas9, and the right side was designated as the target plasmid targeted by FrCas9 cleavage. The results showed that the colony count in the right column was decreased. Figure 5B is a statistical diagram of the cleaving effects of a plurality of PAM sequences recognized by the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure, and the Figure illustrates that cfu of target plasmids of 14 different combinations of NN are decreased as compared with that of the control group. The above results indicated that the Protospacer Adjacent Motif (PAM) recognized by the newly discovered CRISPR/Cas9 system in the prokaryotic system was 5'-NNTA-3' (N represents any base selected from the group consisting of A, T, C and G) by interference experiment.
Example 4
In the example, the tracrRNA range required for cleavage of targeted DNA sequences in the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure was verified by interference experiments, and the results of the interference experiments are shown in Figure 5C.
(1) Materials: pACYC184-FrCas9, target plasmid, preliminarily recognized PAM obtained in Example 5.
(2) Verification method: In the example, the tracrRNA range required for cleaving the targeted DNA sequence was further determined in a prokaryotic system by the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium.
Detailed protocols are as follows.
(a) By a Gibson clone method, a total of six gene sequence of all possible non-coding regions in wild-type Faecalibaculum rodentium were cloned into a target plasmid to construct target-NC 1 to 6 plasmids, wherein NC6 is formed by splice the full length of the non-coding regions, and NCs 1 to 5 respectively represent the possible non-coding regions of the first to fifth segments. In each NC plasmid, a strong heterologous promoter J23119 was added upstream of the non-coding region, wherein "+" represented a forward non-coding region and "-" represented a reverse non-coding region, respectively;
(b) From the pACYC184-FrCas9 obtained in Example 5, all possible non-coding regions were removed by a PCR homologous recombination method, and the CRISPR related protein and the CRISPR array gene sequence were retained, thus constructing a pACYC184-AFrCas9;
(c) The 12 target plasmids were respectively transfected into E.coli DH5a electrogenic competence containing pACYC184-AFrCas9, and the plasmids were gradually diluted after resuscitation at 25 C for 2 h. The target plasmids were incubated overnight at 25 C in SOB medium containing double resistance of ampicillin sodium (100ug/ml) and chloramphenicol (34ug/ml) by dot blot, to observe the number of monoclonal bacteria.
Figure 5C is a schematic diagram of the interference experiment of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure, and the results show that the monoclonal colonies of NCl and NC6 were significantly less than those of NCs 2 to 5, indicating that the alternative non-coding region in the first segment can assist FrCas9 in effectively and targeted cleavage of DNA sequences in E. coli.
Example 5
In the example, the ability to cleave targeted DNA sequences in eukaryotic cells was verified by an ODN experiment in the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium. An ODN-PCR result is shown in a diagram in Figure 6A, and a Sanger sequencing result is shown in a diagram in Figure 6B.
(1) Materials: All amino acid sequences, CRISPR array sequences, tracrRNA sequences, and the recognized PAM of the Faecalibaculum rodentium editing gene obtained in Examples 1-6.
(2) Verification method: In the example, the ability to cleave targeted DNA sequences in eukaryotic cells was verified through an ODN experiment in the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium.
Detailed protocols are as follows.
(a) The human-derived optimized FrCas9 protein sequence was synthesized and cloned into a PX330 eukaryotic vector to construct a PX330-FrCas9 plasmid.
(b) 30 bps upstream of NGTA, NATA, NCTA and NTTA on a human RNF2 gene were selected as a target sequence. A CRISPR array was constructed by a Gibson method. A human eukaryotic strong promoter U6 was added upstream of the CRISPR array, to construct a PX330-FrCas9-array plasmid.
(c) A mouse-derived eukaryotic strong promoter U6 was added upstream of the tracrRNA determined in Example 4 to construct a PX330-FrCas9-array-tracrRNA plasmid.
(d) The PX330-FrCas9-array-tracrRNA plasmid targeting different gene sites constructed above with 2.5ug and 1.5ul ODN was electrically transfected into HEK293T cells in good condition. All the cells were collected after 72 h for extraction of DNA.
(e) The ODN-PCR was performed by designing a pair of primers near the RNF2 targeted gene site and on the ODN sequence, and agarose gel electrophoresis was performed to observe whether there was a band for preliminary identification of the occurrence of targeted cleavage event.
(f) Sanger sequencing verified the successful insertion of ODN into the target site, and confirmed that FrCas9 had the ability to edit the target DNA in eukaryotic cells.
The results of ODN-PCR are shown in Figure 6A. NGTA, NATA, NCTA and NTTA all had target bands with correct band sizes. As shown in Figure 8 of Sanger sequencing results, the insertion position of ODN occurred in the 3 to 4 bps base upstream of PAM, indicating that the sequence cleavage range of target gene in FrCas9 was consistent with that in the previous SpCas9.
Example 6
In the example, the optimal length of the sgRNA recognition part in eukaryotic cells in the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure was verified by an ODN experiment, and the results are shown in Figure 7.
(1) Materials: All amino acid sequences, sgRNA sequences and recognized PAM; of the Faecalibaculumrodentium editing gene obtained in Examples 1-6;
(2) Verification method: In the example, the optimal length of the sgRNA recognition part in the eukaryotic cells in the Type II CRISPR/Cas9 genome editing system of the Faecalibaculumrodentium was verified through an ODN experiment.
Detailed protocols are as follows.
(a) The human-derived optimized FrCas9 protein sequence was synthesized and cloned into a PX330 eukaryotic vector to construct a PX330-FrCas9 plasmid.
(b) A target sequence of 30bp upstream of GGTA near the target site of SpCas9 which was found to have a good cleavage effect in the previous study was selected as the target sequence, and an sgRNA with a recognition length of 19 to 23 bps was constructed by Gibson method. A human eukaryotic strong promoter U6 was added upstream of the sgRNA to construct a PX330-FrCas9-sgRNA plasmid.
(c) The PX330-FrCas9-sgRNA plasmid targeting different gene sites constructed above with 2.5ug and 1.5ul ODN was electrically transfected into HEK293T cells in good condition. All the cells were collected after 72 h for extraction of DNA.
(d) The ODN-PCR was performed by designing a pair of primers near the target gene site and on the ODN sequence, and agarose gel electrophoresis was conducted to observe whether there was a band for preliminary identification of targeted cleavage event and to compare the band strength of sgRNA with the recognition length of 19 to 23 bps.
(e) Amplicon high-throughput database was established to quantify the Indel rate of the target region and the cleavage effects of sgRNA with a recognition length of 19 to 23 bps were compared.
As shown in Figure 7, the optimal sgRNA recognition length for FrCas9 was 21 bps, 22 bps, or 23 bps.
Example 7
In the example, the optimal length of an sgRNA scaffold in eukaryotic cells in the Type II CRISPR/Cas9 genome editing system of Faecalibaculum rodentium according to the disclosure was verified by an ODN experiment, and the results are shown in Figure 8.
(1) Materials: All amino acid sequences, sgRNA sequences and recognized PAM; of the Faecalibaculumrodentium editing gene obtained in Examples 1-6.
(2) Verification method: in the example, the ODN experiment was used for verifying the optimal length of an sgRNA scaffold of the eukaryotic cells in the Type II CRISPR/Cas9 genome editing system of the Faecalibaculumrodentium.
Detailed protocols are as follows.
(a) The human-derived optimized FrCas9 protein sequence was synthesized and cloned into a PX330 eukaryotic vector to construct a PX330-FrCas9 plasmid.
(b) A target sequence of 30bp upstream of GGTA near the target site of SpCas9 which was found to have a good cleavage effect in the previous study was selected as the target sequence, and an sgRNA with a scaffold length of 71 nts to 95 nts was constructed by Gibson method. A human eukaryotic strong promoter U6 was added upstream of the sgRNA to construct a PX330-FrCas9-sgRNA plasmid.
(c) The PX330-FrCas9-sgRNA plasmid targeting different gene sites constructed above with 2.5ug and 1.5ul ODN was electrically transfected into HEK293T cells in good condition. All the cells were collected after 72 h for extraction of DNA.
(d) The ODN-PCR was performed by designing a pair of primers near the targeted gene site and on the ODN sequence, and agarose gel electrophoresis was conducted to observe the presence of bands to preliminarily identify whether targeted cleavage occurred or not, and the band strengths of sgRNA with the scaffold length of 71 nts to 95 nts were compared.
(e) Amplicon high-throughput database was established to quantify the Indel rate of the target region and the cleavage effects of sgRNA with a scaffold length of 71 nts to 95 nts were compared.
As shown in Figures 8 and 3, the optimal sgRNA scaffold for FrCas9 is 83 nt to 91 nts.
Example 8
In the example, the cleaving effect in eukaryotic cells in the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium as higher than that of SpCas9 and the off-target rate as lower than that of SpCas9 was verified by an ODN experiment.
(1) Materials: All amino acid sequences, 22 bps sgRNA sequence, 5'-GGTA-3' PAM sequence, SpCas9 amino acid sequence, SpCas9 sgRNA sequence and SpCas9 '-NGG-3'PAM sequence of the Faecalibaculum rodentium editing gene obtained in Examples 1-6;
(2) Verification method: In the example, the GUIDE-seq experiment was used to verify that the cleaving effect in eukaryotic cell nuclear cells of the Type II CRISPR/Cas9 genome editing system of Faecalibaculumrodentium according to the disclosure was higher than that of SpCas9 and the off-target rate was lower than that of SpCas9.
Detailed protocols are as follows.
(a) The human-derived optimized FrCas9 protein sequence was synthesized and cloned into a PX330 eukaryotic vector to construct a PX330-FrCas9 plasmid.
(b) A target sequence of 30bp upstream of GGTA near the target site of SpCas9 which was found to have a good cleavage effect in the previous study was selected as the target sequence, and an sgRNA with a recognized length of 22 bp was constructed by Gibson method. A human eukaryotic strong promoter U6 was added upstream of the sgRNA to construct a PX330-FrCas9-sgRNA plasmid, and at the same time, construct a PX330-SpCas9-sgRNAplasmid of 20 bps SpCas9 sgRNA.
(c) The PX330-FrCas9-sgRNA plasmid and PX330-SpCas9-sgRNA plasmid targeting different gene sites constructed above with 2.5ug and 1.5ul ODN was electrically transfected into HEK293T cells in good condition. All the cells were collected after 72 h for extraction of DNA.
(d) The ODN-PCR was performed by designing a pair of primers near the targeted gene site and on the ODN sequence, and agarose gel electrophoresis was conducted to observe the presence of bands to preliminarily identify whether targeted cleavage occurred or not, and the DNA with bands was subjected to GUIDE-seq database building.
(e) Through bioinformatics analysis, the target-cleaving effects and off-targets of SpCas9 and FrCas9 at the same site were compared.
As shown in Figure 9, FrCas9 was located at target Reads digit 3257 at DYRK1A-T2 site, higher than that of SpCas9 at target Reads 2456. Meanwhile, off-target was not detected for FrCas9, while SpCas9 showed off-target at three sites. FrCas9 was at target Reads digit 34970 at the GRIB2B-T9 site, which was higher than that of SpCas9 at target Reads 20434. Meanwhile, off-target was not detected for FrCas9, while SpCas9 showed off-target at 3 sites. The above data indicated that FrCas9 was a CaS9 protein superior to SpCas9 in cleavage efficiency and specificity.
Example 9 FrCas9 can be used for Prime Editing (PE)
Most human genetic diseases are single base mutation, which cannot be treated by traditional methods. The base editor is one of the latest and most effective ways to achieve accurate genome editing. It is characterized by the use of CRISPR-Cas protein in nicking form to locate specific DNA targets, and the use of DNA deaminase to modify and mutate this DNA target to correct the diseased bases without producing double-stranded DNA fragmentation. The cytosine base editor (CBE) mutated the C:G base pair in DNA to T:A, and the adenine base editor (ABE) mutated the A:T base pair to G: C.
First, three point-mutations of E796A, H1010A and D1013A were respectively incorporated to generate different FrCas9 nickases (nFrCas9). Then, E796A nFrCas9 with the optimized fourth-generation cytidine base editor BE4Gam and seventh-generation adenine base editor ABE7.10 were combined. It is observed that the editing window of FrCas9-BE4Gam was 6th th (Figure 10B) bases and that of FrCas9-ABE7.10 was 6t1h t1h bases (Figure 10C). Based on the above characteristics, the targeting scopes of FrCas9-BE4Gam and FrCas9-ABE7.10 in ClinVar databases were calculated. For pathogenic mutations that could be precisely corrected by FrCas9-BE4Gam, 90.38% (235/260) unique events were different from SpCas9-BE4Gam. For pathogenic mutations that could be precisely corrected by FrCas9-ABE7.10, 92.21% (1196/1297) unique events were different from SpCas9-ABE7.10 (Figure 1OD). Therefore, the TA-rich PAM of FrCas9 greatly expanded the targets in human genome for base-editor to correct human disease-associated mutations.
Furthermore, the PAM of FrCas9 (5'-NNTA-3') was palindromic, which offered pairwise "back-to-back" existence of sgRNAs (Figure 10E). This feature could broaden the scopes of FrCas9 base-editors by modifying two close editing windows at the same time (Figure 10E) and increase the target distribution and density of FrCas9 sgRNAs. The 5'-GG-3' (represented for SpCas9 PAM) and 5'-TA-3' (represented for FrCas9 PAM) distributions in human genomes were calculated (Figures 11A and 11B). Compared to SpCas9 (median = 5 bp, mean = 8.66 bp)', FrCas9 showed more intensive distributions (median = 1 bp, mean = 6.16 bp) in human genomes, providing additional applicable loci.
Example 10 FrCas9 can target TATA-box to modulate gene expression as an effective tool for CRISPR activation and inhibition
TATA-BOX(TATA box/Hogness box) is one of the elements that constitute the promoter of eukaryotes. The consistent order is TATA(A/T)A(A/T) (non-template chain sequence). It is about -30 bp (-25 to-32 bp) upstream of the transcription initiation point of most eukaryotic genes and basically consists of A-T base pairs, which is the selection for determining the transcription initiation of genes. It is one of the binding sites of RNA polymerase, which can only start transcription after firmly binding to TATA-BOX. Since the PAM sequence of FrCas9 is NNTA, it has natural advantages in targeting TATA-BOX.
FrCas9 CRISPR interference (CRISPRi) in three TATA-box promoted genes, ABCA1, UCP3 and RANKL were tested (Figure 12A). By cleaving the TATA-box, FrCas9 reduced ABCA1, UCP3 and RANKL expression by 31.37%, 49.91% and 39.62%, respectively. Meanwhile, dFrCas9 without cleavage activity to directly bind to TATA-box was also utilized, and the expression of ABCA1, UCP3 and RANKL decreased 61.67%, 45.61% and 42.60% by dFrCas9 directly binding to the TATA-box, respectively (Figure 12B). Accordingly, FrCas9 possesses unique potential for efficient genome engineering of TATA-box related genetic diseases.
Further, FrCas9 CRISPR activation (CRISPRa) using dFrCas9-VP64 directly targeting the TATA-box was tested, and its performance was compared with dSpCas9-VP64 targeting the upstream of TATA-box. The CRISPRa experiments were conducted in ABCA1, SOD], GH1 and BLM2 genes. The results showed that dFrCas9-VP64 enables effective transcriptional activation. Moreover, the fold activation of dFrCas9-VP64 in ABCA1, GH1 and BLM2 was higher than that of dSpCas9-VP64, while the fold activation of SOD] gene was comparable to that of dSpCas9-VP64 (Figure 12C). Therefore, FrCas9 is a promising tool for CRISPR screening due to its unique 5'-NNTA-3'PAM.
Example 11 FrCas9 can be used for Prime Editing, PE
Prime Editing (PE) is a brand-new precise gene editing tool. This technology can greatly reduce the harmful by-products produced by indels during the gene editing process, significantly improve the editing accuracy, and has the potential to overcome fundamental barriers to the treatment of genetic diseases with existing gene editing methods. The PE system consists of two components, which comprises an engineered guide RNA (pegRNA) and a prime editor protein. pegRNAs have dual functions: the capable of directing the edited protein to the target site and containing the edited template sequence. The prime editor protein consists of a mutated Cas9 nickase (which cuts only one DNA strand) and a reverse transcriptase fused. After Cas9 cleaves the target site, reverse transcriptase uses the pegRNA as a template for reverse transcription, thereby incorporating the desired edit into the DNA strand, and the corrected sequence preferentially replaces the original genomic DNA, thereby permanently editing the target site (Figure 13).
Based on the biological characteristics of FrCas9, FrCas9 with reverse transcriptase was fused, and the corresponding pegRNA was designed according to its PAM sequence, and the FrCas9-PE gene editing system was established, and its gene editing efficiency was optimized. The pegRNA was designed according to the FrCas9 PAM sequence. Compared with sgRNA, the 3'-end sequence of pegRNA has two additional sequences. The first sequence is the primer binding site (PBS), which can be complementary to the end of the broken target DNA strand to initiate reverse transcription. In the process, the second sequence is a reverse transcription template (RT template), which carries target point mutations or indel mutations to achieve precise gene editing. Previous studies have shown that the length of the PBS and RT template sequences will significantly affect the gene editing efficiency of the PE system, and it varies by gene locus. Therefore, we explored the optimization of editing efficiency through the combination of PBS and RT template sequences of different lengths.
At the cellular level, the gene editing effect of the FrCas9-PE system was preliminarily verified. In HEK293T cells, the HEK293T-RNF2 gene locus, which is commonly used for CRISPR gene editing efficiency evaluation, was targeted to verify the gene editing function of the FrCas9-PE system. Our current experimental results show that FrCas9-PE can produce site-specific base editing effects. After that, we will further transform the existing FrCas9-PE, including codon optimization, nuclear localization sequence position and number optimization, reverse transcriptase modification, etc (Figure 14).
The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement or any form of suggestion that such prior art forms part of the common general knowledge.
It will be understood that the terms "comprise" and "include" and any of their derivatives (e.g. comprises, comprising, includes, including) as used in this specification, and the claims that follow, is to be taken to be inclusive of features to which the term refers, and is not meant to exclude the presence of any additional features unless otherwise stated or implied.
In some cases, a single embodiment may, for succinctness and/or to assist in understanding the scope of the disclosure, combine multiple features. It is to be understood that in such a case, these multiple features may be provided separately (in separate embodiments), or in any other suitable combination. Alternatively, where separate features are described in separate embodiments, these separate features may be combined into a single embodiment unless otherwise stated or implied. This also applies to the claims which can be recombined in any combination. That is a claim may be amended to include a feature defined in any other claim. Further a phrase referring to "at least one of' a list of items refers to any combination of those items, including single members. As an example, "at least one of: a, b, or c" is intended to cover: a, b, c, a-b, a-c, b-c, and a-b-c.
It will be appreciated by those skilled in the art that the disclosure is not restricted in its use to the particular application or applications described. Neither is the present disclosure restricted in its preferred embodiment with regard to the particular elements and/or features described or depicted herein. It will be appreciated that the disclosure is not limited to the embodiment or embodiments disclosed, but is capable of numerous rearrangements, modifications and substitutions without departing from the scope as set forth and defined by the following claims.
SEQUENCE LISTING 21 Feb 2022
<110> ZHUHAI SHU TONG MEDICAL TECHNOLOGY CO., LID
<120> TYPE II CRISPR/CAS9 GENOME EDITING SYSTEM AND THE APPLICATION THEREOF
<130>
<160> 33 2022201166
<170> PatentIn version 3.3
<210> 1 <211> 1372 <212> PRT <213> Faecalibaculum rodentium
<400> 1 Met Cys Thr Lys Glu Ser Glu Lys Leu Asn Lys Asn Ala Asp Tyr Tyr 1 5 10 15 Ile Gly Leu Asp Met Gly Thr Ser Ser Ala Gly Trp Ala Val Ser Asp 20 25 30 Ser Glu Tyr Asn Leu Ile Arg Arg Lys Gly Lys Asp Leu Trp Gly Val 35 40 45 Arg Gln Phe Glu Glu Ala Lys Thr Ala Ala Glu Arg Arg Gly Phe Arg 50 55 60 Val Ala Arg Arg Arg Lys Gln Arg Gln Gln Val Arg Asn Arg Leu Leu 65 70 75 80 Ser Glu Glu Phe Gln Asn Glu Ile Thr Lys Ile Asp Ser Gly Phe Leu 85 90 95 Lys Arg Met Glu Asp Ser Arg Phe Val Ile Ser Asp Lys Arg Val Pro 100 105 110 Glu Lys Tyr Thr Leu Phe Asn Asp Ser Gly Tyr Thr Asp Val Glu Tyr 115 120 125 Tyr Asn Gln Tyr Pro Thr Ile Tyr His Leu Arg Lys Ala Leu Ile Glu 130 135 140 Ser Asn Glu Arg Phe Asp Ile Arg Leu Val Phe Leu Gly Ile His Ser 145 150 155 160 Leu Phe Gln His Pro Gly His Phe Leu Asp Lys Gly Asp Val Asp Thr 165 170 175 Asp Asn Thr Gly Pro Glu Glu Leu Ile Gln Phe Leu Glu Asp Cys Met 180 185 190 Asn Glu Ile Gln Ile Ser Ile Pro Leu Val Ser Asn Gln Lys Val Leu 195 200 205 Thr Asp Ile Leu Thr Asp Ser Arg Ile Thr Arg Arg Asp Lys Glu Gln 210 215 220 Gln Ile Leu Glu Ile Leu Gln Pro Asn Lys Glu Ser Lys Lys Ala Val 225 230 235 240 Ser Gln Phe Val Lys Val Leu Thr Gly Gln Lys Ala Lys Leu Gly Asp 245 250 255 Leu Ile Met Met Glu Asp Lys Asp Thr Glu Glu Tyr Lys Tyr Ser Phe 260 265 270
Ser Phe Arg Glu Lys Thr Leu Glu Glu Ile Leu Pro Asp Ile Glu Gly 21 Feb 2022
275 280 285 Val Ile Asp Gly Leu Ala Leu Glu Tyr Ile Glu Ser Ile Tyr Ser Leu 290 295 300 Tyr Ser Trp Ser Leu Leu Asn Ser Tyr Met Lys Asp Thr Leu Thr Gly 305 310 315 320 His Tyr Tyr Ser Tyr Leu Ala Glu Ala Arg Val Ala Ala Tyr Asp Lys 325 330 335 His His Ser Asp Leu Val Lys Leu Lys Thr Leu Phe Arg Glu Tyr Ile 340 345 350 2022201166
Pro Glu Glu Tyr Asp Asn Phe Phe Arg Lys Met Glu Lys Ala Asn Tyr 355 360 365 Ser His Tyr Ile Gly Ser Thr Glu Tyr Asp Gly Glu Lys Arg Cys Arg 370 375 380 Thr Ala Lys Ala Lys Gln Glu Asp Phe Tyr Lys Ser Ile Asn Lys Met 385 390 395 400 Leu Glu Lys Ile Pro Glu Cys Ser Glu Lys Thr Glu Ile Gln Lys Glu 405 410 415 Ile Ile Glu Gly Thr Phe Leu Leu Lys Gln Thr Gly Pro Gln Asn Gly 420 425 430 Phe Val Pro Asn Gln Leu Gln Leu Lys Glu Leu Arg Lys Ile Leu Gln 435 440 445 Asn Ala Ser Lys His Tyr Pro Phe Leu Thr Glu Lys Asp Glu Arg Asp 450 455 460 Met Thr Ala Ile Asp Arg Ile Glu Ala Leu Phe Ser Phe Arg Ile Pro 465 470 475 480 Tyr Tyr Ile Gly Pro Leu Lys Asn Thr Asp Asn Gln Gly His Gly Trp 485 490 495 Ala Val Arg Arg Asp Gly His Glu Gln Ile Pro Val Arg Pro Trp Asn 500 505 510 Phe Glu Glu Ile Ile Asp Glu Ser Ala Ser Ala Asp Leu Phe Ile Lys 515 520 525 Asn Leu Val Asn Ser Cys Thr Tyr Leu Arg Thr Glu Lys Val Leu Pro 530 535 540 Lys Ser Ser Leu Leu Tyr Gln Glu Phe Glu Val Leu Asn Glu Leu Asn 545 550 555 560 Asn Leu Arg Ile Asn Gly Met Tyr Pro Asp Glu Ile Gln Pro Gly Leu 565 570 575 Lys Arg Met Ile Phe Glu Gln Cys Phe Tyr Ser Gly Lys Lys Val Thr 580 585 590 Gly Lys Lys Leu Gln Leu Phe Leu Arg Ser Val Leu Thr Asn Ser Ser 595 600 605 Thr Glu Glu Phe Val Leu Thr Gly Ile Asp Lys Asp Phe Lys Ser Ser 610 615 620 Leu Ser Ser Tyr Lys Lys Phe Cys Glu Leu Phe Gly Val Lys Thr Leu 625 630 635 640 Asn Asp Thr Gln Lys Val Met Ala Glu Gln Ile Ile Glu Trp Ser Thr 645 650 655 Val Tyr Gly Asp Ser Arg Lys Phe Leu Lys Arg Lys Leu Glu Asp Asn 660 665 670 Tyr Pro Glu Leu Thr Asp Gln Gln Ile Arg Arg Ile Ala Gly Phe Lys 675 680 685 Phe Ser Glu Trp Gly Asn Leu Ser Arg Ala Phe Leu Glu Met Glu Gly 690 695 700
Tyr Lys Asp Glu Ala Gly Asn Pro Val Thr Ile Ile Arg Ala Leu Arg 21 Feb 2022
705 710 715 720 Asp Thr Gln Lys Asn Leu Met Gln Leu Leu Ser Asn Asp Ser Ala Phe 725 730 735 Ala Lys Lys Leu Gln Glu Leu Asn Asp Tyr Val Thr Arg Asp Ile Trp 740 745 750 Ser Ile Glu Pro Asp Asp Leu Asp Gly Met Tyr Leu Ser Ala Pro Val 755 760 765 Arg Arg Met Ile Trp Gln Thr Phe Leu Ile Leu Arg Glu Val Val Asp 770 775 780 2022201166
Thr Ile Gly Tyr Ser Pro Lys Lys Ile Phe Met Glu Met Ala Arg Gly 785 790 795 800 Glu Gln Glu Lys Lys Arg Thr Ala Ser Arg Lys Lys Gln Leu Ile Asp 805 810 815 Leu Tyr Lys Glu Ala Gly Met Lys Asn Asp Glu Leu Phe Gly Asp Leu 820 825 830 Glu Ser Leu Glu Glu Ala Gln Leu Arg Ser Lys Lys Leu Tyr Leu Tyr 835 840 845 Phe Arg Gln Met Gly Arg Asp Ile Tyr Ser Gly Lys Leu Ile Asp Phe 850 855 860 Met Asp Val Leu His Gly Asn Arg Tyr Asp Ile Asp His Ile His Pro 865 870 875 880 Gln Ser Lys Lys Lys Asp Asp Ser Leu Glu Asn Asn Leu Val Leu Thr 885 890 895 Ser Lys Asp Phe Asn Asn His Ile Lys Gln Asp Val Tyr Pro Ile Pro 900 905 910 Glu Gln Ile Gln Ser Arg Gln Lys Gly Phe Trp Ala Met Leu Leu Lys 915 920 925 Gln Gly Phe Met Ser Gln Glu Lys Tyr Asn Arg Leu Met Arg Thr Thr 930 935 940 Pro Phe Thr Asp Glu Glu Leu Ala Glu Phe Val Asn Arg Gln Leu Val 945 950 955 960 Glu Thr Arg Gln Gly Thr Lys Ala Ile Ile Ser Leu Ile Asn Gln Cys 965 970 975 Phe Pro Asp Ser Glu Val Val Tyr Val Lys Ala Gly Asn Thr Ser Asp 980 985 990 Phe Arg Gln Arg Phe Asp Ile Pro Lys Ser Arg Asp Leu Asn Asn Tyr 995 1000 1005 His His Ala Val Asp Ala Tyr Leu Asn Ile Val Val Gly Asn Val Tyr 1010 1015 1020 Asp Thr Lys Phe Thr Lys Asn Pro Ile Asn Phe Ile Lys Lys Met Arg 1025 1030 1035 1040 Lys Ser Gly Asn Leu His Ser Tyr Ser Leu Arg Arg Met Tyr Asp Phe 1045 1050 1055 Asn Val Gln Arg Gly Asp Gln Thr Ala Trp Val Ala Glu Asn Asp Thr 1060 1065 1070 Thr Leu Lys Thr Val Lys Lys Thr Ala Phe Lys Thr Ser Pro Met Val 1075 1080 1085 Thr Lys Arg Thr Tyr Glu Arg Lys Gly Gly Leu Ala Asp Ser Val Leu 1090 1095 1100 Ile Ala Ala Lys Lys Ala Lys Pro Gly Val His Leu Pro Val Lys Thr 1105 1110 1115 1120 Ser Asp Ser Arg Phe Ala Asn Gln Val Ser Thr Tyr Gly Gly Tyr Asp 1125 1130 1135
Asn Val Lys Gly Ser His Phe Phe Leu Val Glu His Gln Gln Lys Lys 21 Feb 2022
1140 1145 1150 Lys Thr Ile Arg Ser Ile Glu Asn Val Pro Ile His Leu Lys Glu Lys 1155 1160 1165 Leu Lys Thr Lys Glu Glu Leu Glu His Tyr Cys Ala Gln Val Leu Gly 1170 1175 1180 Met Val Gln Pro Asp Val Arg Leu Thr Arg Ile Pro Met Tyr Ser Leu 1185 1190 1195 1200 Leu Leu Ile Asp Gly Tyr Tyr Tyr Tyr Leu Thr Gly Arg Thr Gly Gly 1205 1210 1215 2022201166
Asn Leu Ser Leu Ser Asn Ala Val Glu Leu Cys Leu Pro Ala Lys Glu 1220 1225 1230 Gln Ala His Ile Arg Met Ile Ser Lys Ile Ala Gly Gly Arg Ser Thr 1235 1240 1245 Asp Ala Leu Ser Ala Glu Ala Lys Asp Asp Phe Arg Lys Lys Asn Leu 1250 1255 1260 Arg Leu Tyr Asp Glu Leu Ala Glu Lys His Arg Ser Thr Ile Phe Ser 1265 1270 1275 1280 Lys Arg Lys Asn Pro Ile Gly Pro Lys Leu Leu Lys Tyr Arg Glu Ala 1285 1290 1295 Phe Val Lys Gln Thr Ile Glu Asn Gln Cys Lys Val Ile Leu Gln Ile 1300 1305 1310 Leu Lys Leu Thr Ser Thr Asn Cys Lys Thr Ser Ala Asp Leu Lys Leu 1315 1320 1325 Ile Gly Gly Ser Gly Gln Glu Gly Val Met Ser Ile Ser Lys Leu Leu 1330 1335 1340 Arg Ala Glu Lys Tyr Ala Glu Phe Tyr Leu Ile Cys Gln Ser Pro Ser 1345 1350 1355 1360 Gly Ile Tyr Glu Thr Arg Lys Asn Leu Leu Thr Ile 1365 1370
<210> 2 <211> 294 <212> PRT <213> Faecalibaculum rodentium
<400> 2 Met Thr Trp Arg Thr Ile Thr Ile Ser Ser His Ser Lys Leu Asp Tyr 1 5 10 15 Gln Met Gly Tyr Leu Val Val Arg Gly Glu Ser Ile Lys Arg Ile His 20 25 30 Leu Ser Glu Ile Ser Val Leu Ile Ile Glu Asn Thr Ala Val Ser Leu 35 40 45 Thr Ala Tyr Leu Val Ser Glu Leu Val Lys Asn Lys Ile Lys Leu Leu 50 55 60 Phe Cys Asp Glu Lys Arg Ser Pro Leu Ala Glu Val Ser Glu Leu Tyr 65 70 75 80 Gly Gly His Asp Ser Ser Asp Met Val Arg Lys Gln Ile Glu Ile Pro 85 90 95 Gln Glu Arg Lys Asp Ile Ala Trp Gln Ser Ile Ile Met Ser Lys Ile 100 105 110 Ser Asn Gln Phe Ala Val Leu His Asn Phe Asp Cys Pro Asn Gln Glu 115 120 125
Leu Leu Leu Gln Tyr Ile Asn Glu Val Leu Pro Gly Asp Val Thr Asn 21 Feb 2022
130 135 140 Arg Glu Gly His Ala Ala Lys Val Tyr Phe Asn Ser Leu Phe Gly Lys 145 150 155 160 Ser Phe Tyr Arg Ala Ser Glu Cys Ala Leu Asn Ala Ala Leu Asn Tyr 165 170 175 Gly Tyr Ser Val Leu Leu Ser Ala Val Ser Arg Glu Ile Ala Gly Tyr 180 185 190 Gly Phe Leu Thr Gln Leu Gly Ile Phe His Asp Asn Cys Asp Asn Lys 195 200 205 2022201166
Tyr Asn Leu Ser Cys Asp Leu Met Glu Pro Phe Arg Pro Val Val Asp 210 215 220 Tyr Leu Val Lys Ser Asn Ile Val Glu Val Phe Glu Lys Glu Gln Lys 225 230 235 240 Gln Lys Ile Leu Gln Leu Leu Gln Phe Lys Ile Gln Ile Asn Asp Arg 245 250 255 Gln Glu Thr Val Gln Asn Ala Ile Ser Ile Phe Val His Ser Val Leu 260 265 270 Asp Tyr Leu Leu Asp Pro Ser Val Tyr Ile Lys Val Pro Arg Ile Asp 275 280 285 Phe Thr Lys Asn Val Val 290
<210> 3 <211> 68 <212> PRT <213> Faecalibaculum rodentium
<400> 3 Met Met Gln Glu Ser Val Tyr Cys Lys Leu Thr Thr Asn Gln Ser Ser 1 5 10 15 Ala Glu Thr Val Leu Lys Met Val Arg Ala Asn Lys Pro Pro Glu Gly 20 25 30 Leu Ile Gln Thr Leu Ile Ile Thr Glu Lys Gln Phe Ser Lys Met Asp 35 40 45 Phe Ile Leu Gly Gln Pro Asn Ser Asp Val Val Ala Thr Asp Glu Ser 50 55 60 Val Leu Asp Leu 65
<210> 4 <211> 223 <212> PRT <213> Faecalibaculum rodentium
<400> 4 Met Arg Leu Leu Ile Asp Arg Leu Leu Leu Ser Ala Glu Leu Asn Ile 1 5 10 15 Asp Lys Ala Thr Thr Ile Ile Ile Glu Asn Pro Lys Ala Phe Arg Met 20 25 30 Val Ile Lys Asp Leu Ile Glu Gln Glu Asn Gly Gln Gly Gly Leu Leu 35 40 45
Arg Ile Val Glu Gly Asp Lys Glu Leu Cys Leu Ser Lys Ser Ala Ile 21 Feb 2022
50 55 60 Leu Val Leu Asn Pro Tyr Leu Ala Asp Leu Asn Cys Arg Lys Phe Leu 65 70 75 80 Gln Leu Ala Tyr Ser Glu Leu Gln Ala Met Thr Gly Glu Phe Leu Glu 85 90 95 Asp Gln Ala Val Val Leu Ser Ala Met Thr Gly Tyr Leu Ser Lys Ile 100 105 110 Cys Asp Gln Ser Arg Phe Asp Phe Leu Glu Phe Ser Ala Ile Pro Asp 115 120 125 2022201166
Trp Ala Ser Val Phe Lys Ala Trp Gly Leu Arg Phe Glu Gln Ala Ile 130 135 140 Pro Gly Leu Leu Pro Ser Leu Ile Gln Tyr Leu Gln Leu Ala Ala Thr 145 150 155 160 Phe Pro Gln Phe Lys Leu Ile Ile Phe Ile Asn Leu Lys Gln Tyr Leu 165 170 175 Leu Pro Glu Glu Gln Phe Glu Leu Phe Lys Met Ala Glu Tyr Leu Gln 180 185 190 Leu Lys Val Leu Leu Val Glu Ser Ala Gln Asn Tyr Lys Ser Asp Arg 195 200 205 Glu Asp Leu Ile Ile Ile Asp Lys Asp Leu Cys Glu Ile Gln Ser 210 215 220
<210> 5 <211> 20 <212> RNA <213> Faecalibaculum rodentium
<400> 5 guuugagugu cuuguuaauu 20
<210> 6 <211> 36 <212> DNA <213> Faecalibaculum rodentium
<400> 6 gtttgagtgt cttgttaatt cggaagtatt tcaaac 36
<210> 7 <211> 216 <212> DNA <213> Faecalibaculum rodentium
<400> 7 gtttgagtgt cttgttaatt cgggagtagc tctctcgttt gagtgtcttg ttaattcgga 60 agtaagctca acttttgagt gtcttgttaa ttcggaagta tctcaaacgt ttgagtgtct 120 tgttaattca gaagtatttc aaacgtttga gtgtcttgtt aattcggaag tattccaaac 180 gtttgagtgt cttgttaatt cggaagtatt tcaaac 216
<210> 8 21 Feb 2022
<211> 71 <212> RNA <213> Faecalibaculum rodentium
<400> 8 aauuaacaag augaguucaa aucaggcucc uagagagauc cgaacuuacc uucauggcgg 60 gcauugugcc c 71 2022201166
<210> 9 <211> 95 <212> RNA <213> Faecalibaculum rodentium
<400> 9 guuugagugu cuuguuaauu gaaaaauuaa caagaugagu ucaaaucagg cuccuagaga 60 gauccgaacu uaccuucaug gcgggcauug ugccc 95
<210> 10 <211> 91 <212> RNA <213> Faecalibaculum rodentium
<400> 10 guuugagugu cuuguuaaga aauuaacaag augaguucaa aucaggcucc uagagagauc 60 cgaacuuacc uucauggcgg gcauugugcc c 91
<210> 11 <211> 89 <212> RNA <213> Faecalibaculum rodentium
<400> 11 guuugagugu cuuguuagaa auaacaagau gaguucaaau caggcuccua gagagauccg 60 aacuuaccuu cauggcgggc auugugccc 89
<210> 12 <211> 87 <212> RNA <213> Faecalibaculum rodentium
<400> 12 guuugagugu cuuguugaaa aacaagauga guucaaauca ggcuccuaga gagauccgaa 60 cuuaccuuca uggcgggcau ugugccc 87
<210> 13 <211> 85 <212> RNA <213> Faecalibaculum rodentium
<400> 13 guuugagugu cuugugaaaa caagaugagu ucaaaucagg cuccuagaga gauccgaacu 60 uaccuucaug gcgggcauug ugccc 85
<210> 14 <211> 83 <212> RNA <213> Faecalibaculum rodentium 2022201166
<400> 14 guuugagugu cuuggaaaca agaugaguuc aaaucaggcu ccuagagaga uccgaacuua 60 ccuucauggc gggcauugug ccc 83
<210> 15 <211> 79 <212> RNA <213> Faecalibaculum rodentium
<400> 15 guuugagugu cugaaaagau gaguucaaau caggcuccua gagagauccg aacuuaccuu 60
cauggcgggc auugugccc 79
<210> 16 <211> 75 <212> RNA <213> Faecalibaculum rodentium
<400> 16 guuugagugu gaaaaugagu ucaaaucagg cuccuagaga gauccgaacu uaccuucaug 60
gcgggcauug ugccc 75
<210> 17 <211> 71 <212> RNA <213> Faecalibaculum rodentium
<400> 17 guuugaguga aagaguucaa aucaggcucc uagagagauc cgaacuuacc uucauggcgg 60
gcauugugcc c 71
<210> 18 <211> 69 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 18 ctacctgcat accgttatta acatatgaca actcaattaa acgccacatc catcggcgct 60
ttggtcggc 69
<210> 19 2022201166
<211> 62 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 19 atgttaccac tggattgagg ataccgttat taacatatga caactcaatt aaactctggt 60
ac 62
<210> 20 <211> 69 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 20 tcagtttata tgagttacaa cgaacaccgt ttaattgagt tgtcatatgt taataacggt 60
attcaggta 69
<210> 21 <211> 69 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 21 ctacctacat accgttatta acatatgaca actcaattaa acgtcagcac ctgggacccc 60
gccaccgtg 69
<210> 22 <211> 34 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 22 gtttaattga gttgtcatat gttaataacg gtat 34
<210> 23 <211> 26 <212> DNA <213> Artificial Sequence 2022201166
<221> misc_feature <222> (22)..(26) <223> n is a, c, g, or t;r is a or g;w is a or t
<400> 23 tgttaccact ggattgaggt cnnrwr 26
<210> 24 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 24 tgttaccact ggattgaggt ctggta 26
<210> 25 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 25 aagaaccact ggattgaggt ctggct 26
<210> 26 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 26 gggaaccact ggattaaggt ccagat 26
<210> 27 <211> 26 <212> DNA
<213> Artificial Sequence 21 Feb 2022
<223> Synthetic
<400> 27 agttaccaat ggattgagat ggggag 26
<210> 28 <211> 26 2022201166
<212> DNA <213> Artificial Sequence
<221> misc_feature <222> (22)..(26) <223> n is a, c, g, or t;r is a or g;w is a or t
<400> 28 aaaaaggaag gagttctttg tnnrwr 26
<210> 29 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 29 aaaaaggaag gagttctttg taggta 26
<210> 30 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 30 taaaaggaag gtgttctttg tggggt 26
<210> 31 <211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 31 gaaaaggaag gtgctctttg tgggtg 26
<210> 32 21 Feb 2022
<211> 26 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 32 ggaaaaggtg gaattctttg ttggta 26 2022201166
<210> 33 <211> 17 <212> DNA <213> Artificial Sequence
<223> Synthetic
<400> 33 cgaacacctc aggtaat 17

Claims (3)

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:
1. A Type II CRISPR/Cas9 genome editing system, comprising a Cas9 protein, helper proteins, a CRISPR RNA (crRNA)) and a trans-activated CRISPR RNA (tracrRNA) in the functional form of an ribonucleoprotein complex (RNP complex) of the Cas9 protein and a guide RNA formed by hybridizing the crRNA with the tracrRNA;
wherein the Cas9 protein is a DNA endonuclease, and the Cas9 protein has an amino acid sequence as shown in SEQ ID NO: 1, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 1;
wherein the helper proteins comprise a Cas1 helper protein, a Cas2 helper protein and a Csn2 helper protein;
wherein the Cas1 helper protein has an amino acid sequence as shown in SEQ ID NO: 2, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 2;
wherein the Cas2 helper protein has an amino acid sequence as shown in SEQ ID NO: 3, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 3;
wherein the Csn2 helper protein has an amino acid sequence as shown in SEQ ID NO: 4, or an amino acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the amino acid sequence as shown in SEQ ID NO: 4;
wherein the CRISPR RNA is generated by transcription of a CRISPR Array, and has an RNA sequence as shown in SEQ ID NO: 5, or an RNA sequence with at least %, 85%, 90%, 95%, 98%, or 99% homology to the nucleic acid sequence as shown
in SEQ ID NO: 5; and
wherein the tracrRNA comprises a sequence complementary to a direct repeat sequence of the crRNA, and the tracrRNA has a nucleic acid sequence as shown in SEQ ID NO: 8, or a nucleic acid sequence with at least 70%, 80%, 85%, 90%, 95%, 98%, or 99% to the nucleic acid sequence as shown in SEQ ID NO: 8.
2. The Type II CRISPR/Cas9 genome editing system according to claim 1, wherein the Cas9 protein cleaves a double-stranded DNA complementary to a crRNA upstream of a PAM sequence by a nuclease domain, wherein the nuclease domain is selected from a HNH-like nuclease domain, a RuvC-like nuclease domain, or a combination thereof.
3. The Type II CRISPR/Cas9 genome editing system according to claim 1, wherein the CRISPR Array comprises a direct repeat sequence and a spacer sequence, wherein the direct repeat sequence has a nucleic acid sequence as shown in SEQ ID NO: 6, or a nucleic acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the nucleic acid sequence as shown in SEQ ID NO: 6; and wherein the spacer sequence has a nucleic acid sequence as shown in SEQ ID NO: 7, or a nucleic acid sequence with at least 80%, 85%, 90%, 95%, 98%, or 99% homology to the nucleic acid sequence as shown in SEQ ID NO: 7.
4. The Type II CRISPR/Cas9 genome editing system according to claim 1, wherein the guide RNA formed by hybridizing the crRNA with the tracrRNA has a scaffold composed of a sequence of 7 to 24 nts of a crRNA direct repeat sequence and a tracrRNA sequence.
5. The Type II CRISPR/Cas9 genome editing system according to any one of claims 1 to 4, wherein the crRNA direct repeat sequence and the tracrRNA sequence are fused by a "GAAA", "TGAA", or "AAAC" linker to form an sgRNA scaffold.
6. The Type II CRISPR/Cas9 genome editing system according to claim 5, wherein the sgRNA scaffold is formed by fusion of a crRNA direct repeat sequence of nts and a full length sequence of the tracrRNA by "GAAA" as shown in SEQ ID NO: 9.
7. The Type II CRISPR/Cas9 genome editing system according to claim 5 wherein the sgRNA scaffold formed by fusion of the first 18 to 14 nts of the crRNA direct repeat sequence and the last 69 to 65 nts of the tracrRNA by a linker sequence.
8. The Type II CRISPR/Cas9 genome editing system according to claim 5, wherein the sgRNA scaffold is selected from the following five scaffolds:
(1) an sgRNA scaffold with a length of 91 nts, which comprises 18 nts direct repeat sequence and 69 nts tracrRNA, as shown in SEQ ID NO: 10;
(2) an sgRNA scaffold with a length of 89 nts, which comprises 17 nts direct repeat sequence and 68 nts tracrRNA, as shown in SEQ ID NO: 11;
(3) an sgRNA scaffold with a length of 87 nts, which comprises 16 nts direct repeat sequence and 67 nts tracrRNA, as shown in SEQ ID NO: 12;
(4) an sgRNA scaffold with a length of 85 nts, which comprises 15 nts direct repeat sequence and 66 nts tracrRNA, as shown in SEQ ID NO: 13;
(5) an sgRNA scaffold with a length of 83 nts, which comprises 14 nts direct repeat sequence and 65 nts tracrRNA, as shown in SEQ ID NO: 14;
optionally, the sgRNA scaffold has a nucleic acid sequence with at least 70%, 80%, %, 90%, 95%, 98%, or 99% homology to any of SEQ ID NOs: 9 to 14.
9. The Type II CRISPR/Cas9 genome editing system according to any one of claims 1 to 8, wherein the Type II CRISPR/Cas9 genome editing system is derived from Faecalibaculumrodentium, and wherein a specific DNA is bound or cleaved in a genome editing process, by complementary pairing recognition of the guide RNA and a target of the specific DNA, and wherein a length of a paired binding part of the guide RNA and the target of the specific DNA ranges from 14 to 30 bps;
wherein the specific DNA is a DNA of prokaryote or eukaryote.
10. The Type II CRISPR/Cas9 genome editing system according to claim 9, wherein the length of the paired binding part is 21 bps, 22 bps or 23 bps, and wherein the RNP complex is highly sensitive to base mismatch of 14 bps close to a protospacer adjacent motif and the 14 bps is a seed region.
11. The Type II CRISPR/Cas9 genome editing system according to claim 9, wherein a protospacer adjacent motif required for a function of binding or cleaving DNA is 5'-NNTA-3' downstream of the guide RNA/sgRNA recognition sequence, wherein N is A, T, C or G.
12. Use of the Type II CRISPR/Cas9 genome editing system according to any one of claims 1 to 11 in editing or binding DNA, CRISPR activation, or CRISPR interference.
13. The use according to claim 12, wherein the DNA is prokaryotic or eukaryotic DNA.
14. Use of the Type II CRISPR/Cas9 genome editing system according to any one of claims 1 to 11 in the preparation of a nickase, a dead Cas9, a base editor, or a prime editor.
15. The use according to claim 14, wherein the preparation is in a prokaryote or a eukaryote.
O'0,6*' c^ >D Wf—IW- 31 spacers 2022201166
Fig. 1
H1010A
REC HNH PI
0 Ol E796A N902A e o D1013A CO o o o o '0 TO
Fig. 2
1/14
Fig. 3A 21 Feb 2022
Target binding area Repeat 10 20 5'X XX X X XXXXXXXXXXXXXXXX XiG UUUGAG- UGUCUUGU UAAU U G „ V A C C 50 I I I I I I I I I I I I I I I I I I II G AG CU GGACUAAACUU GUAGAACAAUUAAG jeoA 40 G A 30 Anti-repeat Stem loopl : _ I I I I I I 95 nt ■ G 80 T : Auc - G A A C U U A C C U U C A U G G C G G G C Ay ....Jk.........7R....A.......... Linker............. I I I I I U Stem loop2 3’U U U U U U U U'iC C C G UG 90 Fig. 3B Target binding area Repeat i0 20 5'X XX XX XXXXXXXXXXXXXXXX X;GUUUGAG--UGUCUUGUUAA l/G A 0 I I I I I I I I II I I I I I I IN 2022201166
: AC p so GAG C U GGACU AA AC U U« . G UAGA AC A A U U .ga 30
91 nt Stem loopl : ............................. Anti-repeat i G 80 AUC c GAACUUACCUUCAUGGCGGGCAy ............... A......... J9.....A.......... Linker.............I Mil U Stem lpop2 3’U U U U U U U U;C C C G UG 90
Fig. 3C
m Target binding area Repeat 10 i: 20 5’X XX XX XXXXXXXXXXXXXXXX X:G UUUGAG------ UGUCUUGU UA: IT A-C C 50 ......................................... I I I I I I I I I M GAG CU GGACUAAACUU GUAGAACA A U- feoA 40 G A 30 Aiiiti-repeat Stem loopl ; I I II III 89 nt ; G 80 A U C c G A AC U U A C C U U C A U G G C G G G C Ay : ... A.......... .75! A.........Linker.............. I I I I I u ster^i ioop2 3UUUUUUUU;CCCGUG 90 Fig. 3D Target binding area Repeat iq 20 5’X XX X X XXXXXXXXXXXXXXXX XiG UUUGAG------ UGUCUUGU U A^C p so I I I I I I I I I I I I I I I I : G A G C U GGACUAAACUU GUAGAACAA |60A 40 G A 30 Anti-repeat Stem loopl : II III 87 nt i G 80 AU C c GAACUUACCUUCAUGGCGGGCAU i ....A........7.9.,.,. A.......... Linker..............I II I I u Stem loop2 3’U U U U U U U Ulc C C G UG 90
Fig. 3E Target binding area Repeat i0 20 5’X XX X X XXXXXXXXXXXXXXXX X!G ........ UUUGAG- UGUCUUGU ! ................. I I I I I I I I GAG cu ggacuaa acuu 40 G A GUAGAACA A<U U>^GM so A 30 Anti-repeat Stem loopl : I I II III 85 nt : G A u c c GA AC U U A C C U U C A U G G C G G G C Ay 80 T ....A........ 7.9.....A.......... Linker............ I I 1 I I U ; stem loop2 3’UUUUUUUU;CCCGUg 90
Fig. 3F Target binding area Repeat iq 20 5'X XX X X XXXXXXXXXXXXXXXXXjGUUUGAG------ UGUCUUG : U7 A C p 50 I I I I I I I I I I I I G A G CUGGACUAAACUU G U A G A A C uVaG iso A 40 G A p° Anti-repeat Stem loopl ; II III 83 nt ! G 80
i A U c c G A AC U U A C C U U C A U G G C G G G C Ay ......A..............™...A............... Linker...................I I I I MU stem loop2 3’ U U U U U U U U:C C C G UG : 901 The sequence of 95 nt sgRNAis designated SEQ ID NO.9; the sequence of 91 nt sgRNA is designated SEQ ID NO. 10; the sequence of 89 nt sgRNA is designated SEQ ID NO.ll; the sequence of 87 nt sgRNAis designated SEQ ID NO. 12; the sequence of 85 nt sgRNA is designated SEQ ID NO. 13; and the sequence of 83 nt sgRNA is designated SEQ ID NO. 14.
2/14
0.8-i
£ O
ri 2022201166
(0 O 0.4- CL
AT o.o 5 WebLogo 3.7.4
Fig. 4
3/14
Fig. 5A 21 Feb 2022
CCTA I I lA ; S' CGTA
GATA TCTA
TGTA a H | i 2022201166
1
GTTA CATA :
<3C GCTA I CTTA
AATA GGTA ■
I^P Control ATTA i ■ 1w / Mbti'J ACTA 10° 10-1 10-2 10-3 10-4
AGTA
TATA W
Fig. 5B ■ frcas9 locus empty locus
10000
1000 o> _c
i 100
10
1 T
^^ ^^ ^ ^ ^ ^^^^ ^^ ^^
PAM
4/14
Fig. 5C 21 Feb 2022
10° 10'1 10-2 10-3
Positive control
Negative control 2022201166
NC1+
NC1-
NC2+
NC2-
NC3+
NC3-
Fig. 6A (fr
500 bp^ ^ 300 bp > z
5/14
Fig. 6B 21 Feb 2022
CTACCTGCAIACCGTI AT TAACAI A IUACAACTCAATTAAAGGCCACAICCATCGGCGCTTTGGTCGGC + + -4 + + GATGGAC G T AT GGCAAT AMT GTAT ACTGTT GAGTT AATTT GCGGTGTAG GT AGCCGCGAAACCAGCCG BJill I FANCF-T4 ODN tag
CT ACCTGCAT ACCGTTATT AACAT ATGACAACTCAATT AAACGCCACATC CAT CGGCGCTTTGGTCGGC CCGT T AT 1 AACA I A IGACAACICAATTAAACGCCACAI CCA ICGGCGCTTTGGTCGGC
fL M/WVWvV 1 _ IAAaMAaAaWv aA 2022201166
i .110 100 790 JftO ?70
atgttaccactQgattgaggAlACCG TTATT AACA I AT GACAACT CAA TT AAAC tctggtac I t t ( i tacaatggtgacctaactccTAT GGCAAT AATTGTAT ACTGTTGAGTTAATTTGagaccatg DYRK1A-T2 PAM
[ ODN tag
atgttaccactggattgaggATACCGTTATT AACAT ATGACAACTCAATTAAAC t ct g g tac
ATGTTACCACTGGATTGAGGATACCGTTATTAACATATGACAACTCAA
NMY'AfJW'^ 360 370 380 ■ iv. MAA, mMm 390
TCAGTTTATAT GAGTTACAACGAACACCGTTT AATTGAGTTGTCAT ATGTTAAT AACGGTATT CAGGTA + + AGTCAAAT AT ACTCAATGTTGCTTGT GGCAAATT AACTCAACAGT ATACAATTATTGCCATAACT CCAT RMF2-T6 ■ iM'] ODN tag
TCAGTTTATATGAGTTACAACGAACACCGTTTAATTGAGTTGTCATATGTTAATAACGGTATTCAGGTA TCAGTTTAT AT GAGTT ACAACGAACACCGTTT AATTGAGTT G T CATATGTTAAT AACGG
AAAAAftfMAiAlWlAMAflMftAflAAJ^AWAAAAA/VvV^V./^Vwf 310 320 330 3-40 350 300
CT ACCTACAT ACCGTTATT AACAT AT GACAACT CAATTAAACGT CAGCAC CT GGGACCCCGCCACCGT G + + + + + GATGGATGTAT GGCAAT AATT GTATACTGTTGAGTT AATTT GCAGTCGT G GACCCT GGGGCGGTGGCAC pjJuB F6NCF-T1 ODN tag
CT ACCTACATACCGTTATTAACATATGACAACTCAATT AAACGTCAGCAC CTGGGACCCCGCCACCGTG CCGTTATT AACAT AT GACAACT CAATTAAACGT CAGCAC CT GGGACCCCGCCACCGT G
ft
370 360 350 340 j/wwwwm 330
The sequence of FANCF-T4 is designated SEQ ID NO. 18; the sequence of DYRK1A-T2 is designated SEQ ID NO. 19; the sequence of RNF2-T6 is designated SEQ ID NO.20; the sequence of FANCF-T1 is designated SEQ ID NO.21; and the sequence of ODN tag is designated SEQ ID N0.22.
6/14
Fig. 7A HEK293 SITE2-T2
^^ 30- P o CO CD 20- 2022201166
03 cr CD 10- TJ C
0- 19 20 21 22 23
sgRNA spacer lengths (bp) Fig. 7B DNMT1-T3 20-
P cr-. 15- CO 0 16 10- CZ
€c = 0- 19 20 21 22 23
sgRNA spacer lengths (bp)
Fig. 7C RNF2-T6
p ^40- CO
0 0 0 CC a)20- "O c
0- 19 20 21 22 23
sgRNA spacer lengths (bp)
7/14
10 20 30 40 50 60 70 80 90 FrCas9 sgRNA GUUUGAGUGUCUUGUUAAUUgaaaAAUUAACAAGAUGAGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC
91 nt sgRNA GUUUGAGUGUCUUGUUAA- gaaa - - UUAACAAGAUGAGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC
87 nt sgRNA GUUUGAGUGUCUUGUU - gaaa - - - - AACAAGAUGAGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUC AUGGCGGGCAUUGUGCCC
83 nt sgRNA GUUUGAGUGUCUUG gaaa - CAAGAUGAGUUCAAAUCAGGCUCCUAG AGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC 79 nt sgRNA G UUUGAG UG UC U- - ga a a - - - AGAUGAGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC ' 75 nt sgRNA GUUUGAG UG U gaaa - AUG AGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC 71 nt sgRNA GUUUGAGUG gaaa -UGAGUUCAAAUCAGGCUCCUAGAGAGAUCCGAACUUACCUUCAUGGCGGGCAUUGUGCCC
0 20 40 60
8/14 Indel Rate (%)
The sequence of FrCas9 sgRNA is designated SEQ ID NO.9; the sequence of 91 nt sgRNA is designated SEQ ID NO. 10; the sequence of 87 nt sgRNA is designated SEQ ID NO. 12; the sequence of 83 nt sgRNA is designated SEQ ID NO. 14; the sequence of 79 nt sgRNA is designated SEQ ID NO. 15; the sequence of 75 nt sgRNA is designated SEQ ID NO. 16; and the sequence of 71 nt sgRNA is designated SEQ ID NO.17.
Fig. 8
DYRK1A-J2 1 PAM 22 TGTTACCACTGGATTGAGGTCNGGTA SpCas9 FrCas9 20 ! PAM __ ‘ • • T G G T A □ 2456 3257 A A G A • * ’ T G G C T 1076 G • G A ■ A m. • * * C |A G A T A • G G G G A G 3 3 2022201166
GRIN2B-T9 22 1 PAM AAAAAGGAAGGAGTTCTTTGTNGGTA SpCas9 FrCas9 20 • A G G T A □ 20434 34970 T T . G G G G T 990 G T • C • G G G T G 27 G G • . .A • G T A ■ T G G T A 9
There are 5 nucleotide sequences from top to bottom of DYRK1A-T2, which are sequence 1 as set forth in SEQ ID NO: 23, sequence 2 as set forth in SEQ ID NO: 24, sequence 3 as set forth in SEQ ID NO: 25, sequence 4 as set forth in SEQ ID NO: 26, and sequence 5 as set forth in SEQ ID NO: 27. There are 5 nucleotide sequences from top to bottom of GRIN2B-T9, which are sequence 1 as set forth in SEQ ID NO: 28, sequence 2 as set forth in SEQ ID NO: 29, sequence 3 as set forth in SEQ ID NO: 30, sequence 4 as set forth in SEQ ID NO: 31, and sequence 5 as set forth in SEQ ID NO: 32.
Fig. 9
9/14
sgRNAI RNF2 gene sgRNA3 sgRNA2
0^ 2022201166
sgRNM + + + + + sgRNA2 + sgRNA3 +
i 500 bp ‘ •» -
Fig. 10A
nFrCas9(E796A)-BE4-Gam 10.0 5^ > Site o c 7.5 CBE-1 0 3 CBE-2 CT 0 ^ CBE-3 LL 50 •& CBE-11 C o 0 e 2.5 03
I- A O o.o
123456789 10111213141516171819202122 Base position (bp)
Fig. 10B
10/14
nFrCas9(E796A)-ABE7.10 30' Site s? ABE-1 o ABE-2 c: ABE-3 0 3 20- -& ABE-7 cr 0 ABE-9 LL c O 2022201166
10'. iJ5 c n5
0 A < O' e- 1 23456789 10111213141516171819202122 Base position (bp)
Fig. 10C
Uniquelly targeting T>C or A>G mutations Uniquelly targeting G> Aor C>T mutations
ClinVar 20210404 Pathogenic mutations 346 25 235 2277 101 1196
BE4Gam FrCas9-BE4Gam ABE7.10 FrCas9-ABE7.10
Fig. 10D
>, 12.0% forward sgRNA 0 C8 2. '“Ci NNTANN ! cr 10.0% lGJ NNATNN ;.CJ C7 U_ 8.0% c reverse sgRNA o -i= 6.0% * FrCas9-BE4Gam ffi c rc 4.0% if, * 3 Only forward sgRNA — 1 1 1 2 34 4 A 2.0% 234 1 2 Only reverse sgRNA — 2 o Forward sgRNAI + reverse sgRNA — 3 0.0% Control — 4 C7 C8 C7+C8
Fig. 10E
11/14
<D N = 3.75 x 10 8 E Mean = 8.66 bp c g 15 m CD I> o o Me dian = 5 bp C C t- NGG <D CO SpCas9 t^ eZ5 itio- NCC 2022201166
GO O j— -tf 8 CO =5 o^8- (3 0 5 ■ tr o 0 illllBBrai 5 10 15 20 25 30 35 40 45 50 55 Distance (bp) Fig. 11A
20
<D Mean = 6.19 bp N =2.99 x 10 8 E o 15 c c - 0} < CD CD ^ o Median = 1 bp c ^ - NNTANN CD 5? x FrCas9 t E ~ 10 NNATNN z: 3 w o j= p o- 00 co ^ Q I- O 5 CC O
0 5 10 15 20 25 30 35 40 45 50 Distance (bp)
Fig. 11B
12/14
Relative expression O ro Relative Expression c o CD d o o t- o cn b Ol M VWWWWMH i
CO ^\\\\\M—i a ]H 2 * -P^ 3-h -i-J ‘* N) \\\\\\\\\\\XX1^ >E JCD cn ]-l -1 00
Relative expression C:- o cn H CD b
]o Relative Expression M AWWWWWWWI—i in o o > CO AWWWWW^H O o cn b Ol O I >I
cn 4 s- .;•: NO ^WWWWWI—I c o' Relative expression CO O 2 ^ - 00 oj o b 4^ ]H o Q- i. tn cn CO "I I T1 "n T| DD
13/14 —; IO tvvyyyvxl—I ID (Q U-) CQ CQ CO cwwwwV—I X Relative Expression > o o N) O Ol b b N) N3 cn 1 1 J O Relative expression 03 > H— CD' •- CD * o CD N) WWWWWWXl^ H-
CO ]H >- £ >h3 WWWWM—l Co -1^ 3 co wwwj—i a cn -3 Co ■3 cn co N} DO cn cn 4^ CO ND □□ss□ Q. Q. Tl “n Q J ^ ^ O O § > □ DSHD o o ib oi s- 03 0) (/>(/> > Q- Q- Q. Q. O CO CO n -n CD CD o ~ S' T3 T3 3 ?> ft 0) CO CD 0) CO CD O 0) CO O 0) CO £ £ $ $ x b 1 CD CD ? > J gcr < < < CD < "0 < T) o ^ g g 3 5 _ x x Of S o | 2 £ a S’ g i 2 § <S ™ ^ XI S s ^ SCO COa CQscoS Ico ~ S a CQ > CO CO CQ (D 73 ^0 2fe z5 CQ CQ 5 5 z: z 7) 7) ? ? > > Z Z > >
RT TT 3’ 111111111111 11111 2022201166
5’ Target RAM 3’ 11111111111111111111 5’ pegRNA
nCas9(H840A)
Fig. 13
Selected: 314 .. 332 = 19 bases ■*- 559 bases
FrCas9-PE A G A A T
GTA-insertion A / 330 vA\ 340 > 3. Ox
Fig. 14
14/14
AU2022201166A 2022-02-21 2022-02-21 Type ii crispr/cas9 genome editing system and the application thereof Active AU2022201166B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2022201166A AU2022201166B2 (en) 2022-02-21 2022-02-21 Type ii crispr/cas9 genome editing system and the application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2022201166A AU2022201166B2 (en) 2022-02-21 2022-02-21 Type ii crispr/cas9 genome editing system and the application thereof

Publications (2)

Publication Number Publication Date
AU2022201166A1 AU2022201166A1 (en) 2023-09-07
AU2022201166B2 true AU2022201166B2 (en) 2024-02-22

Family

ID=87849863

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2022201166A Active AU2022201166B2 (en) 2022-02-21 2022-02-21 Type ii crispr/cas9 genome editing system and the application thereof

Country Status (1)

Country Link
AU (1) AU2022201166B2 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160130608A1 (en) * 2012-05-25 2016-05-12 The Regents Of The University Of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160130608A1 (en) * 2012-05-25 2016-05-12 The Regents Of The University Of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription

Also Published As

Publication number Publication date
AU2022201166A1 (en) 2023-09-07

Similar Documents

Publication Publication Date Title
US20230272394A1 (en) RNA-DIRECTED DNA CLEAVAGE BY THE Cas9-crRNA COMPLEX
US20210371883A1 (en) Crispr hybrid dna/rna polynucleotides and methods of use
JP7223377B2 (en) Thermostable CAS9 nuclease
JP7067793B2 (en) Nucleobase editing factors and their use
AU2014227653B2 (en) Using RNA-guided foki nucleases (RFNs) to increase specificity for RNA-guided genome editing
Li et al. SWISS: multiplexed orthogonal genome editing in plants with a Cas9 nickase and engineered CRISPR RNA scaffolds
CN114075559B (en) 2-type CRISPR/Cas9 gene editing system and application thereof
CN113234701B (en) Cpf1 protein and gene editing system
Swartjes et al. Editor's cut: DNA cleavage by CRISPR RNA-guided nucleases Cas9 and Cas12a
Moreb et al. CRISPR-Cas “non-target” sites inhibit on-target cutting rates
AU2022396533A1 (en) Endonuclease systems
Hossain CRISPR-Cas9: A fascinating journey from bacterial immune system to human gene editing
Jiang et al. Highly efficient genome editing in Xanthomonas oryzae pv. oryzae through repurposing the endogenous type I‐C CRISPR‐Cas system
AU2022201166B2 (en) Type ii crispr/cas9 genome editing system and the application thereof
JP2023121643A (en) Type ii crispr/cas9 genome editing system and the application thereof
Alalmaie et al. Insight into the molecular mechanism of the transposon-encoded type IF CRISPR-Cas system
Mingarro et al. Improvements in the genetic editing technologies: CRISPR-Cas and beyond
US20230265421A1 (en) Type ii crispr/cas9 genome editing system and the application thereof
EP4230734A1 (en) Type ii crispr/cas9 genome editing system and the application thereof
RU2794774C1 (en) Crispr/cas9 type ii genome editing system and its use
KR20230125680A (en) Type ii crispr/cas9 genome editing system and the application thereof
Wang et al. Deleting specific residues from the HNH linkers creates a CRISPR-SpCas9 variant with high fidelity and efficiency
Esquerra‐Ruvira et al. Identification of the EH CRISPR‐Cas9 system on a metagenome and its application to genome engineering
Li et al. Nucleases in gene-editing technologies: Past and prologue
Urbaitis Identification and characterization of novel CRISPR-Cas nucleases

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)