CN117897485A

CN117897485A - RNA-guided gene editing system comprising a target hydroxy acid oxidase 1 (HAO 1) and uses thereof

Info

Publication number: CN117897485A
Application number: CN202280052114.3A
Authority: CN
Inventors: Q·N·威塞尔斯; J·R·哈斯韦尔; T·M·迪托马索; N·M·雅基摩; S·森古普塔
Original assignee: Abbott Biotechnology
Current assignee: Abbott Biotechnology
Priority date: 2021-06-04
Filing date: 2022-06-03
Publication date: 2024-04-16

Abstract

Provided herein are gene editing systems and/or compositions for gene editing of HAO1 genes comprising RNA guides targeting HAO 1. Also provided herein are methods of introducing edits to the HAO1 gene and/or treating Primary Hyperoxaluria (PH) using the gene editing system, as well as methods for characterizing the gene editing system.

Description

RNA-guided gene editing system comprising a target hydroxy acid oxidase 1 (HAO 1) and uses thereof

Cross Reference to Related Applications

The present application claims the benefit of U.S. provisional application No. 63/197,073, U.S. provisional application No. 63/225,046, U.S. provisional application No. 63/292,889, and U.S. provisional application No. 63/300,727, U.S. provisional application No. 2021, 12, 22, and U.S. provisional application No. 19, 2022, 6, 4, 2021, each of which is incorporated herein by reference in its entirety.

Sequence listing

The present application contains a sequence listing that has been electronically submitted in ASCII format and is hereby incorporated by reference in its entirety. The ASCII copy created at month 3 of 2022 is named 116928-0040-0004wo00_seq. Txt and is 367,354 bytes in size.

Background

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) genes are collectively referred to as CRISPR-Cas or CRISPR/Cas systems, are adaptive immune systems in archaea (archaea) and bacteria that protect specific species from foreign genetic elements.

Disclosure of Invention

The present disclosure is based, at least in part, on the development of a gene editing system for the hydroxy acid oxidase 1 (HAO 1) gene. The system relates to Cas12i CRISPR nuclease polypeptides (e.g., cas12i2 polypeptides) and RNA guides that mediate cleavage of the CRISPR nuclease polypeptides at a genetic locus within a HAO1 gene. As reported herein, the gene editing system disclosed herein has successfully edited HAO1 genes with high editing efficiency and accuracy.

Without being bound by theory, the gene editing systems disclosed herein may further exhibit one or more of the following advantageous features. Compared to SpCas9 and Cas12a, cas12i effectors that bind to their short mature crrnas (40-43 nt) are smaller (1033 to 1093 aa), which is preferred in terms of delivery and synthesis costs. Cas12i cleavage results in a larger deletion compared to Cas9 cleavage-induced small deletions and +1 insertions. Cas12i PAM sequences are also different from those in Cas 9. Thus, using Cas12i polypeptides and RNA guides can disrupt a larger and different portion of the gene locus of interest than Cas 9. Using an unbiased approach of tag integration site sequencing (TTISS) based on tagging (tagging), spCas9 identified more potential off-target sites with a higher number of unique integration events than Cas12i 2. See WO/2021/202800. Thus, cas12i, such as Cas12i2, may be more specific than Cas 9.

Thus, provided herein are gene editing systems for editing HAO1 genes, pharmaceutical compositions or kits comprising such gene editing systems, methods of producing genetically modified cells using the gene editing systems, and the resulting cells produced thereby. Also provided herein are uses of the gene editing systems disclosed herein, pharmaceutical compositions and kits comprising such gene editing systems, and/or the genetically modified cells produced thereby, for treating Primary Hyperoxaluria (PH) in a subject.

In some aspects, the disclosure features a system for gene editing of a hydroxy acid oxidase 1 (HAO 1) gene, the system comprising (i) a Cas12i polypeptide or a first nucleic acid encoding the Cas12i polypeptide, and (ii) an RNA guide or a second nucleic acid encoding the RNA guide. The RNA guide includes a spacer sequence specific for a target sequence within the HAO1 gene adjacent to a Protospacer Adjacent Motif (PAM) located 5' of the target sequence including a motif of 5' -TTN-3 '.

In some embodiments, the Cas12i polypeptide may be a Cas12i2 polypeptide. In some embodiments, the Cas12i polypeptide may be a Cas12i4 polypeptide.

In some embodiments, the Cas12i polypeptide is a Cas12i2 polypeptide comprising an amino acid sequence that is at least 95% identical to SEQ ID No. 922 and includes one or more mutations relative to SEQ ID No. 922. In some embodiments, the one or more mutations in the Cas12I2 polypeptide are at position D581, G624, F626, P868, I926, V1030, E1035, and/or S1046 of SEQ ID NO: 922. In some examples, the one or more mutations are amino acid substitutions, optionally D581R, G624R, F626R, P868T, I926R, V1030G, E1035R, S1046G or a combination thereof.

In one example, the Cas12I2 polypeptide includes mutations at positions D581, D911, I926, and V1030 (e.g., amino acid substitutions of D581R, D911R, I926R and V1030G). In another example, the Cas12I2 polypeptide includes mutations at positions D581, I926, and V1030 (e.g., amino acid substitutions of D581R, I926R and V1030G). In yet another example, the Cas12I2 polypeptide includes mutations at positions D581, I926, V1030, and S1046 (e.g., amino acid substitutions of D581R, I926R, V G and S1046G). In yet another example, the Cas12I2 polypeptide includes mutations at positions D581, G624, F626, I926, V1030, E1035, and S1046 (e.g., amino acid substitutions of D581R, G624R, F626R, I926R, V1030G, E1035R and S1046G). In another example, the Cas12I2 polypeptide includes mutations at positions D581, G624, F626, P868, I926, V1030, E1035, and S1046 (e.g., amino acid substitutions of D581R, G624R, F626R, P868T, I926R, V1030G, E1035R and S1046G).

An exemplary Cas12i2 polypeptide for use in any of the gene editing systems disclosed herein may include the amino acid sequence of any of SEQ ID NOs 923-927. In one example, an exemplary Cas12i2 polypeptide for use in any of the gene editing systems disclosed herein comprises the amino acid sequence of SEQ ID NO: 924. In another example, an exemplary Cas12i2 polypeptide for use in any of the gene editing systems disclosed herein includes the amino acid sequence of SEQ ID No. 927.

In some embodiments, the gene editing system can include a first nucleic acid encoding the Cas12i polypeptide (e.g., cas12i2 polypeptide as disclosed herein). In some cases, the first nucleic acid is located in a first vector (e.g., a viral vector, such as an adeno-associated viral vector or an AAV vector). In some cases, the first nucleic acid is messenger RNA (mRNA). In some cases, the nucleic acid encoding the Cas12i polypeptide (e.g., cas12i2 polypeptide as disclosed herein) is codon optimized.

In some embodiments, the target sequence may be within exon 1 or exon 2 of the HAO1 gene. In some examples, the target sequence includes 5'-CAAAGTCTATATATGACTAT-3' (SEQ ID NO: 1025), 5'-GGAAGTACTGATTTAGCATG-3' (SEQ ID NO: 1026), 5'-TAGATGGAAGCTGTATCCAA-3' (SEQ ID NO: 1046), 5'-CGGAGCATCCTTGGATACAG-3' (SEQ ID NO: 1047), or 5'-AGGACAGAGGGTCAGCATGC-3' (SEQ ID NO: 1052). In a specific example, the target sequence may be the nucleotide sequence of SEQ ID NO. 1047.

In some embodiments, the spacer sequence may be 20-30 nucleotides in length. In some examples, the spacer sequence is 20 nucleotides in length. In some examples, the spacer sequence comprises 5'-CAAAGUCUAUAUAUGACUAU-3' (SEQ ID NO: 1093); 5'-GGAAGUACUGAUU UAGCAUG-3' (SEQ ID NO: 1094); 5'-UAGAUGGAAGCUGUAUCCAA-3' (SEQ ID NO: 1095); 5'-CGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 1096); or 5'-AGGACAGAGGGUCAGCAUGC-3' (SEQ ID NO: 1097). In a specific example, the spacer sequence can include SEQ ID NO 1096.

In some embodiments, the RNA guide includes a spacer and a direct repeat sequence. In some examples, the direct repeat sequence is 23-36 nucleotides in length. In one example, the direct repeat is at least 90% identical to any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length. In some embodiments, the direct repeat is any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length. As a non-limiting example, the direct repeat is 5'-AGAAAUCCGUCUUUCAUUGACGG-3' (SEQ ID NO: 10).

In specific examples, the RNA guide may include the following nucleotide sequences: 5'-AGAAAUCCGUC UUUCAUUGACGGCAAAGUCUAUAUAUGACUAU-3' (SEQ ID NO: 967), 5'-AGAAAUCCGUCUUUCAUUGACGGGGAAGUACUGAUUUAGCAUG-3' (SEQ ID NO: 968), 5'-AGAAAUCCGUCUUUCAUUGACGGUAGAUGGAAGCUGUAUCCAA-3' (SEQ ID NO: 988), 5'-AGAAAUCCGUCUUUCAUUGACGGCGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 989) or 5'-AGAAAUCCGUCUUUCAUUGACGGAGGACAGAGGGUC AGCAUGC-3' (SEQ ID NO: 994). In a specific example, the RNA guide may comprise SEQ ID NO. 989.

In some embodiments, the system may include a second nucleic acid encoding the RNA guide. In some examples, the nucleic acid encoding the RNA guide may be located in a viral vector. In some examples, the viral vector comprises both the first nucleic acid encoding the Cas12i2 polypeptide and the second nucleic acid encoding the RNA guide.

In some embodiments, any of the systems described herein can include the first nucleic acid encoding the Cas12i2 polypeptide in a first vector and the second nucleic acid encoding the RNA guide in a second vector. In some examples, the first vector and/or the second vector is a viral vector. In some embodiments, the first carrier and the second carrier are the same carrier. In other examples, the first carrier and the second carrier are different carriers.

In some embodiments, any of the systems described herein can include one or more Lipid Nanoparticles (LNPs) that encompass the Cas12i2 polypeptide or the first nucleic acid encoding the Cas12i2 polypeptide, the RNA guide, or the second nucleic acid encoding the RNA guide, or both.

In some embodiments, the systems described herein can include an LNP that encompasses the Cas12i2 polypeptide or the first nucleic acid encoding the Cas12i2 polypeptide, and a viral vector comprising the second nucleic acid encoding the RNA guide. In some examples, the viral vector is an AAV vector. In other embodiments, the systems described herein can include an LNP that encompasses the RNA guide or the second nucleic acid encoding the RNA guide, and a viral vector comprising the first nucleic acid encoding the Cas12i2 polypeptide. In some examples, the viral vector is an AAV vector.

In some aspects, the present disclosure also provides a pharmaceutical composition comprising any of the gene editing systems disclosed herein, or a kit comprising components of the gene editing system.

In other aspects, the disclosure also features a method for editing a hydroxy acid oxidase 1 (HAO 1) gene in a cell, the method comprising contacting a host cell with any of the systems disclosed herein to perform gene editing of the HAO1 gene in the host cell. In some examples, the host cell is cultured in vitro. In other examples, the contacting step is performed by administering the system for editing the HAO1 gene to a subject comprising the host cell.

Also within the scope of the present disclosure is a cell comprising a disrupted hydroxy acid oxidase 1 (HAO 1) gene, which can be produced by contacting a host cell with the system disclosed herein to genetically edit the HAO1 gene in the host cell.

In still other aspects, the present disclosure provides a method for treating primary hyperoxalic acid urea (PH) in a subject. The method may comprise administering to a subject in need thereof any system for editing a hydroxy acid oxidase 1 (HAO 1) gene or any modified cell disclosed herein. In some embodiments, the subject may be a human suffering from the PH. In some examples, the PH is PH1, PH2, or PH3. In one embodiment, the pH is pH1.

Also provided herein is an RNA guide comprising (i) a spacer sequence as disclosed herein that is specific for a target sequence in a hydroxy acid oxidase 1 (HAO 1) gene, wherein the target sequence is adjacent to a Protospacer Adjacent Motif (PAM) located 5' of the target sequence comprising a motif of 5' -TTN-3 '; and (ii) a direct repeat sequence.

In some embodiments, the spacer may be 20-30 nucleotides in length. In some examples, the spacer is 20 nucleotides in length.

In some embodiments, the direct repeat sequence may be 23-36 nucleotides in length. In some examples, the direct repeat sequence is 23 nucleotides in length.

In some embodiments, the target sequence may be within exon 1 or exon 2 of the HAO1 gene. In some examples, the target sequence includes 5'-CAAAGTCTATATATGACTAT-3' (SEQ ID NO: 1025), 5'-GGAAGTACTGATTTAGCATG-3' (SEQ ID NO: 1026), 5'-TAGATGGAAGCTGTATCCAA-3' (SEQ ID NO: 1046), 5'-CGGAGCATCCTTGGATACAG-3' (SEQ ID NO: 1047), or 5'-AGGACAGAGGGTCAGCATGC-3' (SEQ ID NO: 1052). In a specific example, the target sequence can include SEQ ID NO 1047.

In some embodiments, the spacer sequence may be as follows: 5'-CAAAGUCUAUAUAUGACUAU-3' (SEQ ID NO: 1093); 5'-GGAAGUACUGAUU UAGCAUG-3' (SEQ ID NO: 1094); 5'-UAGAUGGAAGCUGUAUCCAA-3' (SEQ ID NO: 1095); 5'-CGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 1096); or 5'-AGGACAGAGGGUCAGCAUGC-3' (SEQ ID NO: 1097). In a specific example, the spacer sequence can include SEQ ID NO 1096.

In some embodiments, the direct repeat is at least 90% identical to any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length. In some examples, the direct repeat is any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length. As a non-limiting example, the direct repeat is 5'-AGAAAUCCGUCUUUCAUUGACGG-3' (SEQ ID NO: 10).

In some embodiments, the RNA guide may comprise the nucleotide sequence: 5'-AGAAAUCCGU CUUUCAUUGACGGCAAAGUCUAUAUAUGACUAU-3' (SEQ ID NO: 967), 5'-AGAAAUCCGUCUUUCAUUGACGGGGAAGUACUGAUUUAGCAUG-3' (SEQ ID NO: 968), 5'-AGAAAUCCGUCUUUCAUUGACGGUAGAUGGAAGCUGUAUCCAA-3' (SEQ ID NO: 988), 5'-AGAAAUCCGUCUUUCAUUGACGGCGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 989) or 5'-AGAAAUCCGUCUUUCAUUGACGGAGGACAGAGGGUC AGCAUGC-3' (SEQ ID NO: 994). In a specific example, the RNA guide may comprise SEQ ID NO. 989.

Also provided herein are the use of any of the gene editing systems disclosed herein, pharmaceutical compositions or kits comprising such gene editing systems, or genetically modified cells produced by the gene editing systems for treating PH in a subject, as well as the gene editing systems disclosed herein, pharmaceutical compositions or kits comprising such gene editing systems, or genetically modified cells produced by the gene editing systems for the manufacture of a medicament for treating PH in a subject.

The details of one or more embodiments of the invention are set forth in the description below. Other features or advantages of the present invention will become apparent from the following drawings and detailed description of several embodiments, and also from the appended claims.

Drawings

The following drawings form a part of the present specification and are included to further demonstrate certain aspects of the present disclosure, which aspects may be better understood by reference to the drawings in combination with the detailed description of the specific embodiments presented herein.

FIG. 1 is a graph showing the ability of RNPs prepared with Cas12i2 polypeptide and crRNA to edit HAO1 genes in HEK293 cells. The darker grey bars represent target sequences with complete homology to both rhesus (cynomolgus) and cynomolgus (cynomolgus) sequences.

Fig. 2 is a graph showing the ability of RNPs prepared with Cas12i2 polypeptides and crrnas to edit HAO1 genes in HepG2 cells.

Fig. 3 is a graph showing the ability of RNPs prepared with Cas12i2 polypeptide and crRNA to edit HAO1 genes in primary hepatocytes.

Fig. 4 is a diagram showing knockdown of HAO1mRNA in primary human hepatocytes with Cas12i2 polypeptide and HAO 1-targeted crRNA.

FIG. 5A is a graph showing% indels induced in HepG2 cells by crRNA targeting HAO1 and variant Cas12i2 polypeptides of SEQ ID NO:924 or SEQ ID NO: 927. FIG. 5B shows the size (left) and starting position (right) of the insertion deletion induced in HepG2 cells by the RNA guide targeting HAO1 of variants Cas12i2 and E1T3 of SEQ ID NO:924 (SEQ ID NO: 968).

FIG. 6 is a graph showing% insertion loss induced by the chemically modified HAO 1-targeted crRNA of SEQ ID NO. 1091 and SEQ ID NO. 1092 and the variant Cas12i2 mRNA of SEQ ID NO. 1089 or SEQ ID NO. 1090.

FIG. 7A shows a diagram depicting variants of SEQ ID NO:924 Cas12i2 and RNA guide E2T5 targeting HAO1 (SEQ ID NO: 989), E1T2 (SEQ ID NO: 967), E1T3 (SEQ ID NO: 968) and E2T10 (SEQ ID NO: 994) tag-based tag integration site sequencing (TTISS) reads. The black wedge and centered numbers represent the fraction of target TTISS reads in the sample. Each grey wedge represents a unique off-target site identified by TTISS. The size of each gray wedge represents the fraction of TTISS reads mapped to a given miss. FIG. 7B shows a diagram depicting two replicas of the variant Cas12i2 of SEQ ID NO:927 and the RNA guide E2T5 targeting HAO1 (SEQ ID NO: 989), E1T2 (SEQ ID NO: 967) and E1T3 (SEQ ID NO: 968) TTISS reads. The black wedge and centered numbers represent the fraction of target TTISS reads in the sample. Each grey wedge represents a unique off-target site identified by TTISS. The size of each gray wedge represents the fraction of TTISS reads mapped to a given miss.

FIG. 8 is a Western blot showing knockdown of HAO1 protein after electroporation of primary human hepatocytes with variants Cas12i2 and RNA guide E2T5 of SEQ ID NO:924 (SEQ ID NO: 989).

Detailed Description

The present disclosure relates to a system for gene editing of a hydroxy acid oxidase 1 (HAO 1) gene (also known as a glycolate oxidase gene) comprising (i) a Cas12i polypeptide or a first nucleic acid encoding the Cas12i polypeptide, and (ii) an RNA guide or a second nucleic acid encoding the RNA guide, wherein the RNA guide comprises a spacer sequence specific for a target sequence within the HAO1 gene, the target sequence being adjacent to a Protospacer Adjacent Motif (PAM) comprising a motif of 5' -TTN-3' located 5' of the target sequence. The present disclosure also provides pharmaceutical compositions or kits comprising such systems, and uses thereof. Also disclosed herein is a method for editing a HAO1 gene in a cell, a cell so produced comprising a disrupted HAO1 gene, a method of treating primary hyperoxalic acid (PH) in a subject, and an RNA guide comprising (i) a spacer specific for a target sequence in a HAO1 gene, wherein the target sequence is adjacent to a Protospacer Adjacent Motif (PAM) located 5' of the target sequence comprising a motif of 5' -TTN-3 '; and (ii) direct repeats, and uses thereof.

The Cas12i polypeptide used in the gene editing systems disclosed herein may be a Cas12i2 polypeptide, e.g., a wild-type Cas12i polypeptide disclosed herein or a variant thereof. In some examples, the Cas12i2 polypeptide includes an amino acid sequence that is at least 95% identical to SEQ ID No. 922 and includes one or more mutations relative to SEQ ID No. 922. In other examples, the Cas12i polypeptide can be a Cas12i4 polypeptide, which is also disclosed herein.

Definition of the definition

The present disclosure will be described with respect to particular embodiments and with reference to certain drawings but the disclosure is not limited thereto but only by the claims. Unless otherwise indicated, the terms shown below are generally understood to be common sense thereof.

As used herein, the term "activity" refers to biological activity. In some embodiments, the activity comprises an enzymatic activity, e.g., the catalytic ability of a Cas12i polypeptide. For example, the activity may comprise nuclease activity.

As used herein, the term "HAO1" refers to "glycolate oxidase 1", which is also referred to as "hydroxy acid oxidase" HAO1 is a peroxisome protein expressed primarily in the liver and pancreas, and its activity comprises the oxidation of glycolate and 2-hydroxy fatty acids. SEQ ID NO. 928, as shown herein, provides an example of a HAO1 gene sequence.

As used herein, the term "Cas12i polypeptide" (also referred to herein as Cas12 i) refers to a polypeptide that binds to a target sequence on a target nucleic acid specified by an RNA guide, wherein the polypeptide has at least some amino acid sequence homology to a wild-type Cas12i polypeptide. In some embodiments, the Cas12i polypeptide comprises at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to any of SEQ ID nos. 1-5 and 11-18 of U.S. patent No. 10,808,245, which documents are incorporated by reference for the subject matter and purposes cited herein. In some embodiments, the Cas12i polypeptide comprises at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% sequence identity to any of SEQ ID NOs 8,2, 11 and 9 of the present application. In some embodiments, the Cas12i polypeptide of the present disclosure is a Cas12i2 polypeptide as described in WO/2021/202800, the relevant disclosure of which is incorporated by reference for the subject matter and purposes cited herein. In some embodiments, the Cas12i polypeptide cleaves the target nucleic acid (e.g., as a nick or double strand break).

As used herein, the term "adjacent" refers to a nucleotide or amino acid sequence that is very close to another nucleotide or amino acid sequence. In some embodiments, a nucleotide sequence is adjacent (i.e., immediately adjacent) to another nucleotide sequence if no nucleotide separates the two sequences. In some embodiments, a nucleotide sequence is adjacent to another nucleotide sequence if a small number of nucleotides separates the two sequences (e.g., about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides). In some embodiments, a first sequence is adjacent to a second sequence if the two sequences are separated by about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 nucleotides. In some embodiments, a first sequence is adjacent to a second sequence if the two sequences are separated by at most 2 nucleotides, at most 5 nucleotides, at most 8 nucleotides, at most 10 nucleotides, at most 12 nucleotides, or at most 15 nucleotides. In some embodiments, a first sequence is adjacent to a second sequence if the two sequences are 2-5 nucleotides, 4-6 nucleotides, 4-8 nucleotides, 4-10 nucleotides, 6-8 nucleotides, 6-10 nucleotides, 6-12 nucleotides, 8-10 nucleotides, 8-12 nucleotides, 10-15 nucleotides, or 12-15 nucleotides apart.

As used herein, the term "complex" refers to a combination of two or more molecules. In some embodiments, a complex includes a polypeptide and a nucleic acid molecule that interact (e.g., bind, contact, adhere) with each other. For example, the term "complex" may refer to a combination of an RNA guide and a polypeptide (e.g., cas12i polypeptide). Alternatively, the term "complex" may refer to a combination of RNA guide, polypeptide and complementary regions of a target sequence. In another example, the term "complex" can refer to a combination of an RNA guide and a Cas12i polypeptide that targets HAO 1.

As used herein, the term "protospacer adjacent motif" or "PAM" refers to a DNA sequence adjacent to a target sequence (e.g., HAO1 target sequence) to which complexes including RNA guides (e.g., HAO 1-targeted RNA guides) and Cas12i polypeptides bind. In double-stranded DNA molecules, the strand containing the PAM motif is referred to as the "PAM strand" and the complementary strand is referred to as the "non-PAM strand" RNA guide binds to a site in the non-PAM strand that is complementary to the target sequence disclosed herein.

In some embodiments, the PAM strand is a coding (e.g., sense) strand. In other embodiments, the PAM strand is a non-coding strand (e.g., an antisense strand). Since the RNA guide binds to the non-PAM strand by base pairing, the non-PAM strand is also referred to as the target strand, while the PAM strand is also referred to as the non-target strand.

As used herein, the term "target sequence" refers to a DNA fragment adjacent to a PAM motif (on the PAM strand). The complementary region of the target sequence is on the non-PAM strand. The target sequence may be immediately adjacent to the PAM motif. Alternatively, the target sequence and PAM may be separated by a small sequence segment (e.g., up to 5 nucleotides, e.g., up to 4, 3, 2, or 1 nucleotides). According to CRISPR nucleases recognizing PAM motifs known in the art, the target sequence may be located at the 3 'end of the PAM motif or at the 5' end of the PAM motif. For example, the target sequence is located at the 3' end of the PAM motif of a Cas12i polypeptide (e.g., cas12i2 polypeptide, such as those disclosed herein). In some embodiments, the target sequence is a sequence within the HAO1 gene sequence, including but not limited to the sequence set forth in SEQ ID No. 928.

As used herein, the term "spacer" or "spacer sequence" is part of an RNA guide that is an RNA equivalent of a target sequence (DNA sequence). The spacer contains a sequence capable of binding to a non-PAM strand by base pairing at a site complementary to the target sequence (in the PAM strand). Such spacers are also referred to as having specificity for the target sequence. In some cases, the spacer may be at least 75% (e.g., at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99%) identical to the target sequence, except for RNA-DNA sequence differences. In some cases, the spacer may be 100% identical to the target sequence, except for RNA-DNA sequence differences.

As used herein, the term "RNA guide" or "RNA guide sequence" refers to any RNA molecule or modified RNA molecule that facilitates targeting of a polypeptide described herein (e.g., cas12i polypeptide) to a target sequence (e.g., the sequence of the HAO1 gene). For example, an RNA guide may be a molecule designed to be complementary to a particular nucleic acid sequence (target sequence, such as a target sequence with the HAO1 gene). The RNA guide may include a spacer sequence and a Direct Repeat (DR) sequence. In some cases, the RNA guide may be a modified RNA molecule comprising one or more deoxyribonucleotides, e.g., comprised in a DNA binding sequence in the RNA guide that binds to a sequence complementary to the target sequence. In some examples, the DNA binding sequence may contain a DNA sequence or a DNA/RNA hybridization sequence. The terms CRISPR RNA (crRNA), pre-crRNA and mature crRNA are also used herein to refer to RNA guides.

As used herein, the term "complementary" refers to a first polynucleotide (e.g., a spacer sequence of an RNA guide) that has a degree of complementarity to a second polynucleotide (e.g., a complementary sequence of a target sequence) such that the first polynucleotide and the second polynucleotide can form a double-stranded complex by base pairing to allow an effector polypeptide complexed with the first polynucleotide to act on (e.g., cleave) the second polynucleotide. In some embodiments, the first polynucleotide may be substantially complementary to the second polynucleotide, i.e., at least about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% complementary to the second polynucleotide. In some embodiments, the first polynucleotide is fully complementary to the second polynucleotide, i.e., has 100% complementarity to the second polynucleotide.

The "percent identity" of two nucleic acid or two amino acid sequences (also known as sequence identity) is determined using the algorithm of Karlin and Altschul, proc. Natl. Acad. Sci. USA, 87:2264-68,1990, as described in Karlin and Altschul, proc. Natl. Sci. USA, 90:5873-77,1993. Such algorithms are incorporated in the NBLAST and XBLAST programs (version 2.0) of Altschul et al, journal of molecular biology (J.mol. Biol.) 215:403-10,1990. BLAST nucleotide searches can be performed using the NBLAST program, score = 100, word length-12 to obtain nucleotide sequences homologous to the nucleic acid molecules of the present invention. BLAST protein searches can be performed with the XBLAST program, search = 50, word length = 3 to obtain amino acid sequences homologous to the protein molecule of interest. In the case of gaps between the two sequences, use can be made of the gaps BLAST (Gapped BLAST) as described in Altschul et al, nucleic Acids Res 25 (17): 3389-3402, 1997. When utilizing BLAST programs and gapped BLAST programs, default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used.

As used herein, the term "editing" refers to the introduction of one or more modifications into a target nucleic acid, e.g., within the HAO1 gene. The edits may be one or more substitutions, one or more insertions, one or more deletions, or a combination thereof. As used herein, the term "substitution" refers to the replacement of one or more nucleotides with a different one or more nucleotides relative to a reference sequence. As used herein, the term "insert" refers to the gain of one or more nucleotides in a nucleic acid sequence relative to a reference sequence. As used herein, the term "deletion" refers to the loss of one or more nucleotides in a nucleic acid sequence relative to a reference sequence.

No specific procedure is implied in how to prepare sequences comprising deletions. For example, sequences comprising deletions may be synthesized directly from individual nucleotides. In other embodiments, the deletion is made by providing and then altering the reference sequence. The nucleic acid sequence may be in the genome of the organism. The nucleic acid sequence may be in a cell. The nucleic acid sequence may be a DNA sequence. Deletions may be frame shift mutations or non-frame shift mutations. Deletions as described herein refer to deletions of up to several kilobases.

As used herein, the terms "upstream" and "downstream" refer to relative positions within a single nucleic acid (e.g., DNA) sequence in a nucleic acid molecule. "upstream" and "downstream" refer to the 5 'to 3' directions, respectively, in which RNA transcription occurs. When the 3 'end of the first sequence occurs before the 5' end of the second sequence, the first sequence is located upstream of the second sequence. When the 5 'end of the first sequence occurs after the 3' end of the second sequence, the first sequence is downstream of the second sequence. In some embodiments, the 5'-NTTN-3' or 5'-TTN-3' sequence is located upstream of the insertion deletion described herein, and the Cas12 i-induced insertion deletion is located downstream of the 5'-NTTN-3' or 5'-TTN-3' sequence.

I.Gene editing system

In some aspects, the present disclosure provides a gene editing system comprising an RNA guide targeting a HAO1 gene. Such gene editing systems can be used to edit HAO1 target genes, e.g., disrupt HAO1 genes.

Hydroxy acid oxidase 1 (HAO 1, also known as glycolate oxidase [ GOX or GO ]) converts glycolate to glyoxylate. It has been suggested that inhibition of HAO1 in individuals with PH1 will block glyoxylate formation and that excess glycolic acid will be excreted through urine. The idea of treating PH1 by inhibition of HAO1 is further supported by the fact that some individuals with aberrant splice variants of HAO1 have no symptoms of glycoluria, and thus increased urinary glycolysis, without significant renal pathology. Thus, inhibition of HAO1 expression will block production of glyoxylic acid and in turn its metabolite oxalic acid. Thus, the HAO1 gene-targeted gene editing systems disclosed herein may be used to treat Primary Hyperoxaluria (PH) in a subject in need of treatment.

In some embodiments, the RNA guide is composed of a direct repeat component and a spacer component. In some embodiments, the RNA guide binds to the Cas12i polypeptide. In some embodiments, the spacer component is specific for a HAO1 target sequence, wherein the HAO1 target sequence is adjacent to a 5'-NTTN-3' or 5'-TTN-3' pam sequence as described herein. In the case of a double-stranded target, the RNA guide binds to the first strand of the target (i.e., the non-PAM strand) and the PAM sequence as described herein is present in the second complementary strand (i.e., the PAM strand).

In some embodiments, the present disclosure provides a composition comprising a complex, wherein the complex comprises an RNA guide that targets HAO 1. In some embodiments, the disclosure includes a complex comprising an RNA guide and a Cas12i polypeptide. In some embodiments, the RNA guide and Cas12i polypeptide are bound to each other in a molar ratio of about 1:1. In some embodiments, the complex comprising the RNA guide and Cas12i polypeptide binds to a complementary region of a target sequence within the HAO1 gene. In some embodiments, the complex comprising the HAO 1-targeted RNA guide and Cas12i polypeptide binds to the complementary region of the target sequence within the HAO1 gene at a molar ratio of about 1:1. In some embodiments, the complex includes an enzymatic activity, such as a nuclease activity, that can cleave the HAO1 target sequence and/or the complement sequence. The complementary regions of the RNA guide, cas12i polypeptide, and HAO1 target sequence, either alone or together, are not naturally occurring. In some embodiments, the RNA guide in the complex comprises the direct repeat sequences and/or spacer sequences described herein. In some embodiments, the sequence of the RNA guide has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity) to the sequence of any one of SEQ ID NOs 967-1023. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

In some embodiments, the disclosure described herein includes compositions comprising an RNA guide as described herein and/or an RNA encoding a Cas12i polypeptide as described herein. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide are included together within the same composition. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide are included in separate compositions. In some embodiments, the RNA guide comprises the direct repeat sequences and/or spacer sequences described herein. In some embodiments, the sequence of the RNA guide has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity) to the sequence of any one of SEQ ID NOs 967-1023. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

The use of the gene editing systems disclosed herein provides advantages over other known nuclease systems. Cas12i polypeptides are smaller than other nucleases. For example, cas12i2 is 1,054 amino acids in length, while streptococcus pyogenes Cas9 (SpCas 9) is 1,368 amino acids in length, streptococcus thermophilus Cas9 (StCas 9) is 1,128 amino acids in length, fnCpf1 is 1,300 amino acids in length, asCpf1 is 1,307 amino acids in length, and LbCpf1 is 1,246 amino acids in length. Cas12i RNA guides that do not require transactivation CRISPR RNA (tracrRNA) are also smaller than Cas9 RNA guides. Smaller Cas12i polypeptides and RNA guide sizes facilitate delivery. Compositions comprising a Cas12i polypeptide also exhibit reduced off-target activity compared to compositions comprising a SpCas9 polypeptide. See PCT/US2021/025257, which is incorporated by reference in its entirety. Furthermore, the insertion deletion induced by the composition comprising the Cas12i polypeptide is different from the insertion deletion induced by the composition comprising the SpCas9 polypeptide. For example, spCas9 polypeptides primarily induce insertions and deletions of 1 nucleotide in length. However, cas12i polypeptides induce larger deletions, which are beneficial for disrupting larger portions of genes, such as HAO 1.

Also provided herein are systems for gene editing of a hydroxy acid oxidase 1 (HAO 1) gene, the systems comprising (i) a Cas12i polypeptide (e.g., a Cas12i2 polypeptide) or a first nucleic acid encoding the Cas12i polypeptide (e.g., a Cas12i2 polypeptide comprising an amino acid sequence at least 95% identical to SEQ ID NO:922, which may and comprises one or more mutations relative to SEQ ID NO: 922); and (ii) an RNA guide or a second nucleic acid encoding the RNA guide, wherein the RNA guide comprises a spacer sequence specific for a target sequence within a HAO1 gene (e.g., within exon 1 or exon 2 of the HAO1 gene) adjacent to a Protospacer Adjacent Motif (PAM) comprising a motif of 5' -TTN-3' (5 ' -NTTN-3 ') located 5' of the target sequence.

RNA guide

In some embodiments, the gene editing systems described herein include RNA guides that target the HAO1 gene, e.g., exon 1 or exon 2 of the HAO1 gene. In some embodiments, the gene editing systems described herein may include two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) RNA guides that target HAO 1.

The RNA guide can guide the Cas12i polypeptide comprised in the gene editing system as described herein to the HAO1 target sequence. The two or more RNA guides can guide two or more separate Cas12i polypeptides (e.g., cas12i polypeptides having the same or different sequences) as described herein to (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more) HAO1 target sequences.

Those skilled in the art who read the following examples of specific classes of RNA guides will understand that in some embodiments, the RNA guides are HAO1 target specific. That is, in some embodiments, the RNA guide specifically binds to one or more HAO1 target sequences (e.g., within a cell) and does not bind to non-targeting sequences (e.g., non-specific DNA or random sequences within the same cell).

In some embodiments, the RNA guide includes a spacer sequence followed by a direct repeat sequence, which refers to a sequence in the 5 'to 3' direction. In some embodiments, the RNA guide comprises a first direct repeat sequence followed by a spacer sequence, and the second direct repeat sequence refers to a sequence in the 5 'to 3' direction. In some embodiments, the first direct repeat and the second direct repeat of such an RNA guide are identical. In some embodiments, the first direct repeat and the second direct repeat of such RNA guides are different.

In some embodiments, the spacer sequence and the direct repeat sequence of the RNA guide are present within the same RNA molecule. In some embodiments, the spacer and the direct repeat sequence are directly linked to each other. In some embodiments, a short linker is present between the spacer and the direct repeat sequence, e.g., an RNA linker of 1, 2, or 3 nucleotides in length. In some embodiments, the spacer sequence and the direct repeat sequence of the RNA guide are present in separate molecules that are linked to each other by base pairing interactions.

Additional information about exemplary direct repeat sequences and spacer components of the RNA guide is provided below.

(i) Direct repeat sequence

In some embodiments, the RNA guide comprises a direct repeat sequence. In some embodiments, the direct repeat of the RNA guide has a length of between 12-100, 13-75, 14-50, or 15-40 nucleotides (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 nucleotides).

In some embodiments, the direct repeat sequence is a sequence of table 1 or a portion of a sequence of table 1. The direct repeat sequence may comprise nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 2 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 3 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may comprise nucleotide 4 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 5 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 6 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 7 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 8 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 9 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may comprise nucleotide 10 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 11 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 12 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotide 13 to nucleotide 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may include nucleotides 14 to 36 of any of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8.

The direct repeat sequence may include nucleotide 1 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 2 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 3 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 4 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 5 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 6 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 7 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 8 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 9 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 10 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 11 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may include nucleotide 12 to nucleotide 34 of SEQ ID NO. 9. In some embodiments, the direct repeat sequence is shown in SEQ ID NO. 10. In some embodiments, the direct repeat sequence comprises a portion of the sequence set forth in SEQ ID NO. 10.

In some embodiments, the direct repeat sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 1 or a portion of a sequence of table 1. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 3 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 5 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 7 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 8 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 9 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 10 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 11 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 12 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 13 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising 14 to nucleotide 36 of any one of SEQ ID NOs 1, 2, 3, 4, 5, 6, 7 or 8. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 to 3 of SEQ ID NO. 9.

The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising 8 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising 10 to nucleotide 34 of SEQ ID NO. 9. The direct repeat sequence may have at least 90% identity to a sequence comprising 11 to nucleotide 34 of SEQ ID NO. 9.

The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 34 to 12 of SEQ ID NO. 9. In some embodiments, the direct repeat has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity) to SEQ ID NO: 10. In some embodiments, the direct repeat sequence has at least 90% identity to a portion of the sequence set forth in SEQ ID NO. 10.

In some embodiments, a composition comprising a Cas12i2 polypeptide and an RNA guide comprising the direct repeat sequence of SEQ ID No. 10 and a spacer of 20 nucleotides in length is capable of introducing an indel into the HAO1 target sequence. See, e.g., example 1, wherein indels are measured at forty-four HAO1 target sequences after delivery of the RNA guide and Cas12i2 polypeptide of SEQ ID No. 924 by RNP into HEK293T cells; example 2, wherein indels are measured at eleven HAO1 target sequences after delivery of the RNA guide and Cas12i2 polypeptide of SEQ ID No. 924 by RNP into HepG2 cells; and example 3, wherein indels are measured at five HAO1 target sequences after RNA guide and Cas12i2 polypeptide of SEQ ID No. 924 are delivered by RNP into primary hepatocytes.

In some embodiments, the direct repeat sequence is at least 90% identical to the reverse complement of any one of SEQ ID NOs 1-10 (see Table 1). In some embodiments, the direct repeat is the reverse complement of any one of SEQ ID NOs 1-10.

TABLE 1 direct Cas12i2 repeat

In some embodiments, the direct repeat sequence is a sequence of table 2 or a portion of a sequence of table 2. The direct repeat may comprise nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 2 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may include nucleotide 3 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 4 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 5 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may include nucleotide 6 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may include nucleotide 7 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 8 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may include nucleotide 9 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 10 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may include nucleotide 11 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 12 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 13 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat may comprise nucleotide 14 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953.

In some embodiments, the direct repeat sequence has at least 95% identity (e.g., at least 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 2 or a portion of a sequence of table 2. The direct repeat sequence may have at least 95% identity to a sequence comprising nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 3 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 4 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 5 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 6 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 7 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 8 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 9 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 10 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 11 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 12 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 95% identity to a sequence comprising 13 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953.

In some embodiments, the direct repeat sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 2 or a portion of a sequence of table 2. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 3 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 5 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 6 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 7 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 8 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 9 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 10 to nucleotide 36 of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 11 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 12 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953. The direct repeat sequence may have at least 90% identity to a sequence comprising 13 to nucleotide 36 of any of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952 or 953.

In some embodiments, the direct repeat sequence is at least 90% identical to the reverse complement of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, or 953. In some embodiments, the direct repeat sequence is at least 95% identical to the reverse complement of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, or 953. In some embodiments, the direct repeat is the complement of any one of SEQ ID NOs 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, or 953.

In some embodiments, the direct repeat sequence is at least 90% identical to SEQ ID NO. 954 or a portion of SEQ ID NO. 954. In some embodiments, the direct repeat sequence is at least 95% identical to SEQ ID NO. 954 or a portion of SEQ ID NO. 954. In some embodiments, the direct repeat sequence is 100% identical to SEQ ID NO. 954 or a portion of SEQ ID NO. 954.

TABLE 2 direct Cas12i4 repeat

In some embodiments, the direct repeat sequence is a sequence of table 3 or a portion of a sequence of table 3. In some embodiments, the direct repeat sequence has at least 95% identity (e.g., at least 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 3 or a portion of a sequence of table 3. In some embodiments, the direct repeat sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 3 or a portion of a sequence of table 3. In some embodiments, the direct repeat sequence is at least 90% identical to the reverse complement of any one of SEQ ID NOS: 959-961. In some embodiments, the direct repeat sequence is at least 95% identical to the reverse complement of any one of SEQ ID NOS: 959-961. In some embodiments, the direct repeat is the complement of any one of SEQ ID NOs 959-961.

TABLE 3 direct Cas12i1 repeat

Sequence identifier	Direct repeat sequence
		SEQ ID NO:959	GUUGGAAUGACUAAUUUUUGUGCCCACCGUUGGCAC
SEQ ID NO:960	AAUUUUUGUGCCCAUCGUUGGCAC
		SEQ ID NO:961	AUUUUUGUGCCCAUCGUUGGCAC

In some embodiments, the direct repeat sequence is a sequence of table 4 or a portion of a sequence of table 4. In some embodiments, the direct repeat sequence has at least 95% identity (e.g., at least 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 4 or a portion of a sequence of table 4. In some embodiments, the direct repeat sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 4 or a portion of a sequence of table 4. In some embodiments, the direct repeat sequence is at least 90% identical to the reverse complement of any one of SEQ ID NOS 962-964. In some embodiments, the direct repeat sequence is at least 95% identical to the reverse complement of any one of SEQ ID NOS 962-964. In some embodiments, the direct repeat is the complement of any one of SEQ ID NOS 962-964.

TABLE 4 direct Cas12i3 repeat

Sequence identifier	Direct repeat sequence
		SEQ ID NO:962	CUAGCAAUGACCUAAUAGUGUGUCCUUAGUUGACAU
SEQ ID NO:963	CCUACAAUACCUAAGAAAUCCGUCCUAAGUUGACGG
		SEQ ID NO:964	AUAGUGUGUCCUUAGUUGACAU

In some embodiments, the direct repeat described herein comprises uracil (U). In some embodiments, the direct repeat described herein comprises thymine (T). In some embodiments, the direct repeat sequences according to tables 1-4 include sequences comprising thymine in one or more positions indicated as uracil in tables 1-4.

(ii) Spacer sequence

In some embodiments, the RNA guide comprises a DNA targeting or spacer sequence. In some embodiments, the spacer sequence of the RNA guide is between 12-100, 13-75, 14-50, or 15-30 nucleotides (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides) in length and is complementary to the non-PAM strand sequence. In some embodiments, the spacer sequence is designed to be complementary to a particular DNA strand, e.g., a DNA strand of a genomic locus.

In some embodiments, the RNA guide spacer sequence is substantially identical to the complementary strand of the target sequence. In some embodiments, the RNA guide comprises a sequence (e.g., a spacer sequence) having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity to a complementary strand of a reference nucleic acid sequence, e.g., a target sequence. The percent identity between two such nucleic acids can be determined manually by examining the two optimally aligned nucleic acid sequences or by a software program or algorithm (e.g., BLAST, ALIGN, CLUSTAL) using standard parameters.

In some embodiments, the RNA guide comprises a spacer sequence between 12-100, 13-75, 14-50, or 15-30 nucleotides in length (e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides) and is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to a region on the non-PAM strand that is complementary to the target sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target DNA sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target genomic sequence. In some embodiments, the RNA guide comprises a sequence, e.g., an RNA sequence, of at most 50 and at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to a region on the non-PAM strand that is complementary to the target sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target DNA sequence. In some embodiments, the RNA guide comprises a sequence that is at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to the target genomic sequence.

In some embodiments, the spacer sequence is a sequence of table 5 or a portion of a sequence of table 5. It will be appreciated that SEQ ID NO:466-920 shall be regarded as being equivalent to SEQ ID NO:466, 920, wherein each intermediate number appears in the list, i.e. 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 528, 529, 530, 531, 533, 534, 535, 536, 537, 538, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 55556, 557, 558, 560, 561, 562, 563, 567, 570, 569, 570, 579, 573, 509, 510, 511, 512, 513, 515, 516, 67, 670, 67, 670, electrically conductive, electrically conductive, electrically, 694. 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, and 920.

The spacer sequence may comprise nucleotide 1 to nucleotide 16 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 17 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 18 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 19 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 20 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 21 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 22 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 23 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 24 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 25 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 26 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 27 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 28 of any one of SEQ ID NOs 466 to 920. The spacer sequence may comprise nucleotide 1 to nucleotide 29 of any one of SEQ ID NOs 466 to 920. The spacer sequence may include nucleotide 1 to nucleotide 30 of any one of SEQ ID NOs 466 to 920.

In some embodiments, the spacer sequence has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity) to a sequence of table 5 or a portion of a sequence of table 5. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 16 of any one of SEQ ID NOs 466 to 920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 17 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 18 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 19 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 20 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 21 of any one of SEQ ID NOs 466 to 920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 22 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 23 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 24 of any one of SEQ ID NOs 466-920.

The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 25 of any one of SEQ ID NOs 466 to 920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 26 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 27 of any one of SEQ ID NOs 466 to 920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 28 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 29 of any one of SEQ ID NOs 466-920. The spacer sequence may have at least 90% identity to a sequence comprising nucleotide 1 to nucleotide 30 of any one of 466-920.

TABLE 5 target and spacer sequences

/>

* The 5'-TTN-3' 3-nucleotide PAM motif is shown in bold.

The present disclosure encompasses all combinations of the direct repeat sequences and spacers listed above, consistent with the disclosure herein.

In some embodiments, the spacer sequences described herein include uracil (U). In some embodiments, the spacer sequences described herein include thymine (T). In some embodiments, the spacer sequence according to table 5 includes a sequence comprising thymine in one or more positions indicated as uracil in table 5.

(iii) Exemplary RNA guide

The present disclosure provides RNA guides (e.g., as shown in table 5 above) that include any and all combinations of the direct repeat sequences and spacers described herein. In some embodiments, the sequence of the RNA guide has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity) to the sequence of any one of SEQ ID NOs 967-1023. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

In some embodiments, exemplary RNA guides provided herein may include a spacer sequence of any one of SEQ ID NOs 1093-1097. In one example, the RNA guide may include a spacer of SEQ ID NO 1096.

Any of the exemplary RNA guides disclosed herein can include the direct sequence of any of SEQ ID NOs 1-10 or fragments thereof of at least 23 nucleotides in length. In one example, the direct sequence may include SEQ ID NO. 10.

In specific examples, the RNA guide provided herein can include the nucleotide sequence of SEQ ID NOs 967, 968, 988, 989, or 994. In one example, the RNA guide provided herein includes the nucleotide sequence of SEQ ID NO: 989.

(iv) Modification of

The RNA guide may comprise one or more covalent modifications relative to the reference sequence, in particular parent polyribonucleotides comprised within the scope of the present disclosure.

Exemplary modifications may include any modification to a sugar, nucleobase, internucleoside linkage (e.g., to a linked phosphate/phosphodiester linkage/to a phosphodiester backbone), and any combination thereof. Some exemplary modifications provided herein are described in detail below.

The RNA guide can comprise any available modification such as to a sugar, nucleobase, or internucleoside linkage (e.g., to a linked phosphate, to a phosphodiester linkage, to a phosphodiester backbone). One or more atoms of the pyrimidine nucleobase may be replaced or substituted with an optionally substituted amino group, an optionally substituted thiol, an optionally substituted alkyl group (e.g., methyl or ethyl) or a halo group (e.g., chloro or fluoro). In certain embodiments, a modification (e.g., one or more modifications) is present in each of the sugar and internucleoside linkages. The modification may be of ribonucleic acid (RNA) to deoxyribonucleic acid (DNA), threose Nucleic Acid (TNA), ethylene Glycol Nucleic Acid (GNA), peptide Nucleic Acid (PNA), locked Nucleic Acid (LNA) or hybrids thereof. Additional modifications are described herein.

In some embodiments, the modification may comprise a chemical modification or a cell-induced modification. For example, lewis and Pan describe some non-limiting examples of intracellular RNA modification in "RNA modification and structural coordination RNA guide-protein interactions (RNAmodifications and structures cooperate to RNA guide-protein interactions)" of molecular cell biology Nature reviews (Nat Reviews Mol Cell Biol), 2017, 18:202-210.

Different sugar modifications, nucleotide modifications, and/or internucleoside linkages (e.g., backbone structures) may be present at different positions in the sequence. One of ordinary skill in the art will appreciate that nucleotide analogs or other modifications may be located anywhere in the sequence such that the function of the sequence is not substantially reduced. The sequence may comprise about 1% to about 100% modified nucleotides (relative to the total nucleotide content, or relative to one or more types of nucleotides, i.e., any one or more of A, G, U or C) or any intermediate percentage (e.g., 1% to 20% >, 1% to 25%, 1% to 50%, 1% to 60%, 1% to 70%, 1% to 80%, 1% to 90%, 1% to 95%, 10% to 20%, 10% to 25%, 10% to 50%, 10% to 60%, 10% to 70%, 10% to 80%, 10% to 90%, 10% to 95%, 10% to 100%, 20% to 25%, 20% to 50%, 20% to 60%, 20% to 70%, 20% to 80%, 20% to 90%, 20% to 95%, 20% to 100%, 50% to 60%, 50% to 70%, 50% to 80%, 50% to 95%, 50% to 100%, 70% to 80%, 70% to 90%, 70% to 95%, 80% to 80%, 80% to 90%, 80% to 95%, and 100% to 95%).

In some embodiments, the sugar modification (e.g., at the 2 'position or the 4' position) or substitution of the sugar at one or more ribonucleotides of the sequence, and the backbone modification may comprise a modification or substitution of a phosphodiester bond. Specific examples of sequences include, but are not limited to, sequences comprising a modified backbone or no natural internucleoside linkages, such as internucleoside modifications, modifications or substitutions comprising phosphodiester linkages. In addition, sequences having modified backbones include those that do not have phosphorus atoms in the backbone. For the purposes of this application, and as sometimes referred to in the art, modified RNAs that do not have phosphorus atoms in their internucleoside backbones can also be considered oligonucleotides. In particular embodiments, the sequence will comprise ribonucleotides with a phosphorus atom in their internucleoside backbone.

Modified sequence backbones may comprise, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkylphosphonates, such as 3 '-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates, such as 3' -phosphoramidate and aminoalkyl amine phosphates, phosphorothioate, carbothioate alkyl phosphonates, carbothioate alkyl phosphotriesters, and borane phosphates with normal 3'-5' linkages, 2'-5' linked analogs of these esters, and those with reversed polarity, wherein adjacent pairs of nucleoside units are linked in 3'-5' to 5'-3' or 2'-5' to 5 '-2'. Various salts, mixed salts and free acid forms are also included. In some embodiments, the sequence may be negatively or positively charged.

Modified nucleotides that may be incorporated into the sequence may be modified on internucleoside linkages (e.g., phosphate backbones). Herein, the phrases "phosphate" and "phosphodiester" are used interchangeably in the context of a polynucleotide backbone. The backbone phosphate group may be modified by replacing one or more of the oxygen atoms with a different substituent. Further, modified nucleosides and nucleotides can comprise extensive replacement of the unmodified phosphate moiety with another internucleoside linkage as described herein. Examples of modified phosphate groups include, but are not limited to, phosphorothioates, phosphoroselenos, phosphoroborophosphates (borophosphosphates), phosphoroborophosphates (boranophosphate esters), hydrogen phosphates, phosphoramidates, phosphorodiamidates, alkyl or aryl phosphonates, and phosphotriesters. Both non-linking oxygens of the dithiophosphate are replaced by sulfur. Phosphate linkers can also be modified by replacing the linking oxygen with nitrogen (bridged phosphoramidates), sulfur (bridged phosphorothioates) and carbon (bridged methylenephosphonates).

The α -thio substituted phosphate moiety imparts stability to RNA and DNA polymers through non-natural phosphorothioate backbone linkages. Phosphorothioate DNA and RNA have increased nuclease resistance and therefore have a longer half-life in the cellular environment.

In particular embodiments, the modified nucleoside comprises an α -thio-nucleoside (e.g., 5' -O- (1-phosphorothioate) -adenosine, 5' -O- (1-phosphorothioate) -cytidine (a-thiocytidine), 5' -O- (1-phosphorothioate) -guanosine, 5' -O- (1-phosphorothioate) -uridine, or 5' -O- (1-phosphorothioate) -pseudouridine).

Other internucleoside linkages, including internucleoside linkages that do not contain a phosphorus atom, that may be employed in accordance with the present disclosure are described herein.

In some embodiments, the sequence may comprise one or more cytotoxic nucleosides. For example, cytotoxic nucleosides can be incorporated into sequences, such as difunctional modifications. Cytotoxic nucleosides may include, but are not limited to, adenosine arabinoside, 5-azacytidine, 4' -thio-arabinocytidine, cyclopentylcytosine, cladribine (cladribine), clofarabine (cloxaabine), cytarabine, cytosine arabinoside, 1- (2-C-cyano-2-deoxy- β -D-arabino-pentofuranosyl) -cytosine, decitabine (decitabine), 5-fluorouracil, fludarabine (fludarabine), fluorouridine (floxuridine), gemcitabine (gemcitabine), tegafur (tegafur) and uracil in combination, tegafur ((RS) -5-fluoro-1- (tetrahydrofuran-2-yl) pyrimidine-2, 4 (1 h,3 h) -dione), traxacitabine (azacitidine), 2' -deoxy-2 ' -methylene-cytidine (DMDC) and 6-mercapto. Further examples include fludarabine phosphate, N4-behenacyl-1- β -D-arabinofuranosyl cytosine, N4-octadecyl-1- β -D-arabinofuranosyl cytosine, N4-palmitoyl-1- (2-C-cyano-2-deoxy- β -D-arabino-furanosyl) cytosine, and P-4055 (cytarabine 5' -elaidite).

In some embodiments, the sequence comprises one or more post-transcriptional modifications (e.g., capping, cleavage, polyadenylation, splicing, poly-a sequence, methylation, acylation, phosphorylation, methylation of lysine and arginine residues, acetylation, and nitrosylation of thiol and tyrosine residues, etc.). The one or more post-transcriptional modifications may be any post-transcriptional modification, such as any of more than one hundred different nucleoside modifications that have been identified in RNA (Rozenski, J, crain, P and McCloskey, J. (1999). RNA modification database:1999 New edition (The RNA Modification Database:1999 update.). Nucleic acid Instructions 27:196-197). In some embodiments, the first isolated nucleic acid comprises messenger RNA (mRNA). In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: pyridine-4-ketoriboside, 5-aza-uridine, 2-thio-uridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxy-uridine, 3-methyl-uridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taulmethyl-uridine, 1-taulmethyl-pseudouridine, 5-taulmethyl-2-thio-uridine, 1-taulmethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-deaza-pseudouridine, dihydro-pseudouridine, 2-thio-2-methoxy-uridine, 2-methoxy-4-thio-uridine and 2-methoxy-4-thio-pseudouridine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: 5-aza-cytidine, pseudoiso-cytidine, 3-methyl-cytidine, N4-acetyl-cytidine, 5-formyl-cytidine, N4-methylcytidine, 5-hydroxymethyl cytidine, 1-methyl-pseudoiso-cytidine, pyrrolo-pseudoiso-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoiso-cytidine, 4-thio-1-methyl-1-deaza-pseudoiso-cytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebrine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoiso-cytidine, and 4-methyl-iso-cytidine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2, 6-diaminopurine, 7-deaza-8-aza-2, 6-diaminopurine, 1-methyladenosine, N6-isopentenyl adenosine, N6- (cis-hydroxyisopentenyl) adenosine, 2-methylthio-N6- (cis-hydroxyisopentenyl) adenosine, N6-glycylcarbamoyl adenosine, N6-threonyl-adenosine, 2-methylthio-N6-threonyl-adenosine, N6-dimethyladenosine, 7-methyladenosine, 2-methylthio-adenine and 2-methoxy-adenine. In some embodiments, the mRNA comprises at least one nucleoside selected from the group consisting of: inosine, 1-methyl-inosine, hui-guanosine, huai Dinggan, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methyl-inosine, 6-methoxy-guanosine, 1-methyl guanosine, N2-dimethyl guanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, 1-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2, N2-dimethyl-6-thio-guanosine.

The sequences may or may not be consistently modified along the entire length of the molecule. For example, one or more or all types of nucleotides (e.g., naturally occurring nucleotides, purines or pyrimidines, or any or more or all of A, G, U, C, I, pU) may or may not be uniformly modified in sequence or in a given predetermined sequence region thereof. In some embodiments, the sequence comprises pseudouridine. In some embodiments, the sequence comprises inosine, which may help the immune system to characterize the sequence as endogenous to the viral RNA. The incorporation of inosine can also mediate an increase in RNA stability/decrease in degradation. See, e.g., yu, z et al, (2015) RNA editing of ADAR1 labels dsRNA as "self" (RNAediting by ADAR marks dsRNA as "self"), "Cell research (Cell res.)", 25,1283-1284, which is incorporated by reference in its entirety.

In some embodiments, one or more of the nucleotides of the RNA guide comprises a 2' -O-methyl phosphorothioate modification. In some embodiments, each of the first three nucleotides of the RNA guide includes a 2' -O-methyl phosphorothioate modification. In some embodiments, each of the last four nucleotides of the RNA guide comprises a 2' -O-methyl phosphorothioate modification. In some embodiments, each of the first to last, second to last, and third to last nucleotides of the RNA guide comprises a 2' -O-methyl phosphorothioate modification, and wherein the last nucleotide of the RNA guide is not modified. In some embodiments, each of the first three nucleotides of the RNA guide comprises a 2 '-O-methyl phosphorothioate modification, and each of the first to last, second to last, and third to last nucleotides of the RNA guide comprises a 2' -O-methyl phosphorothioate modification.

When the gene editing systems disclosed herein include a nucleic acid, e.g., an mRNA molecule, disclosed herein that encodes a Cas12i polypeptide, such nucleic acid molecule may contain any of the modifications disclosed herein, if applicable.

Cas12i Polypeptides

In some embodiments, the compositions or systems of the present disclosure comprise Cas12i polypeptides as described in WO/2019/178427, the relevant disclosure of which is incorporated by reference for the subject matter and purposes cited herein.

In some embodiments, the systems disclosed herein for gene editing comprise a Cas12i2 polypeptide described herein (e.g., including SEQ ID NO:922 and/or a polypeptide encoded by SEQ ID NO: 921). In some embodiments, the Cas12i2 polypeptide includes at least one RuvC domain.

The nucleic acid sequence encoding the Cas12i2 polypeptides described herein may be substantially identical to a reference nucleic acid sequence, e.g., SEQ ID No. 921. In some embodiments, the Cas12i2 polypeptide is encoded by a nucleic acid comprising a sequence having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or at least about 99.5% sequence identity to a reference nucleic acid sequence, e.g., SEQ ID NO 921. The percent identity between two such nucleic acids can be determined manually by examining the two optimally aligned nucleic acid sequences or by a software program or algorithm (e.g., BLAST, ALIGN, CLUSTAL) using standard parameters. One indication that two nucleic acid sequences are substantially identical is that the nucleic acid molecule hybridizes to the complement of the other under stringent temperature and ionic strength conditions (e.g., in the medium to high stringency range). See, e.g., tijssen, "hybridize to nucleic acid probes. A first part: theory and nucleic acid preparation (Hybridization with Nucleic Acid probes. Part I. Theory and Nucleic Acid Preparation) "(Biochemical and molecular biology laboratory techniques (Laboratory Techniques in Biochemistry and Molecular Biology), volume 24).

In some embodiments, the Cas12i2 polypeptide is encoded by a nucleic acid sequence having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or more sequence identity, but not 100% sequence identity to a reference nucleic acid sequence, e.g., SEQ ID No. 921.

In some embodiments, cas12i2 polypeptides of the disclosure include polypeptide sequences that are at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID No. 922.

In some embodiments, the disclosure describes Cas12i2 polypeptides having a certain degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100% sequence identity to the amino acid sequence of SEQ ID NO: 922. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Also provided are Cas12i2 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, and when aligned using any of the foregoing alignment methods, comprise an amino acid sequence that differs from the amino acid sequence of SEQ ID NO 922 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

In some examples, the Cas12I2 polypeptide may contain one or more mutations relative to SEQ ID No. 922, e.g., at positions D581, G624, F626, P868, I926, V1030, E1035, S1046, or any combination thereof. In some cases, the one or more mutations are amino acid substitutions, e.g., D581R, G624R, F626R, P868T, I926R, V1030G, E1035R, S1046G, or a combination thereof.

In some examples, the Cas12I2 polypeptide contains mutations at positions D581, D911, I926, and V1030. Such Cas12i2 polypeptides may contain the amino acid substitutions D581R, D911R, I926R and V1030G (e.g., SEQ ID NO: 923). In some examples, the Cas12I2 polypeptide contains mutations at positions D581, I926, and V1030. Such Cas12i2 polypeptides may contain amino acid substitutions of D581R, I926R and V1030G (e.g., SEQ ID NO: 924). In some examples, the Cas12I2 polypeptide may contain mutations at positions D581, I926, V1030, and S1046. Such Cas12i2 polypeptides may contain amino acid substitutions of D581R, I926R, V G and S1046G (e.g., SEQ ID NO: 925). In some examples, the Cas12I2 polypeptide may contain mutations at positions D581, G624, F626, I926, V1030, E1035, and S1046. Such Cas12i2 polypeptides may contain amino acid substitutions of D581R, G624R, F626R, I R, V1030G, E1035R and S1046G (e.g., SEQ ID NO: 926). In some examples, the Cas12I2 polypeptide may contain mutations at positions D581, G624, F626, P868, I926, V1030, E1035, and S1046. Such Cas12i2 polypeptides may contain the amino acid substitutions of D581R, G624R, F626R, P T, I926R, V1030G, E1035R and S1046G (e.g., SEQ ID NO: 927).

In some embodiments, cas12i2 polypeptides of the disclosure include polypeptide sequences that are at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927. In some embodiments, a Cas12i2 polypeptide having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID No. 923, 924, 925, 926 or 927 maintains amino acid changes (or at least 1, 2, 3, etc. of these changes) that distinguish the polypeptide from its corresponding parent/reference sequence.

In some embodiments, the disclosure describes Cas12i2 polypeptides having a particular degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100%, sequence identity to the amino acid sequence of SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Cas12i2 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, are also provided and when aligned using any of the foregoing alignment methods, comprise an amino acid sequence that differs from the amino acid sequence of SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

In some embodiments, the compositions of the present disclosure comprise a Cas12i4 polypeptide described herein (e.g., including SEQ ID NO:956 and/or a polypeptide encoded by SEQ ID NO: 955). In some embodiments, the Cas12i4 polypeptide includes at least one RuvC domain.

The nucleic acid sequence encoding a Cas12i4 polypeptide described herein may be substantially identical to a reference nucleic acid sequence, e.g., SEQ ID No. 955. In some embodiments, the Cas12i4 polypeptide is encoded by a nucleic acid comprising a sequence having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or at least about 99.5% sequence identity to a reference nucleic acid sequence, e.g., SEQ ID NO: 955. The percent identity between two such nucleic acids can be determined manually by examining the two optimally aligned nucleic acid sequences or by a software program or algorithm (e.g., BLAST, ALIGN, CLUSTAL) using standard parameters. One indication that two nucleic acid sequences are substantially identical is that the nucleic acid molecule hybridizes to the complement of the other under stringent temperature and ionic strength conditions (e.g., in the medium to high stringency range).

In some embodiments, the Cas12i4 polypeptide is encoded by a nucleic acid sequence having at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% or more sequence identity, but not 100% sequence identity to a reference nucleic acid sequence, e.g., SEQ ID NO: 955.

In some embodiments, cas12i4 polypeptides of the disclosure include polypeptide sequences having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 956.

In some embodiments, the disclosure describes Cas12i4 polypeptides having a certain degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100% sequence identity to the amino acid sequence of SEQ ID NO: 956. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Cas12i4 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, are also provided and when aligned using any of the foregoing alignment methods, comprise an amino acid sequence that differs from the amino acid sequence of SEQ ID NO 956 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

In some embodiments, the Cas12i4 polypeptide comprises a polypeptide having the sequence of SEQ ID NO:957 or SEQ ID NO: 958.

In some embodiments, cas12i4 polypeptides of the disclosure include polypeptide sequences that are at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to SEQ ID No. 957 or SEQ ID No. 958. In some embodiments, a Cas12i4 polypeptide having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID No. 957 or SEQ ID No. 958 retains amino acid changes (or at least 1, 2, 3, etc. of these changes) that distinguish it from its corresponding parent/reference sequence.

In some embodiments, the disclosure describes Cas12i4 polypeptides having a certain degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100% sequence identity to the amino acid sequence of SEQ ID NO:957 or SEQ ID NO: 958. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Cas12i4 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, are also provided and when aligned using any of the foregoing alignment methods, comprise amino acid sequences that differ from the amino acid sequence of SEQ ID NO:957 or SEQ ID NO:958 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

In some embodiments, the compositions of the present disclosure comprise a Cas12i1 polypeptide described herein (e.g., a polypeptide comprising SEQ ID NO: 965). In some embodiments, the Cas12i4 polypeptide includes at least one RuvC domain.

In some embodiments, cas12i1 polypeptides of the disclosure include polypeptide sequences having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID No. 965.

In some embodiments, the disclosure describes Cas12i1 polypeptides having a certain degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100% sequence identity to the amino acid sequence of SEQ ID No. 965. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Also provided are Cas12i1 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, and when aligned using any of the foregoing alignment methods, comprise an amino acid sequence that differs from the amino acid sequence of SEQ ID NO 965 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

In some embodiments, the compositions of the present disclosure comprise a Cas12i3 polypeptide described herein (e.g., a polypeptide comprising SEQ ID NO: 966). In some embodiments, the Cas12i4 polypeptide includes at least one RuvC domain.

In some embodiments, cas12i3 polypeptides of the disclosure include polypeptide sequences having at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID No. 966.

In some embodiments, the disclosure describes Cas12i3 polypeptides having a certain degree of amino acid sequence identity to one or more reference polypeptides, e.g., at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or even at least 99%, but not 100% sequence identity to the amino acid sequence of SEQ ID No. 966. Homology or identity can be determined by amino acid sequence alignment, for example, using BLAST, ALIGN, or CLUSTAL, among other programs, as described herein.

Cas12i3 polypeptides of the present disclosure having enzymatic activity, e.g., nuclease or endonuclease activity, are also provided and when aligned using any of the foregoing alignment methods, comprise an amino acid sequence that differs from the amino acid sequence of SEQ ID NO 966 by 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, or 0 amino acid residues.

Although the changes described herein may be one or more amino acid changes, the changes to the Cas12i polypeptide may also be substantial, such as the polypeptide as an amino-terminal extended and/or carboxy-terminal extended fusion. For example, the Cas12i polypeptide may contain additional peptides, e.g., one or more peptides. Additional examples of peptides may include epitope peptides for tagging, such as polyhistidine tags (His tags), myc, and FLAG. In some embodiments, cas12i polypeptides described herein can be fused to a detectable moiety, such as a fluorescent protein (e.g., green Fluorescent Protein (GFP) or Yellow Fluorescent Protein (YFP)).

In some embodiments, the Cas12i polypeptide includes at least one (e.g., two, three, four, five, six, or more) Nuclear Localization Signal (NLS). In some embodiments, the Cas12i polypeptide includes at least one (e.g., two, three, four, five, six, or more) Nuclear Export Signal (NES). In some embodiments, the Cas12i polypeptide comprises at least one (e.g., two, three, four, five, six, or more) NLS and at least one (e.g., two, three, four, five, six, or more) NES.

In some embodiments, cas12i polypeptides described herein may be self-inactivating. See, epstein et al, "design self-inactivating CRISPR System for AAV vectors (Engineering a Self-Inactivating CRISPR System for AAV Vectors)," "molecular therapy (mol. Ther.)," 24 (2016): S50, which is incorporated by reference in its entirety.

In some embodiments, the nucleotide sequence encoding a Cas12i polypeptide described herein may be codon optimized for a particular host cell or organism. For example, the nucleic acid can be codon optimized for any non-human eukaryotic organism including mice, rats, rabbits, dogs, livestock, or non-human primates. Codon usage tables are readily available, for example, in the "codon usage database" on www.kazusa.orjp/codon, and these tables can be adjusted in a variety of ways. See Nakamura et al, nucleic acids research 28:292 (2000), which is incorporated herein by reference in its entirety. Computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene force (Aptagen, jacobus, pa.). In some examples, a nucleic acid encoding a Cas12i polypeptide as disclosed herein, such as a Cas12i2 polypeptide, can be a mRNA molecule that can be codon optimized.

Exemplary Cas12i polypeptide sequences and corresponding nucleotide sequences are listed in table 6.

TABLE 6 Cas12i and HAO1 sequences

/>

In some embodiments, the gene editing systems disclosed herein can include a Cas12i polypeptide as disclosed herein. In other embodiments, the gene editing system can include a nucleic acid encoding a Cas12i polypeptide. For example, the gene editing system can include a vector encoding a Cas12i polypeptide (e.g., a viral vector, such as an AAV vector, e.g., AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAVrh10, AAV11, and AAV 12). Alternatively, the gene editing system may comprise an mRNA molecule encoding a Cas12i polypeptide. In some cases, the mRNA molecule may be codon optimized.

II.Preparation of Gene editing System Components

The present disclosure provides methods for producing components of the gene editing systems disclosed herein, e.g., RNA guides, methods for producing Cas12i polypeptides, and methods for complexing RNA guides and Cas12i polypeptides.

RNA guide

In some embodiments, the RNA guide is prepared by in vitro transcription of a DNA template. Thus, for example, in some embodiments, the RNA guide is generated by in vitro transcription of a DNA template encoding the RNA guide using an upstream promoter sequence (e.g., a T7 polymerase promoter sequence). In some embodiments, the DNA template encodes multiple RNA guides, or the in vitro transcription reaction comprises multiple different DNA templates, each encoding a different RNA guide. In some embodiments, the RNA guide is prepared using chemical synthesis methods. In some embodiments, the RNA guide is prepared by expressing the RNA guide sequence in cells transfected with a plasmid comprising a sequence encoding the RNA guide. In some embodiments, the plasmid encodes a plurality of different RNA guides. In some embodiments, a plurality of different plasmids each encoding a different RNA guide are transfected into the cell. In some embodiments, the RNA guide is expressed by a plasmid encoding the RNA guide and also encoding the Cas12i polypeptide. In some embodiments, the RNA guide is expressed by a plasmid that expresses the RNA guide but does not express the Cas12i polypeptide. In some embodiments, the RNA guide is purchased from a commercial vendor. In some embodiments, for example, as described above, RNA guides are synthesized using one or more modified nucleotides.

Cas12i Polypeptides

In some embodiments, cas12i polypeptides of the present disclosure can be prepared by the following methods: (a) Culturing a bacterium that produces a Cas12i polypeptide of the present disclosure, isolating the Cas12i polypeptide, optionally purifying the Cas12i polypeptide, and complexing the Cas12i polypeptide with an RNA guide. Cas12i polypeptides may also be prepared by (b) known genetic engineering techniques, in particular, by isolating the gene encoding the Cas12i polypeptide of the present disclosure from bacteria, constructing a recombinant expression vector, and then transferring the vector into a suitable host cell expressing the RNA guide for expression of the recombinant protein complexed with the RNA guide in the host cell. Alternatively, cas12i polypeptides may be prepared by (c) in vitro coupling a transcriptional translation system, and then complexing with an RNA guide.

In some embodiments, the host cell is used to express the Cas12i polypeptide. The host cell is not particularly limited, and various known cells may be preferably used. Specific examples of the host cells include bacteria such as E.coli (E.coli), yeasts (budding yeast, saccharomyces cerevisiae (Saccharomyces cerevisiae) and schizosaccharomyces, schizosaccharomyces pombe (Schizosaccharomyces pombe)), nematodes (caenorhabditis elegans (Caenorhabditis elegans)), xenopus oocytes (Xenopus laevis oocytes) and animal cells (e.g., CHO cells, COS cells and HEK293 cells). The method for transferring the above-described expression vector into a host cell, i.e., transformation method, is not particularly limited, and known methods such as electroporation method, calcium phosphate method, liposome method and DEAE dextran method may be used.

After transformation of the host with the expression vector, the host cell can be cultured, bred, or propagated for production of the Cas12i polypeptide. After expressing the Cas12i polypeptide, the host cells can be harvested and the Cas12i polypeptide can be purified from culture or the like according to conventional methods (e.g., filtration, centrifugation, cell disruption, gel filtration chromatography, ion exchange chromatography, etc.).

In some embodiments, the method for Cas12i polypeptide expression comprises translating at least 5 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, at least 50 amino acids, at least 100 amino acids, at least 150 amino acids, at least 200 amino acids, at least 250 amino acids, at least 300 amino acids, at least 400 amino acids, at least 500 amino acids, at least 600 amino acids, at least 700 amino acids, at least 800 amino acids, at least 900 amino acids, or at least 1000 amino acids of the Cas12i polypeptide. In some embodiments, the method for protein expression comprises translating about 5 amino acids, about 10 amino acids, about 15 amino acids, about 20 amino acids, about 50 amino acids, about 100 amino acids, about 150 amino acids, about 200 amino acids, about 250 amino acids, about 300 amino acids, about 400 amino acids, about 500 amino acids, about 600 amino acids, about 700 amino acids, about 800 amino acids, about 900 amino acids, about 1000 amino acids, or more of the Cas12i polypeptide.

Various methods can be used to determine the production level of Cas12i polypeptide in a host cell. Such methods include, but are not limited to, for example, methods utilizing polyclonal or monoclonal antibodies specific for Cas12i polypeptides or marker tags as described elsewhere herein. Exemplary methods include, but are not limited to, enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (MA), fluorescent Immunoassay (FIA), and Fluorescent Activated Cell Sorting (FACS). These and other assays are well known in the art (see, e.g., maddox et al, J. Exp. Med.) (158:1211 [1983 ]).

The present disclosure provides methods of expressing a Cas12i polypeptide in vivo in a cell, the methods comprising providing a host cell with a polynucleotide encoding a Cas12i polypeptide, wherein the polynucleotide encodes the Cas12i polypeptide, expressing the Cas12i polypeptide in the cell, and obtaining the Cas12i polypeptide from the cell.

The present disclosure further provides methods of expressing a Cas12i polypeptide in vivo in a cell, the method comprising providing to a host cell a polynucleic nucleotide encoding a Cas12i polypeptide, wherein the polynucleic nucleotide encodes a Cas12i polypeptide, and expressing the Cas12i polypeptide in the cell. In some embodiments, the polyribonucleotide encoding the Cas12i polypeptide is delivered to the cell with an RNA guide, and once expressed in the cell, the Cas12i polypeptide and the RNA guide form a complex. In some embodiments, the polyribonucleotide encoding Cas12i polypeptide and the RNA guide are delivered to the cell within a single composition. In some embodiments, the polyribonucleotide encoding Cas12i polypeptide and the RNA guide are included in separate compositions. In some embodiments, the host cell is present in a subject, e.g., a human patient.

C. Composite material

In some embodiments, the HAO 1-targeted RNA guide is complexed with a Cas12i polypeptide to form a ribonucleoprotein. In some embodiments, the complexing of the RNA guide and Cas12i polypeptide occurs at about any temperature below: 20 ℃, 21 ℃, 22 ℃, 23 ℃, 24 ℃, 25 ℃, 26 ℃, 27 ℃, 28 ℃, 29 ℃, 30 ℃, 31 ℃, 32 ℃, 33 ℃, 34 ℃, 35 ℃, 36 ℃, 37 ℃, 38 ℃, 39 ℃, 40 ℃, 41 ℃, 42 ℃, 43 ℃, 44 ℃, 45 ℃, 50 ℃, or 55 ℃. In some embodiments, the RNA guide does not dissociate from the Cas12i polypeptide at about 37 ℃ during the incubation period of at least about any one of the following periods: 10 minutes, 15 minutes, 20 minutes, 25 minutes, 30 minutes, 35 minutes, 40 minutes, 45 minutes, 50 minutes, 55 minutes, 1 hour, 2 hours, 3 hours, 4 hours or more.

In some embodiments, the RNA guide and Cas12i polypeptide are complexed in a complexing buffer. In some embodiments, the Cas12i polypeptide is stored in a buffer that is replaced with a complexing buffer to form a complex with the RNA guide. In some embodiments, the Cas12i polypeptide is stored in a complex buffer.

In some embodiments, the pH of the complex buffer is in the range of about 7.3 to 8.6. In one embodiment, the pH of the complex buffer is about 7.3. In one embodiment, the pH of the complex buffer is about 7.4. In one embodiment, the pH of the complex buffer is about 7.5. In one embodiment, the pH of the complex buffer is about 7.6. In one embodiment, the pH of the complex buffer is about 7.7. In one embodiment, the pH of the complex buffer is about 7.8. In one embodiment, the pH of the complex buffer is about 7.9. In one embodiment, the pH of the complex buffer is about 8.0. In one embodiment, the pH of the complex buffer is about 8.1. In one embodiment, the pH of the complex buffer is about 8.2. In one embodiment, the pH of the complex buffer is about 8.3. In one embodiment, the pH of the complex buffer is about 8.4. In one embodiment, the pH of the complex buffer is about 8.5. In one embodiment, the pH of the complex buffer is about 8.6.

In some embodiments, the Cas12i polypeptide may be overexpressed in a host cell and complexed with an RNA guide prior to purification as described herein. In some embodiments, mRNA or DNA encoding the Cas12i polypeptide is introduced into the cell such that the Cas12i polypeptide is expressed in the cell. In some embodiments, the RNA guide is also introduced into the cell from a single mRNA or DNA construct, either simultaneously, separately or sequentially, thereby forming ribonucleoprotein complexes in the cell.

III.Method for gene editing

The present disclosure also provides methods of modifying a target site within a HAO1 gene. In some embodiments, the method comprises introducing into the cell an RNA guide and Cas12i polypeptide that targets HAO1. The HAO 1-targeting RNA guide and Cas12i polypeptides may be introduced into the cell as ribonucleoprotein complexes. The HAO 1-targeting RNA guide and Cas12i polypeptides may be introduced onto a nucleic acid vector. Cas12i polypeptides may be introduced as mRNA. The RNA guide may be introduced directly into the cell. In some embodiments, the compositions described herein are delivered to cells/tissue/liver/human to reduce HAO1 in cells/tissue/liver/human. In some embodiments, the compositions described herein are delivered to cells/tissue/liver/human to reduce oxalate production by cells/tissue/liver/human. In some embodiments, the compositions described herein are delivered to cells/tissues/liver/human to correct calcium oxalate crystal deposition in cells/tissues/liver/human. In some embodiments, the compositions described herein are delivered to a person suffering from primary hyperoxalic acid.

Any of the gene editing systems disclosed herein may be used to genetically engineer HAO1 genes. The gene editing system can include an RNA guide and a Cas12i2 polypeptide. The RNA guide comprises a spacer sequence specific for a target sequence in the HAO1 gene, e.g. specific for a region in exon 1 or exon 2 of the HAO1 gene.

A. Target sequence

In some embodiments, the RNA guide disclosed herein is designed to be complementary to a target sequence adjacent to a 5'-TTN-3' pam sequence or a 5'-NTTN-3' pam sequence.

In some embodiments, the target sequence is within the HAO1 gene or the locus of the HAO1 gene (e.g., in exon 1 or exon 2), to which the RNA guide can bind by base pairing. In some embodiments, the cell has only one copy of the target sequence. In some embodiments, the cell has more than one copy, such as at least about any of 2, 3, 4, 5, 10, 100, or more copies of the target sequence.

In some embodiments, the HAO1 gene is a mammalian gene. In some embodiments, the HAO1 gene is a human gene. For example, in some embodiments, the target sequence is within the sequence of SEQ ID NO. 928 (or its complement). In some embodiments, the target sequence is within an exon of the HAO1 gene shown in SEQ ID NO. 928, e.g., within the sequence of SEQ ID NO. 929, 930, 931, 932, 933, 934 or 935 (or the inverse complement thereof). The target sequences within the exon regions of the HAO1 gene of SEQ ID NO. 928 are shown in Table 5. In some embodiments, the target sequence is within the intron of the HAO1 gene shown in SEQ ID NO. 928 (or its complement). In some embodiments, the target sequence is within a variant (e.g., polymorphic variant) of the HAO1 gene sequence set forth in SEQ ID No. 928 (or the reverse complement thereof). In some embodiments, the HAO1 gene sequence is a homolog of the sequence shown in SEQ ID NO. 928 (or an inverse complement thereof). For example, in some embodiments, the HAO1 gene sequence is a non-human HAO1 sequence. In some embodiments, the HAO1 gene sequence is the coding sequence shown in SEQ ID NO. 1024 (or its complement). In some embodiments, the HAO1 gene sequence is a homolog of the coding sequence set forth in SEQ ID NO. 1024 (or an inverse complement thereof).

In some embodiments, the target sequence is adjacent to a 5'-TTN-3' pam sequence or a 5'-NTTN-3' pam sequence, where N is any nucleotide. The 5'-NTTN-3' sequence may be immediately adjacent to the target sequence, or, for example, within a small number (e.g., 1, 2, 3, 4, or 5) nucleotides of the target sequence. In some embodiments, the 5'-NTTN-3' sequence is 5'-NTTY-3', 5'-NTTC-3', 5'-NTTT-3', 5'-NTTA-3', 5'-NTTB-3', 5'-NTTG-3', 5'-CTTY-3', 5'-DTTR-3', 5'-CTTR-3', 5'-DTTT-3', 5'-ATTN-3', or 5'-GTTN-3', where Y is C or T, B is any nucleotide other than A, D is any nucleotide other than C, and R is A or G. In some embodiments, the 5'-NTTN-3' sequence is 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3'. The PAM sequence may be located 5' to the target sequence.

The 5'-NTTN-3' sequence may be immediately adjacent to the target sequence, or, for example, within a small number (e.g., 1, 2, 3, 4, or 5) nucleotides of the target sequence. In some embodiments, the 5'-NTTN-3' sequence is 5'-NTTY-3', 5'-NTTC-3', 5'-NTTT-3', 5'-NTTA-3', 5'-NTTB-3', 5'-NTTG-3', 5'-CTTY-3', 5'-DTTR-3', 5'-CTTR-3', 5'-DTTT-3', 5'-ATTN-3', or 5'-GTTN-3', where Y is C or T, B is any nucleotide other than A, D is any nucleotide other than C, and R is A or G. In some embodiments, the 5'-NTTN-3' sequence is 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3'. In some embodiments, the RNA guide is designed to bind to a first strand (i.e., a non-PAM strand) of a double-stranded target nucleic acid, and the 5'-NTTN-3' PAM sequence is present in a second complementary strand (i.e., PAM strand). In some embodiments, the RNA guide binds to a region on a non-PAM strand that is complementary to a target sequence on a PAM strand adjacent to a 5 '-nan-3' sequence.

In some embodiments, the target sequence is present in a cell. In some embodiments, the target sequence is present in the nucleus. In some embodiments, the target sequence is endogenous to the cell. In some embodiments, the target sequence is genomic DNA. In some embodiments, the target sequence is chromosomal DNA. In some embodiments, the target sequence is a protein-encoding gene or a functional region thereof, such as a coding region, or a regulatory element, such as a promoter, enhancer, 5 'or 3' untranslated region, or the like.

In some embodiments, the target sequence is present in an accessible region of the target sequence. In some embodiments, the target sequence is located in an exon of the target gene. In some embodiments, the target sequence is linked across an exon-intron of the target gene. In some embodiments, the target sequence is present in a non-coding region, such as a regulatory region of a gene.

B. Gene editing

In some embodiments, the Cas12i polypeptide has enzymatic activity (e.g., nuclease activity). In some embodiments, the Cas12i polypeptide induces one or more DNA double strand breaks in the cell. In some embodiments, the Cas12i polypeptide induces one or more DNA single strand breaks in the cell. In some embodiments, the Cas12i polypeptide induces one or more DNA gaps in the cell. In some embodiments, DNA breaks and/or nicks result in the formation of one or more indels (e.g., one or more deletions).

In some embodiments, the RNA guides disclosed herein form a complex with the Cas12i polypeptide and guide the Cas12i polypeptide to a target sequence adjacent to the 5'-NTTN-3' sequence. In some embodiments, the complex induces a deletion (e.g., a nucleotide deletion or a DNA deletion) adjacent to the 5'-NTTN-3' sequence. In some embodiments, the complex induces deletions adjacent to the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequences. In some embodiments, the complex induces a deletion adjacent to the T/C-rich sequence.

In some embodiments, the deletion is downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion is located downstream of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence. In some embodiments, the deletion is downstream of the T/C-rich sequence.

In some embodiments, the deletion alters the expression of the HAO1 gene. In some embodiments, the deletion alters the function of the HAO1 gene. In some embodiments, the deletion inactivates the HAO1 gene. In some embodiments, the miss is a frameshift miss. In some embodiments, the miss is a non-frameshift miss. In some embodiments, the deletion results in cytotoxicity or cell death (e.g., apoptosis).

In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins at about 5 to about 15 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides). In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of the T/C-rich sequence.

In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins about 5 to about 15 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides). In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-rich sequence.

In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins at about 5 to about 10 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides). In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) of the T/C-rich sequence.

In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins about 5 to about 10 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides). In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the T/C-rich sequence.

In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5 '-ttg-3', 5 '-ttt-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of the T/C-rich sequence.

In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5 '-ttt-3', 5'-ATTC-3', 5 '-ttt-3', or 5'-CTTC-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-rich sequence.

In some embodiments, the deletion terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion terminates at about 20 to about 30 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletions terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion terminates about 20 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletion terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5 '-ttt-3', 5'-GTTG-3', 5'-GTTC-3', or 5'-CTTC-3' sequence. In some embodiments, the deletions terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the T/C-rich sequence.

In some embodiments, the deletion terminates within about 20 to about 25 nucleotides downstream of the 5'-NTTN-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the deletion terminates about 20 to about 25 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the deletion terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletion terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion terminates at about 25 to about 30 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletions terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion terminates about 25 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins at about 5 to about 15 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminates in about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins about 5 to about 15 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 nucleotides), and terminates about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, etc.) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 17 nucleotides). In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins about 5 to about 15 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 nucleotides), and terminates about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 25, etc.) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion starts within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, or 33 nucleotides). In some embodiments, the deletions begin within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins about 5 to about 15 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or 17 nucleotides), and terminates about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 25, etc.) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion begins within about 5 to about 15 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5 '-ttt-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, or 33 nucleotides) of the 5'-ATTC-3', 5'-ATTT-3', 5'-ATTC-3', 5 '-ttt-3', or 12 nucleotides. In some embodiments, the deletions begin within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 20 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5 '-ttt-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the deletions begin within about 5 to about 10 nucleotides and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 20 to about 25 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 or 28 nucleotides). In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletions begin within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 25 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33 nucleotides). In some embodiments, the deletion begins within about 5 to about 10 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 nucleotides) downstream of the T/C-enriched sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins at about 10 to about 15 nucleotides of a 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTT-3', 5'-TTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 20 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 30 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5 '-ttt-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTT-3', 5'-TTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 20 to about 25 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 or 28 nucleotides). In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 20 to about 25 nucleotides (e.g., about 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) of a 5'-ATTA-3', 5'-ATTT-3', 5 '-ttt-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides). In some embodiments, the deletions begin within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) and terminate within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) of the T/C-enriched sequence.

In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-NTTN-3' sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5 '-TTT-3', 5'-TTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', or 5'-CTTC-3' sequence, and terminates about 25 to about 30 nucleotides downstream of the 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5 '-TTTTT-3', 5'-TTTG-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3' sequence (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 or 33 nucleotides). In some embodiments, the deletion begins within about 10 to about 15 nucleotides (e.g., about 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides) downstream of the T/C-enriched sequence and terminates within about 25 to about 30 nucleotides (e.g., about 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, or 33 nucleotides) downstream of the T/C-enriched sequence.

In some embodiments, the length of the deletion is up to about 40 nucleotides (e.g., about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides). In some embodiments, the length of the deletion is between about 4 nucleotides and about 40 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides). In some embodiments, the length of the deletion is between about 4 nucleotides and about 25 nucleotides (e.g., about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the length of the deletion is between about 10 nucleotides and about 25 nucleotides (e.g., about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 nucleotides). In some embodiments, the length of the deletion is between about 10 nucleotides and about 15 nucleotides (e.g., about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17 nucleotides).

In some embodiments, the methods described herein are used to engineer cells comprising deletions as described herein in the HAO1 gene. In some embodiments, the methods are implemented using a complex comprising a Cas12i enzyme as described herein and an RNA guide comprising a direct repeat sequence and a spacer as described herein. In some embodiments, the sequence of the RNA guide has at least 90% identity (e.g., at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity) to the sequence of any one of SEQ ID NOs 967-1023. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

In some embodiments, the HAO 1-targeting RNA guide is encoded in a plasmid. In some embodiments, the HAO 1-targeting RNA guide is a synthetic or purified RNA. In some embodiments, the Cas12i polypeptide is encoded in a plasmid. In some embodiments, the Cas12i polypeptide is encoded by a synthetic or purified RNA.

C. Delivery of

The components of any of the gene editing systems disclosed herein can be formulated, e.g., comprising a carrier, such as a carrier and/or a polymeric carrier, e.g., a liposome, and delivered to a cell (e.g., prokaryotic, eukaryotic, plant, mammalian, etc.) by known methods. Such methods include, but are not limited to, transfection (e.g., lipid-mediated cationic polymers, calcium phosphate, dendrimers); electroporation or other membrane disruption methods (e.g., nuclear transfection), viral delivery (e.g., lentivirus, retrovirus, adenovirus, adeno-associated virus (AAV)), microinjection, microprojectile bombardment ("gene gun"), fugene, direct sonic loading, cell extrusion, optical transfection, protoplast fusion, puncture transfection (impalefection), magnetic transfection, exosome-mediated transfer, lipid nanoparticle-mediated transfer, and any combination thereof.

In some embodiments, the methods comprise delivering one or more nucleic acids (e.g., nucleic acid encoding a Cas12i polypeptide, RNA guide, donor DNA, etc.), one or more transcripts thereof, and/or preformed RNA guide/Cas 12i polypeptide complexes to a cell in which the ternary complex is formed. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide are delivered together in a single composition. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide are delivered in separate compositions. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide delivered in separate compositions are delivered using the same delivery technique. In some embodiments, the RNA guide and the RNA encoding the Cas12i polypeptide delivered in separate compositions are delivered using different delivery techniques. Exemplary methods of intracellular delivery include, but are not limited to, viruses, such as AAV, or virus-like agents; chemical-based transfection methods such as those using calcium phosphate, dendrimers, liposomes, lipid nanoparticles, or cationic polymers (e.g., DEAE-dextran or polyethylenimine); non-chemical methods such as microinjection, electroporation, cell extrusion, sonoporation, optical transfection, puncture transfection, protoplast fusion, bacterial conjugation, delivery of plasmids or transposons; particle-based methods, such as using gene gun, magnetic transfection or magnetic assisted transfection, particle bombardment; and hybridization methods, such as nuclear transfection. In some embodiments, the lipid nanoparticle comprises an mRNA encoding a Cas12i polypeptide, an RNA guide, or an mRNA encoding a Cas12i polypeptide and an RNA guide. In some embodiments, the mRNA encoding the Cas12i polypeptide is a transcript of the nucleotide sequence set forth in SEQ ID NO. 921 or SEQ ID NO. 955, or a variant thereof. In some embodiments, the present application further provides cells produced by such methods, as well as organisms (e.g., animals, plants, or fungi) comprising or produced by such cells.

D. Genetically modified cells

Any of the gene editing systems disclosed herein can be delivered to a variety of cells. In some embodiments, the cell is an isolated cell. In some embodiments, the cell is in a cell culture or a co-culture of two or more cell types. In some embodiments, the cell is ex vivo. In some embodiments, the cells are obtained from a living organism and maintained in a cell culture. In some embodiments, the cell is a unicellular organism.

In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is a bacterial cell or is derived from a bacterial cell. In some embodiments, the cell is or is derived from an archaea cell.

In some embodiments, the cell is a eukaryotic cell. In some embodiments, the cell is a plant cell or is derived from a plant cell. In some embodiments, the cell is a fungal cell or is derived from a fungal cell. In some embodiments, the cell is an animal cell or is derived from an animal cell. In some embodiments, the cell is an invertebrate cell or is derived from an invertebrate cell. In some embodiments, the cell is a vertebrate cell or is derived from a vertebrate cell. In some embodiments, the cell is a mammalian cell or is derived from a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is a zebra fish cell. In some embodiments, the cell is a rodent cell. In some embodiments, the cells are synthetic, sometimes referred to as artificial cells.

In some embodiments, the cells are derived from a cell line. A variety of cell lines for tissue culture are known in the art. Examples of cell lines include, but are not limited to, 293T, MF7, K562, heLa, CHO and transgenic variants thereof. Cell lines may be obtained from a variety of sources known to those skilled in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassas, va). In some embodiments, the cell is an immortalized cell or an immortalized cell.

In some embodiments, the cell is a primary cell. In some embodiments, the cell is a stem cell, such as a totipotent stem cell (e.g., totipotent), pluripotent stem cell, multipotent stem cell, oligopotent stem cell, or unipotent stem cell. In some embodiments, the cells are induced pluripotent stem cells (ipscs) or are derived from ipscs. In some embodiments, the cell is a differentiated cell. For example, in some embodiments, the differentiated cell is a liver cell (e.g., a hepatocyte), a bile duct cell (e.g., a bile duct epithelial cell), a stellate cell, a Kupffer cell (Kupffer cell), a liver sinus endothelial cell, a muscle cell (e.g., a muscle cell), a lipid cell (e.g., an adipocyte), a bone cell (e.g., an osteoblast, a bone cell, an osteoclast), a blood cell (e.g., a monocyte, a lymphocyte, a neutrophil, an eosinophil, a basophil, a macrophage, a red blood cell, or a platelet), a neural cell (e.g., a neuron), an epithelial cell, an immune cell (e.g., a lymphocyte, a neutrophil, a monocyte, or a macrophage), a fibroblast, or a sexual cell. In some embodiments, the cell is a terminally differentiated cell. For example, in some embodiments, the terminally differentiated cell is a neuronal cell, an adipocyte, a cardiomyocyte, a skeletal muscle cell, an epidermal cell, or an intestinal cell. In some embodiments, the cell is an immune cell. In some embodiments, the immune cell is a T cell. In some embodiments, the immune cell is a B cell. In some embodiments, the immune cell is a Natural Killer (NK) cell. In some embodiments, the immune cells are Tumor Infiltrating Lymphocytes (TILs). In some embodiments, the cell is a mammalian cell, e.g., a human cell or a murine cell. In some embodiments, the murine cells are derived from wild-type mice, immunosuppressive mice, or disease-specific mouse models. In some embodiments, the cell is a cell in a living tissue, organ, or organism.

Any genetically modified cell produced using any of the gene editing systems disclosed herein is also within the scope of the present disclosure. Such modified cells may include disrupted HAO1 gene.

The compositions, vectors, nucleic acids, RNA guides and cells disclosed herein can be used in therapy. The compositions, vectors, nucleic acids, RNA guides, and cells disclosed herein can be used in methods of treating a disease or condition in a subject. In some embodiments, the disease or condition is any suitable delivery or administration method known in the art that can be used to deliver the compositions, vectors, nucleic acids, RNA guides, and cells disclosed herein. Such methods may involve contacting the target sequence with a composition, vector, nucleic acid, or RNA guide disclosed herein. Such methods may involve methods of editing HAO1 sequences as disclosed herein. In some embodiments, the RNA guide engineered cells disclosed herein are used for ex vivo gene therapy.

IV.Therapeutic application

Any gene editing system or modified cells produced using such gene editing systems as disclosed herein may be used to treat diseases associated with the HAO1 gene, such as Primary Hyperoxaluria (PH). In some embodiments, PH is PH1, PH2, or PH3. In a specific example, the target disease is PH1.

The gene editing systems, pharmaceutical compositions or kits comprising such gene editing systems, and any RNA guide disclosed herein can be used to treat Primary Hyperoxaluria (PH) in a subject. PH is a rare genetic disorder affecting subjects of all ages from infants to elderly. PH comprises three subtypes of gene defects that are involved in altering the expression of three different proteins. PH1 is related to alanine-glyoxylate aminotransferase, or AGT/AGT1.PH2 relates to glyoxylate/hydroxypyruvate reductase or GR/HPR, and PH3 relates to 4-hydroxy-2-oxoglutarate aldolase or HOGA.

In PH1, excess oxalate can also bind calcium to form calcium oxalate in the kidneys and other organs. The deposition of calcium oxalate can cause extensive deposition of calcium oxalate (nephrocalcia) or the formation of kidney stones and bladder stones (urolithiasis) and lead to kidney damage. Common renal complications in PH1 include blood in urine (haematuria), urinary tract infections, kidney damage and End Stage Renal Disease (ESRD). Over time, the kidneys of patients with PH1 may begin to fail and the level of oxalate in the blood may rise. The deposition of oxalate in systemic tissues, such as systemic oxalosis, can occur due to high blood oxalate levels and can lead to bone, skin and eye complications. Patients with PH1 typically suffer from renal failure in early stages, with kidney dialysis or double kidney/liver organ transplantation being the only treatment option.

In some embodiments, provided herein are methods for treating a target disease disclosed herein (e.g., PH, such as PH 1), comprising administering to a subject in need of treatment (e.g., a human patient) any of the gene editing systems disclosed herein. The gene editing system may be delivered to a specific tissue or a specific type of cell that requires gene editing. The gene editing system may include an LNP that encompasses one or more of the components, one or more vectors (e.g., viral vectors) that encode one or more of the components, or a combination thereof. The components of the gene editing system may be formulated to form a pharmaceutical composition, which may further include one or more pharmaceutically acceptable carriers.

In some embodiments, modified cells produced using any of the gene editing systems disclosed herein can be administered to a subject (e.g., a human patient) in need of treatment. The modified cells may include substitutions, insertions, and/or deletions as described herein. In some examples, the modified cells can comprise cell lines modified by CRISPR nucleases, reverse transcriptase polypeptides, and editing template RNAs (e.g., RNA guide and RT donor RNAs). In some cases, the modified cells may be a heterogeneous population comprising cells with different types of gene editing. Alternatively, the modified cells may comprise a substantially homogeneous population of cells (e.g., at least 80% of the cells in the entire population) comprising editing of one particular gene in the HAO1 gene. In some examples, the cells may be suspended in a suitable medium.

In some embodiments, provided herein are compositions comprising a gene editing system or components thereof. Such a composition may be a pharmaceutical composition. Useful pharmaceutical compositions may be prepared, packaged or marketed in formulations suitable for oral, rectal, vaginal, parenteral, topical, pulmonary, intranasal, intralesional, buccal, ocular, intravenous, intra-organ or another route of administration. The pharmaceutical compositions of the present disclosure may be prepared, packaged or sold in large quantities in the form of a single unit dose or a plurality of single unit doses. As used herein, a "unit dose" is a discrete amount of a pharmaceutical composition (e.g., a gene editing system or component thereof) that will be the amount administered to a subject or a convenient fraction of such a dose, e.g., half or one third of such a dose.

In some embodiments, a pharmaceutical composition comprising a gene editing system as described herein or components thereof may be administered to a subject in need thereof, e.g., a subject suffering from a liver disease associated with the HAO1 gene. In some cases, the gene editing system or components thereof may be delivered to a particular cell or tissue (e.g., to a liver cell), where the gene editing system may act to genetically modify HAO1 genes in such cells.

Formulations of pharmaceutical compositions suitable for parenteral administration may include the active agent (e.g., a gene editing system or component thereof or modified cells) in combination with a pharmaceutically acceptable carrier, such as sterile water or sterile isotonic saline. Such formulations may be prepared, packaged or sold in a form suitable for bolus administration or continuous administration. Some injectable formulations may be prepared, packaged or sold in unit dosage forms, such as in ampoules or in multi-dose containers containing a preservative. Some formulations for parenteral administration include, but are not limited to, suspensions, solutions, emulsions in oily or aqueous vehicles, pastes, and implantable sustained release or biodegradable formulations. Some formulations may further include one or more additional ingredients including, but not limited to, suspending agents, stabilizing agents, or dispersing agents.

The pharmaceutical composition may be in the form of a sterile injectable aqueous or oleaginous suspension or solution. Such suspensions or solutions may be formulated according to known techniques and may include additional ingredients in addition to the cells, such as dispersing agents, wetting agents or suspending agents as described herein. Such sterile injectable formulations may be prepared using non-toxic parenterally acceptable diluents or solvents, such as water or saline. Other acceptable diluents and solvents include, but are not limited to, ringer's solution, isotonic sodium chloride solution, and fixed oils, such as synthetic mono-or diglycerides. Other parenterally administrable formulations that may be useful include those that may include cells in packaged form, in liposome formulations, or as a component of a biodegradable polymer system. Some compositions for sustained release or implantation may include pharmaceutically acceptable polymers or hydrophobic materials, such as emulsions, ion exchange resins, sparingly soluble polymers, or sparingly soluble salts.

V.Kit and use thereof

The present disclosure also provides kits that can be used, for example, to carry out the methods of genetically modifying the HAO1 gene described herein. In some embodiments, the kit comprises an RNA guide and a Cas12i polypeptide. In some embodiments, the kit comprises a polynucleotide encoding such Cas12i polypeptide, and optionally the polynucleotide is included within a vector, e.g., as described herein. The Cas12i polypeptide and RNA guide (e.g., as ribonucleoprotein) may be packaged in the same or other vessel within a kit or system, or may be packaged in separate vials or other vessels, the contents of which may be mixed prior to use. The kit may additionally comprise optionally buffers and/or instructions for use of the RNA guide and Cas12i polypeptides.

In some embodiments, the kit may be used for research purposes. For example, in some embodiments, the kit may be used to study gene function.

All references and publications cited herein are hereby incorporated by reference.

Further embodiments

Additional embodiments are provided below, which are also within the scope of the present disclosure.

Example 1: a composition comprising an RNA guide, wherein the RNA guide comprises (i) a spacer sequence that is substantially complementary or fully complementary to a region on a non-PAM strand (complement of a target sequence) within a HAO1 gene, and (ii) a direct repeat sequence; wherein the target sequence is adjacent to a Protospacer Adjacent Motif (PAM) comprising the sequence 5 '-NTTN-3'.

In example 1, the target sequence may be within exon 1, exon 2, exon 3, exon 4, exon 5, exon 6 or exon 7 of the HAO1 gene. In some examples, the HAO1 gene comprises the sequence of SEQ ID No. 928, the reverse complement of SEQ ID No. 928, the variant of SEQ ID No. 928, or the reverse complement of the variant of SEQ ID No. 928.

In embodiment 1, the spacer sequence may include: (a) Nucleotide 1 to nucleotide 16 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (b) Nucleotide 1 to nucleotide 17 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (c) Nucleotide 1 to nucleotide 18 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (d) Nucleotide 1 to nucleotide 19 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (e) Nucleotide 1 to nucleotide 20 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (f) Nucleotide 1 to nucleotide 21 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (g) Nucleotide 1 to nucleotide 22 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (h) Nucleotide 1 to nucleotide 23 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (i) Nucleotide 1 to nucleotide 24 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (j) Nucleotide 1 to nucleotide 25 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (k) Nucleotide 1 to nucleotide 26 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (l) Nucleotide 1 to nucleotide 27 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (m) nucleotide 1 to nucleotide 28 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (n) nucleotide 1 to nucleotide 29 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; or (o) nucleotide 1 to nucleotide 30 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS 466-920.

In any of the compositions according to example 1, the spacer sequence may comprise: (a) Nucleotide 1 to nucleotide 16 of any one of SEQ ID NOs 466 to 920; (b) Nucleotide 1 to nucleotide 17 of any one of SEQ ID NOs 466 to 920; (c) Nucleotide 1 to nucleotide 18 of any one of SEQ ID NOs 466 to 920; (d) Nucleotide 1 to nucleotide 19 of any one of SEQ ID NOs 466 to 920; (e) Nucleotide 1 to nucleotide 20 of any one of SEQ ID NOs 466 to 920; (f) Nucleotide 1 to nucleotide 21 of any one of SEQ ID NOs 466 to 920; (g) Nucleotide 1 to nucleotide 22 of any one of SEQ ID NOs 466 to 920; (h) Nucleotide 1 to nucleotide 23 of any one of SEQ ID NOs 466 to 920; (i) Nucleotide 1 to nucleotide 24 of any one of SEQ ID NOs 466 to 920; (j) Nucleotide 1 to nucleotide 25 of any one of SEQ ID NOs 466 to 920; (k) Nucleotide 1 to nucleotide 26 of any one of SEQ ID NOs 466 to 920; (l) Nucleotide 1 to nucleotide 27 of any one of SEQ ID NOs 466 to 920; (m) nucleotide 1 to nucleotide 28 of any one of SEQ ID NOs 466 to 920; (n) nucleotide 1 to nucleotide 29 of any one of SEQ ID NOs 466 to 920; or (o) nucleotide 1 to nucleotide 30 of any one of SEQ ID NOs 466 to 920.

In any of the compositions according to example 1, the direct repeat sequence may include: (a) Nucleotide 1 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (b) Nucleotide 2 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (c) Nucleotide 3 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (d) Nucleotide 4 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (e) Nucleotide 5 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (f) Nucleotide 6 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (g) Nucleotide 7 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (h) Nucleotide 8 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (i) Nucleotide 9 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (j) Nucleotide 10 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (k) Nucleotide 11 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (l) Nucleotide 12 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (m) nucleotide 13 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (n) nucleotide 14 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (o) nucleotide 1 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (p) nucleotide 2 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (q) nucleotide 3 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (r) nucleotide 4 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (s) nucleotide 5 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (t) nucleotide 6 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (u) nucleotide 7 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (v) Nucleotide 8 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO 9; (w) nucleotide 9 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (x) Nucleotide 10 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO 9; (y) nucleotide 11 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (z) nucleotide 12 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; or (aa) a sequence which is at least 90% identical to the sequence of SEQ ID NO. 10, or a part thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (b) Nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (c) Nucleotide 3 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (d) Nucleotide 4 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (e) Nucleotide 5 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (f) Nucleotide 6 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (g) Nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (h) Nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (i) Nucleotide 9 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (j) Nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (k) Nucleotide 11 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (l) Nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (m) nucleotide 13 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (n) nucleotide 14 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (o) nucleotide 1 to nucleotide 34 of SEQ ID NO. 9; (p) nucleotide 2 to nucleotide 34 of SEQ ID NO 9; (q) nucleotide 3 to nucleotide 34 of SEQ ID NO 9; (r) nucleotide 4 to nucleotide 34 of SEQ ID NO 9; (s) nucleotide 5 to nucleotide 34 of SEQ ID NO 9; (t) nucleotide 6 to nucleotide 34 of SEQ ID NO 9; (u) nucleotide 7 to nucleotide 34 of SEQ ID NO 9; (v) nucleotide 8 to nucleotide 34 of SEQ ID NO 9; (w) nucleotide 9 to nucleotide 34 of SEQ ID NO 9; (x) nucleotide 10 to nucleotide 34 of SEQ ID NO 9; (y) nucleotide 11 to nucleotide 34 of SEQ ID NO 9; (z) nucleotide 12 to nucleotide 34 of SEQ ID NO 9; (or aa) SEQ ID NO. 10 or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (b) Nucleotide 2 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (c) Nucleotide 3 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (d) Nucleotide 4 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (e) Nucleotide 5 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (f) Nucleotide 6 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (g) Nucleotide 7 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (h) Nucleotide 8 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (i) Nucleotide 9 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (j) Nucleotide 10 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (k) Nucleotide 11 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (l) Nucleotide 12 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS.936-953; (m) nucleotide 13 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 936-953; (n) nucleotide 14 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 936-953; or (o) a sequence which is at least 90% identical to the sequence of SEQ ID NO. 954, or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 936-953; (b) Nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 936-953; (c) Nucleotide 3 to nucleotide 36 of any one of SEQ ID NOs 936-953; (d) Nucleotide 4 to nucleotide 36 of any one of SEQ ID NOs 936-953; (e) Nucleotide 5 to nucleotide 36 of any one of SEQ ID NOs 936-953; (f) Nucleotide 6 to nucleotide 36 of any one of SEQ ID NOs 936-953; (g) Nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 936-953; (h) Nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 936-953; (i) Nucleotide 9 to nucleotide 36 of any one of SEQ ID NOs 936-953; (j) Nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 936-953; (k) Nucleotide 11 to nucleotide 36 of any one of SEQ ID NOs 936-953; (l) Nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 936-953; (m) nucleotide 13 to nucleotide 36 of any one of SEQ ID NOs 936-953; (n) nucleotide 14 to nucleotide 36 of any one of SEQ ID NOs 936-953; (or o) SEQ ID NO:954 or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (b) Nucleotide 2 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (c) Nucleotide 3 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (d) Nucleotide 4 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (e) Nucleotide 5 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (f) Nucleotide 6 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (g) Nucleotide 7 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (h) Nucleotide 8 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (i) Nucleotide 9 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (j) Nucleotide 10 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (k) Nucleotide 11 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (l) Nucleotide 12 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (m) nucleotide 13 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (n) nucleotide 14 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; or (o) a sequence which is at least 90% identical to the sequence of SEQ ID No. 960 or SEQ ID No. 961, or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) nucleotide 1 to nucleotide 36 of SEQ ID NO. 959; (b) nucleotide 2 to nucleotide 36 of SEQ ID NO. 959; (c) nucleotide 3 to nucleotide 36 of SEQ ID NO. 959; (d) nucleotide 4 to nucleotide 36 of SEQ ID NO. 959; (e) nucleotide 5 to nucleotide 36 of SEQ ID NO. 959; (f) nucleotide 6 to nucleotide 36 of SEQ ID NO: 959; (g) nucleotide 7 to nucleotide 36 of SEQ ID NO. 959; (h) nucleotide 8 to nucleotide 36 of SEQ ID NO. 959; (i) nucleotide 9 to nucleotide 36 of SEQ ID NO 959; (j) nucleotide 10 to nucleotide 36 of SEQ ID NO 959; (k) nucleotide 11 to nucleotide 36 of SEQ ID NO: 959; (l) nucleotide 12 to nucleotide 36 of SEQ ID NO. 959; (m) nucleotide 13 to nucleotide 36 of SEQ ID NO. 959; (n) nucleotide 14 to nucleotide 36 of SEQ ID NO. 959; or (o) SEQ ID NO 960 or SEQ ID NO 961 or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of a sequence which is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (b) Nucleotide 2 to nucleotide 36 of a sequence which is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (c) Nucleotide 3 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (d) Nucleotide 4 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (e) Nucleotide 5 to nucleotide 36 of a sequence which is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (f) Nucleotide 6 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (g) Nucleotide 7 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (h) Nucleotide 8 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (i) Nucleotide 9 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (j) Nucleotide 10 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (k) Nucleotide 11 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (l) Nucleotide 12 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO 962 or SEQ ID NO 963; (m) nucleotide 13 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (n) nucleotide 14 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; (o) nucleotide 15 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of SEQ ID NO. 962 or SEQ ID NO. 963; or (p) a sequence which is at least 90% identical to the sequence of SEQ ID NO. 964, or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) 962 or 963 nucleotides 1 to 36 of SEQ ID NO; (b) 962 or 963 nucleotides 2 to 36 of SEQ ID NO; (c) 962 or 963 nucleotides 3 to 36 of SEQ ID NO; (d) 962 or 963 nucleotides 4 to 36 of SEQ ID NO; (e) 962 or 963 nucleotides 5 to 36 of SEQ ID NO; (f) 962 or 963 nucleotides 6 to 36 of SEQ ID NO; (g) 962 or 963 nucleotides 7 to 36; (h) 962 or 963 nucleotides 8 to 36; (i) 962 or 963 nucleotides 9 to 36; (j) 962 or 963 nucleotides 10 to 36 of SEQ ID NO; (k) 962 or 963 nucleotides 11 to 36; (l) 962 or 963 nucleotides 12 to 36; (m) nucleotide 13 to nucleotide 36 of SEQ ID NO 962 or SEQ ID NO 963; (n) nucleotide 14 to nucleotide 36 of SEQ ID NO 962 or SEQ ID NO 963; (o) nucleotide 15 to nucleotide 36 of SEQ ID NO. 962 or SEQ ID NO. 963; or (p) SEQ ID NO 964 or a portion thereof.

In some examples, the spacer sequence is substantially complementary or fully complementary to the complement of the sequence of any one of SEQ ID NOS: 11-465.

In any of the compositions according to example 1, the PAM may include the sequences 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5'-TTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTG-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3'.

In some examples, the target sequence is immediately adjacent to the PAM sequence.

In some examples, the RNA guide has a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 967-1023.

In some examples, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

Example 2: the composition of embodiment 1, which may further comprise a Cas12i polypeptide or a polynucleotide encoding a Cas12i polypeptide, may be one of the following: (a) A Cas12i2 polypeptide, said Cas12i2 polypeptide comprising a sequence that is at least 90% identical to the sequence of SEQ ID No. 922, SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927; (b) A Cas12i4 polypeptide, said Cas12i4 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 956, SEQ ID No. 957 or SEQ ID No. 958; (c) A Cas12i1 polypeptide, said Cas12i1 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 965; or (d) a Cas12i3 polypeptide, the Cas12i3 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 966.

In a specific example, the Cas12i polypeptide is: (a) A Cas12i2 polypeptide, said Cas12i2 polypeptide comprising the sequence of SEQ ID No. 922, SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927; (b) A Cas12i4 polypeptide, said Cas12i4 polypeptide comprising the sequence of SEQ ID No. 956, SEQ ID No. 957 or SEQ ID No. 958; (c) A Cas12i1 polypeptide, said Cas12i1 polypeptide comprising the sequence of SEQ ID No. 965; or (d) a Cas12i3 polypeptide, the Cas12i3 polypeptide comprising the sequence of SEQ ID No. 966.

In any of the compositions according to example 2, the RNA guide and the Cas12i polypeptide can form a ribonucleoprotein complex. In some examples, the ribonucleoprotein complex binds to a target nucleic acid. In some examples, the composition is present within a cell.

In any of the compositions according to example 2, the RNA guide and the Cas12i polypeptide may be encoded in a vector, e.g., an expression vector. In some examples, the RNA guide and the Cas12i polypeptide are encoded in a single vector. In other examples, the RNA guide is encoded in a first vector and the Cas12i polypeptide is encoded in a second vector.

Example 3: a vector system comprising one or more vectors encoding an RNA guide disclosed herein and a Cas12i polypeptide. In some examples, the vector system comprises a first vector encoding an RNA guide disclosed herein and a second vector encoding a Cas12i polypeptide. The vector may be an expression vector.

Example 4: a composition comprising an RNA guide and a Cas12i polypeptide, wherein the RNA guide comprises (i) a spacer sequence that is substantially complementary or fully complementary to a region on a non-PAM strand (the complement of a target sequence) within a HAO1 gene, and (ii) a direct repeat sequence.

In some examples, the target sequence is within exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, or exon 7 of the HAO1 gene, which may include the sequence of SEQ ID NO:928, the reverse complement of SEQ ID NO:928, the variant of the sequence of SEQ ID NO:928, or the reverse complement of the variant of SEQ ID NO: 928.

In some examples, the spacer sequence comprises: (a) Nucleotide 1 to nucleotide 16 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (b) Nucleotide 1 to nucleotide 17 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (c) Nucleotide 1 to nucleotide 18 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (d) Nucleotide 1 to nucleotide 19 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (e) Nucleotide 1 to nucleotide 20 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (f) Nucleotide 1 to nucleotide 21 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (g) Nucleotide 1 to nucleotide 22 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (h) Nucleotide 1 to nucleotide 23 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (i) Nucleotide 1 to nucleotide 24 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (j) Nucleotide 1 to nucleotide 25 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (k) Nucleotide 1 to nucleotide 26 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (l) Nucleotide 1 to nucleotide 27 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (m) nucleotide 1 to nucleotide 28 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; (n) nucleotide 1 to nucleotide 29 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 466 to 920; or (o) nucleotide 1 to nucleotide 30 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOS 466-920.

In some examples, the spacer sequence comprises: (a) Nucleotide 1 to nucleotide 16 of any one of SEQ ID NOs 466 to 920; (b) Nucleotide 1 to nucleotide 17 of any one of SEQ ID NOs 466 to 920; (c) Nucleotide 1 to nucleotide 18 of any one of SEQ ID NOs 466 to 920; (d) Nucleotide 1 to nucleotide 19 of any one of SEQ ID NOs 466 to 920; (e) Nucleotide 1 to nucleotide 20 of any one of SEQ ID NOs 466 to 920; (f) Nucleotide 1 to nucleotide 21 of any one of SEQ ID NOs 466 to 920; (g) Nucleotide 1 to nucleotide 22 of any one of SEQ ID NOs 466 to 920; (h) Nucleotide 1 to nucleotide 23 of any one of SEQ ID NOs 466 to 920; (i) Nucleotide 1 to nucleotide 24 of any one of SEQ ID NOs 466 to 920; (j) Nucleotide 1 to nucleotide 25 of any one of SEQ ID NOs 466 to 920; (k) Nucleotide 1 to nucleotide 26 of any one of SEQ ID NOs 466 to 920; (l) Nucleotide 1 to nucleotide 27 of any one of SEQ ID NOs 466 to 920; (m) nucleotide 1 to nucleotide 28 of any one of SEQ ID NOs 466 to 920; (n) nucleotide 1 to nucleotide 29 of any one of SEQ ID NOs 466 to 920; or (o) nucleotide 1 to nucleotide 30 of any one of SEQ ID NOs 466 to 920.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (b) Nucleotide 2 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (c) Nucleotide 3 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (d) Nucleotide 4 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (e) Nucleotide 5 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (f) Nucleotide 6 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (g) Nucleotide 7 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (h) Nucleotide 8 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (i) Nucleotide 9 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (j) Nucleotide 10 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (k) Nucleotide 11 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (l) Nucleotide 12 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (m) nucleotide 13 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (n) nucleotide 14 to nucleotide 36 of a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1 to 8; (o) nucleotide 1 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (p) nucleotide 2 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (q) nucleotide 3 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (r) nucleotide 4 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (s) nucleotide 5 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (t) nucleotide 6 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (u) nucleotide 7 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (v) Nucleotide 8 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO 9; (w) nucleotide 9 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (x) Nucleotide 10 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO 9; (y) nucleotide 11 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; (z) nucleotide 12 to nucleotide 34 of a sequence at least 90% identical to the sequence of SEQ ID NO. 9; or (aa) a sequence which is at least 90% identical to the sequence of SEQ ID NO. 10, or a part thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (b) Nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (c) Nucleotide 3 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (d) Nucleotide 4 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (e) Nucleotide 5 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (f) Nucleotide 6 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (g) Nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (h) Nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (i) Nucleotide 9 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (j) Nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (k) Nucleotide 11 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (l) Nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (m) nucleotide 13 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (n) nucleotide 14 to nucleotide 36 of any one of SEQ ID NOs 1 to 8; (o) nucleotide 1 to nucleotide 34 of SEQ ID NO. 9; (p) nucleotide 2 to nucleotide 34 of SEQ ID NO 9; (q) nucleotide 3 to nucleotide 34 of SEQ ID NO 9; (r) nucleotide 4 to nucleotide 34 of SEQ ID NO 9; (s) nucleotide 5 to nucleotide 34 of SEQ ID NO 9; (t) nucleotide 6 to nucleotide 34 of SEQ ID NO 9; (u) nucleotide 7 to nucleotide 34 of SEQ ID NO 9; (v) nucleotide 8 to nucleotide 34 of SEQ ID NO 9; (w) nucleotide 9 to nucleotide 34 of SEQ ID NO 9; (x) nucleotide 10 to nucleotide 34 of SEQ ID NO 9; (y) nucleotide 11 to nucleotide 34 of SEQ ID NO 9; (z) nucleotide 12 to nucleotide 34 of SEQ ID NO 9; or (aa) SEQ ID NO. 10 or a portion thereof.

In some examples, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of any one of SEQ ID NOs 936-953; (b) Nucleotide 2 to nucleotide 36 of any one of SEQ ID NOs 936-953; (c) Nucleotide 3 to nucleotide 36 of any one of SEQ ID NOs 936-953; (d) Nucleotide 4 to nucleotide 36 of any one of SEQ ID NOs 936-953; (e) Nucleotide 5 to nucleotide 36 of any one of SEQ ID NOs 936-953; (f) Nucleotide 6 to nucleotide 36 of any one of SEQ ID NOs 936-953; (g) Nucleotide 7 to nucleotide 36 of any one of SEQ ID NOs 936-953; (h) Nucleotide 8 to nucleotide 36 of any one of SEQ ID NOs 936-953; (i) Nucleotide 9 to nucleotide 36 of any one of SEQ ID NOs 936-953; (j) Nucleotide 10 to nucleotide 36 of any one of SEQ ID NOs 936-953; (k) Nucleotide 11 to nucleotide 36 of any one of SEQ ID NOs 936-953; (l) Nucleotide 12 to nucleotide 36 of any one of SEQ ID NOs 936-953; (m) nucleotide 13 to nucleotide 36 of any one of SEQ ID NOs 936-953; (n) nucleotide 14 to nucleotide 36 of any one of SEQ ID NOs 936-953; or (o) SEQ ID NO. 954 or a portion thereof.

In some embodiments, the direct repeat sequence comprises: (a) Nucleotide 1 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (b) Nucleotide 2 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (c) Nucleotide 3 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (d) Nucleotide 4 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (e) Nucleotide 5 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (f) Nucleotide 6 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (g) Nucleotide 7 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (h) Nucleotide 8 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (i) Nucleotide 9 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (j) Nucleotide 10 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (k) Nucleotide 11 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (l) Nucleotide 12 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (m) nucleotide 13 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; (n) nucleotide 14 to nucleotide 36 of a sequence at least 90% identical to SEQ ID NO 959; or (o) a sequence which is at least 90% identical to the sequence of SEQ ID No. 960 or SEQ ID No. 961, or a portion thereof.

In any of the compositions according to example 4, the spacer sequence may be substantially complementary or fully complementary to the complement of the sequence of any one of SEQ ID NOs 11-465.

In some examples, the target sequence is adjacent to a Protospacer Adjacent Motif (PAM) comprising the sequence 5 '-NTTN-3'. In some examples, the PAM includes the sequences 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5'-TTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3'.

In some examples, the target sequence is immediately adjacent to the PAM sequence. In some examples, the target sequence is within 1, 2, 3, 4, or 5 nucleotides of the PAM sequence.

In any of the compositions according to embodiment 4, the Cas12i polypeptide is: (a) A Cas12i2 polypeptide, said Cas12i2 polypeptide comprising a sequence that is at least 90% identical to the sequence of SEQ ID No. 922, SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927; (b) A Cas12i4 polypeptide, said Cas12i4 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 956, SEQ ID No. 957 or SEQ ID No. 958; (c) A Cas12i1 polypeptide, said Cas12i1 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 965; or (d) a Cas12i3 polypeptide, the Cas12i3 polypeptide comprising a sequence at least 90% identical to the sequence of SEQ ID No. 966.

In some examples, the Cas12i polypeptide is: (a) A Cas12i2 polypeptide, said Cas12i2 polypeptide comprising the sequence of SEQ ID No. 922, SEQ ID No. 923, SEQ ID No. 924, SEQ ID No. 925, SEQ ID No. 926, or SEQ ID No. 927; (b) A Cas12i4 polypeptide, said Cas12i4 polypeptide comprising the sequence of SEQ ID No. 956, SEQ ID No. 957 or SEQ ID No. 958; (c) A Cas12i1 polypeptide, said Cas12i1 polypeptide comprising the sequence of SEQ ID No. 965; or (d) a Cas12i3 polypeptide, the Cas12i3 polypeptide comprising the sequence of SEQ ID No. 966.

In any of the compositions according to example 4, the RNA guide and the Cas12i polypeptide can form a ribonucleoprotein complex. In some examples, the ribonucleoprotein complex binds to a target nucleic acid.

In any of the compositions according to example 4, the composition may be present intracellularly.

In any of the compositions according to example 4, the RNA guide and the Cas12i polypeptide may be encoded in a vector, e.g., an expression vector. In some examples, the RNA guide and the Cas12i polypeptide are encoded in a single vector. In other examples, the RNA guide is encoded in a first vector and the Cas12i polypeptide is encoded in a second vector.

Example 5: a vector system comprising one or more vectors encoding an RNA guide disclosed herein and a Cas12i polypeptide. In some examples, the vector system comprises a first vector encoding an RNA guide disclosed herein and a second vector encoding a Cas12i polypeptide. In some embodiments, the vector is an expression vector.

Example 6: an RNA guide comprising (i) a spacer sequence that is substantially complementary or fully complementary to a region on a non-PAM strand (the complement of the target sequence) within the HAO1 gene, and (ii) a direct repeat sequence.

In any of the RNA guides according to example 6, the spacer sequence may be substantially complementary or fully complementary to the complement of the sequence of any one of SEQ ID NOs 11-465.

In any of the RNA guides according to example 6, the target sequence may be adjacent to a Protospacer Adjacent Motif (PAM) comprising the sequence 5'-NTTN-3', where N is any nucleotide. In some examples, the PAM includes the sequences 5'-ATTA-3', 5'-ATTT-3', 5'-ATTG-3', 5'-ATTC-3', 5'-TTTA-3', 5'-TTTT-3', 5'-TTTG-3', 5'-TTTC-3', 5'-GTTA-3', 5'-GTTT-3', 5'-GTTC-3', 5'-CTTA-3', 5'-CTTT-3', 5'-CTTG-3', or 5'-CTTC-3'.

In some examples, the target sequence is immediately adjacent to the PAM sequence. In other examples, the target sequence is within 1, 2, 3, 4, or 5 nucleotides of the PAM sequence.

In some examples, the RNA guide has a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 967-1023. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 967-1023.

Example 7: a nucleic acid encoding an RNA guide as described herein.

Example 8: a vector comprising an RNA guide as described herein.

Example 9: a cell comprising a composition, RNA guide, nucleic acid or vector as described herein. In some examples, the cell is a eukaryotic cell, an animal cell, a mammalian cell, a human cell, a primary cell, a cell line, a stem cell, or a liver cell.

Example 10: a kit comprising a composition, RNA guide, nucleic acid or vector as described herein.

Example 11: a method of editing a HAO1 sequence, the method comprising contacting the HAO1 sequence with a composition or RNA guide as described herein. In some examples, the method is performed in vitro. In other examples, the method is practiced ex vivo.

In some examples, the HAO1 sequence is in a cell.

In some examples, the composition or the RNA guide induces a deletion in the HAO1 sequence. In some examples, the deletion is adjacent to a 5'-NTTN-3' sequence, where N is any nucleotide. In some embodiments, the deletion is downstream of the 5'-NTTN-3' sequence. In some embodiments, the deletion is up to about 40 nucleotides in length. In some cases, the deletion is about 4 nucleotides to 40 nucleotides in length, about 4 nucleotides to 25 nucleotides in length, about 10 nucleotides to 25 nucleotides in length, or about 10 nucleotides to 15 nucleotides in length.

In some examples, the deletion begins within about 5 nucleotides to about 15 nucleotides, about 5 nucleotides to about 10 nucleotides, or about 10 nucleotides to about 15 nucleotides of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 15 nucleotides, about 5 nucleotides to about 10 nucleotides, or about 10 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion terminates within about 20 nucleotides to about 30 nucleotides, about 20 nucleotides to about 25 nucleotides, or about 25 nucleotides to about 30 nucleotides of the 5'-NTTN-3' sequence.

In some examples, the deletion terminates within about 20 nucleotides to about 30 nucleotides, about 20 nucleotides to about 25 nucleotides, about 25 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 25 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 25 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 10 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 10 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 25 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 5 nucleotides to about 10 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 25 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 10 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 10 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 20 nucleotides to about 25 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the deletion begins within about 10 nucleotides to about 15 nucleotides downstream of the 5'-NTTN-3' sequence and terminates within about 25 nucleotides to about 30 nucleotides downstream of the 5'-NTTN-3' sequence.

In some examples, the 5'-NTTN-3' sequence is 5'-CTTT-3', 5'-CTTC-3', 5'-GTTT-3', 5'-GTTC-3', 5'-TTTC-3', 5'-GTTA-3', or 5'-GTTG-3'.

In some examples, the deletion overlaps with a mutation in the HAO1 sequence. In some cases, the deletion overlaps with an insertion in the HAO1 sequence. In some cases, the deletion lacks repeat sequence extension other than the HAO1 sequence or a portion thereof. In some cases, the deletion disrupts one or both alleles of the HAO1 sequence.

In any of the compositions, RNA guides, nucleic acids, vectors, cells, kits, or methods according to any of embodiments 1-10 described herein, the RNA guide can comprise the sequence of any of SEQ ID NOs 967-1023.

Example 12: a method of treating Primary Hyperoxaluria (PH) in a subject, the PH optionally being PH1, PH2 or PH3, the method comprising administering to the subject any composition, RNA or cell as described herein.

In any of the compositions, RNA guides, cells, kits, or methods described herein, the RNA guide and/or the polyribonucleotides encoding Cas12i polypeptide can be included within a lipid nanoparticle. In some examples, the RNA guide and the polynucleic acid encoding the Cas12i polypeptide are included within the same lipid nanoparticle. In other examples, the RNA guide and the polyribonucleotide encoding the Cas12i polypeptide are included within separate lipid nanoparticles.

Example 13: an RNA guide comprising (i) a spacer sequence that is complementary to a target site within the HAO1 gene (the target site being on a non-PAM strand and complementary to the target sequence), and (ii) a direct repeat sequence, wherein the target sequence is any one of SEQ ID NOs 1047, 1026 or 1025 or the inverse complement thereof.

In some examples, the RNA guide has a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 989, 968 or 967. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 989, 968 or 967.

In some examples, each of the first three nucleotides of the RNA guide includes a 2' -O-methyl phosphorothioate modification.

In some examples, each of the last four nucleotides of the RNA guide includes a 2' -O-methyl phosphorothioate modification.

In some examples, each of the first to last, second to last, and third to last nucleotides of the RNA guide comprises a 2' -O-methyl phosphorothioate modification, and wherein the last nucleotide of the RNA guide is not modified.

In some examples, the RNA guide has a sequence that is at least 90% identical to the sequence of any one of SEQ ID NOs 1082-1087. In some embodiments, the RNA guide has the sequence of any one of SEQ ID NOs 1082-1087.

In some embodiments, the HAO 1-targeting RNA guide comprises at least 90% identity to any one of SEQ ID NOs 1082-1087. In some embodiments, the RNA guide targeting HAO1 comprises any of SEQ ID NOs 1082-1087. In some embodiments, RNA guides comprising targeting HAO1 that are at least 90% identical to SEQ ID NO:1083 or SEQ ID NO:1084 bind by base pairing to the complementary region of the HAO1 target sequence of SEQ ID NO: 1047. In some embodiments, the RNA guide of SEQ ID NO:1083 or SEQ ID NO:1084 that targets HAO1 binds to the complementary region of the HAO1 target sequence of SEQ ID NO:1047 by base pairing. In some embodiments, RNA guides comprising targeting HAO1 that are at least 90% identical to SEQ ID NO:1085 or SEQ ID NO:1086 bind by base pairing to the complementary region of the HAO1 target sequence of SEQ ID NO: 1026. In some embodiments, the RNA guide of SEQ ID NO:1085 or SEQ ID NO:1086 that targets HAO1 binds to the complementary region of the HAO1 target sequence of SEQ ID NO:1026 by base pairing. In some embodiments, RNA guides comprising targeting HAO1 that are at least 90% identical to SEQ ID NO:1087 or SEQ ID NO:2293 bind to the complementary region of the HAO1 target sequence of SEQ ID NO:1025 by base pairing. In some embodiments, the RNA guide of SEQ ID NO:1087 or SEQ ID NO:2293 that targets HAO1 binds to the complementary region of the HAO1 target sequence of SEQ ID NO:1025 by base pairing.

Example 14: a nucleic acid encoding an RNA guide according to example 13 as described herein.

Example 15: a vector comprising a nucleic acid according to example 14 as described herein.

Example 16: a vector system comprising one or more vectors encoding (i) the RNA guide of example 13 as described herein, and (ii) a Cas12i polypeptide. In some examples, the vector system comprises a first vector encoding the RNA guide and a second vector encoding the Cas12i polypeptide.

Example 17: a cell comprising an RNA guide, nucleic acid, vector or vector system according to examples 13 to 16 as described herein. In some examples, the cell is a eukaryotic cell, an animal cell, a mammalian cell, a human cell, a primary cell, a cell line, a stem cell, or a T cell.

Example 18: a kit comprising an RNA guide, nucleic acid, vector or vector system as described herein according to examples 13 to 16.

Example 19: a method of editing a HAO1 sequence, the method comprising contacting the HAO1 sequence with an RNA guide according to example 13 as described herein. In some examples, the HAO1 sequence is in a cell.

In some examples, the RNA guide induces indels (e.g., insertions or deletions) in the HAO1 sequence.

Example 20: a method of treating Primary Hyperoxaluria (PH) in a subject, the PH optionally being PH1, PH2 or PH3, the method comprising administering to the subject an RNA guide according to example 12 as described herein.

General technique

Practice of the present disclosure will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are fully explained in the literature, such as: molecular cloning: laboratory Manual (Molecular Cloning: ALaboratory Manual), second edition (Sambrook et al, 1989) Cold spring harbor Press (Cold Spring Harbor Press); oligonucleotide Synthesis (Oligonucleotide Synthesis) (M.J.Gait et al 1984); molecular biology methods (Methods in Molecular Biology), humana Press; cell biology: laboratory Manual (Cell Biology: A Laboratory Notebook) (J.E.Cellis editions, 1989) Academic Press (Academic Press); animal cell culture (Animal Cell Culture) (r.i. freshney edit, 1987); cell and tissue culture treatises (Introduction to Cell and Tissue Culture) (J.P.Mather and P.E.Roberts, 1998), proleman Press; cell and tissue culture: laboratory procedures (Cell and Tissue Culture: laboratory Procedures) (A.Doyle, J.B.Griffiths and D.G.Newell editions 1993-8) John Wiley father-son publishing company (J.Wiley and Sons); enzymatic methods (Methods in Enzymology) (Academic Press, inc.); experimental immunology handbook (Handbook of Experimental Immunology) (d.m. weir and c.c. blackwell editions); mammalian cell gene transfer vectors (Gene Transfer Vectors for Mammalian Cells) (J.M.Miller and M.P.Calos. Eds., 1987); current guidelines for molecular biology experiments (Current Protocols in Molecular Biology) (f.m. ausubel et al editions 1987); PCR: polymerase chain reaction (PCR: the Polymerase Chain Reaction) (Mullis et al, eds., 1994); current guidelines for immunology (Current Protocols in Immunology) (J.E. Coligan et al, editions, 1991); instructions on the fine-compiled molecular biology laboratory Manual (Short Protocols in Molecular Biology) (John Willi's father-son publishing company, 1999); immunobiology (Immunobiology) (c.a. janeway and p.transitions, 1997); antibodies (P.Finch, 1997); antibody: practical methods (Antibodies: a practical approach) (D.Catty. Eds., IRL Press, 1988-1989); monoclonal antibody: practical methods (Monoclonal antibodies: a practical approach) (P.shepherd and C.dean editions, oxford university press (Oxford University Press), 2000); use of antibodies: laboratory Manual (Using anti-bodies: a laboratory manual) (E.Harlow and D.Lane (Cold spring harbor laboratory Press (Cold Spring Harbor Laboratory Press, 1999)); antibodies (The Antibodies) (M.Zanetti and J.D.Capra editors Hawude academy of sciences (Harwood Academic Publishers), 1995); DNA cloning: practical methods (DNAClaning: A practical Approach), volumes I and II (D.N.Glover edit 1985); nucleic acid hybridization (Nucleic Acid Hybridization) (B.D.Hames and S.J.Higgins, editions (1985)); transcription and translation (Transcription and Translation) (b.d.hames and s.j.higgins editions (1984)); animal cell culture (R.I. Freshney edit, (1986)); immobilized cells and enzymes (Immobilized Cells and Enzymes) (IRL Press, (1986)); perbal, guidelines for practical use in molecular cloning (Apractical Guide To Molecular Cloning) (1984); ausubel et al (editions).

Without further elaboration, it is believed that one skilled in the art can, based on the preceding description, utilize the present invention to its fullest extent. Accordingly, the following specific examples should be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever. All publications cited herein are incorporated by reference for the purpose or subject matter for which they are referred to herein.

Examples

The following examples are provided to further illustrate some embodiments of the invention, but are not intended to limit the scope of the invention; it will be appreciated by way of example nature thereof that other procedures, methods or techniques known to those skilled in the art may alternatively be used.

Example 1-Cas12i2 mediated editing of HAO1 target sites in HEK293T cells

This example describes genome editing of HAO1 genes using Cas12i2 introduced into HEK293T cells.

Cas12i2 RNA guide (crRNA) was designed and ordered from Integrated DNA technologies company (Integrated DNATechnologies, IDT). For initial guide screening in HEK293T cells, the target sequence was designed by tiling the coding exons of HAO1 of the 5'-NTTN-3' PAM sequence, and then spacer sequences were designed for the 20-bp target sequence downstream of the PAM sequence. The HAO 1-targeting RNA guide sequences are shown in table 7. In the figure, "e#t#" may also be denoted as "exon#target#".

TABLE 7 crRNA sequence of HAO1

/>

* The 3' three nucleotides represent the 5' -TTN-3' motif.

Cas12i2 RNP complex reactions were performed by mixing the purified Cas12i2 polypeptide of SEQ ID No. 924 (400 μm) with HAO 1-targeted crRNA (1 mM in 250mM NaCl) at a volume ratio of 1:1 (Cas 12i2: crRNA) (2.5:1 crRNA: cas12i2 molar ratio). The complex was incubated on ice for 30-60 minutes.

Using TRYPLE ^TM HEK293T cells were harvested (recombinant cell dissociating enzyme; siemens Feier Co., thermo Fisher) and counted. Cells were washed once with PBS and resuspended in SF buffer+supplement (SF cell line 4D-NUCLEOFECTOR) at a concentration of 16,480 cells/. Mu.L ^TM X kit S; lonza) #V4XC-2032. Resuspended cells were grown at 3X10 ⁵ Individual cells/reactions were distributed to 16 wells of the Dragon companyIn the strip. The complexed Cas12i2 RNP was added to each reaction at a final concentration of 10 μΜ (Cas 12i 2), and then to the transfection-enhancer oligomer at a final concentration of 4 μΜ. The final volume of each electroporation reaction was 20. Mu.L. Non-targeted guidance was used as a negative control.

Using electroporation device (program CM-130, dragon company 4D-NUCLEOFECTOR) ^TM ) The strips were electroporated. Immediately after electroporation, 80 μl of pre-warmed dmem+10% fbs was added to each well and gently mixed by pipetting. For each technical replica plate, 10 μl (30,000 cells) of diluted nuclear infected cells were plated into a pre-warmed 96-well plate containing 100 μl dmem+10% fbs wells. The edit plate was incubated with 5% CO at 37deg.C ₂ Incubation was carried out for 3 days.

After 3 days, TRYPLE was used ^TM Wells were harvested (recombinant cell dissociating enzyme; sameidie company) and transferred to 96-well TWIN.PCR plate (Ai Bende company (Eppendorf)). The medium was removed by flicking and the cells were resuspended in 20. Mu.L QUICKEXTRACT ^TM (DNA extraction buffer; luci)gen company (Lucigen)). The sample was then cycled in the PCR machine at 65℃for 15 minutes, at 68℃for 15 minutes, and at 98℃for 10 minutes. The samples were then frozen at-20 ℃.

Samples for Next Generation Sequencing (NGS) were prepared by multiple rounds of PCR. The first round (PCR I) was used to amplify the genomic region flanking the target site and add NGS adaptors. The second round (PCR II) was used to add NGS index. The reactions were then pooled, purified by column purification and quantified on a fluorometer (Qubit). 150 cycle NGS instrument (NEXTSEQ was used ^TM v 2.5) or high output kit (from Mecaner Corp., illumina) and run in NGS instrument (NEXTSEQ) ^TM 550; as of the company Mena).

For NGS analysis, the indel mapping function uses the fastq file, amplicon reference sequence, and forward primer sequence of the sample. For each read, the editing operations (matches, mismatches, insertions, deletions) between the read and the reference sequence are calculated using a kmer scanning algorithm. To remove the small amount of primer dimer present in some samples, the first 30nt of each read is required to match the reference, and more than half of the mapped nucleotide mismatched reads are also filtered out. Up to 50,000 reads that pass those filters are used for analysis and if these reads contain insertions or deletions, are calculated as indel reads. The% indels are calculated as the number of reads containing indels divided by the number of reads analyzed (reads through the filter of at most 50,000). The QC standard for the minimum number of reads through the filter is 10,000.

FIG. 1 shows the HAO1 indels in HEK293T cells after RNP delivery. Error bars represent the average of three technical replicas across one biological replica. Following delivery, indels are detected within and/or near each of the HAO1 target sites with each of the RNA guides. Delivery of E1T2, E1T3, E1T6, E1T7, E1T13, T1T17, E2T4, E2T5, E2T9, E2T10, E3T6, E3T19, E3T22, and E3T28 resulted in indels in over 70% NGS reads. Thus, HAO 1-targeting RNA guides induce indels of exon 1, exon 2, and exon 3 in HEK293T cells.

Thus, this example demonstrates that HAO1 can be targeted solely by Cas12i2RNP in mammalian cells, such as HEK293T cells.

Example 2-Cas12i2 mediated editing of HAO1 target sites in HepG2 cells

This example describes genome editing of HAO1 genes using Cas12i2 introduced into HepG2 cells by RNP.

RNP complexing reactions were performed with the various RNA guides of Table 7 as described in example 1. Using TRYPLE ^TM (recombinant cell dissociating enzyme; sesameiser) HepG2 cells were collected and counted. Cells were washed once with PBS and resuspended in SF buffer+supplement (SF cell line 4D-NUCLEOFECTOR) at a concentration of 13,889 cells/. Mu.L ^TM X kit S; dragon company #V4XC-2032). Resuspended cells were distributed to Dragon company 16 wells at 2.5e5 cells/reactionIn the strip. The complexed Cas12i2 RNP was added to each reaction at a final concentration of 20 μm (Cas 12i 2) without transfection-enhancer oligomers. The final volume of each electroporation reaction was 20. Mu.L. Non-targeted guidance was used as a negative control.

Using electroporation device (program DJ-100, dragon company 4D-NUCLEOFECTOR ^TM ) The strips were electroporated. Immediately after electroporation, 80 μl of pre-warmed emem+10% fbs was added to each well and gently mixed by pipetting. For each technical replica plate, 10 μl (25,000 cells) of diluted nuclear infected cells were plated into a pre-warmed 96-well plate containing wells of 100 μl emem+10% fbs. The edit plate was incubated with 5% CO at 37deg.C ₂ Incubation was carried out for 3 days.

After 3 days, TRYPLE was used ^TM Wells were harvested (recombinant cell dissociating enzyme; sameidie company) and transferred to 96-well TWIN.PCR plate (Ai Bende Co.). Flick to remove the medium and resuspend the cells in 20μL QUICKEXTRACT ^TM (DNA extraction buffer; lucigen Co.). The sample was then cycled in the PCR machine at 65℃for 15 minutes, at 68℃for 15 minutes, and at 98℃for 10 minutes. The samples were then frozen at-20 ℃. Samples were analyzed by NGS as described in example 1.

FIG. 2 shows the HAO1 indels in HepG2 cells after RNP delivery. Error bars represent the average of three technical replicas across one biological replica. Following delivery, indels are detected within and/or near each of the HAO1 target sites with each of the RNA guides. Thus, HAO 1-targeting RNA guides induce indels of exon 1, exon 2 and exon 3 in HepG2 cells.

Thus, this example demonstrates that HAO1 can be targeted by Cas12i2 RNP in mammalian cells, such as HepG2 cells.

Example 3-Cas12i2 mediated editing of HAO1 target sites in primary hepatocytes

This example describes genome editing of HAO1 genes using Cas12i2 introduced into primary hepatocytes by RNP.

RNP complexing reactions were performed with the RNA guide of Table 7 as described in example 1. Primary hepatocytes from human donors were thawed very rapidly from liquid nitrogen in a 37 ℃ water bath. Cells were added to pre-warmed hepatocyte recovery medium (zemoeimer, CM 7000) and centrifuged at 100g for 10 min. The cell pellet was resuspended in an appropriate volume of hepatocyte plating medium (Williams medium E, supplemented with hepatocyte plating supplement package (serum-containing) sameir femto a1217601, sameir femto CM 3000). By using A one-time cytometer (Shil technologies (Fisher scientific), 22-600-100) counts cells for trypan blue survival. The cells were then washed in PBS and resuspended in P3 buffer + replenishment solution at a concentration of about 7,500 cells/. Mu.L (P3 primary cells 4D-NUCLEOFECTOR ^TM An X kit; dragon sand Co., VXP-3032 A) is provided. The resuspended cells were distributed to 16 Kong Longsha company nuclear cuvette strips at 150,000 cells/reaction, or to a single Dragon company at 500,000 cells/reactionFor mRNA reading. The complexed Cas12i2 RNP was added to each reaction at a final concentration of 20 μΜ (Cas 12i 2), and then to the transfection-enhancer oligomer at a final concentration of 4 μΜ. The final volume of each electroporation reaction was 20. Mu.L (16-well nuclear cuvette strip) or 100. Mu.L (mononuclear cuvette format). Non-targeted guidance was used as a negative control.

The strips were electroporated using the DS-150 program, while the CA137 program (Dragon company 4D-NUCLEOFECTOR ^TM ) Mononuclear cuvette strips were electroporated. Immediately after electroporation, pre-warmed hepatocyte plating medium was added to each well and mixed very gently by pipetting. For each technical replica plate, all cell suspensions in diluted nuclear infected cells were plated into pre-warmed collagen-coated 96-well plates or 24-well plates (sameifer's company), where the wells contained hepatocyte plating medium. The cells were then incubated in a 37℃incubator. After 4 hours of cell adhesion, the medium was replaced with hepatocyte maintenance medium (Williams medium E supplemented with William's E medium cell maintenance mixture, sameir feier a1217601, sameir feier CM 4000). Fresh hepatocyte maintenance medium was replaced after 2 days.

After 4-5 days post RNP electroporation, the medium was aspirated and the cells were harvested by shaking (500 rpm) in an incubator with 2Mg/ml collagenase IV (Sieimer Feier Co., 17104019) dissolved in PBS containing Ca/Mg at 37 ℃. After cells were separated from the plate, they were transferred to 96-well TWIN.PCR (Ai Bende company) and centrifugation. The media was flicked off and the cells used for NGS reading were resuspended in 20 μl of quackextract ^TM (DNA extraction buffer; lucigen Co.). The samples were then cycled in a PCR machine at 65 ℃ for 15 minutes, at 68 ℃ for 15 minutes, at 98 ℃ for 10 minutes, and analyzed by NGS as described in example 1.

For mRNA reads, cell pellets were frozen at-80 ℃ and then resuspended in lysis buffer and DNA/RNA was extracted using the RNeasy kit (Qiagen) according to the manufacturer's instructions. DNA extracted from the samples was analyzed by NGS. The amount and purity of the isolated RNAs were checked using nanodrop (nanodrop), and then 5x iScript reverse transcription reaction mixtures (bure laboratories (Bio-Rad laboratories)) were used for cDNA synthesis as per manufacturer's instructions. The templated cDNA was properly diluted to within the linear range of the subsequent analysis. A20. Mu.L digital droplet PCR (ddPCR-Berle laboratory) reaction was set up using diluted cDNA using target specific primers for HAO1, ATTGTGCACTGTCAGATCTTGGAAACGGCCAAAGGATTTTTCCTCACCAATGTCTTGTCGATGACTTTCACATTCTGGCACCCACTCAGAGCCATGGCCAACCGGAATTCTTCCTTTAGTAT (SEQ ID NO: 1088) and a 2x ddPCR Supermix (Berle laboratory) with probe number dUTP, according to the manufacturer's instructions. The reaction was used to generate droplets using an automated droplet generator (bure laboratories) as recommended by the manufacturer. The generated droplets were PCR amplified using PX1 PCR plate sealer (bure laboratories) plate, using C1000 touch thermocycler (bure laboratories) using manufacturer recommended conditions. PCR amplified droplets were read on a QX200 droplet reader (bure labs) and the obtained data was analyzed using QX manager version 1.2 (bure labs) to determine the absolute copy number of mRNA present in each reaction of the appropriate target.

As shown in fig. 3, each RNA guide tested for induced deletion of inserts within and/or near the HAO1 target site. Indels were not induced with non-targeted controls. Thus, HAO 1-targeting RNA guides induce indels in primary hepatocytes. Indels are then correlated with the mRNA level of each target to determine if the indels result in mRNA knockdown and subsequent protein knockdown. FIG. 4 shows the% mRNA knockdown of HAO1 in edited cells compared to non-edited control cells. Although NGS reads using HAO 1E 2T5 (SEQ ID NO: 989) included a higher percentage of indels than HAO 1E 2T4 (SEQ ID NO: 988), HAO 1E 2T4 resulted in a greater HAO1 mRNA knockdown.

Thus, this example demonstrates that HAO1 can be targeted by Cas12i2 RNP in mammalian cells, such as primary human hepatocytes.

Example 4-editing of HAO1 target sites in HepG2 cells with Cas12i2 variants

This example describes indel assessment of HAO1 targets using variants introduced into HepG2 cells by transient transfection.

The Cas12i2 variants of SEQ ID No. 924 and SEQ ID No. 927 were cloned into the pcda3.1 backbone (Invitrogen), respectively. Nucleic acids encoding RNA guide E1T2 (SEQ ID NO: 967), E1T3 (SEQ ID NO: 968), E2T4 (SEQ ID NO: 988), E2T5 (SEQ ID NO: 989), E2T10 (SEQ ID NO: 994) were cloned into the pUC19 backbone (New England Biolabs (New England Biolabs)). The plasmid was then maximally prepared and diluted.

Using TRYPLE ^TM (recombinant cell dissociating enzyme; sesameiser) HepG2 cells were collected and counted. Cells were washed once with PBS and resuspended in SF buffer+supplement (SF cell line 4D-NUCLEOFECTOR) ^TM X kit S; dragon company #V4XC-2032).

Approximately 16 hours prior to transfection, 25,000 HepG2 cells in EMEM/10% fbs were plated into each of the 96-well plates. On the day of transfection, cells were 70-90% confluent. Lipofectamine was prepared for each well to be transfected ^TM 3000 and Opti-And then incubated at room temperature for 5 minutes (solution 1). After incubation, lipofectamine was used ^TM :/>The mixture was added to a separate mixture containing nuclease plasmid and RNA guide plasmid and P3000 reagent (solution 2). In the negative controlIn the case of (2), crRNA is not contained in solution. Solution 1 and solution 2 were mixed by pipetting up and down, and then incubated at room temperature for 15 minutes. After incubation, the solution 1 and solution 2 mixtures were added drop-wise to each well of a 96-well plate containing cells.

After 3 days, TRYPLE was used ^TM Wells were harvested (recombinant cell dissociating enzyme; sameidie company) and transferred to 96-well TWIN.PCR plate (Ai Bende Co.). The medium was removed by flicking and the cells were resuspended in 20. Mu.L QUICKEXTRACT ^TM (DNA extraction buffer; lucigen Co.). The sample was then cycled in the PCR machine at 65℃for 15 minutes, at 68℃for 15 minutes, and at 98℃for 10 minutes. The samples were then frozen at-20 ℃ and analyzed by NGS as described in example 1.

As shown in fig. 5A, for E1T2, E1T3, E2T4, E2T5, E2T10, indel activity comparable to the two Cas12i2 variants was observed. FIG. 5B shows the insertion size frequency (left) and insertion start position of PAM relative to variants Cas12i2 of E1T3 and SEQ ID NO: 924. As shown in the left panel, the size of the deletion ranges from 1 nucleotide to about 40 nucleotides. Most deletions are from about 6 nucleotides to about 27 nucleotides in length. As shown in the right figure, the target sequence is shown starting from position 0 and ending to position 20. Indels begin within about 10 nucleotides and about 35 nucleotides downstream of the PAM sequence. Most indels begin near the end of the target sequence, e.g., about 18 nucleotides to about 25 nucleotides downstream of the PAM sequence.

Thus, this example demonstrates that HAO1 is capable of being targeted by a variety of Cas12i2 polypeptides.

Example 5 editing of HAO1 in primary human hepatocytes using Cas12i2 mRNA constructs

This example describes the indel assessment of HAO1 target sites by RNA guide delivering Cas12i2mRNA and chemically modified targeting HAO 1.

Corresponding toThe mRNA sequences of the variant Cas12i2 sequence of SEQ ID NO. 924 and of the variant Cas12i2 sequence of SEQ ID NO. 927 are modified by Alveerror with 1-pseudo-U and usedReagent AG (triple Biotechnology Co., ltd. (TriLink Biotechnologies)) was synthesized. The Cas12i2mRNA sequence shown in table 8 further includes a C-terminal NLS.

TABLE 8 Cas12i2mRNA sequences

/>

Cas12i2 RNA guide was designed and ordered from Integrated DNA Technologies (IDT) as specified in table 9, with 3' modified phosphorothioate 2' o-methyl bases or 5' and 3' modified phosphorothioate 2' o-methyl base guides. Each variant Cas12i2mRNA was mixed with crRNA at a volume ratio of 1:1 (Cas 12i2: crRNA) (1050:1 crRNA: cas12i2 molar ratio). mRNA and crRNA were mixed immediately prior to electroporation. Primary human hepatocytes were cultured and electroporated as described in example 3.

TABLE 9 chemically modified RNA guide sequences

FIG. 6 shows editing of HAO1 target sites by variant Cas12i2mRNA and 3' -end modified E2T5 (SEQ ID NO: 1091) or 5' -and 3' -end modified E2T5 (SEQ ID NO: 1092). After electroporation of the Cas12i2mRNA of SEQ ID No. 1089 or SEQ ID No. 1090 and the RNA guide of SEQ ID No. 1091 or SEQ ID No. 1092, an indel is introduced into the HAO1 target site. After electroporation of the Cas12i2mRNA of SEQ ID No. 1090 and the RNA guide of SEQ ID No. 1091 or SEQ ID No. 1092, approximately 50% ngs reads included indels. A statistically significantly higher% indel was observed with the variant Cas12i2mRNA of SEQ ID No. 1090 compared to the variant Cas12i2mRNA of SEQ ID No. 1089. No statistical difference was observed with the 5' and 3' versus 3' only modifications to RNA guide E2T 5.

Thus, this example demonstrates that HAO1 can be targeted in mammalian cells by Cas12i2 mRNA constructs and chemically modified RNA guides.

Example 6-Cas12i2 and RNA-guided off-target analysis targeting HAO1

This example describes mid-target assessment versus off-target assessment for Cas12i2 variants and HAO 1-targeted RNA guides.

HEK293T cells were transfected with plasmids encoding either the variant Cas12i2 of SEQ ID NO:924 or the variant Cas12i2 of SEQ ID NO:927 and plasmids encoding E2T5 (SEQ ID NO: 989), E1T2 (SEQ ID NO: 967), E1T3 (SEQ ID NO: 968) and E2T10 (SEQ ID NO: 994) according to the methods described in example 16 of PCT/US 21/25257. The tag-based tag integration site sequencing (TTISS) method described in example 16 of PCT/US21/25257 was then performed.

Fig. 7A and 7B show graphs depicting target and off-target TTISS reads. The black wedge and centered numbers represent the fraction of target TTISS reads in the sample. Each grey wedge represents a unique off-target site identified by TTISS. The size of each gray wedge represents the fraction of TTISS reads mapped to a given off-target site. FIG. 7A shows the TTISS read of variant Cas12i2 of SEQ ID NO. 924, and FIG. 7B shows the TTISS read of variant Cas12i2 of SEQ ID NO. 927.

As shown in fig. 7A, variant Cas12i2 of SEQ ID No. 924 paired with E2T5 shows a low probability of off-target editing, as 100% of TTISS reads map onto mid-target. No TTISS reads map to potential off-target sites. E1T2 also shows a low probability of off-target editing. For E1T2, 98% of the TTISS reads mapped to the mid-target, and two potential off-target sites represented a total of 2% of the TTISS reads. For E5T10, 95% of the TTISS reads mapped to the mid-target, and two potential off-target sites represented a total of 5% of the TTISS reads. E2T10 demonstrates a higher probability of off-target editing using the TTISS method. For E2T10, only 65% of the TTISS reads mapped to the mid target, and 4 potential off-target sites represented the remaining total of 35% of the TTISS reads. One potential off-target represents most of the potential off-target TTISS reads of E2T 10.

As shown in fig. 7B, variant Cas12i2 of SEQ ID NO:927 paired with E2T5 shows a low probability of off-target editing, as 100% of TTISS reads map onto mid-target. No TTISS reads map to potential off-target sites. Variants Cas12i2 of SEQ ID NO:927 paired with E1T2 or E1T3 also show a low probability of off-target editing. For E1T2, 100% of the TTISS reads in replica 1 and 96% of the TTISS reads in replica 2 map onto the mid-target; the two potential off-target sites represent the remaining 4% of the TTISS reads in replica 2. For E1T3, 100% of the TTISS reads in replica 1 and 92% of the TTISS reads in replica 2 map onto the mid-target; the two potential off-target sites represent the remaining 8% of the TTISS reads in replica 2.

Thus, this example shows that compositions comprising Cas12i2 and HAO 1-targeted RNA guides comprise different off-target activity profiles.

Example 7-knock down of HAO1 protein with Cas12i2 and RNA guide targeting HAO1

This example describes the use of variants Cas12i2 of SEQ ID No. 924 and RNA guides targeting HAO1, using western blotting to identify knockdown of HAO1 protein.

Primary hepatocytes from human donors were thawed very rapidly from liquid nitrogen in a 37 ℃ water bath. Cells were added to pre-warmed hepatocyte recovery medium (zemoeimer, CM 7000) and centrifuged at 100g for 10 min. The cell pellet was resuspended in an appropriate volume of hepatocyte plating medium (Williams medium E, supplemented with hepatocyte plating supplement package (serum-containing) sameir femto a1217601, sameir femto CM 3000). Cells were trypan blue survival counted using an Inucket disposable cytometer (Shil technologies, inc., 22-600-100). Cells were then washed in PBS and resuspended in P3 buffer + replenishment solution (Longsha, VXP-3032) at a concentration of about 5000 cells/. Mu.L. Resuspended cells were dispensed into Dragon's electroporation cuvette at 500,000 cells/reaction.

For the RNP reaction, E2T5 (SEQ ID NO: 989) was used as an RNA guide targeting HAO 1. RNP was added to each reaction at a final concentration of 20 μm (Cas 12i 2) and then to the transfection-enhancer oligomer at a final concentration of 4 μm. Non-electroporated cells and electroporated cells without cargo were used as negative controls.

The strips were electroporated using an electroporation device (program CA137, dragon's company 4D-nuclear transfection). Immediately after electroporation, pre-warmed hepatocyte plating medium was added to each well and mixed very gently by pipetting. For each technical replica plate, 500,000 cells of diluted nuclear infected cells were plated into a pre-warmed collagen-coated 24-well plate (sameifer company), where the wells contained hepatocyte plating medium. The cells were then incubated at 37 ℃. After 24 hours of cell adhesion, the medium was replaced with hepatocyte maintenance medium (Williams medium E supplemented with William's E medium cell maintenance mixture, sameir feier a1217601, sameir feier CM 4000). Fresh hepatocyte maintenance medium was replaced every 48 hours.

16 days after RNP electroporation, the medium was aspirated and the cells were gently washed with PBS. Cells were then lysed with RIPA lysis and extraction buffer (samzel 89901) +1x protease inhibitor (samzel 78440) on ice for 30 minutes and samples were mixed every 5 minutes. Cell lysates were quantified by the Pierce BCA protein assay kit (sameimer femto-cell 23227). 15 μg total protein per sample was prepared for SDS-PAGE in 1 XLaemmlli sample buffer (Berle Corp. 1610747) and 100mM DTT, and then heated at 95C for 10 min. Samples were run on 4-15% tgx gel (berle 5671084) at 200V for 45 minutes. Samples were transferred to 0.2um nitrocellulose membrane (burle 1704159) using a Trans Blot Turbo system. The membrane was blocked in Intercept TBS blocking buffer (Li-cor 927-60001) for 30 min at room temperature. The blots were then incubated overnight at 4C in blocking buffer with 1:1000 dilution of primary anti-HAO 1 antibody (Genetex GTX 81144) and 1:2500 dilution of primary anti-adhesion plaque protein antibody (Sigma Co., V9131). The blots were washed three times with TBST (Siemens Mimehto 28360) for 5 minutes each, and then incubated with IR680 anti-mouse (Siemens Mimehto PI 35518) and IR800 anti-rabbit secondary antibody (Siemens Mimehto PISA 535571) diluted 1:12500 in TBST for 1 hour at room temperature. The blots were then washed three times with TBST for 5 minutes each and visualized on a Li-cor Odyssey CLX.

Targeting Cas12i2 RNP of HAO1 gene with E2T5 a knockdown of HAO1 protein was observed in primary human hepatocytes on day 7 post-editing (lanes 1-3 of fig. 8). For buffer-only controls (lanes 4-7), no HAO1 knockdown was observed.

Thus, this example shows that HAO1 protein levels decrease after editing with Cas12i2 and RNA guides targeting HAO 1.

OTHER EMBODIMENTS

All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.

From the foregoing description, one skilled in the art can easily ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Accordingly, other embodiments are within the scope of the following claims.

Equivalent scheme

Although a number of embodiments of the invention have been described and illustrated herein, various other means and/or structures for performing the functions described herein and/or obtaining one or more of these results and/or advantages will be apparent to those of ordinary skill in the art, and each such variation and/or modification is deemed to be within the scope of the embodiments of the invention described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the present invention is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, embodiments of the invention may be practiced otherwise than as specifically described and claimed. Embodiments of the invention of the present disclosure relate to each individual feature, system, article, material, kit, and/or method described herein. In addition, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, any combination of two or more such features, systems, articles, materials, kits, and/or methods is included within the scope of the present disclosure.

All definitions as defined and used herein should be understood to control dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.

All references, patents and patent applications disclosed herein are incorporated by reference with respect to the subject matter to which each is cited, and in some cases, may encompass the entire document.

The indefinite articles "a" and "an" as used herein in the specification and claims should be understood to mean "at least one" unless explicitly stated to the contrary.

As used herein in the specification and claims, the phrase "and/or" should be understood to mean "either or both" of the elements so combined, i.e., elements that in some cases exist in combination with other cases exist separately. The various elements listed with "and/or" should be interpreted in the same manner, i.e., "one or more of the elements so combined. In addition to the elements specifically identified by the "and/or" clause, other elements may optionally be present, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to "a and/or B" when used in conjunction with an open language such as "comprising" may refer in one embodiment to a alone (optionally including elements other than B); in another embodiment, only B (optionally including elements other than a); in yet another embodiment, both a and B (optionally including other elements); etc.

As used herein in the specification and claims, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when separating items in a list, "or" and/or "should be construed as inclusive, i.e., including many elements or at least one element in the list of elements, but also including more than one element and optionally additional unlisted items. Only the opposite terms such as "only one" or "exactly one" or when used in the claims, "consisting of …" means that exactly one element of the many elements or list of elements is included. In general, when preceded by exclusive terms such as "any of …", "one of …", "only one of …" or "exactly one of …", the term "or" as used herein should be interpreted merely to indicate the only alternative (i.e., "one or the other, but not both"). As used in the claims, "consisting essentially of …" shall have the ordinary meaning as used in the patent law art.

As used herein in the specification and claims, the phrase "at least one" with respect to a list of one or more elements should be understood to mean at least one element selected from any one or more elements in the list of elements, but not necessarily including at least one element of each element specifically listed within the list of elements, and not excluding any combination of elements in the list of elements. This definition also allows that elements may optionally be present other than those specifically identified within the list of elements to which the phrase "at least one" refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, in one embodiment, "at least one of a and B" (or equivalently, "at least one of a or B," or equivalently "at least one of a and/or B") may refer to at least one optionally comprising more than one a, absent B (and optionally comprising elements other than B); in another embodiment, it may refer to at least one optionally comprising more than one B, absent a (and optionally comprising elements other than a); in yet another embodiment, it may refer to at least one optionally comprising more than one a, and optionally comprising at least one of more than one B (and optionally comprising other elements); etc.

It should also be understood that in any method claimed herein that comprises more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited, unless explicitly stated to the contrary.

Claims

1. A gene editing system for gene editing of a hydroxy acid oxidase 1 (HAO 1) gene, the gene editing system comprising

(i) A Cas12i2 polypeptide or a first nucleic acid encoding the Cas12i2 polypeptide, wherein the Cas12i2 polypeptide comprises an amino acid sequence that is at least 95% identical to SEQ ID No. 922 and comprises one or more mutations relative to SEQ ID No. 922;

(ii) An RNA guide or a second nucleic acid encoding the RNA guide, wherein the RNA guide comprises a spacer sequence specific for a target sequence within the HAO1 gene, the target sequence being adjacent to a Protospacer Adjacent Motif (PAM) comprising a motif 5' -TTN-3' located 5' to the target sequence.

2. The gene editing system of claim 1, wherein the one or more mutations in the Cas12I2 polypeptide are located at positions D581, G624, F626, P868, I926, V1030, E1035, and/or S1046 of SEQ ID No. 922.

3. The gene editing system of claim 1 or claim 2, wherein the one or more mutations is an amino acid substitution, optionally D581R, G624R, F626R, P868T, I926R, V1030G, E1035R, S1046G or a combination thereof.

4. The gene-editing gene editing system of claim 3 wherein the Cas12i2 polypeptide comprises:

(i) Mutations at positions D581, D911, I926 and V1030, optionally amino acid substitutions of D581R, D911R, I926R and V1030G;

(ii) Mutations at positions D581, I926 and V1030, optionally amino acid substitutions of D581R, I926R and V1030G;

(iii) Mutations at positions D581, I926, V1030 and S1046, optionally amino acid substitutions of D581R, I926R, V G and S1046G;

(iv) Mutations at positions D581, G624, F626, I926, V1030, E1035 and S1046, optionally amino acid substitutions of D581R, G624R, F626R, I926R, V1030G, E1035R and S1046G; or (b)

(v) Mutations at positions D581, G624, F626, P868, I926, V1030, E1035 and S1046, optionally amino acid substitutions of D581R, G624R, F626R, P868T, I926R, V1030G, E1035R and S1046G.

5. The gene editing system of claim 1, wherein the Cas12i2 polypeptide comprises the amino acid sequence of SEQ ID No. 923, 924, 925, 926, or 927, optionally wherein the Cas12i2 polypeptide comprises the amino acid sequence of SEQ ID No. 924 or 927.

6. The gene editing system of any one of claims 1 to 5, comprising the first nucleic acid encoding the Cas12i2 polypeptide.

7. The gene editing system of claim 6 wherein the first nucleic acid is messenger RNA (mRNA).

8. The gene editing system of claim 7, wherein the first nucleic acid is contained in a viral vector, optionally an adeno-associated virus (AAV) vector.

9. The gene editing system of any of claims 1 to 8, wherein the target sequence is within exon 1 or exon 2 of the HAO1 gene.

10. The gene editing system of claim 9, wherein the target sequence comprises:

(i)5'-CAAAGTCTATATATGACTAT-3'(SEQ ID NO:1025)；

(ii)5'-GGAAGTACTGATTTAGCATG-3'(SEQ ID NO:1026)；

(iii)5'-TAGATGGAAGCTGTATCCAA-3'(SEQ ID NO:1046)；

(iv) 5'-CGGAGCATCCTTGGATACAG-3' (SEQ ID NO: 1047); or (b)

(v)5'-AGGACAGAGGGTCAGCATGC-3(SEQ ID NO:1052)。

11. The system of claim 10, wherein the spacer sequence comprises:

(i)5'-CAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:1093)；

(ii)5'-GGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:1094)；

(iii)5'-UAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:1095)；

(iv) 5'-CGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 1096); or (b)

(v)5'-AGGACAGAGGGUCAGCAUGC-3(SEQ ID NO:1097)。

12. The gene editing system of any of claims 1 to 11, wherein the spacer sequence is 20-30 nucleotides in length, optionally wherein the spacer is 20 nucleotides in length.

13. The gene editing system of any of claims 1 to 12, wherein the RNA guide comprises the spacer and direct repeat sequence.

14. The gene editing system of claim 13 wherein the direct repeat sequence is 23-36 nucleotides in length.

15. The gene editing system of claim 14 wherein the direct repeat sequence is at least 90% identical to any one of SEQ ID NOs 1-10 or fragments thereof of at least 23 nucleotides in length.

16. The gene editing system of claim 15 wherein the direct repeat is any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length.

17. The gene editing system of claim 16 wherein the direct repeat is 5'-AGAAAUCCGUCUUUCAUUGACGG-3' (SEQ ID NO: 10).

18. The gene editing system of claim 1, wherein the RNA guide comprises the nucleotide sequence of:

(i)5'-AGAAAUCCGUCUUUCAUUGACGGCAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:967)；

(ii)5'-AGAAAUCCGUCUUUCAUUGACGGGGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:968)；

(iii)5'-AGAAAUCCGUCUUUCAUUGACGGUAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:988)；

(iv) 5'-AGAAAUCCGUCUUUCAUUGACGGCGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 989); or (b)

(v)5'-AGAAAUCCGUCUUUCAUUGACGGAGGACAGAGGGUCAGCAUGC-3'(SEQ ID NO:994)。

19. The gene editing system of any of claims 1 to 18, wherein the system comprises the second nucleic acid encoding the RNA guide.

20. The gene editing system of claim 19 wherein the nucleic acid encoding the RNA guide is located in a viral vector.

21. The gene editing system of any of claims 7 to 20, wherein the viral vector comprises both the first nucleic acid encoding the Cas12i2 polypeptide and the second nucleic acid encoding the RNA guide.

22. The gene editing system of any one of claims 1 to 20, wherein the system comprises the first nucleic acid encoding the Cas12i2 polypeptide in a first vector, and wherein the system comprises the second nucleic acid encoding the RNA guide in a second vector; optionally wherein the first vector and/or the second vector is a viral vector.

23. The gene editing system of claim 22 wherein the first vector and the second vector are the same vector.

24. The gene editing system of any of claims 1 to 23, wherein the system comprises one or more Lipid Nanoparticles (LNPs) that encompass (i), (ii), or both.

25. The gene editing system of claim 24, wherein the system comprises the LNP encompassing (i), and wherein the system comprises a viral vector comprising the second nucleic acid encoding the RNA guide; optionally wherein the viral vector is an AAV vector.

26. The gene editing system of claim 24, wherein the system comprises the LNP comprising (ii), and wherein the system comprises a viral vector comprising the first nucleic acid encoding Cas12i2 polypeptide; optionally wherein the viral vector is an AAV vector.

27. A gene editing system for gene editing of a hydroxy acid oxidase 1 (HAO 1) gene, the gene editing system comprising

(i) A Cas12i polypeptide or a first nucleic acid encoding the Cas12i polypeptide, optionally wherein the Cas12i polypeptide is a Cas12i2 polypeptide;

(ii) An RNA guide or a second nucleic acid encoding the RNA guide, wherein the RNA guide comprises a spacer sequence specific for a target sequence within exon 1 or exon 2 of the HAO1 gene, the target sequence being adjacent to a Protospacer Adjacent Motif (PAM) comprising a motif of 5' -TTN-3' located 5' to the target sequence.

28. The gene editing system of claim 27, wherein the target sequence comprises:

(i)5'-CAAAGTCTATATATGACTAT-3'(SEQ ID NO:1025)；

(ii)5'-GGAAGTACTGATTTAGCATG-3'(SEQ ID NO:1026)；

(iii)5'-TAGATGGAAGCTGTATCCAA-3'(SEQ ID NO:1046)；

(iv) 5'-CGGAGCATCCTTGGATACAG-3' (SEQ ID NO: 1047); or (b)

(v)5'-AGGACAGAGGGTCAGCATGC-3(SEQ ID NO:1052)。

29. The gene editing system of claim 27, wherein the spacer sequence comprises:

(i)5'-CAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:1093)；

(ii)5'-GGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:1094)；

(iii)5'-UAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:1095)；

(iv) 5'-CGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 1096); or (b)

(v)5'-AGGACAGAGGGUCAGCAUGC-3(SEQ ID NO:1097)。

30. The gene editing system of any of claims 27 to 29, comprising the first nucleic acid encoding the Cas12i polypeptide.

31. The gene editing system of claim 30 wherein the first nucleic acid is messenger RNA (mRNA).

32. The gene editing system of claim 30, wherein the first nucleic acid is comprised in a viral vector, optionally an adeno-associated virus (AAV) vector.

33. The gene editing system of any of claims 27 to 32, wherein the spacer is 20-30 nucleotides in length, optionally wherein the spacer is 20 nucleotides in length.

34. The gene editing system of any of claims 27 to 33, wherein the RNA guide comprises the spacer sequence and a direct repeat sequence.

35. The gene editing system of claim 34 wherein the direct repeat sequence is 23-36 nucleotides in length.

36. The gene editing system of claim 35 wherein the direct repeat sequence is at least 90% identical to any one of SEQ ID NOs 1-10 or fragments thereof of at least 23 nucleotides in length.

37. The gene editing system of claim 36 wherein the direct repeat is any of SEQ ID NOs 1-10 or fragments thereof of at least 23 nucleotides in length.

38. The gene editing system of claim 37 wherein the direct repeat is 5'-AGAAAUCCGUCUUUCAUUGACGG-3' (SEQ ID NO: 10).

39. The gene editing system of claim 34 wherein the RNA guide comprises the nucleotide sequence of:

(i)5'-AGAAAUCCGUCUUUCAUUGACGGCAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:967)；

(ii)5'-AGAAAUCCGUCUUUCAUUGACGGGGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:968)；

(iii)5'-AGAAAUCCGUCUUUCAUUGACGGUAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:988)；

(iv) 5'-AGAAAUCCGUCUUUCAUUGACGGCGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 989); or (b)

(v)5'-AGAAAUCCGUCUUUCAUUGACGGAGGACAGAGGGUCAGCAUGC-3'(SEQ ID NO:994)。

40. The gene editing system of any of claims 27 to 39 wherein the system comprises the second nucleic acid encoding the RNA guide.

41. The gene editing system of claim 40 wherein the nucleic acid encoding the RNA guide is located in a viral vector.

42. The gene editing system of any of claims 32 to 41 wherein the viral vector comprises both the first nucleic acid encoding the Cas12i2 polypeptide and the second nucleic acid encoding the RNA guide.

43. The gene editing system of any of claims 27 to 42, wherein the system comprises the first nucleic acid encoding the Cas12i2 polypeptide in a first vector, and wherein the system comprises the second nucleic acid encoding the RNA guide in a second vector.

44. The gene editing system of any of claims 27 to 43, wherein the system comprises one or more Lipid Nanoparticles (LNPs) that encompass (i), (ii), or both.

45. The gene editing system of claim 44 wherein the system comprises the LNP encompassing (i), and wherein the system comprises a viral vector comprising the second nucleic acid encoding the RNA guide; optionally wherein the viral vector is an AAV vector.

46. The gene editing system of claim 44 wherein the system comprises the LNP comprising (ii), and wherein the system comprises a viral vector comprising the first nucleic acid encoding a Cas12i2 polypeptide; optionally wherein the viral vector is an AAV vector.

47. A pharmaceutical composition comprising the gene editing system of any one of claims 1 to 46.

48. A kit comprising elements (i) and (ii) of the gene editing system of any one of claims 1 to 46.

49. A method for editing a hydroxy acid oxidase 1 (HAO 1) gene in a cell, the method comprising contacting a host cell with the gene editing system of any one of claims 1 to 46 for editing the HAO1 gene to perform gene editing of the HAO1 gene in the host cell.

50. The method of claim 49, wherein the host cell is cultured in vitro.

51. The method of claim 49, wherein the contacting step is performed by administering the system for editing the HAO1 gene to a subject comprising the host cell.

52. A cell comprising a disrupted hydroxy acid oxidase 1 (HAO 1) gene, wherein the cell is optionally produced by contacting a host cell with the gene editing system of any one of claims 1 to 46 to gene edit the HAO1 gene in the host cell, thereby disrupting the HAO1 gene.

53. A method for treating Primary Hyperoxaluria (PH) in a subject, the method comprising administering to a subject in need thereof the gene editing system of any one of claims 1 to 46 or the cell of claim 52 for editing a hydroxy acid oxidase 1 (HAO 1) gene.

54. The method of claim 53, wherein the subject is a human patient having a pH, optionally pH1, pH2 or pH3.

55. The method of claim 54, wherein the pH is pH1.

56. An RNA guide comprising (i) a spacer sequence specific for a target sequence in a hydroxy acid oxidase 1 (HAO 1) gene, wherein the target sequence is adjacent to a Protospacer Adjacent Motif (PAM) located 5' of the target sequence comprising a motif of 5' -TTN-3 '; and (ii) a direct repeat sequence.

57. The RNA guide of claim 56, wherein the spacer is 20-30 nucleotides in length, optionally 20 nucleotides in length.

58. The RNA guide of claim 56 or claim 57, wherein the direct repeat is 23-36 nucleotides in length, optionally 23 nucleotides in length.

59. The RNA guide of any one of claims 56 to 58, wherein the target sequence is within exon 1 or exon 2 of the HAO1 gene.

60. The RNA guide of claim 59, wherein the target sequence comprises:

(i)5'-CAAAGTCTATATATGACTAT-3'(SEQ ID NO:1025)；

(ii)5'-GGAAGTACTGATTTAGCATG-3'(SEQ ID NO:1026)；

(iii)5'-TAGATGGAAGCTGTATCCAA-3'(SEQ ID NO:1046)；

(iv) 5'-CGGAGCATCCTTGGATACAG-3' (SEQ ID NO: 1047); or (b)

(v)5'-AGGACAGAGGGTCAGCATGC-3(SEQ ID NO:1052)。

61. The RNA guide of claim 60, wherein the spacer sequence comprises:

(i)5'-CAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:1093)；

(ii)5'-GGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:1094)；

(iii)5'-UAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:1095)；

(iv) 5'-CGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 1096); or (b)

(v)5'-AGGACAGAGGGUCAGCAUGC-3(SEQ ID NO:1097)。

62. The RNA guide of any one of claims 56 to 61, wherein the direct repeat sequence is at least 90% identical to any one of SEQ ID NOs 1-10 or fragments thereof of at least 23 nucleotides in length.

63. The RNA guide of claim 62, wherein the direct repeat is any one of SEQ ID NOs 1-10 or a fragment thereof of at least 23 nucleotides in length.

64. The RNA guide of claim 63, wherein the direct repeat is 5'-AGAAAUCCGUCUUUCAUUGACGG-3' (SEQ ID NO: 10).

65. The RNA guide of claim 56, comprising the nucleotide sequence of:

(i)5'-AGAAAUCCGUCUUUCAUUGACGGCAAAGUCUAUAUAUGACUAU-3'(SEQ ID NO:967)；

(ii)5'-AGAAAUCCGUCUUUCAUUGACGGGGAAGUACUGAUUUAGCAUG-3'(SEQ ID NO:968)；

(iii)5'-AGAAAUCCGUCUUUCAUUGACGGUAGAUGGAAGCUGUAUCCAA-3'(SEQ ID NO:988)；

(iv) 5'-AGAAAUCCGUCUUUCAUUGACGGCGGAGCAUCCUUGGAUACAG-3' (SEQ ID NO: 989); or (b)

(v)5'-AGAAAUCCGUCUUUCAUUGACGGAGGACAGAGGGUCAGCAUGC-3'(SEQ ID NO:994)。