WO2022221278A1 - Compositions et procédés comprenant des promoteurs hybrides - Google Patents

Compositions et procédés comprenant des promoteurs hybrides Download PDF

Info

Publication number
WO2022221278A1
WO2022221278A1 PCT/US2022/024419 US2022024419W WO2022221278A1 WO 2022221278 A1 WO2022221278 A1 WO 2022221278A1 US 2022024419 W US2022024419 W US 2022024419W WO 2022221278 A1 WO2022221278 A1 WO 2022221278A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
seq
rna
vector
protein
Prior art date
Application number
PCT/US2022/024419
Other languages
English (en)
Inventor
Greg NACHTRAB
Original Assignee
Locanabio, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Locanabio, Inc. filed Critical Locanabio, Inc.
Publication of WO2022221278A1 publication Critical patent/WO2022221278A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • A61K48/0066Manipulation of the nucleic acid to modify its expression pattern, e.g. enhance its duration of expression, achieved by the presence of particular introns in the delivered nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/15Vector systems having a special element relevant for transcription chimeric enhancer/promoter combination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/42Vector systems having a special element relevant for transcription being an intron or intervening sequence for splicing and/or stability of RNA

Definitions

  • the disclosure is directed to molecular biology, gene therapy, and compositions and methods for modifying expression and activity of RNA molecules.
  • hybrid promoters comprising a modified intron, such as a UBB (Ubiquiton B) modified intron, were engineered.
  • the hybrid promoters disclosed herein are engineered to be significantly smaller in size than commonly used ubiquitous promoters - which typically range from 900 base pairs to 2 kilobases - and are capable of driving higher levels of expression than the standard commonly used promoters.
  • Such shortened hybrid promoters also allow for a vector capacity which accommodates larger therapeutic transgenes in gene therapy products.
  • the disclosure provides gene therapy compositions and methods comprising hybrid promoters of a significantly smaller size which are capable of driving high levels of sustained expression.
  • RNA-targeting gene therapy compositions and systems comprising such hybrid promoters, are provided herein.
  • compositions and methods comprising hybrid promoter sequences of significantly reduced sizes for sustained and robust levels of transgene expression.
  • hybrid promoters disclosed herein comprise core promoter sequences and modified UBB intron sequences. In another embodiment, hybrid promoters disclosed herein comprise modified enhancer sequences, core promoter sequences and modified UBB intron sequences.
  • a nucleic acid molecule comprising a hybrid promoter sequence, wherein the hybrid promoter comprises a promoter sequence in operable linkage with a UBB intron sequence.
  • a UBB intron sequence is from the 5’ non-coding flank of the human UBB gene.
  • the UBB intron sequence is a modified UBB intron sequence.
  • the modified UBB intron sequence is SEQ ID NO: 448.
  • the nucleic acid molecule comprising the hybrid promoter sequence further comprises an enhancer sequence in operable linkage with the core promoter sequence.
  • the enhancer sequence is a CMV enhancer sequence.
  • the CMV enhancer sequence is SEQ ID NO: 449.
  • the nucleic acid molecule comprising the hybrid promoter comprises a core promoter sequence which is human EF-1 alpha (EFS) promoter sequence.
  • EFS human EF-1 alpha
  • the EFS promoter sequence is SEQ ID NO: 447.
  • the nucleic acid molecule comprising the hybrid promoter is a eCMV-EFS(core)-UBB(intron) sequence of SEQ ID: 450. In another embodiment, the nucleic acid molecule comprising the hybrid promoter is a EFS(core)-UBB(intron) sequence of SEQ ID 451. [014] Disclosed herein is a vector comprising a hybrid promoter sequence comprising a promoter sequence in operable linkage with a UBB intron sequence.
  • a vector comprising a hybrid promoter sequence comprising a core promoter sequence in operable linkage with a UBB intron sequence.
  • the hybrid promoter sequence is operably linked to a nucleic acid sequence or NOI encoding a transgene. In one embodiment, the hybrid promoter sequence is operably linked to a nucleic acid sequence or NOI encoding a therapeutic transgene.
  • the vector is a viral vector.
  • the viral vector is an adeno-associated viral (AAV) vector.
  • the AAV vector is AAV2, AAV8, AAVrh.8, AAV9, AAVrhlO, or AAVrh.74.
  • the AAV vector comprises AAV2, AAV8, AAVrh.8, AAV9, AAVrhlO, or AAVrh.74 capsid proteins.
  • the AAV vector is an AAV engineered capsid vector.
  • the vector is selected from the group consisting of: adeno-associated virus (AAV), retrovirus, lentivirus, adenovirus, nanoparticle, micelle, liposome, lipoplex, polymersome, polyplex, and dendrimer.
  • AAV adeno-associated virus
  • retrovirus retrovirus
  • lentivirus lentivirus
  • adenovirus nanoparticle
  • micelle micelle
  • liposome lipoplex
  • polymersome lipoplex
  • polyplex polymersome
  • dendrimer dendrimer
  • the therapeutic transgene in the vector comprises a nucleic acid sequene or NOI encoding an RNA-binding protein.
  • the RNA- binding protein is selected from the group consisting of CRISPR/Cas protein, PUF protein, and PUMBY protein.
  • the RNA-binding protein is a CRISPR/Cas protein, and wheriein the nucleic acid sequence further encodes a single guide RNA which is capable of hybridizing to i) the CRISPR/Cas and ii) a target RNA sequence of interest.
  • the CRISPR/Cas protein is catalytically inactive.
  • the CRISPR/Cas protein is a Casl3d protein.
  • the RNA- binding protein is fused to an effector protein.
  • the effector protein is an endonuclease.
  • the endonuclease is a nuclease domain of a ZC3H12A zinc-finger endonuclease.
  • the ZC3H12A zinc finger nuclease comprises SEQ ID NO: 358 or SEQ ID NO: 359.
  • the RNA- binding protein is a PUF or PUMBY protein.
  • the PUF or PUMBY protein is a human PUF or PUMBY protein.
  • the PUF or PUMBY protein is fused a nuclease domain of a ZC3H12A zinc-finger endonuclease.
  • the therapeutic transgene comprises one or more signal sequences selected from the group consisting of a nuclear localization sequence (NLS), and a nuclear export sequence (NES).
  • hybrid promoter comprising the nucleic acid sequence set forth in SEQ ID NO: 450 or SEQ ID NO: 451
  • a cell comprising a vector comprising the hybrid promoter comprising a core promoter sequence in operable linkage with a UBB intron sequence.
  • a method of expressing a transgene in a cell comprising contacting a cell with any of the preceding hybrid promoter sequences in operable linkage with a transgene of interest or NOI.
  • FIG. 1 shows EFS core promoter analysis of GC, TATA and TCT regions of EFS.
  • FIG. 2 shows insertion rationale of 5’ UTR introns: SV40 intron, modified SV40 intron, MVM intron, UBB intron and EF1 alpha intron.
  • FIG. 3 shows results of a luciferase assay which demonstrates exemplary embodiments of the hybrid promoters disclosed herein. Results show the EFS(core) promoter as compared to the control (tCAG) and other EFS test constructs.
  • FIG. 4 shows results of a luciferase assay which demonstrates exemplary embodiments of the hybrid promoters disclosed herein. Results show the EFS(core) compared to EFS(core)-UBB(intron).
  • FIG. 5 shows results of a luciferase assay which demonstrates exemplary embodiments of the hybrid promoters disclosed herein. Results show the EFS(core) compared to EFS(core)-UBB(intron) and eCMV-EFS(core)-UBB(intron).
  • FIG. 6 shows a Biorad Chemidoc image demonstrating expression of EFS(core)- UBB(intron) expression of a GOI (gene of interest) as compared to controls (tCAG driving expression of GOI).
  • FIG. 7 shows RNA-FISH results demonstrating expression of a gene of interest (GOI) in brain tissue using the hybrid promoter EFS(core)-UBB(intron) as compared to the control tCAG.
  • GOI is a PUF-E17 construct.
  • the disclosure provides gene therapy compositions comprising strong ubiquitous hybrid promoters of a size smaller than most common promoters.
  • compositions comprising nucleic acid molecules, and vectors comprising hybrid promoters.
  • a “promoter” is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors.
  • a “hybrid promoter” is a promoter comprised of two or more genetic elements.
  • the two or more genetic elements are heterologous (i.e. derived from different genetic loci).
  • hybrid promoter disclosed herein is operably linked to a nucleic acid sequence of interest (NOI).
  • NOI nucleic acid sequence of interest
  • the terminology "operably linked” is intended to mean that the hybrid promoter is linked to an NOI in a manner permitting expression of the nucleotide sequence in, for example, a host cell when the vector is introduced into (or in contact with) the host cell.
  • An NOI includes, without limitation, any nucleotide sequence or transgene capable of being delivered by a vector.
  • NOIs can be synthetic, derived from naturally occurring DNA or RNA, codon optimized, recombinant RNA/DNA, cDNA, partial genomic DNA, and/or combinations thereof.
  • the NOI can be a coding region or partial coding region, but need not be a coding region.
  • An NOI can be RNA/DNA in a sense or anti-sense orientation.
  • NOIs are also referred herein, without limitation, as transgenes, heterologous sequences, genes, therapeutic genes.
  • An NOI may also encode a POI (protein of interest), a partial POI, a mutated version or variant of a POI.
  • a POI may be analogous to or correspond to a wild-type protein.
  • a POI may also be a fusion protein or nucleoprotein complex such as a CRISPR/Cas nucleoprotein complex.
  • AAV adeno-associated virus
  • packaging capacity is only 4.8 kilobases (kb). Therefore, to maximize the potential of AAV as a delivery vector, short hybrid promoters ranging from about 371 base pairs (bp) to about 675 base pairs (bp) were generated. These promoters are significantly shorter than commonly used ubiquitous promoters which range from about 900 bp to about 2,000 bp (or 2 kb). These shorter hybrid promoters have been engineered to drive higher levels of expression than the standard promoters from which they are derived.
  • EFS short intron-less EF-1 alpha
  • SEQ ID NO: 453 standard ubiquitous promoter
  • the standard EFS promoter while smaller in size (about 250 bp) than other standard ubiquitous promoters (which are typically ⁇ 1 kb in length), is weak and in many cases does not drive sufficient expression for a therapeutic effect.
  • Short hybrid promoters afford the packaging of larger transgenes than transgenes driven by the larger standard promoters such as EFS, while still providing robust and sustained expression of the transgene insert (or NOI). Accordingly, these short hybrid promoters permit the packaging of larger transgenes due to the smaller sized hybrid promoters disclosed herein.
  • the invention disclosed herein is comprised of the following three primary embodiments used alone or in various combinations: 1) a core promoter region, 2) a modified intron, and/or 3) a modified enhancer region.
  • the invention disclosed herein provides 1) a core promoter region, 2) a Ubiquitin B (UBB) modified intron, and/or 3) a modified enhancer region.
  • the invention disclosed herein provides 1) an EFS core promoter, 2) a Ubiquitin B (UBB) modified intron, and/or 3) a modified enhancer region.
  • the invention disclosed herein provides 1) an EFS core promoter, 2) a Ubiquitin B (UBB) modified intron, and/or 3) a modified CMV enhancer region.
  • the invention disclosed herein provides 1) an EFS core promoter comprising a core promoter sequence comprising or consisting of SEQ ID NO: 447, 2) a Ubiquitin B (UBB) modified intron comprising or consisting of SEQ ID NO: 448, and/or 449) a modified CMV enhancer region comprising a modified enhancer sequence comprising or consisting of SEQ ID NO: 449.
  • the hybrid promoter disclosed herein comprises from 5’ to 3’ eCMV-EFS(core)-UBB(intron).
  • the hybrid promoter comprising from 5’ to 3’ eCMV-EFS(core)-UBB(intron) is SEQ ID NO: 450 and is about 675 bp in length.
  • the hybrid promoter from 5’ to 3’ eCMV-EFS(core)- UBB(intron) is operably linked to a NOI or transgene.
  • the hybrid promoter disclosed herein comprises from 5’ to 3’ EFS(core)-UBB(intron).
  • the hybrid promoter comprising from 5’ to 3’ EFS(core)-UBB(intron) is SEQ ID NO: 451 and is about 353 bp in length.
  • the hybrid promoter comprising from 5’ to 3’ EFS(core)-UBB(intron) is operably linked to a NOI or transgene.
  • the hybrid promoter elements are not in direct operable linkage.
  • the hybrid promoter elements and transgene from 5’ to 3’ comprises EFS(core)- UBB(intron)-transgene-eCMV.
  • the hybrid promoter elements and transgene from 5’ to 3’ comprises EFS(core)-transgene-UBB(intron), wherein the UBB(intron) is in the 3’UTR of the transgene.
  • the hybrid promoter elements and transgene from 5’ to 3’ comprises eCMV-EFS(core)-transgene-UBB(intron), wherein the UBB(intron) is in the 3’UTR of the transgene.
  • the hybrid promoter elements and transgene from 5’ to 3’ comprises EFS(core)-transgene-UBB(intron)- eCMV, wherein the UBB(intron) is in the 3’UTR of the transgene.
  • the invention disclosed herein provides a hybrid promoter comprising a core EFS promoter sequence which has improved strength and increased expression relative to the standard EFS promoter.
  • the invention disclosed herein provides a hybrid promoter comprising a core EFS promoter sequence which has improved responsiveness and therefore a larger dynamic range than the standard EFS promoter.
  • the invention disclosed herein provides a hybrid promoter comprising a core EFS promoter sequence which has a smaller size than the standard EFS promoter.
  • a hybrid promoter comprising a core EFS promoter sequence and a modified UBB intron sequence is under 400 bp, is approximately 3 times stronger than a standard EFS promoter and has the capability to increase expression with increased dose.
  • a hybrid promoter comprising a core EFS promoter sequence, a modified UBB intron sequence, and a modified enhancer sequence is under 700 bp, is approximately 5 times stronger than a standard EFS promoter, and has the capability to increase expression with increased dose.
  • a “core promoter region” as disclosed herein is a minimal region comprising a promoter sequence or “core promoter sequence” which is derived from a corresponding standard promoter sequence which retains the capability of recruiting RNA Polymerase II for RNA transcription.
  • a core promoter region is identified in the standard EFS promoter.
  • the EFS core promoter region is about 212 bp and is a nucleic acid sequence comprising or consisting of SEQ ID NO: 447.
  • the EFS core promoter disclosed herein recruits RNA Polymerase II for RNA transcription.
  • the core promoter sequence is optimized or modified in the initiator, DPE (downstream promoter element), TATA box and/or GC box regions.
  • the standard EFS TATA box sequence which is TATATAAG is mutated to TATAAAG or TATATAAAG. These mutated TATA sequences incorporate higher affinity sites.
  • a core promoter region is identified in a standard (wild- type) ubiquitous promoter of interest.
  • a core promoter is identified in a standard ubiquitous promoter of interest selected from the group consisting of EFS, tCAG, CBA, CBh, CMV, UBC, GRK1 and PGK.
  • a “modified intron” or “modified intron sequence” as disclosed herein is derived from an intron present in an abundantly expressed gene and altered via truncation and/or mutation in such a manner so as increase the altered intron’ s capability to enhance and/or sustain transcription and nuclear export of a transcript.
  • a modified intron sequence is derived from a human human UBB (Ubiquitin B) gene.
  • the UBB gene comprises a first intron which is 717 bp in length and which contains 2 ATG and 9 alternative start codons.
  • a modified UBB intron comprises mutations in both ATG and all 9 alternative start codons.
  • a modified UBB intron comprises mutations in the 2 ATG sequences only. This modified intron comprising 2 ATG mutations perserves more of its transcription factor binding sites.
  • a modified intron is a truncated and altered version of the first 717 bp intron of the human UBB (Ubiquitin B) gene.
  • the UBB modified intron is about 130 bp in length and is a nucleic acid sequence comprising or consisting of SEQ ID NO: 448.
  • UBB is abundantly expressed and the first intron of the UBB gene is known to be a key regulator of its expression.
  • the modified intron is added to the 5’ UTR region of the hybrid promoter.
  • the modified intron is a truncated sequence of the corresponding intron.
  • the modified intron is a mutated sequence of the corresponding intron.
  • the modified intron is a truncated and mutated sequence of the corresponding intron.
  • a modified intron is derived from an intron of a gene which is abundantly expressed and/or the intron is known to be a key regulator of the expression of the corresponding gene of the intron.
  • a modified intron as disclosed herein is selected from the group consisting of UBB, SV40 (Simian Virus 40), modified SV40, MVM (Minute Virus of Mice), EF1 alpha, and UBC (Ubiquitin C).
  • a modified intron comprises mutations of start and alternative start codons.
  • An “enhancer sequence” as disclosed herein is an enhancer region derived from enhancers known to aid in the recruitment of RNA Polymerase II.
  • an enhancer sequence is a regulatory element that increases the expression of a target sequence.
  • a “modified enhancer region” or “modified enhancer sequence” as disclosed herein is an enhancer region derived from enhancers known to aid in the recruitment of RNA Polymerase II and altered via truncation and/or mutation in such a manner so as to improve recruitment of RNA Polymerase II as compared to an unmodified enhancer region.
  • a modified enhancer is a truncated version of the CMV enhancer region.
  • the CMV modified enhancer is truncated at the 5’ end of the CMV enhancer region and is about 300 bp in length.
  • the CMV modified enhancer sequence is a nucleotide sequence comprising or consisting of SEQ ID NO: 449.
  • a modified enhancer sequence is added to a core promoter sequence to form a hybrid promoter and/or it is added to the hybrid promoter comprising a core promoter sequence and a modified intron sequence.
  • the modified enhancer region is a putative enhancer around the UBB gene.
  • RNA-targeting gene therapy construct comprises a hybrid promoter in operable linkage with a nucleic acid sequence encoding an RNA-targeting Casl3d system or a catalytically inactive (dead) Casl3d (dCasl3d) system.
  • an RNA-targeting gene therapy construct comprising a hybrid promoter disclosed herein in operable linkage with a nucleic acid sequence encoding a PUF-E17 (PUF fused to a nuclease domain of the ZC3H12A zinc finger endonuclease also referred to herein as El 7) or PUMBY-E17.
  • a an RNA-targeting gene therapy construct comprises a hybrid promoter in operable linkage with a nucleic acid sequence encoding an RNA-targeting PUF or PUMBY protein.
  • a gene therapy construct comprising a EFS(core) disclosed herein is in operable linkage with a gene of interest.
  • an RNA-targeting gene therapy construct comprises an EFS(core) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding an RNA-targeting Casl3d system or a catalytically inactive (dead) Casl3d (dCasl3d) system.
  • an RNA-targeting gene therapy construct comprising a EFS(core) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding a PUF-E17 (PUF fused to a nuclease domain of the ZC3H12A zinc finger endonuclease also referred to herein as E17) or PUMBY-E17.
  • a gene therapy construct comprising a EFS(core)-UBB(intron) disclosed herein is in operable linkage with a gene of interest.
  • an RNA- targeting gene therapy construct comprises an EFS(core)-UBB(intron) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding an RNA-targeting Casl3d system or a catalytically inactive (dead) Casl3d (dCasl3d) system.
  • an RNA-targeting gene therapy construct comprising a EFS(core)-UBB(intron) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding a PUF-E17 (PUF fused to a nuclease domain of the ZC3H12A zinc finger endonuclease also referred to herein as El 7) or PUMBY-E17.
  • a gene therapy construct comprising a eCMV-EFS(core)- UBB(intron) promoter disclosed herein is in operable linkage with a gene of interest.
  • an RNA-targeting gene therapy construct comprises an eCMV-EFS(core)- UBB(intron) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding an RNA-targeting Casl3d system or a catalytically inactive (dead) Casl3d (dCasl3d) system.
  • an RNA-targeting gene therapy construct comprising a eCMV-EFS(core)-UBB(intron) promoter disclosed herein in operable linkage with a nucleic acid sequence encoding a PUF-E17 (PUF fused to a nuclease domain of the ZC3H12A zinc finger endonuclease also referred to herein as E17) or PUMBY-E17.
  • NOIs used in operable linkage with the hybrid promoters disclosed herein can be selected from any gene or transgene of interest.
  • an NOI or transgene comprising Guide RNAs for RNA-Guided RNA-Binding Proteins
  • an NOI or transgene comprises a guide RNA.
  • guide RNA gRNA
  • sgRNA single guide RNA
  • Guide RNAs (gRNAs) of the disclosure may comprise of a spacer sequence and a
  • a guide RNA is a single guide RNA (sgRNA) comprising a contiguous spacer sequence and DR sequence. In some embodiments, the spacer sequence and the DR sequence are not contiguous. In some embodiments, the gRNA comprises a DR sequence.
  • DR sequences refer to the repetitive sequences in the CRISPR locus (naturally-occurring in a bacterial genome or plasmid) that are interspersed with the spacer sequences. It is well known that one would be able to infer the DR sequence of a corresponding (or cognate) Cas protein if the sequence of the associated CRISPR locus is known.
  • a guide RNA comprises a direct repeat (DR) sequence and a spacer sequence.
  • a sequence encoding a guide RNA or single guide RNA of the disclosure comprises or consists of a spacer sequence and a DR sequence, that are separated by a linker sequence.
  • the linker sequence may comprise or consist of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or any number of nucleotides (nt) in between.
  • the linker sequence may comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or any number of nucleotides in between.
  • the DR sequence is a Casl3d DR sequence.
  • the gRNA that hybridizes with the one or more target RNA molecules in a Cas 13d-mediated manner includes one or more direct repeat (DR) sequences, one or more spacer sequences, such as, e.g., one or more sequences comprising an array of DR-spacer-DR-spacer.
  • DR direct repeat
  • spacer sequences such as, e.g., one or more sequences comprising an array of DR-spacer-DR-spacer.
  • a plurality of gRNAs are generated from a single array, wherein each gRNA can be different, for example target different RNAs or target multiple regions of a single RNA, or combinations thereof.
  • an isolated gRNA includes one or more direct repeat sequences, such as an unprocessed (e.g., about 36 nt) or processed DR (e.g., about 30 nt).
  • a gRNA can further include one or more spacer sequences specific for (e.g., is complementary to) the target RNA.
  • multiple polIII promoters can be used to drive multiple gRNAs, spacers and/or DRs.
  • a guide array comprises a DR (about 36nt)-spacer (about 30nt)-DR (about 36nt)-spacer (about 30nt).
  • Guide RNAs (gRNAs) of the disclosure may comprise non-naturally occurring nucleotides.
  • a guide RNA of the disclosure or a sequence encoding the guide RNA comprises or consists of modified or synthetic RNA nucleotides.
  • Exemplary modified RNA nucleotides include, but are not limited to, pseudouridine (Y), dihydrouridine (D), inosine (I), and 7-methylguanosine (m7G), hypoxanthine, xanthine, xanthosine, 7- methylguanine, 5, 6-Dihydrouracil, 5-methylcytosine, 5-methylcytidine, 5- hydropxymethylcytosine, isoguanine, and isocytosine.
  • Guide RNAs (gRNAs) of the disclosure may bind modified RNA within a target sequence.
  • guide RNAs (gRNAs) of the disclosure may bind modified or mutated (e.g., pathogenic) RNA.
  • exemplary epigenetically or post- transcriptionally modified RNA include, but are not limited to, 2’-0-Methylation (2’-OMe) (2’-0-methylation occurs on the oxygen of the free 2’-OH of the ribose moiety), N6- methyladenosine (m6A), and 5-methylcytosine (m5C).
  • a guide RNA of the disclosure comprises at least one sequence encoding a non-coding C/D box small nucleolar RNA (snoRNA) sequence.
  • the snoRNA sequence comprises at least one sequence that is complementary to the target RNA, wherein the target sequence of the RNA molecule comprises at least one 2’-OMe.
  • the snoRNA sequence comprises at least one sequence that is complementary to the target RNA, wherein the at least one sequence that is complementary to the target RNA comprises a box C motif (RUGAUGA) and a box D motif (CUGA).
  • Spacer sequences of the disclosure bind to the target sequence of an RNA molecule. In some embodiments, spacer sequences of the disclosure bind to pathogenic target RNA. [067] In some embodiments of the compositions of the disclosure, the sequence comprising the gRNA further comprises a spacer sequence that specifically binds to the target RNA sequence. In some embodiments, the spacer sequence has at least 50%, 55%, 60%,
  • the spacer sequence has 100% complementarity to the target RNA sequence.
  • the spacer sequence comprises or consists of 20 nucleotides. In some embodiments, the spacer sequence comprises or consists of 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, or 29 nucleotides. In some embodiments, the spacer sequence comprises or consists of 26 nucleotides.
  • the spacer sequence is non-processed and comprises or consists of 30 nucleotides. In some embodiments the non-processed spacer sequence comprises or consists of 30-36 nucleotides.
  • DR sequences of the disclosure bind the Cas polypeptide of the disclosure. Upon binding of the spacer sequence of the gRNA to the target RNA sequence, the Cas protein bound to the DR sequence of the gRNA is positioned at the target RNA sequence.
  • a DR sequence having sufficient complementarity to its cognate Cas protein, or nucleic acid thereof binds selectively to the target nucleic acid sequence of the Cas protein and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96, 97%, 98%, 99%, or any percentage identity in between to the sequence.
  • a sequence having sufficient complementarity has 100% identity.
  • DR sequences of the disclosure comprise a secondary structure or a tertiary structure. Exemplary secondary structures include, but are not limited to, a helix, a stem loop, a bulge, a tetraloop and a pseudoknot.
  • Exemplary tertiary structures include, but are not limited to, an A-form of a helix, a B-form of a helix, and a Z-form of a helix.
  • Exemplary tertiary structures include, but are not limited to, a twisted or helicized stem loop.
  • Exemplary tertiary structures include, but are not limited to, a twisted or helicized pseudoknot.
  • DR sequences of the disclosure comprise at least one secondary structure or at least one tertiary structure.
  • DR sequences of the disclosure comprise one or more secondary structure(s) or one or more tertiary structure(s).
  • a guide RNA or a portion thereof selectively binds to a tetraloop motif in an RNA molecule of the disclosure.
  • a target sequence of an RNA molecule comprises a tetraloop motif.
  • the tetraloop motif is a “GRNA” motif comprising or consisting of one or more of the sequences of GAAA, GUGA, GCAA or GAGA.
  • a guide RNA or a portion thereof that binds to a target sequence of an RNA molecule hybridizes to the target sequence of the RNA molecule.
  • a guide RNA or a portion thereof that binds to a first RNA binding protein or to a second RNA binding protein covalently binds to the first RNA binding protein or to the second RNA binding protein.
  • a guide RNA or a portion thereof that binds to a first RNA binding protein or to a second RNA binding protein non-covalently binds to the first RNA binding protein or to the second RNA binding protein.
  • a guide RNA or a portion thereof comprises or consists of between 10 and 100 nucleotides, inclusive of the endpoints.
  • a spacer sequence of the disclosure comprises or consists of between 10 and 30 nucleotides, inclusive of the endpoints.
  • a spacer sequence of the disclosure comprises or consists of 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 nucleotides.
  • the spacer sequence of the disclosure comprises or consists of 20 nucleotides.
  • the spacer sequence of the disclosure comprises or consists of 21 nucleotides.
  • the spacer sequence of the disclosure comprises or consists of 26 nucleotides.
  • an unprocessed guide RNA is 36nt of DR followed by 30-32 nt of spacer.
  • the guide RNA is processed (truncated/modified) by Cas 13d itself or other RNases into the shorter "mature" form.
  • an unprocessed guide sequence is about, or at least about 30, 35, 40, 45, 50, 55, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, or more nucleotides (nt) in length.
  • a processed guide sequence is about 44 to 60 nt (such as 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, or 70 nt).
  • an unprocessed spacer is about 28-32 nt long (such as 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nt) while the mature
  • spacer can be about 10 to 30 nt, 10 to 25 nt, 14 to 25 nt, 20 to 22 nt, or 14-30 nt (such as 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
  • an unprocessed DR is about 36 nt (such as 30,
  • a DR sequence is truncated by 1-10 nucleotides (such as 1, 2, 3, 4, 5, 6, 7, 8, 9, to 10 nucleotides at e.g., the 5’ end in order to be expressed as mature pre-processed guide RNAs.
  • a guide RNA or a portion thereof does not comprise a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • a guide RNA or a portion thereof comprises a sequence complementary to a protospacer flanking sequence (PFS).
  • PFS protospacer flanking sequence
  • the first RNA binding protein may comprise a sequence isolated or derived from a Cas 13 protein.
  • the first RNA binding protein may comprise a sequence encoding a Casl3 protein or an RNA- binding portion thereof.
  • the guide RNA or a portion thereof does not comprise a sequence complementary to a PFS.
  • guide RNA sequence of the disclosure comprises a promoter sequence to drive expression of the guide RNA.
  • a vector comprising a guide RNA sequence of the disclosure comprises a promoter sequence to drive expression of the guide RNA.
  • the promoter to drive expression of the guide RNA is a constitutive promoter.
  • the promoter sequence is an inducible promoter.
  • the promoter is a sequence is a tissue-specific and/or cell-type specific promoter.
  • the promoter is a hybrid or a recombinant promoter.
  • the promoter is a promoter capable of expressing the guide RNA in a mammalian cell. In some embodiments, the promoter is a promoter capable of expressing the guide RNA in a human cell. In some embodiments, the promoter is a promoter capable of expressing the guide RNA and restricting the guide RNA to the nucleus of the cell. In some embodiments, the promoter is a human RNA polymerase promoter or a sequence isolated or derived from a sequence encoding a human RNA polymerase promoter. In some embodiments, the promoter is a U6 promoter or a sequence isolated or derived from a sequence encoding a U6 promoter.
  • the U6 promoter is a human U6 promoter. In some embodiments, the promoter is a human tRNA promoter or a sequence isolated or derived from a sequence encoding a human tRNA promoter. In some embodiments, the promoter is a human valine tRNA promoter or a sequence isolated or derived from a sequence encoding a human valine tRNA promoter.
  • a promoter to drive expression of the guide RNA further comprises a regulatory element.
  • a vector comprising a promoter sequence to drive expression of the guide RNA further comprises a regulatory element.
  • a regulatory element enhances expression of the guide RNA.
  • Exemplary regulatory elements include, but are not limited to, an enhancer element, an intron, an exon, or a combination thereof.
  • a vector of the disclosure comprises one or more of a sequence encoding a guide RNA, a promoter sequence to drive expression of the guide RNA and a sequence encoding a regulatory element. In some embodiments of the compositions of the disclosure, the vector further comprises a sequence encoding a fusion protein of the disclosure.
  • gRNAs correspond to target RNA molecules and an RNA-guided RNA binding protein.
  • the gRNAs correspond to an RNA-guided RNA binding fusion protein, wherein the fusion protein comprises first and second RNA binding proteins.
  • the first RNA-binding protein in the fusion protein is a deactivated RNA-binding protein, e.g., a deactivated Cas or catalytic dead Cas protein.
  • the sequence encoding the first RNA binding protein is positioned 5’ of the sequence encoding the second RNA binding protein.
  • the sequence encoding the first RNA binding protein is positioned 3’ of the sequence encoding the second RNA binding protein.
  • the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of selectively binding an RNA molecule and not binding a DNA molecule, a mammalian DNA molecule or any DNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule and inducing a break in the RNA molecule.
  • the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule, inducing a break in the RNA molecule, and not binding a DNA molecule, a mammalian DNA molecule or any DNA molecule. In some embodiments, the sequence encoding the first RNA binding protein comprises a sequence isolated or derived from a protein capable of binding an RNA molecule, inducing a break in the RNA molecule, and neither binding nor inducing a break in a DNA molecule, a mammalian DNA molecule or any DNA molecule.
  • the sequence encoding the first RNA-guided RNA binding protein comprises a sequence isolated or derived from a protein with no DNA nuclease activity.
  • the sequence encoding the RNA-guided RNA binding protein disclosed herein comprises a sequence isolated or derived from a CRISPR Cas protein.
  • the CRISPR Cas protein is not a Type II CRISPR Cas protein.
  • the CRISPR Cas protein is not a Cas9 protein.
  • the Cas9 protein is engineered to target RNA (RCas9).
  • the sequence encoding the RNA-guided RNA binding protein comprises a Type VI CRISPR Cas protein or portion thereof.
  • the Type VI CRISPR Cas protein comprises a Casl3 protein or portion thereof.
  • Exemplary Casl3 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, a bacteria or an archaea.
  • Exemplary Casl3 proteins of the disclosure may be isolated or derived from any species, including, but not limited to, Leptotrichia wadei, Listeria seeligeri serovar l/2b (strain ATCC 35967 / DSM 20751 / CIP 100100 / SLCC 3954), Lachnospiraceae bacterium, Clostridium aminophilum
  • DSM 10710 Carnobacterium gallinarum DSM 4847, Paludibacter propionicigenes WB4, Listeria weihenstephanensis FSL R9-0317, Listeria weihenstephanensis FSL R9-0317, bacterium FSL M6-0635 (Listeria newyorkensis), Leptotrichia wadei F0279, Rhodobacter capsulatus SB 1003, Rhodobacter capsulatus R121, Rhodobacter capsulatus DE442 and Corynebacterium ulcerans.
  • Exemplary Cas 13 proteins of the disclosure may be DNA nuclease inactivated.
  • Exemplary Casl3 proteins of the disclosure include, but are not limited to, Cas 13 a, Cas 13b, Cas 13c, Cas 13d and orthologs thereof.
  • Exemplary Cas 13b proteins of the disclosure include, but are not limited to, subtypes 1 and 2 referred to herein as Csx27 and Csx28, respectively.
  • Exemplary Casl3a proteins include, but are not limited to:
  • Exemplary wild type Casl3a proteins of the disclosure may comprise or consist of the amino acid sequence of SEQ ID NO: 323.
  • Exemplary Casl3b proteins include, but are not limited to:
  • Exemplary wild type Bergeyella zoohelcum ATCC 43767 Casl3b (BzCasl3b) proteins of the disclosure may comprise or consist of the amino acid sequence of SEQ ID NO: 1
  • the sequence encoding the RNA binding protein comprises a sequence isolated or derived from a Casl3d protein.
  • Casl3d is an effector of the type VI-D CRISPR-Cas systems.
  • the Casl3d protein is an RNA-guided RNA endonuclease enzyme that can cut or bind RNA.
  • the Casl3d protein can include one or more higher eukaryotes and prokaryotes nucleotide-binding (HEPN) domains.
  • HEPN prokaryotes nucleotide-binding
  • the Casl3d protein can include either a wild-type or mutated HEPN domain.
  • the Casl3d protein includes a mutated HEPN domain that cannot cut RNA but can process guide RNA. In some embodiments, the Casl3d protein does not require a protospacer flanking sequence. Also see WO Publication No. W02019/040664 & US2019/0062724, which is incorporated herein by reference in its entirety, for further examples and sequences of Casl3d protein, without limitation.
  • Casl3d sequences of the disclosure include without limitation SEQ ID NOS: 1-296 of WO 2019/040664, so numbered herein and included herewith.
  • SEQ ID NO: 1 is an exemplary Casl3d sequence from Eubacterium siraeum containing a HEPN site.
  • SEQ ID NO: 2 is an exemplary Casl3d sequence from Eubacteriiim siraeum containing a mutated HEPN site.
  • SEQ ID NO: 3 is an exemplary Casl3d sequence from uncultured Ruminococcus sp. containing a HEPN site.
  • SEQ ID NO: 4 is an exemplary Casl3d sequence from uncultured Ruminococcus sp. containing a mutated HEPN site.
  • SEQ ID NO: 5 is an exemplary Casl3d sequence from Gut_metagenome_contig2791000549.
  • SEQ ID NO: 6 is an exemplary Casl3d sequence from Gut_metagenome_contig855000317
  • SEQ ID NO: 7 is an exemplary Casl3d sequence from Gut_metagenome_contig3389000027.
  • SEQ ID NO: 8 is an exemplary Casl3d sequence from Gut_metagenome_contig8061000170.
  • SEQ ID NO: 9 is an exemplary Casl3d sequence from
  • SEQ ID NO: 10 is an exemplary Casl3d sequence from Gut_metagenome_contig9549000591.
  • SEQ ID NO: 11 is an exemplary Casl3d sequence from Gut_metagenome_contig71000500.
  • SEQ ID NO: 12 is an exemplary Casl3d sequence from human gut metagenome.
  • SEQ ID NO: 13 is an exemplary Casl3d sequence from Gut_metagenome_contig3915000357.
  • SEQ ID NO: 14 is an exemplary Casl3d sequence from Gut_metagenome_contig4719000173.
  • SEQ ID NO: 15 is an exemplary Casl3d sequence from Gut_metagenome_contig6929000468.
  • SEQ ID NO: 16 is an exemplary Casl3d sequence from Gut_metagenome_contig7367000486.
  • SEQ ID NO: 17 is an exemplary Casl3d sequence from Gut_metagenome_contig7930000403.
  • SEQ ID NO: 18 is an exemplary Casl3d sequence from Gut_metagenome_contig993000527.
  • SEQ ID NO: 19 is an exemplary Casl3d sequence from Gut_metagenome_contig6552000639.
  • SEQ ID NO: 20 is an exemplary Casl3d sequence from Gut_metagenome_contigll932000246.
  • SEQ ID NO: 21 is an exemplary Casl3d sequence from Gut_metagenome_contigl2963000286.
  • SEQ ID NO: 22 is an exemplary Casl3d sequence from Gut_metagenome_contig2952000470.
  • SEQ ID NO: 23 is an exemplary Casl3d sequence from Gut_metagenome_contig451000394.
  • SEQ ID NO: 24 is an exemplary Casl3d sequence from Eub acterium_siraeum_D SM_15702.
  • SEQ ID NO: 25 is an exemplary Casl3d sequence from gut_metagenome_P 19E0k2120140920, _c369000003.
  • SEQ ID NO: 26 is an exemplary Casl3d sequence from Gut_metagenome_contig7593000362.
  • SEQ ID NO: 27 is an exemplary Casl3d sequence from Gut_metagenome_contigl2619000055.
  • SEQ ID NO: 28 is an exemplary Casl3d sequence from Gut_metagenome_contigl405000151.
  • SEQ ID NO: 29 is an exemplary Casl3d sequence from Chicken_gut_metagenome_c298474.
  • SEQ ID NO: 30 is an exemplary Casl3d sequence from Gut_metagenome_contigl516000227.
  • SEQ ID NO: 31 is an exemplary Casl3d sequence from Gut metagenome contigl 838000319.
  • SEQ ID NO: 32 is an exemplary Casl3d sequence from Gut_metagenome_contig 13123000268.
  • SEQ ID NO: 33 is an exemplary Casl3d sequence from Gut_metagenome_contig5294000434.
  • SEQ ID NO: 34 is an exemplary Casl3d sequence from Gut_metagenome_contig6415000192.
  • SEQ ID NO: 35 is an exemplary Casl3d sequence from
  • SEQ ID NO: 36 is an exemplary Casl3d sequence from Gut_metagenome_contig9118000041.
  • SEQ ID NO: 37 is an exemplary Casl3d sequence from Activated_sludge_metagenome_transcript_124486.
  • SEQ ID NO: 38 is an exemplary Casl3d sequence from Gut_metagenome_contig 1322000437.
  • SEQ ID NO: 39 is an exemplary Casl3d sequence from Gut_metagenome_contig4582000531.
  • SEQ ID NO: 40 is an exemplary Casl3d sequence from
  • SEQ ID NO: 41 is an exemplary Casl3d sequence from Gut_metagenome_contigl709000510.
  • SEQ ID NO: 42 is an exemplary Casl3d sequence from M24_(LSQX01212483_A «aeroZhc /g-esfer metagenome) with a HEPN domain.
  • SEQ ID NO: 43 is an exemplary Casl3d sequence from Gut_metagenome_contig3833000494.
  • SEQ ID NO: 44 is an exemplary Casl3d sequence from Activated_sludge_metagenome_transcript_l 17355.
  • SEQ ID NO: 45 is an exemplary Casl3d sequence from
  • SEQ ID NO: 46 is an exemplary Casl3d sequence from Gut_metagenome_contig338000322 from sheep gut metagenome.
  • SEQ ID NO: 47 is an exemplary Casl3d sequence from human gut metagenome.
  • SEQ ID NO: 48 is an exemplary Casl3d sequence from Gut_metagenome_contig9530000097.
  • SEQ ID NO: 49 is an exemplary Casl3d sequence from Gut_metagenome_contigl750000258.
  • SEQ ID NO: 50 is an exemplary Casl3d sequence from Gut_metagenome_contig5377000274.
  • SEQ ID NO: 51 is an exemplary Casl3d sequence from gut_metagenome_P 19E0k2120140920_c248000089.
  • SEQ ID NO: 52 is an exemplary Casl3d sequence from Gut_metagenome_contigll400000031.
  • SEQ ID NO: 53 is an exemplary Casl3d sequence from Gut_metagenome_contig7940000191.
  • SEQ ID NO: 54 is an exemplary Casl3d sequence from Gut_metagenome_contig6049000251.
  • SEQ ID NO: 55 is an exemplary Casl3d sequence from Gut_metagenome_contigl 137000500.
  • SEQ ID NO: 56 is an exemplary Casl3d sequence from Gut_metagenome_contig9368000105.
  • SEQ ID NO: 57 is an exemplary Casl3d sequence from Gut_metagenome_contig546000275.
  • SEQ ID NO: 58 is an exemplary Casl3d sequence from Gut_metagenome_contig7216000573.
  • SEQ ID NO: 59 is an exemplary Casl3d sequence from Gut_metagenome_contig4806000409.
  • SEQ ID NO: 60 is an exemplary Casl3d sequence from Gut_metagenome_contigl0762000480.
  • SEQ ID NO: 61 is an exemplary Casl3d sequence from Gut_metagenome_contig4114000374.
  • SEQ ID NO: 62 is an exemplary Casl3d sequence from Ruminococcus flavefaciens D 1.
  • SEQ ID NO: 63 is an exemplary Casl3d sequence from Gut_metagenome_contig7093000170.
  • SEQ ID NO: 64 is an exemplary Casl3d sequence from Gut metagenome contigl 1113000384.
  • SEQ ID NO: 65 is an exemplary Casl3d sequence from Gut_metagenome_contig6403000259.
  • SEQ ID NO: 66 is an exemplary Casl3d sequence from Gut_metagenome_contig6193000124.
  • SEQ ID NO: 67 is an exemplary Casl3d sequence from Gut_metagenome_contig721000619.
  • SEQ ID NO: 68 is an exemplary Casl3d sequence from Gut_metagenome_contigl666000270.
  • SEQ ID NO: 69 is an exemplary Casl3d sequence from Gut_metagenome_contig2002000411.
  • SEQ ID NO: 70 is an exemplary Casl3d sequence from Ruminococcus albus.
  • SEQ ID NO: 71 is an exemplary Casl3d sequence from Gut_metagenome_contigl 3552000311.
  • SEQ ID NO: 72 is an exemplary Casl3d sequence from Gut_metagenome_contigl0037000527.
  • SEQ ID NO: 73 is an exemplary Casl3d sequence from Gut_metagenome_contig238000329.
  • SEQ ID NO: 74 is an exemplary Casl3d sequence from Gut_metagenome_contig2643000492.
  • SEQ ID NO: 75 is an exemplary Casl3d sequence from Gut_metagenome_contig874000057.
  • SEQ ID NO: 76 is an exemplary Casl3d sequence from Gut_metagenome_contig4781000489.
  • SEQ ID NO: 77 is an exemplary Casl3d sequence from Gut_metagenome_contigl2144000352.
  • SEQ ID NO: 78 is an exemplary Casl3d sequence from Gut_metagenome_contig5590000448.
  • SEQ ID NO: 79 is an exemplary Casl3d sequence from Gut_metagenome_contig9269000031.
  • SEQ ID NO: 80 is an exemplary Casl3d sequence from Gut_metagenome_contig8537000520.
  • SEQ ID NO: 81 is an exemplary Casl3d sequence from Gut_metagenome_contigl845000130.
  • SEQ ID NO: 82 is an exemplary Casl3d sequence from gut_metagenome_P 13E0k2120140920_c3000072.
  • SEQ ID NO: 83 is an exemplary Casl3d sequence from gut_metagenome_Pl E0k2120140920 _c 1000078.
  • SEQ ID NO: 84 is an exemplary Casl3d sequence from Gut_metagenome_contigl2990000099.
  • SEQ ID NO: 85 is an exemplary Casl3d sequence from Gut_metagenome_contig525000349.
  • SEQ ID NO: 86 is an exemplary Casl3d sequence from Gut_metagenome_contig7229000302.
  • SEQ ID NO: 87 is an exemplary Casl3d sequence from
  • SEQ ID NO: 88 is an exemplary Casl3d sequence from Gut_metagenome_contig7030000469.
  • SEQ ID NO: 89 is an exemplary Casl3d sequence from Gut_metagenome_contig5149000068.
  • SEQ ID NO: 90 is an exemplary Casl3d sequence from Gut_metagenome_contig400200045.
  • SEQ ID NO: 91 is an exemplary Casl3d sequence from Gut_metagenome_contigl0420000446.
  • SEQ ID NO: 92 is an exemplary Casl3d sequence from new_flavefaciens_strain_XPD3002 (CasRx).
  • SEQ ID NO: 93 is an exemplary Casl3d sequence from M26_Gut_metagenome_contig698000307.
  • SEQ ID NO: 94 is an exemplary Casl3d sequence from M36_Uncultured_ Eubacterium_ sp_TS28_c40956.
  • SEQ ID NO: 95 is an exemplary Casl3d sequence from M12_gut_metagenome_P25C0k2120140920 _cl34000066.
  • SEQ ID NO: 96 is an exemplary Casl3d sequence from human gut metagenome.
  • SEQ ID NO: 97 is an exemplary Casl3d sequence from MIO gut metagenome _P25C90k2120140920_c28000041.
  • SEQ ID NO: 98 is an exemplary Casl3d sequence from 30 Ml I gut metagenome P25C7k2120140920_c4078000105.
  • SEQ ID NO: 99 is an exemplary Casl3d sequence from gut_metagenome_P25C0k2120140920_c32000045.
  • SEQ ID NO: 100 is an exemplary Casl3d sequence from M13_gut_metagenome _P23C7k2120140920 _c3000067.
  • SEQ ID NO: 101 is an exemplary Casl 3d sequence from M5_gut_metagenome_P18E90k2120140920.
  • SEQ ID NO: 102 is an exemplary Casl3d sequence from M21 gut metagenome P18E0k2120140920.
  • SEQ ID NO: 103 is an exemplary Casl3d sequence from M7 gut metagenome _P38C7k21201 40920_c4841 000003.
  • SEQ ID NO: 104 is an exemplary Casl3d sequence from Ruminococcus bicirculans.
  • SEQ ID NO: 105 is an exemplary Casl3d sequence.
  • SEQ ID NO: 106 is an exemplary Casl3d consensus sequence.
  • SEQ ID NO: 107 is an exemplary Casl3d sequence from M18_gut_metagenome _P22EOk2120140920_c3395000078.
  • SEQ ID NO: 108 is an exemplary Casl 3d sequence from M 17 gut metagenome P22E90k2120140920_c 114.
  • SEQ ID NO: 109 is an exemplary Casl3d sequence from Ruminococcus _sp_C AG57.
  • SEQ ID NO: 110 is an exemplary Casl3d sequence from gut_metagenome_Pl lE90k21201 40920 c43000123.
  • SEQ ID NO: 111 is an exemplary Casl 3d sequence from M6 gut metagenome P 13E90k2120 1 40920_c7000009.
  • SEQ ID NO: 112 is an exemplary Casl3d sequence from M19 gut metagenome PI 7E90k2120140920.
  • SEQ ID NO: 113 is an exemplary Casl3d sequence from gut_metagenome_P17E0k2120140920,_c87000043.
  • SEQ ID NO: 114 is an exemplary human codon optimized Eubacterium siraeum Casl 3d nucleic acid sequence.
  • SEQ ID NO: 115 is an exemplary human codon optimized Eubacterium siraeum Casl3d nucleic acid sequence with a mutant HEPN domain.
  • SEQ ID NO: 116 is an exemplary human codon-optimized Eubacterium siraeum Casl 3d nucleic acid sequence with N-terminal NLS.
  • SEQ ID NO: 117 is an exemplary human codon-optimized Eubacterium siraeum Casl3d nucleic acid sequence with N- and C-terminal NLS tags.
  • SEQ ID NO: 118 is an exemplary human codon-optimized uncultured Ruminococcus sp.Casl3d 30 nucleic acid sequence.
  • SEQ ID NO: 119 is an exemplary human codon-optimized uncultured Ruminococcus sp. Casl 3d nucleic acid sequence with a mutant HEPN domain.
  • SEQ ID NO: 120 is an exemplary human codon-optimized uncultured Ruminococcus sp. Casl3d nucleic acid sequence with N-terminal NLS.
  • SEQ ID NO: 121 is an exemplary human codon-optimized uncultured Ruminococcus sp. Casl3d nucleic acid sequence with N- and C-terminal NLS tags.
  • SEQ ID NO: 122 is an exemplary human codon-optimized uncultured Ruminococcus flavefaciens FD1 Casl 3d nucleic acid sequence.
  • SEQ ID NO: 123 is an exemplary human codon-optimized uncultured Ruminococcus flavefaciens FD1 Casl3d nucleic acid sequence with mutated HEPN domain.
  • SEQ ID NO: 124 is an exemplary Casl3d nucleic acid sequence from Ruminococcus bicirculans .
  • SEQ ID NO: 125 is an exemplary Casl3d nucleic acid sequence from Eubacterium siraeum.
  • SEQ ID NO: 126 is an exemplary Casl3d nucleic acid sequence from Ruminococcus flavefaciens FD1.
  • SEQ ID NO: 127 is an exemplary Casl3d nucleic acid sequence from Ruminococcus albus.
  • SEQ ID NO: 128 is an exemplary Casl3d nucleic acid sequence from Ruminococcus flavefaciens XPD.
  • SEQ ID NO: 129 is an exemplary consensus DR nucleic acid sequence for if. siraeum Casl3d.
  • SEQ ID NO: 130 is an exemplary consensus DR nucleic acid sequence for Rum. Sp. Casl3d.
  • SEQ ID NO: 131 is an exemplary consensus DR nucleic acid sequence for Rum. Flavefaciens strain XPD3002 Casl3d (CasRx).
  • SEQ ID NOS: 132-137 are exemplary consensus DR nucleic acid sequences.
  • SEQ ID NO: 138 is an exemplary 50% consensus sequence for seven full-length Casl3d orthologues.
  • SEQ ID NO: 139 is an exemplary Casl3d nucleic acid sequence from Gut metagenome P1EO.
  • SEQ ID NO: 140 is an exemplary Casl3d nucleic acid sequence from Anaerobic digester.
  • SEQ ID NO: 141 is an exemplary Casl3d nucleic acid sequence from Ruminococcus sp. CAG:57.
  • SEQ ID NO: 142 is an exemplary human codon-optimized uncultured Gut metagenome P1EO Casl3d nucleic acid sequence.
  • SEQ ID NO: 143 is an exemplary human codon-optimized Anaerobic Digester Casl3d nucleic acid sequence.
  • SEQ ID NO: 144 is an exemplary human codon-optimized Ruminococcus flavefaciens XPD Casl3d nucleic acid sequence.
  • SEQ ID NO: 145 is an exemplary human codon-optimized Ruminococcus albus Casl3d nucleic acid sequence.
  • SEQ ID NO: 146 is an exemplary processing of the Ruminococcus sp. CAG:57 CRISPR array.
  • SEQ ID NO: 147 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 148 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 147).
  • SEQ ID NO: 149 is an exemplary Casl3d protein sequence from contig tpg
  • SEQ ID NOS: 150-152 are exemplary consensus DR nucleic acid sequences (goes with SEQ ID NO: 149).
  • SEQ ID NO: 153 is an exemplary Casl3d protein sequence from contig tpg
  • SEQ ID NO: 154 is an exemplary consensus DRnucleic acid sequence (goes with SEQ ID NO: 153).
  • SEQ ID NO: 155 is an exemplary Casl3d protein sequence from contig OGZC01000639.1 (human gut metagenome assembly).
  • SEQ ID NOS: 156-177 are exemplary consensus DR nucleic acid sequences (goes with SEQ ID NO: 155).
  • SEQ ID NO: 158 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 159 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 158).
  • SEQ ID NO: 160 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 161 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 160).
  • SEQ ID NO: 162 is an exemplary Casl3d protein sequence from contig embl0GDF01008514.1
  • SEQ ID NO: 163 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 162).
  • SEQ ID NO: 164 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 165 is an exemplary consensus DRnucleic acid sequence (goes with SEQ ID NO: 164).
  • SEQ ID NO: 166 is an exemplary Casl3d protein sequence from contig NF1R01000008. 1 ( Eubacterium sp. An3, from chicken gut metagenome).
  • SEQ ID NO: 167 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 166).
  • SEQ ID NO: 168 is an exemplary Casl3d protein sequence from contig NFLV01000009.1 ( Eubacterium sp. Ani l from chicken gut metagenome).
  • SEQ ID NO: 169 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 168).
  • SEQ ID NOS: 171-174 are an exemplary Casl3d motif sequences.
  • SEQ ID NO: 175 is an exemplary Casl3d protein sequence from contig OJMMO 1002900 human gut metagenome sequence.
  • SEQ ID NO: 176 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 175).
  • SEQ ID NO: 177 is an exemplary Casl3d protein sequence from contig
  • SEQ ID NO: 178 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 177).
  • SEQ ID NO: 179 is an exemplary Casl3d protein sequence from contig OIZXO 1000427.1.
  • SEQ ID NO: 180 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 179).
  • SEQ ID NO: 181 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 182 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 181).
  • SEQ ID NO: 183 is an exemplary Casl3d protein sequence from contig OCTWOl 1587266.1
  • SEQ ID NO: 184 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 183).
  • SEQ ID NO: 185 is an exemplary Casl3d protein sequence from contig emb IOGNFO 1009141.1.
  • SEQ ID NO: 186 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 185).
  • SEQ ID NO: 187 is an exemplary Casl3d protein sequence from contig emb
  • SEQ ID NO: 188 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 187).
  • SEQ ID NO: 189 is an exemplary Casl3d protein sequence from contig e- k87_l 1092736.
  • SEQ ID NOS: 190-193 are exemplary consensus DR nucleic acid sequences (goes with SEQ ID NO: 189).
  • SEQ ID NO: 194 is an exemplary Casl3d sequence from Gut_metagenome_contig6893000291.
  • SEQ ID NOS: 195-197 are exemplary Casl3d motif sequences.
  • SEQ ID NO: 198 is an exemplary Casl3d protein sequence from Ga0224415_10007274.
  • SEQ ID NO: 199 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 198).
  • SEQ ID NO: 200 is an exemplary Casl3d protein sequence from EMGJ0003641.
  • SEQ ID NO: 202 is an exemplary Casl3d protein sequence from Ga0129306 1000735.
  • SEQ ID NO: 201 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 200).
  • SEQ ID NO: 202 is an exemplary Casl3d protein sequence from Ga0129306_1000735.
  • SEQ ID NO: 203 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 203
  • SEQ ID NO: 204 is an exemplary Casl3d protein sequence from Ga0129317_l 008067.
  • SEQ ID NO: 205 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 204).
  • SEQ ID NO: 206 is an exemplary Casl3d protein sequence from Ga0224415_l 0048792.
  • SEQ ID NO: 207 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 206).
  • SEQ ID NO: 208 is an exemplary Casl3d protein sequence from 160582958 gene49834.
  • SEQ ID NO: 209 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 208).
  • SEQ ID NO: 210 is an exemplary Casl3d protein sequence from 250twins_35838_GL0110300.
  • SEQ ID NO: 211 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 210).
  • SEQ ID NO: 212 is an exemplary Casl3d protein sequence from 250twins_36050_GLOI58985.
  • SEQ ID NO: 213 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 212).
  • SEQ ID NO: 214 is an exemplary Casl3d protein sequence from 31009_GL0034153.
  • SEQ ID NO: 215 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 214).
  • SEQ ID NO: 216 is an exemplary Casl3d protein sequence from 530373_GL0023589.
  • SEQ ID NO: 217 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 216).
  • SEQ ID NO: 218 is an exemplary Casl3d protein sequence from BMZ-1 1B_GL0037771.
  • SEQ ID NO: 219 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 218).
  • SEQ ID NO: 220 is an exemplary Casl3d protein sequence from BMZ-1 1B_GL0037915.
  • SEQ ID NO: 221 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 220).
  • SEQ ID NO: 222 is an exemplary Casl3d protein sequence from BMZ- 1 1B_GL00696 1 7.
  • SEQ ID NO: 223 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 222).
  • SEQ ID NO: 224 is an exemplary Casl3d protein sequence from DLF014_GL0011914.
  • SEQ ID NO: 225 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 224).
  • SEQ ID NO: 226 is an exemplary Casl3d protein sequence from EYZ- 362B_GL0088915.
  • SEQ ID NO: 227-228 are exemplary consensus DR nucleic acid sequences (goes with SEQ ID NO: 226).
  • SEQ ID NO: 229 is an exemplary Casl3d protein sequence from Ga0099364 10024192.
  • SEQ ID NO: 230 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 229).
  • SEQ ID NO: 231 is an exemplary Casl3d protein sequence from Ga0187910 10006931.
  • SEQ ID NO: 232 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 231).
  • SEQ ID NO: 233 is an exemplary Casl3d protein sequence from GaO 187910_10015336.
  • SEQ ID NO: 234 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 233).
  • SEQ ID NO: 235 is an exemplary Casl3d protein sequence from Ga0187910_10040531.
  • SEQ ID NO: 236 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 23).
  • SEQ ID NO: 237 is an exemplary Casl3d protein sequence from Ga0187911_10069260.
  • SEQ ID NO: 238 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 237).
  • SEQ ID NO: 239 is an exemplary Casl3d protein sequence from MH0288_GL0082219.
  • SEQ ID NO: 240 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 239).
  • SEQ ID NO: 241 is an exemplary Casl3d protein sequence from 02.UC29- 0_GL0096317.
  • SEQ ID NO: 242 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 241).
  • SEQ ID NO: 243 is an exemplary Casl3d protein sequence from PIG- 014_GL0226364.
  • SEQ ID NO: 244 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 243).
  • SEQ ID NO: 245 is an exemplary Casl3d protein sequence from PIG- 018_GL0023397.
  • SEQ ID NO: 246 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 245).
  • SEQ ID NO: 247 is an exemplary Casl3d protein sequence from PIG- 025_GL0099734.
  • SEQ ID NO: 248 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 247).
  • SEQ ID NO: 249 is an exemplary Casl3d protein sequence from PIG- 028_GL0185479.
  • SEQ ID NO: 250 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 249).
  • SEQ ID NO: 251 is an exemplary Casl3d protein sequence from - Ga0224422_ 10645759.
  • SEQ ID NO: 252 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 251).
  • SEQ ID NO: 253 is an exemplary Casl3d protein sequence from ODAI chimera.
  • SEQ ID NO: 254 is an exemplary consensus DR nucleic acid sequence (goes with SEQ ID NO: 253).
  • SEQ ID NO: 255 is an HEPN motif.
  • SEQ ID NOs: 256 and 257 are exemplary Casl3d nuclear localization signal amino acid and nucleic acid sequences, respectively.
  • SEQ ID NOs: 258 and 260 are exemplary SV40 large T antigen nuclear localization signal amino acid and nucleic acid sequences, respectively.
  • SEQ ID NO: 259 is a dCas9 target sequence.
  • SEQ ID NO: 261 is an artificial Eubacterium siraeum nCasl array targeting ccdB.
  • SEQ ID NO: 262 is a full 36 nt directrepeat.
  • SEQ ID Nos: 263-266 are spacer sequences.
  • SEQ ID NO: 267 is an artificial uncultured Ruminoccus sp. nCasl array targeting ccdB.
  • SEQ ID NO: 268 is a full 36 nt direct repeat.
  • SEQ ID Nos: 269-272 are spacer sequences.
  • SEQ ID NO: 273 is a ccdB target RNA sequence.
  • SEQ ID NOs: 274-277 are spacer sequences.
  • SEQ ID NO: 278 is a mutated Casl3d sequence, NLS-Ga_0531(trunc)-NLS- HA. This mutant has a deletion of the non-conservedN-terminus.
  • SEQ ID NO: 279 is a mutated Casl3d sequence, NES-Ga_0531(trunc)-NES-HA. This mutant has a deletion of the non-conservedN-terminus.
  • SEQ ID NO: 280 is a full-length Casl3d sequence, NLS-RfxCasl3d-NLS-HA.
  • SEQ ID NO: 281 is a mutated Casl3d sequence, NLS-RfxCasl3d(del5)-NLS- HA. This mutant has a deletion of amino acids 558-587.
  • SEQ ID NO: 282 is a mutated Casl3d sequence, NLS-RfxCasl3d(del5.12)-NLS- HA. This mutant has a deletion of amino acids 558-587 and 953-966.
  • SEQ ID NO: 283 is a mutated Casl3d sequence, NLS-RfxCasl3d(del5.13)-NLS- HA. This mutant has a deletion of amino acids 376-392 and 558-587.
  • SEQ ID NO: 284 is a mutated Casl3d sequence, NLS-RfxCasl3d(del5.12+5.13)- NLS-HA. This mutant has a deletion of amino acids 376-392, 558-587, and 953-966.
  • SEQ ID NO: 285 is a mutated Casl3d sequence, NLS-RfxCasl3d(dell3)-NLS- HA. This mutant has a deletion of amino acids 376-392.
  • SEQ ID NO: 286 is an effector sequence used to edit expression of ADAR2.
  • Amino acids 1 to 969 are dRfxCasl3
  • aa 970 to 991 are an NLS sequence
  • amino acids 992 to 1378 are ADAR2DD.
  • SEQ ID NO: 287 is an exemplary HIV NES protein sequence.
  • SEQ ID NOS: 288-291 are exemplary Casl3d motif sequences.
  • SEQ ID NO: 292 is Casl3d ortholog sequence MH_4866.
  • SEQ ID NO: 293 is an exemplary Casl3d protein sequence from 037_- _emblOIZ A01000315.11
  • SEQ ID NO: 294 is an exemplary Casl3d protein sequence from PIG- 022 GL002635 1.
  • SEQ ID NO: 295 is an exemplary Casl3d protein sequence from PIG- 046 GL0077813.
  • SEQ ID NO: 296 is an exemplary Casl3d protein sequence from pig chimera.
  • SEQ ID NO: 297 is an exemplary nuclease-inactive or dead Casl3d (dCasl3d) protein sequence from Ruminococcus flavefaciens XPD3002 (CasRx)
  • SEQ ID NO: 298 is an exemplary Casl3d protein sequence.
  • SEQ ID NO: 299 is an exemplary Casl3d protein sequence from (contig tpg
  • SEQ ID NO: 300 is an exemplary Casl3d direct repeat nucleotide sequence from Casl3d (contig tpg
  • Ruminococcus assembly UBA7013, from sheep gut metagenome (goes with SEQ ID NO: 299).
  • SEQ ID NO: 301 is an exemplary Casl3d protein contig emb
  • SEQ ID NO: 304 is an exemplary CasM protein from Ruminococcus sp., isolate 2789STDY5834971.
  • SEQ ID NO: 305 is an exemplary CasM protein from Ruminococcus bicirculans.
  • SEQ ID NO: 306 is an exemplary CasM protein from Ruminococcus sp., isolate
  • SEQ ID NO: 307 is an exemplary CasM protein from Ruminococcus sp. CAG:57.
  • SEQ ID NO: 308 is an exemplary CasM protein from Ruminococcus flavefaciens FD- 1.
  • SEQ ID NO: 309 is an exemplary CasM protein from Ruminococcus albus strain KH2T6.
  • SEQ ID NO: 310 is an exemplary CasM protein from Ruminococcus flavefaciens strain XPD3002.
  • SEQ ID NO: 311 is an exemplary CasM protein from Ruminococcus sp., isolate 2789STDY5834894.
  • SEQ ID NO: 312 is an exemplary RtcB homolog.
  • SEQ ID NO: 313 is an exemplary WYL from Eubacterium siraeum + C-terminal NLS.
  • SEQ ID NO: 314 is an exemplary WYL from Ruminococcus sp. isolate 2789STDY5834971 + C-term NLS.
  • SEQ ID NO: 315 is an exemplary WYL from Ruminococcus bicirculans + C-term NLS.
  • SEQ ID NO: 316 is an exemplary WYL from Ruminococcus sp. isolate 2789STDY5608892 + C-term NLS.
  • SEQ ID NO: 317 is an exemplary WYL from Ruminococcus sp. CAG:57 + C-term NLS.
  • SEQ ID NO: 318 is an exemplary WYL from Ruminococcus flavefaciens FD-1 + C- term NLS.
  • SEQ ID NO: 319 is an exemplary WYL from Ruminococcus albus strain KH2T6 + C-term NLS.
  • SEQ ID NO: 320 is an exemplary WYL from Ruminococcus flavefaciens strain XPD3 002 + C-term NLS.
  • SEQ ID NO: 321 is an exemplary RtcB from Eubacterium siraeum + C-term NLS.
  • SEQ ID NO: 322 is an exemplary direct repeat sequence of Ruminococcus flavefaciens XPD3002 Casl3d (CasRx).
  • Exemplary wild type Casl3d proteins of the disclosure may comprise or consist of the amino acid sequence SEQ ID NO: 92 or SEQ ID NO: 298 (Casl3d protein also known as CasRx) .
  • An exemplary direct repeat sequence of Ruminococcus flavefaciens XPD3002 Casl3d comprises the nucleic acid sequence: AACCCCTACCAACTGGTCGGGGTTTGAAAC (SEQ ID NO: 302).
  • the compositions of the disclosure bind and destroy a target sequence of an RNA molecule comprising a pathogenic repeat sequence.
  • the target RNA comprises a sequence motif corresponding to a spacer sequence of the guide RNA corresponding to the RNA-guided RNA-binding protein.
  • one or more spacer sequences are used to target one or more target sequences.
  • multiple spacers are used to target multiple target RNAs.
  • Such target RNAs can be different target sites within the same RNA molecule or can be different target sites within different RNA molecules.
  • Spacer sequences can also target non-coding RNA.
  • multiple promoters e.g., Pol III promoters
  • the destruction of the target RNA(s) or target sequence motif(s) reduces expression of pathogenic CAG repeat RNA thereby treating CAG repeat disease such as HD or SCA1 and/or ameliorating one or more symptoms associated with CAG repeat diseases such as HD or SCA1.
  • destruction of target RNA reduces expression of pathogenic CUG repeat RNA thereby treating a disease such as myotonic dystrophy-1 (DM1).
  • DM1 myotonic dystrophy-1
  • destruction of target RNA reduces expression of pathogenic CCGGG and/or GGCCCC repeat RNAs thereby treating a disease such as ALS or FTD (Frontotemporal Dementia).
  • destruction of target RNA of mutant Rhodopsin (Rho) thereby reduces expression of pathogenic Rho and treats inherited retinal disease such as autosomal dominant retinitis pigmentosa (adRP).
  • the sequence motif of the target RNA is a signature of a disease or disorder.
  • a sequence motif of the disclosure may be isolated or derived from a sequence of foreign or exogenous sequence found in a genomic sequence, and therefore translated into an mRNA molecule of the disclosure or a sequence of foreign or exogenous sequence found in an RNA sequence of the disclosure.
  • a target sequence motif of the disclosure may comprise, consist of, be situated by, or be associated with a mutation in an endogenous sequence that causes a disease or disorder.
  • the mutation may comprise or consist of a sequence substitution, inversion, deletion, insertion, transposition, or any combination thereof.
  • a target sequence motif of the disclosure may comprise or consist of a repeated sequence.
  • the repeated sequence may be associated with a microsatellite instability (MSI). MSI at one or more loci results from impaired DNA mismatch repair mechanisms of a cell of the disclosure.
  • MSI microsatellite instability
  • a hypervariable sequence of DNA may be transcribed into an mRNA of the disclosure comprising a target sequence comprising or consisting of the hypervariable sequence.
  • a target sequence motif of the disclosure may comprise or consist of a biomarker.
  • the biomarker may indicate a risk of developing a disease or disorder.
  • the biomarker may indicate a healthy gene (low or no determinable risk of developing a disease or disorder.
  • the biomarker may indicate an edited gene.
  • Exemplary biomarkers include, but are not limited to, single nucleotide polymorphisms (SNPs), sequence variations or mutations, epigenetic marks, splice acceptor sites, exogenous sequences, heterologous sequences, and any combination thereof.
  • a target sequence motif of the disclosure may comprise or consist of a secondary, tertiary or quaternary structure.
  • the secondary, tertiary or quaternary structure may be endogenous or naturally occurring.
  • the secondary, tertiary or quaternary structure may be induced or non-naturally occurring.
  • the secondary, tertiary or quaternary structure may be encoded by an endogenous, exogenous, or heterologous sequence.
  • a target sequence of an RNA molecule comprises or consists of between 2 and 100 nucleotides or nucleic acid bases, inclusive of the endpoints. In some embodiments, the target sequence of an RNA molecule comprises or consists of between 2 and 50 nucleotides or nucleic acid bases, inclusive of the endpoints. In some embodiments, the target sequence of an RNA molecule comprises or consists of between 2 and 20 nucleotides or nucleic acid bases, inclusive of the endpoints. In some embodiments, the target sequence of an RNA molecule comprises or consists of between 20-30 nucleotides or nucleic acid bases, inclusive of the endpoints. In some embodiments, the target sequence of an RNA molecule comprises or consists of about 26 nucleotides or nucleic acid bases, inclusive of the endpoints.
  • a target sequence of an RNA molecule is continuous. In some embodiments, the target sequence of an RNA molecule is discontinuous.
  • the target sequence of an RNA molecule may comprise or consist of one or more nucleotides or nucleic acid bases that are not contiguous because one or more intermittent nucleotides are positioned in between the nucleotides of the target sequence.
  • a target sequence of an RNA molecule is naturally occurring. In some embodiments, the target sequence of an RNA molecule is non-naturally occurring.
  • Exemplary non-naturally occurring target sequences may comprise or consist of sequence variations or mutations, chimeric sequences, exogenous sequences, heterologous sequences, chimeric sequences, recombinant sequences, sequences comprising a modified or synthetic nucleotide or any combination thereof.
  • a target sequence of an RNA molecule binds to a guide RNA of the disclosure. In some embodiments of the compositions and methods of the disclosure, one or more target sequences of an RNA molecule binds to one or more guide RNA spacer sequences of the disclosure.
  • a target sequence of an RNA molecule binds to a first RNA binding protein of the disclosure.
  • compositions of the disclosure comprise a gRNA comprising a spacer sequence that specifically binds to a target toxic RNA repeat or non-repeat sequence.
  • the spacer which binds the target RNA repeat or non-repeat sequence comprises or consists of about 20-30 nucleotides.
  • a gRNA comprises one or more spacer sequences.
  • the compositions of the disclosure comprise a second RNA binding protein which comprises or consists of a nuclease or endonuclease domain.
  • the second RNA-binding protein is an effector protein.
  • the second RNA binding protein binds RNA in a manner in which it associates with RNA.
  • the second RNA binding protein associates with RNA in a manner in which it cleaves RNA.
  • the second RNA-binding protein is fused to a first RNA-binding protein which is a PUF, PUMBY, or PPR-based protein.
  • the second RNA-binding protein is fused to a first RNA-binding protein which is a catalytically deactivated Cas-based (dCas-based) protein.
  • the second RNA binding protein comprises or consists of an RNAse.
  • the second RNA binding protein comprises or consists of an RNAsel.
  • the RNAsel protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse4.
  • the RNAse4 protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse6.
  • the RNAse6 protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse7.
  • the RNAse7 protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse8.
  • the RNAse8 protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse2.
  • the RNAse2 protein comprises or consists of SEQ ID NO:
  • the second RNA binding protein comprises or consists of an RNAse6PL.
  • the RNAse6PL protein comprises or consists of SEQ ID NO: 331.
  • the second RNA binding protein comprises or consists of an RNAseL.
  • the RNAseL protein comprises or consists of SEQ ID NO: 332.
  • the second RNA binding protein comprises or consists of an RNAseT2.
  • the RNAseT2 protein comprises or consists of SEQ ID NO: 333.
  • the second RNA binding protein comprises or consists of an RNAsel 1.
  • the RNAsel 1 protein comprises or consists of SEQ ID NO: 334.
  • the second RNA binding protein comprises or consists of an RNAseT2-like.
  • the RNAseT2-like protein comprises or consists of SEQ ID NO: 335.
  • the second RNA binding protein comprises or consists of a mutated RNAse.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(K41R)) polypeptide.
  • Rnasel(K41R) polypeptide comprises or consists of SEQ ID NO: 336.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(K41R, D121E)) polypeptide.
  • the Rnasel (Rnasel(K41R, D121E)) polypeptide comprises or consists of SEQ ID NO: 337.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(K41R, D121E, H119N)) polypeptide.
  • Rnasel (Rnasel(K41R, D121E, HI 19N)) polypeptide comprises or consists of SEQ ID NO: 338.
  • the second RNA binding protein comprises or consists of a mutated Rnasel .
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(Hl 19N)) polypeptide.
  • the Rnasel (Rnasel(Hl 19N)) polypeptide comprises or consists of SEQ ID NO: 339.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(R39D, N67D, N88A, G89D, R91D, HI 19N)) polypeptide.
  • the Rnasel (Rnasel(R39D, N67D, N88A, G89D, R91D,
  • HI 19N)) polypeptide comprises or consists of SEQ ID NO: 340.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(R39D, N67D, N88A, G89D, R91D, HI 19N)) polypeptide.
  • Rnasel Rnasel(R39D, N67D, N88A, G89D, R91D, HI 19N, K41R, D121E)
  • polypeptide comprises or consists of SEQ ID NO: 341.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel(R39D, N67D, N88A, G89D, R91D, H119N)) polypeptide.
  • Rnasel (Rnasel(R39D, N67D, N88A, G89D, R91D)) polypeptide comprises or consists of SEQ ID NO: 342.
  • the second RNA binding protein comprises or consists of a mutated Rnasel (Rnasel (R39D, N67D, N88A, G89D, R91D, H119N, K41R, D121E)) polypeptide that comprises or consists of SEQ ID NO: 343.
  • the second RNA binding protein comprises or consists of a NOB1 polypeptide.
  • the NOB1 polypeptide comprises or consists of SEQ ID NO: 344.
  • the second RNA binding protein comprises or consists of an endonuclease.
  • the second RNA binding protein comprises or consists of an endonuclease V (ENDOV).
  • the ENDOV protein comprises or consists of SEQ ID NO: 345.
  • the second RNA binding protein comprises or consists of an endonuclease G (ENDOG).
  • ENDOG protein comprises or consists of SEQ ID NO: 346.
  • the second RNA binding protein comprises or consists of an endonuclease D1 (ENDODl).
  • ENDODl endonuclease D1
  • the ENDODl protein comprises or consists of SEQ ID NO: 347.
  • the second RNA binding protein comprises or consists of a Human flap endonuclease- 1 (hFENl).
  • the hFENl polypeptide comprises or consists of SEQ ID NO: 348.
  • the second RNA binding protein comprises or consists of a DNA repair endonuclease XPF (ERCC4) polypeptide.
  • ERCC4 polypeptide comprises or consists of SEQ ID NO: 349.
  • the second RNA binding protein comprises or consists of an Endonuclease Ill-like protein 1 (NTHL) polypeptide.
  • NTHL polypeptide comprises or consists of SEQ ID NO: 350.
  • the second RNA binding protein comprises or consists of a human Schlafen 14 (hSLFN14) polypeptide.
  • the hSLFN14 polypeptide comprises or consists of SEQ ID NO: 351.
  • the second RNA binding protein comprises or consists of a human beta-lactamase-like protein 2 (hLACTB2) polypeptide.
  • the hLACTB2 polypeptide comprises or consists of SEQ ID NO: 352.
  • the second RNA binding protein comprises or consists of an apurinic/apyrimidinic (AP) endodeoxyribonuclease (APEX) polypeptide.
  • the second RNA binding protein comprises or consists of an apurinic/apyrimidinic (AP) endodeoxyribonuclease (APEX2) polypeptide.
  • the APEX2 polypeptide comprises or consists of SEQ ID NO: 353.
  • the APEX2 polypeptide comprises or consists of SEQ ID NO: 354.
  • the second RNA binding protein comprises or consists of an apurinic or apyrimidinic site lyase (APEXl) polypeptide.
  • APEXl polypeptide comprises or consists of SEQ ID NO: 355.
  • the second RNA binding protein comprises or consists of an angiogenin (ANG) polypeptide.
  • ANG polypeptide comprises or consists of SEQ ID NO: 356.
  • the second RNA binding protein comprises or consists of a heat responsive protein 12 (HRSP12) polypeptide.
  • HRSP12 heat responsive protein 12
  • the HRSP12 polypeptide comprises or consists of SEQ ID NO: 357.
  • the second RNA binding protein comprises or consists of a Zinc Finger CCCH-Type Containing 12A (ZC3H12A) polypeptide.
  • ZC3H12A polypeptide is an endonuclease domain of the ZC3H12A polypeptide which comprises or consists of SEQ ID NO: 358, also referred to as E17 herein.
  • the ZC3H12A polypeptide comprises or consists of SEQ ID NO: 359.
  • the second RNA binding protein comprises or consists of a Reactive Intermediate Imine Deaminase A (RIDA) polypeptide.
  • RIDA Reactive Intermediate Imine Deaminase A
  • the RIDA polypeptide comprises or consists of SEQ ID NO: 360.
  • the second RNA binding protein comprises or consists of a Phospholipase D Family Member 6 (PDL6) polypeptide.
  • PDL6 polypeptide comprises or consists of SEQ ID NO: 361.
  • the second RNA binding protein comprises or consists of a mitochondrial ribonuclease P catalytic subunit (KIAA0391) polypeptide.
  • the KIAA0391 polypeptide comprises or consists of SEQ ID NO: 362.
  • the second RNA binding protein comprises or consists of an argonaute 2 (AG02) polypeptide.
  • the AG02 polypeptide comprises or consists of SEQ ID NO: 363.
  • the second RNA binding protein comprises or consists of a mitochondrial nuclease EXOG (EXOG) polypeptide.
  • EXOG mitochondrial nuclease EXOG
  • the EXOG polypeptide comprises or consists of SEQ ID NO: 364.
  • the second RNA binding protein comprises or consists of a Zinc Finger CCCH-Type Containing 12D (ZC3H12D) polypeptide.
  • ZC3H12D polypeptide comprises or consists of SEQ ID NO: 365.
  • the second RNA binding protein comprises or consists of an endoplasmic reticulum to nucleus signaling 2 (ERN2) polypeptide.
  • ERN2 polypeptide comprises or consists of SEQ ID NO: 366.
  • the second RNA binding protein comprises or consists of a pelota mRNA surveillance and ribosome rescue factor (PELO) polypeptide.
  • the PELO polypeptide comprises or consists of SEQ ID NO: 367.
  • the second RNA binding protein comprises or consists of a YBEY metallopeptidase (YBEY) polypeptide.
  • YBEY YBEY metallopeptidase
  • the YBEY polypeptide comprises or consists of SEQ ID NO: 368.
  • the second RNA binding protein comprises or consists of a cleavage and polyadenylation specific factor 4 like (CPSF4L) polypeptide.
  • CPSF4L polypeptide comprises or consists of SEQ ID NO: 369.
  • the second RNA binding protein comprises or consists of an hCG_2002731 polypeptide.
  • the hCG_2002731 polypeptide comprises or consists of SEQ ID NO: 370.
  • the hCG_2002731 polypeptide comprises or consists of SEQ ID NO: 371.
  • the second RNA binding protein comprises or consists of an Excision Repair Cross-Complementation Group 1 (ERCC1) polypeptide.
  • the ERCC1 polypeptide comprises or consists of SEQ ID NO: 372.
  • the second RNA binding protein comprises or consists of a ras-related C3 botulinum toxin substrate 1 isoform (RAC1) polypeptide.
  • RAC1 polypeptide comprises or consists of SEQ ID NO: 373.
  • the second RNA binding protein comprises or consists of a Ribonuclease A A1 (RAA1) polypeptide.
  • RAA1 polypeptide comprises or consists of SEQ ID NO: 374.
  • the second RNA binding protein comprises or consists of a Ras Related Protein (RAB1) polypeptide.
  • RAB1 polypeptide comprises or consists of SEQ ID NO: 375.
  • the second RNA binding protein comprises or consists of a DNA Replication Helicase/Nuclease 2 (DNA2) polypeptide.
  • the DNA2 polypeptide comprises or consists of SEQ ID NO: 376.
  • the second RNA binding protein comprises or consists of a FLJ35220 polypeptide.
  • the FLJ35220 polypeptide comprises or consists of SEQ ID NO: 377.
  • the second RNA binding protein comprises or consists of a FLJ13173 polypeptide.
  • the FLJ13173 polypeptide comprises or consists of SEQ ID NO: 378.
  • the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein (TENM) polypeptide. In some embodiments, the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein 1 (TENM1) polypeptide. In some embodiments, the TENM1 polypeptide comprises or consists of SEQ ID NO: 379.
  • TEM Teneurin Transmembrane Protein
  • the second RNA binding protein comprises or consists of Teneurin Transmembrane Protein 2 (TENM2) polypeptide.
  • the TENM2 polypeptide comprises or consists of SEQ ID NO: 380.
  • the second RNA binding protein comprises or consists of a Ribonuclease Kappa (RNAseK) polypeptide.
  • the RNAseK polypeptide comprises or consists of SEQ ID NO: 381.
  • the second RNA binding protein comprises or consists of a transcription activator-like effector nuclease (TALEN) polypeptide or a nuclease domain thereof.
  • TALEN transcription activator-like effector nuclease
  • the TALEN polypeptide comprises or consists of SEQ ID NO: 382.
  • the TALEN polypeptide comprises or consists of SEQ ID NO: 383.
  • the second RNA binding protein comprises or consists a zinc finger nuclease polypeptide or a nuclease domain thereof. In some embodiments, the second RNA binding protein comprises or consists of a ZNF638 polypeptide or a nuclease domain thereof. In some embodiments, the ZNF638 polypeptide polypeptide comprises or consists of SEQ ID NO: 384.
  • the second RNA binding protein comprises or consists of a PIN domain derived from the human SMG6 protein, also commonly known as telomerase-binding protein EST1 A isoform 3, NCBI Reference Sequence: NP 001243756.1.
  • the PIN from hSMG6 is used herein in the form of a Cas fusion protein and as an internal control, for example, and without limitation.
  • the composition further comprises (a) a sequence comprising a gRNA that specifically binds within an RNA molecule and (b) a sequence encoding a nuclease.
  • a nuclease comprises a sequence isolated or derived from a CRISPR/Cas protein.
  • a nuclease comprises a sequence isolated or derived from a TALEN or a nuclease domain thereof.
  • a nuclease comprises a sequence isolated or derived from a zinc finger nuclease or a nuclease domain thereof.
  • RNA-targeting Casl3d compositions are packaged as AAV unitary vectors.
  • an exemplary RNA-targeting Casl3d composition comprises from 5’ to 3’ : a) a human U6 promoter, b) a casl3d gRNA, wherein the gRNA comprises i) a direct repeat sequence and ii) a RNA targeting spacer sequence, a c) hybrid promoter as disclosed herein, a kozak sequence, a SV40 NLS sequence, a linker sequence, a sequence encoding Casl3d, a linker sequence, a SV40 NLS sequence, a linker sequence, an HA tag sequence, and a BGH poly a sequence.
  • an exemplary RNA-targeting Casl3d composition comprises from 5’ to 3’ : a) a human U6 promoter, b) a casl3d gRNA, wherein the gRNA comprises i) a direct repeat sequence and ii) a RNA targeting spacer sequence, a c) hybrid promoter as disclosed herein, and d) a sequence encoding Casl3d.
  • an exemplary RNA-targeting Casl3d composition includes additional elements such as signal sequences, linkers, tags, and poly A sequences.
  • an RNA-targeting Casl3d composition comprises from 5’ to 3’: a) a human U6 promoter, b) a casl3d gRNA, wherein the gRNA comprises i) a direct repeat sequence and ii) a RNA targeting spacer sequence, a c) hybrid promoter as disclosed herein, d) a kozak sequence, an e) NLS sequence, f) a linker sequence, g) a sequence encoding Casl3d, h) a linker sequence, i) an NLS sequence, j) a linker sequence, k) a tag sequence, and 1) a poly A sequence.
  • NOIs or transgenes or GOIs such as nucleic acid sequences encoding RNA-targeting Casl3d proteins of the disclosure are codon optimized nucleic acid sequences.
  • the codon optimized sequence exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased translation in a human subject relative to a wild-type or non-codon optimized nucleic acid sequence.
  • a codon optimized nucleic acid sequence exhibits increased stability. In some aspects, a codon optimized nucleic acid sequence exhibits increased stability through increased resistance to hydrolysis. In some embodiments, the codon optimized sequence exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased stability relative to a wild-type or non-codon optimized nucleic acid sequence.
  • the codon optimized sequence exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased resistance to hydrolysis in a human subject relative to a wild-type or non-codon optimized nucleic acid sequence.
  • a codon optimized nucleic acid sequence can comprise no donor splice sites.
  • a codon optimized nucleic acid sequence can comprise no more than about one, or about two, or about three, or about four, or about five, or about six, or about seven, or about eight, or about nine, or about ten donor splice sites.
  • a codon optimized nucleic acid sequence comprises at least one, or at least two, or at least three, or at least four, or at least five, or at least six, or at least seven, or at least eight, or at least nine, or at least ten fewer donor splice sites as compared to a non-codon optimized nucleic acid sequence.
  • the removal of donor splice sites in the codon optimized nucleic acid sequence can unexpectedly and unpredictably increase expression of protein of interest in vivo , as cryptic splicing is prevented.
  • cryptic splicing may vary between different subjects, meaning that the expression level of a protein comprising donor splice sites may unpredictably vary between different subjects. Such unpredictability is unacceptable in the context of human therapy.
  • the codon optimized nucleic acid sequences which lacks donor splice sites unexpectedly and surprisingly allows for increased expression of the protein in human subjects and regularizes expression of the protein across different human subjects.
  • a codon optimized nucleic acid sequence can have a GC content that differs from the GC content of the non-codon optimized nucleic acid sequence encoding the RNA-targeting Casl3d protein. In some aspects, the GC content of a codon optimized nucleic acid sequence is more evenly distributed across the entire nucleic acid sequence, as compared to the non-codon optimized nucleic acid sequence.
  • the codon optimized nucleic acid sequence exhibits a more uniform melting temperature (“Tm”) across the length of the transcript.
  • Tm melting temperature
  • a codon optimized nucleic acid sequence can have fewer repressive microRNA target binding sites as compared to the non-codon optimized nucleic acid sequence.
  • a codon optimized nucleic acid sequence can have at least one, or at least two, or at least three, or at least four, or at least five, or at least six, or at least seven, or at least eight, or at least nine, or at least ten, or at least ten fewer repressive microRNA target binding sites as compared to the non-codon optimized nucleic acid sequence.
  • the codon optimized nucleic acid sequence unexpectedly exhibits increased expression in a human subject.
  • the composition comprises an NOI which is a nucleic acid sequence encoding a target RNA- binding fusion protein comprising (a) a sequence encoding a first RNA-binding polypeptide or portion thereof; and optionally (b) a sequence encoding a second RNA-binding polypeptide, wherein the first RNA-binding polypeptide binds a target RNA, and wherein the second RNA-binding polypeptide comprises RNA-nuclease activity.
  • NOI is a nucleic acid sequence encoding a target RNA- binding fusion protein comprising (a) a sequence encoding a first RNA-binding polypeptide or portion thereof; and optionally (b) a sequence encoding a second RNA-binding polypeptide, wherein the first RNA-binding polypeptide binds a target RNA, and wherein the second RNA-binding polypeptide comprises RNA-nuclease activity.
  • a target RNA-binding fusion protein is an RNA-guided target RNA-binding fusion protein.
  • RNA-guided target RNA-binding fusion proteins comprise at least one RNA-binding polypeptide which corresponds to a gRNA which guides the RNA- binding polypeptide to target RNA.
  • RNA-guided target RNA-binding fusion proteins include without limitation, RNA-binding polypeptides which are CRISPR/Cas-based RNA-binding polypeptides or portions thereof.
  • a target RNA-binding fusion protein of the disclosure comprises a signal sequence.
  • a target RNA-binding fusion protein comprises one or more signal sequences.
  • the signal sequence(s) is a nuclear localization sequence (NLS), nuclear export signal (NES) or a combination thereof.
  • the tag sequence comprises a nuclear localization sequence (NLS).
  • the NLS sequence comprises a sequence listed in table 8.
  • the NLS signal sequence is a human NLS.
  • the human NLS signal sequence is a human pRB-NLS or a human pRB-NLS (extended version).
  • the signal sequence comprises one or more NES sequences.
  • the one or more NES sequence comprises a sequence listed in Table 9.
  • a target RNA-binding fusion protein of the disclosure comprises a tag sequence.
  • the tag sequence is a FLAG tag.
  • the FLAG tag sequence is DYKDDDDK (SEQ ID NO: 426).
  • a target RNA-binding fusion protein comprises a linker sequence.
  • the linker sequence may comprise or consist of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or any number of amino acids in between.
  • the linker sequence comprises a linker sequence listed in Table 10. Table 10. Linker Sequences of the disclosure
  • the NOI is a nucleic acid encoding a target RNA-binding fusion protein which is not an RNA-guided target RNA-binding fusion protein and as such comprises at least one RNA-binding polypeptide which is capable of binding a target RNA without a corresponding gRNA sequence.
  • RNA-binding polypeptides include, without limitation, at least one RNA-binding protein or RNA-binding portion thereof which is a PUF (Pumilio and FBF homology family) protein. This type RNA- binding polypeptide can be used instead of a gRNA-guided RNA binding protein such as CRISPR/Cas.
  • the unique RNA recognition mode of PUF proteins (named for Drosophila Pumilio and C. elegans fem-3 binding factor) that are involved in mediating mRNA stability and translation are well known in the art.
  • the PUF domain of human Pumiliol also known in the art, binds tightly to cognate RNA sequences and its specificity can be modified. It contains eight PUF modules that recognize eight consecutive RNA bases with each module recognizing a single base. Since two amino acid side chains in each module recognize the Watson-Crick edge of the corresponding base and determine the specificity of that module, a PUF protein can be designed to specifically bind most 8 to 16-nt RNA. Wang et al, Nat Methods. 2009; 6(11): 825-830. See also WO2012/068627 which is incorporated by reference herein in its entirety.
  • PumHD is a modified version of the WT Pumilio protein that exhibits programmable binding to arbitrary 8-base sequences of RNA.
  • Each of the eight units of PumHD can bind to all four RNA bases, and the RNA bases flanking the target sequence do not affect binding. See also the following for art-recognized RNA-binding rules of PUF design: Filipovska A, Razif MF, Nygard KK, & Rackham O. A universal code for RNA recognition by PUF proteins.
  • human PUMl (1186 amino acids) contains an RNA-binding domain (RBD) in the C-terminus of the protein (also known as Pumilio homology domain PUM-HD amino acid 828-amino acid 1175) and that PUFs are based on the RBD of human PUMl.
  • RBD RNA-binding domain
  • amino acids 12, 13, and 16 are important for RNA binding with 12 and 16 responsible for RNA base recognition.
  • Amino acid 13 stacks with RNA bases and can be modified to tune specificity and affinity.
  • the PUF design may maintain amino acid 13 as human PUMl’s native residue.
  • amino acid 13 for stacking
  • amino acid 13 will be engineered with an H and in other embodiments, will be engineered with a Y.
  • stacking residues may be modified to improve binding and specificity. Recognition occurs in reverse orientation as N- to C-terminal PUF recognizes 3’ to 5’ RNA. Accordingly, PUF engineering of 8 modules (8PUF), as known in the art, mimics a human protein.
  • An exemplary 8-mer RNA recognition (8PUF) would be designed as follows: R1 , -R1-R2-R3-R4-R5-R6-R7-R8-R8 ⁇
  • an 8PUF is used as the RBD.
  • a variation of the 8PUF design is used to create a 14-mer RNA recognition (14PUF) RBD, 15-mer RNA recognition (15PUF) RBD, or a 16- mer RNA recognition (16PUF) RBD.
  • the PUF can be engineered to comprise a 4-mer, 5-mer, 6-mer, 7-mer, 8-mer, 9-mer, 10-mer, 11-mer, 12-mer, 13-mer, 14- mer, 15-mer, 16-mer, 24-mer, 30-mer, 36-mer, or any number of modules between. Shinoda et ah, 2018; Criscuolo et ah, 2020. Repeats 1-8 of wild type human PUM1 are provided herewith at SEQ ID NOS: 434-441, respectively.
  • the nucleic acid sequence encoding the PUF domain from human PUM1 is SEQ ID NO: 442 and the amino acid sequence of the PUF domain from human PUM1 amino acids 828-1176 is SEQ ID NO: 443. See also US Patent 9,580,714 which is incorporated herein in its entirety.
  • the fusion protein comprises at least one RNA-binding protein or RNA-binding portion thereof which is a PUMBY (Pumilio-based assembly) protein.
  • RNA-binding protein PumHD which has been widely used in native and modified form for targeting RNA, has been engineered into a protein architecture designed to yield a set of four canonical protein modules, each of which targets one RNA base. These modules (i.e., Pumby, for Pumilio- based assembly) are concatenated in chains of varying composition and length, to bind desired target RNAs.
  • the first RNA binding protein comprises a Pumilio and FBF (PUF) protein.
  • the first RNA binding protein comprises a Pumilio-based assembly (PUMBY) protein.
  • the PUF or PUMBY RNA-binding proteins are fused with a nuclease domain such as El 7.
  • RNA-binding proteins or RNA-binding portions thereof is a PPR protein.
  • PPR proteins proteins with pentatricopeptide repeat (PPR) motifs derived from plants
  • PPR proteins are nuclear- encoded and exclusively controlled at the RNA level organelles (chloroplasts and mitochondria), cutting, translation, splicing, RNA editing, genes specifically acting on RNA stability.
  • PPR proteins are typically a motif of 35 amino acids and have a structure in which a PPR motif is about 10 contiguous amino acids. The combination of PPR motifs can be used for sequence-selective binding to RNA.
  • PPR proteins are often comprised of PPR motifs of about 10 repeat domains.
  • the fusion protein disclosed herein comprises a linker between the at least two RNA-binding polypeptides.
  • the linker is a peptide linker.
  • the linker is VDTANGS (SEQ ID NO: 401).
  • the peptide linker comprises one or more repeats of the tri-peptide GGS. In other embodiments, the linker is a non-peptide linker.
  • the non-peptide linker comprises polyethylene glycol (PEG), polypropylene glycol (PPG), co- poly(ethylene/propylene) glycol, polyoxyethylene (POE), polyurethane, polyphosphazene, polysaccharides, dextran, polyvinyl alcohol, polyvinylpyrrolidones, polyvinyl ethyl ether, polyacryl amide, polyacrylate, polycyanoacrylates, lipid polymers, chitins, hyaluronic acid, heparin, or an alkyl linker.
  • PEG polyethylene glycol
  • PPG polypropylene glycol
  • POE polyoxyethylene
  • polyurethane polyphosphazene
  • polysaccharides dextran
  • polyvinyl alcohol polyvinylpyrrolidones
  • polyvinyl ethyl ether polyacryl amide
  • polyacrylate polycyanoacrylates
  • lipid polymers chitins, hyaluronic acid
  • the at least one RNA-binding protein does not require multimerization for RNA-binding activity.
  • the at least one RNA- binding protein is not a monomer of a multimer complex.
  • a multimer protein complex does not comprise the RNA binding protein.
  • the at least one of RNA-binding protein selectively binds to a target sequence within the RNA molecule.
  • the at least one RNA-binding protein does not comprise an affinity for a second sequence within the RNA molecule.
  • the at least one RNA-binding protein does not comprise a high affinity for or selectively bind a second sequence within the RNA molecule.
  • the at least one RNA-binding protein comprises between 2 and 1300 amino acids, inclusive of the endpoints.
  • the at least one RNA-binding protein of the fusion proteins disclosed herein further comprises a sequence encoding a nuclear localization signal (NLS).
  • a nuclear localization signal (NLS) is positioned at the N-terminus of the RNA binding protein.
  • the at least one RNA-binding protein comprises an NLS at a C-terminus of the protein.
  • the at least one RNA-binding protein further comprises a first sequence encoding a first NLS and a second sequence encoding a second NLS.
  • the first NLS or the second NLS is positioned at the N-terminus of the RNA-binding protein.
  • the at least one RNA-binding protein comprises the first NLS or tdhe second NLS at a C-terminus of the protein. In some embodiments, the at least one RNA-binding protein further comprises an NES (nuclear export signal) or other peptide tag or secretory signal. In one embodiment, the tag is a FLAG tag.
  • a fusion protein disclosed herein comprises the at least one RNA-binding protein as a first RNA-binding protein together with a second RNA-binding protein comprising or consisting of a nuclease domain.
  • the second RNA-binding polypeptide is operably configured to the first RNA-binding polypeptide at the C-terminus of the first RNA-binding polypeptide. In some embodiments, the second RNA-binding polypeptide is operably configured to the first RNA-binding polypeptide at the N-terminus of the first RNA-binding polypeptide.
  • an exemplary fusion protein is a PUF or PUMBY-based first RNA-binding protein fused to a second RNA-bindng protein which is an zinc-finger endonuclease known as ZC3H12A or truncation of it is shown in SEQ ID NO: 358 (also termed E17 herein).
  • nucleic acid sequences encoding PUF proteins of the disclosure are codon optimized nucleic acid sequences.
  • the codon optimized sequence encoding a PUF protein exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased expression in a human subject relative to a wild-type or non-codon optimized nucleic acid sequence.
  • nucleic acid sequences encoding PUF proteins of the disclosure are codon optimized nucleic acid sequences.
  • the codon optimized sequence encoding a PUF protein exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased translation in a human subject relative to a wild-type or non-codon optimized nucleic acid sequence.
  • a codon optimized nucleic acid sequence encoding a PUF protein exhibits increased stability. In some aspects, a codon optimized nucleic acid sequence encoding a PUF protein exhibits increased stability through increased resistance to hydrolysis. In some embodiments, the codon optimized sequence encoding a PUF protein exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased stability relative to a wild-type or non-codon optimized nucleic acid sequence.
  • the codon optimized sequence encoding a PUF protein exhibits at least 5%, at least 10%, at least 20%, at least 30%, at least 50%, at least 75%, at least 100%, at least 200%, at least 300%, at least 500%, or at least 1000% increased resistance to hydrolysis in a human subject relative to a wild-type or non-codon optimized nucleic acid sequence.
  • a codon optimized nucleic acid sequence encoding a PUF protein can comprise no donor splice sites. In some aspects, a codon optimized nucleic acid sequence encoding a PUF protein can comprise no more than about one, or about two, or about three, or about four, or about five, or about six, or about seven, or about eight, or about nine, or about ten donor splice sites.
  • a codon optimized nucleic acid sequence encoding a PUF protein comprises at least one, or at least two, or at least three, or at least four, or at least five, or at least six, or at least seven, or at least eight, or at least nine, or at least ten fewer donor splice sites as compared to a non-codon optimized nucleic acid sequence encoding the PUF protein.
  • the removal of donor splice sites in the codon optimized nucleic acid sequence can unexpectedly and unpredictably increase expression of the PUF protein in vivo , as cryptic splicing is prevented.
  • cryptic splicing may vary between different subjects, meaning that the expression level of the PUF protein comprising donor splice sites may unpredictably vary between different subjects.
  • a codon optimized nucleic acid sequence encoding a PUF protein can have a GC content that differs from the GC content of the non-codon optimized nucleic acid sequence encoding the PUF protein.
  • the GC content of a codon optimized nucleic acid sequence encoding a PUF protein is more evenly distributed across the entire nucleic acid sequence, as compared to the non-codon optimized nucleic acid sequence encoding the PUF protein.
  • the codon optimized nucleic acid sequence exhibits a more uniform melting temperature (“Tm”) across the length of the transcript.
  • Tm melting temperature
  • a codon optimized nucleic acid sequence encoding a PUF protein can have fewer repressive microRNA target binding sites as compared to the non-codon optimized nucleic acid sequence encoding the PUF protein.
  • a codon optimized nucleic acid sequence encoding a PUF protein can have at least one, or at least two, or at least three, or at least four, or at least five, or at least six, or at least seven, or at least eight, or at least nine, or at least ten, or at least ten fewer repressive microRNA target binding sites as compared to the non-codon optimized nucleic acid sequence the PUF protein.
  • a vector comprises the hybrid promoters disclosed herein in operable linkage to an NOI or transgene.
  • the NOI is a guide RNA.
  • the vector comprises at least one guide RNA of the disclosure.
  • the vector comprises one or more guide RNA(s) of the disclosure.
  • the vector comprises two or more guide RNAs of the disclosure.
  • the vector comprises three guide RNAs.
  • the vector comprises four guide RNAs.
  • the vector comprises an NOI which further comprises a guided or non- guided RNA-binding protein of the disclosure.
  • the vector comprises an NOI which further comprises a RNA-binding fusion protein of the disclosure.
  • the fusion protein comprises a first RNA binding protein and a second RNA binding protein.
  • the hybrid promoters of the disclosed are in operably linkage with an NOI in a single or unitary vector.
  • the single vector comprises an NOI which is a RNA-guided RNA-binding system comprising a RNA-binding protein and a gRNA are in a single vector.
  • the single vector comprises the RNA-guided RNA-binding systems which are Casl3d RNA-guided RNA- binding systems or catalytic deactivated Casl3d (dCasl3d) RNA-guided RNA-binding systems.
  • the single vector comprises the Casl3d RNA-guided RNA- binding systems which are CasRx or dCasRx RNA-guided RNA-binding systems.
  • the single vector comprises a non-guided RNA-binding system comprising a PUF or PUMBY-based protein fused with a nuclease domain from ZC3H12A, such as E17 (SEQ ID NO: 358).
  • the single vector comprises a dCasl3d RNA- binding system fused with a nuclease domain from ZC3H12A, such as E17 (SEQ ID NO: 359).
  • a first vector comprises an NOI which is a guide RNA of the disclosure and a second vector comprises an NOI which is an RNA-binding protein or RNA-binding fusion protein of the disclosure.
  • the first vector comprises at least one guide RNA of the disclosure.
  • the first vector comprises one or more guide RNA(s) of the disclosure.
  • the first vector comprises two or more guide RNA(s) of the disclosure.
  • the fusion protein comprises a first RNA binding protein and a second RNA binding protein.
  • the first vector and the second vector are identical vectors or vector serotypes.
  • the first vector and the second vector are not identical vectors or vector serotypes.
  • the RNA-binding systems are capable of targeting toxic CAG, CUG, GGCCCC, CCGGG, or GGCCC+CCGGGG RNA repeats are in a single vector.
  • the RNA-binding systems are capable of targeting a non-repeat RNA of interest.
  • the RNA-binding systems disclosed herein are in a single vector.
  • vectors refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.
  • viral vector e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses.
  • Viral vectors also include polynucleotides carried by a virus for transfection into a host cell.
  • the vector is a lentivirus (such as an integration-deficient lentiviral vector) or adeno-associated viral (AAV) vector.
  • Vectors are capable of autonomous replication in a host cell into which they are introduced such as e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors and other vectors such as, e.g., non-episomal mammalian vectors, are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
  • vectors such as e.g., expression vectors
  • Common expression vectors are often in the form of plasmids.
  • recombinant expression vectors comprise a nucleic acid provided herein such as e.g., a guide RNA which can be expressed from a DNA sequence, and a nucleic acid encoding a Cas 13d protein, in a form suitable for expression of a protein in a host cell.
  • Recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid sequence to be expressed.
  • operably linked is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence such as e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell. Certain embodiments of a vector depend on factors such as the choice of the host cell to be transformed, and the level of expression desired.
  • a vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein such as, e.g., CRISPR transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.
  • a vector of the disclosure is a viral vector.
  • the viral vector comprises a sequence isolated or derived from a retrovirus.
  • the viral vector comprises a sequence isolated or derived from a lentivirus.
  • the viral vector comprises a sequence isolated or derived from an adenovirus.
  • the viral vector comprises a sequence isolated or derived from an adeno-associated virus (AAV).
  • AAV adeno-associated virus
  • the viral vector is replication incompetent.
  • the viral vector is isolated or recombinant.
  • the viral vector is self complementary.
  • the viral vector comprises a sequence isolated or derived from an adeno-associated virus (AAV).
  • AAV adeno-associated virus
  • the viral vector comprises an inverted terminal repeat sequence or a capsid sequence that is isolated or derived from an AAV of serotype AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAV9, AAV10 (AAVrhlO), AAV11 or AAV12.
  • the AAV serotype is AAVrh.74.
  • the AAV vector comprises a modified capsid.
  • the AAV vector is an AAV2- Tyr mutant vector.
  • the AAV vector comprises a capsid with a non tyrosine amino acid at a position that corresponds to a surface-exposed tyrosine residue in position Tyr252, Tyr272, Tyr275, Tyr281, Tyr444, Tyr500, Tyr508, Tyr612, Tyr704,
  • the AAV vector comprises an engineered capsid.
  • AAV vectors comprising engineered capsids include without limitation, AAV2.7m8, AAV9.7m8, AAV22tYF, and AAV8 Y733F).
  • the engineered capsid is a ubiquitination resistant capsid.
  • the ubiquitination capsid is an AAV2 capsid comprising tyrosine (Y) and serine (S) mutations.
  • the AAV2 capsid comprises Y, S and threonine (T) mutations.
  • the engineered AAV2 capsid includes, without limitation, AAV2 capsid mutants such as T455V, T491V, T550V, T659V, Y444+500+730F, and Y444+500+730F+T491V.
  • the viral vector is replication incompetent.
  • the viral vector is isolated or recombinant (rAAV).
  • the viral vector is self complementary (scAAV).
  • a vector of the disclosure is a non-viral vector.
  • the vector comprises or consists of a nanoparticle, a micelle, a liposome or lipoplex, a polymersome, a polyplex or a dendrimer.
  • the vector is an expression vector or recombinant expression system.
  • the term “recombinant expression system” refers to a genetic construct for the expression of certain genetic material formed by recombination.
  • an expression vector, viral vector or non-viral vector provided herein includes without limitation, an expression control element.
  • An “expression control element” as used herein refers to any sequence that regulates the expression of a coding sequence, such as a gene.
  • Exemplary expression control elements include but are not limited to promoters, enhancers, microRNAs, post-transcriptional regulatory elements, polyadenylation signal sequences, and introns. Expression control elements may be constitutive, inducible, repressible, or tissue-specific, for example.
  • a “promoter” is a control sequence that is a region of a polynucleotide sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors. In some embodiments, expression control by a promoter is tissue-specific. In some embodiments, expression control by a promoter is constituitive or ubiquitous.
  • Non-limiting exemplary promoters include a Pol III promoter such as, e.g., U6 and HI promoters and/or a Pol II promoter e.g., SV40, CMV (optionally including the CMV enhancer), RSV (Rous Sarcoma Virus LTR promoter (optionally including RSV enhancer), CBA (hybrid CMV enhancer/ chicken B-actin), CAG (hybrid CMV enhancer fused to chicken B-actin), truncated CAG (tCAG), Cbh (hybrid CBA), EF-la (human elongation factor alpha-1) or EFS (short intron-less EF-1 alpha), PGK (phosphoglycerol kinase), CEF (chicken embryo fibroblasts), UBC (ubiquitinC), GUSB (lysosomal enzyme beta-glucuronidase), UCOE (ubiquitous chromatin opening element), hAAT (alpha- 1
  • Enhancer is a region of DNA that can be bound by activating proteins to increase the likelihood or frequency of transcription.
  • Non-limiting exemplary enhancers and posttranscriptional regulatory elements include the CMV enhancer, MCK enhancer, R-U5’ segment in LTR of HTLV-1, SV40 enhancer, the intron sequence between exons 2 and 3 of rabbit B-globin, and WPRE.
  • an expression vector, viral vector or non-viral vector includes without limitation, vector elements such as an IRES or 2A peptide sites for configuration of “multicistronic” or “polycistronic” or “bicistronic” or tricistronic” constructs, i.e., having double or triple or multiple coding areas or exons, and as such will have the capability to express from mRNA two or more proteins from a single construct.
  • Multicistronic vectors simultaneously express two or more separate proteins from the same mRNA.
  • the two strategies most widely used for constructing multicistronic configurations are through the use of an IRES or a 2A self-cleaving site.
  • an “IRES” refers to an internal ribosome entry site or portion thereof of viral, prokaryotic, or eukaryotic origin which are used within polycistronic vector constructs.
  • an IRES is an RNA element that allows for translation initiation in a cap-independent manner.
  • self-cleaving peptides or “sequences encoding self-cleaving peptides” or “2A self-cleaving site” refer to linking sequences which are used within vector constructs to incorporate sites to promote ribosomal skipping and thus to generate two polypeptides from a single promoter, such self cleaving peptides include without limitation, T2A, and P2A peptides or other sequences encoding the self-cleaving peptides.
  • exemplary vector configurations comprise a hybrid promoter disclosed herein driving the expression of an NOI.
  • the NOI is a nucleic acid encoding the RNA-targeting PUF-endonuclease fusion.
  • a vector configuration comprises a hybrid promoter disclosed herein driving expression of the RNA-guided Cas RNAse RNA-binding protein, or dCas protein fusion in operable linkage with a second promoter driving expressing of a cognate gRNA.
  • the vector configurations can comprise linker(s), signal sequence(s), and/or tag(s).
  • the vector is a viral vector.
  • the vector is an adenoviral vector, an adeno-associated viral (AAV) vector, or a lentiviral vector.
  • the vector is a retroviral vector, an adenoviral/retroviral chimera vector, a herpes simplex viral I or II vector, a parvoviral vector, a reticuloendotheliosis viral vector, a polioviral vector, a papillomaviral vector, a vaccinia viral vector, or any hybrid or chimeric vector incorporating favorable aspects of two or more viral vectors.
  • the vector further comprises one or more expression control elements operably linked to the polynucleotide. In some embodiments, the vector further comprises one or more selectable markers. In some embodiments, the AAV vector has low toxicity. In some embodiments, the AAV vector does not incorporate into the host genome, thereby having a low probability of causing insertional mutagenesis. In some embodiments, the AAV vector can encode a range of total polynucleotides from 4.5 kb to 4.75 kb.
  • exemplary AAV vectors that may be used in any of the herein described compositions, systems, methods, and kits can include an AAV1 vector, a modified AAV1 vector, an AAV2 vector, a modified AAV2 vector, an AAV2-Tyr mutant vector, an AAV3 vector, a modified AAV3 vector, an AAV4 vector, a modified AAV4 vector, an AAV5 vector, a modified AAV5 vector, an AAV6 vector, a modified AAV6 vector, an AAV7 vector, a modified AAV7 vector, an AAV8 vector, an AAVrh8 vector, an AAV9 vector, an AAV.rhlO vector, a modified AAV.rhlO vector, an AAVrh.74, an AAV.rh32/33 vector, a modified AAV.rh32/33 vector, an AAV.rh43 vector, a modified AAV.rh43 vector, an AAV.rh64Rl vector, and a modified AAV.rh
  • the lentiviral vector is an integrase-competent lentiviral vector (ICLV).
  • the lentiviral vector can refer to the transgene plasmid vector as well as the transgene plasmid vector in conjunction with related plasmids (e.g., a packaging plasmid, a rev expressing plasmid, an envelope plasmid) as well as a lentiviral -based particle capable of introducing exogenous nucleic acid into a cell through a viral or viral-like entry mechanism.
  • Lentiviral vectors are well-known in the art (see, e.g., Trono D.
  • exemplary lentiviral vectors that may be used in any of the herein described compositions, systems, methods, and kits can include a human immunodeficiency virus (HIV) 1 vector, a modified human immunodeficiency virus (HIV) 1 vector, a human immunodeficiency virus (HIV) 2 vector, a modified human immunodeficiency virus (HIV) 2 vector, a sooty mangabey simian immunodeficiency virus (SIVS M ) vector, a modified sooty mangabey simian immunodeficiency virus (SIVS M ) vector, a African green monkey simian immunodeficiency virus (SIVAG M ) vector, a modified African green monkey simian immunodeficiency virus (SIVAG M ) vector, an HIVdeficiency virus (HIV) 1 vector, a modified human immunodeficiency virus (HIV) 1 vector, a human immunodeficiency virus (HIV) 2 vector, a modified human immuno
  • nucleic acid sequences encoding the gene therapy compositions or RNA-targeting systems for use in gene transfer and expression techniques described herein. It should be understood, although not always explicitly stated that the sequences provided herein can be used to provide the expression product as well as substantially identical sequences that produce a protein that has the same biological properties. These “biologically equivalent” or “biologically active” or “equivalent” polypeptides are encoded by equivalent polynucleotides as described herein.
  • They may possess at least 60%, or alternatively, at least 65%, or alternatively, at least 70%, or alternatively, at least 75%, or alternatively, at least 80%, or alternatively at least 85%, or alternatively at least 90%, or alternatively at least 95% or alternatively at least 98%, identical primary amino acid sequence to the reference polypeptide when compared using sequence identity methods run under default conditions.
  • Specific polypeptide sequences are provided as examples of particular embodiments. Modifications to the sequences to amino acids with alternate amino acids that have similar charge.
  • an equivalent polynucleotide is one that hybridizes under stringent conditions to the reference polynucleotide or its complement or in reference to a polypeptide, a polypeptide encoded by a polynucleotide that hybridizes to the reference encoding polynucleotide under stringent conditions or its complementary strand.
  • an equivalent polypeptide or protein is one that is expressed from an equivalent polynucleotide.
  • nucleic acid sequences e.g., polynucleotide sequences
  • exemplary Cas sequences such as e.g., a nucleic acid sequence encoding SEQ ID NO: 144 (Cas 13d known as CasRx) or the amino acid sequence encoding SEQ ID NO: 298 (Casl3d known as CasRx), are codon optimized for expression in human cells. Codon optimization refers to the fact that different cells differ in their usage of particular codons. This codon bias corresponds to a bias in the relative abundance of particular tRNAs in the cell type.
  • nucleic acid sequences coding for, e.g., a Cas protein can be generated.
  • such a sequence is optimized for expression in a host or target cell, such as a host cell used to express the Cas protein or a cell in which the disclosed methods are practiced (such as in a mammalian cell, e.g., a human cell).
  • Codon preferences and codon usage tables for a particular species can be used to engineer isolated nucleic acid molecules encoding a Cas protein (such as one encoding a protein having at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to its corresponding wild-type protein) that takes advantage of the codon usage preferences of that particular species.
  • the Cas proteins disclosed herein can be designed to have codons that are preferentially used by a particular organism of interest.
  • an Cas nucleic acid sequence is optimized for expression in human cells, such as one having at least 70%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 98%, or at least 99% sequence identity to its corresponding wild-type or originating nucleic acid sequence.
  • an isolated nucleic acid molecule encoding at least one Cas protein (which can be part of a vector) includes at least one Cas protein coding sequence that is codon optimized for expression in a eukaryotic cell, or at least one Cas protein coding sequence codon optimized for expression in a human cell.
  • such a codon optimized Cas coding sequence has at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to its corresponding wild-type or originating sequence.
  • a eukaryotic cell codon optimized nucleic acid sequence encodes a Cas protein having at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to its corresponding wild-type or originating protein.
  • clones containing functionally equivalent nucleic acids may be routinely generated, such as nucleic acids which differ in sequence but which encode the same Cas protein sequence.
  • Silent mutations in the coding sequence result from the degeneracy (i.e., redundancy) of the genetic code, whereby more than one codon can encode the same amino acid residue.
  • leucine can be encoded by CTT, CTC, CTA, CTG, TTA, or TTG; serine can be encoded by TCT, TCC, TCA, TCG, AGT, or AGC; asparagine can be encoded by AAT or AAC; aspartic acid can be encoded by GAT or GAC; cysteine can be encoded by TGT or TGC; alanine can be encoded by GCT, GCC, GCA, or GCG; glutamine can be encoded by CAA or CAG; tyrosine can be encoded by TAT or TAC; and isoleucine can be encoded by ATT, ATC, or ATA. Tables showing the standard genetic code can be found in various sources (see, for example, Stryer, 1988, Biochemistry, 3.sup.rd Edition, W.H. 5 Freeman and Co., NY).
  • Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
  • the hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner.
  • the complex may comprise two strands forming a duplex structure, three or more strands forming a multi -stranded complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PC reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
  • Examples of stringent hybridization conditions include: incubation temperatures of about 25°C to about 37°C; hybridization buffer concentrations of about 6x SSC to about lOx SSC; formamide concentrations of about 0% to about 25%; and wash solutions from about 4x SSC to about 8x SSC.
  • Examples of moderate hybridization conditions include: incubation temperatures of about 40°C to about 50°C; buffer concentrations of about 9x SSC to about 2x SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5x SSC to about 2x SSC.
  • high stringency conditions include: incubation temperatures of about 55°C to about 68°C; buffer concentrations of about lx SSC to about O.lx SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about lx SSC, 0. lx SSC, or deionized water.
  • hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes.
  • SSC is 0.15 M NaCl and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed.
  • Homology or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences. An “unrelated” or “non-homologous” sequence shares less than 40% identity, or alternatively less than 25% identity, with one of the sequences of the present invention.
  • a cell of the disclosure is a prokaryotic cell.
  • a cell of the disclosure is a eukaryotic cell.
  • the cell is a mammalian cell.
  • the cell is a bovine, murine, feline, equine, porcine, canine, simian, or human cell.
  • the cell is a non-human mammalian cell such as a non human primate cell.
  • a cell of the disclosure is a somatic cell. In some embodiments, a cell of the disclosure is a germline cell. In some embodiments, a germline cell of the disclosure is not a human cell.
  • a cell of the disclosure is a stem cell.
  • a cell of the disclosure is an embryonic stem cell.
  • an embryonic stem cell of the disclosure is not a human cell.
  • a cell of the disclosure is a multipotent stem cell or a pluripotent stem cell.
  • a cell of the disclosure is an adult stem cell.
  • a cell of the disclosure is an induced pluripotent stem cell (iPSC).
  • a cell of the disclosure is a hematopoietic stem cell (HSC).
  • a somatic cell of the disclosure is a neuronal cell.
  • a cell or cells of a patient treated with compositions disclosed herein include, without limitation, central nervous system (neurons), peripheral nervous system (neurons), peripheral motor neurons, and/or sensory neurons.
  • a neuronal cell is a glial cell.
  • a somatic cell of the disclosure is a fibroblast or an epithelial cell.
  • an epithelial cell of the disclosure forms a squamous cell epithelium, a cuboidal cell epithelium, a columnar cell epithelium, a stratified cell epithelium, a pseudostratified columnar cell epithelium or a transitional cell epithelium.
  • an epithelial cell of the disclosure forms a gland including, but not limited to, a pineal gland, a thymus gland, a pituitary gland, a thyroid gland, an adrenal gland, an apocrine gland, a holocrine gland, a merocrine gland, a serous gland, a mucous gland and a sebaceous gland.
  • an epithelial cell of the disclosure contacts an outer surface of an organ including, but not limited to, a lung, a spleen, a stomach, a pancreas, a bladder, an intestine, a kidney, a gallbladder, a liver, a larynx or a pharynx.
  • an epithelial cell of the disclosure contacts an outer surface of a blood vessel or a vein.
  • a somatic cell is an ocular cell.
  • An ocular cell includes, without limitation, corneal epithelial cells, keratyocytes, retinal pigment epithelial (RPE) cells, lens epithelial cells, iris pigment epithelial cells, conjunctival fibroblasts, non-pigmented ciliary epithelial cells, trabecular meshwork cells, ocular choroid fibroblasts, conjunctival epithelial cells,
  • RPE retinal pigment epithelial
  • lens epithelial cells iris pigment epithelial cells
  • conjunctival fibroblasts non-pigmented ciliary epithelial cells
  • trabecular meshwork cells trabecular meshwork cells
  • ocular choroid fibroblasts conjunctival epithelial cells
  • an ocular cell is a retinal cell or a corneal cell.
  • a retinal cell is a photoreceptor cell or a retinal pigment epithelial
  • a retinal cell is a ganglion cell, an amacrine cell, a bipolar cell, a horizontal cell, a Miiller glial cell, a rod cell, or a cone cell.
  • a somatic cell of the disclosure is a primary cell.
  • a somatic cell of the disclosure is a cultured cell.
  • a somatic cell of the disclosure is in vivo, in vitro, ex vivo or in situ.
  • a somatic cell of the disclosure is autologous or allogeneic.
  • the disclosure provides a method of expressing an NOI in a cell using the hybrid promoters disclosed herein.
  • the disclosure provides a method of modifying an activity of a protein encoded by an RNA molecule comprising contacting the composition of the disclosure and the RNA molecule under conditions suitable for binding of one or more of the guide RNA or the RNA-binding protein or the fusion protein (or a portion thereof) to the RNA molecule.
  • the disclosure provides a method of modifying the level of expression of an RNA molecule of the disclosure or a protein encoded by the RNA molecule comprising contacting the composition of the disclosure and a cell comprising the RNA molecule under conditions suitable for binding of one or more of the guide RNA or the RNA-binding protein or fusion protein (or a portion thereof) to the RNA molecule.
  • the cell is in vivo, in vitro, ex vivo or in situ.
  • the composition of the disclosure comprises a vector comprising a guide RNA of the disclosure and an RNA-binding protein or fusion protein of the disclosure.
  • the vector is an AAV.
  • the disclosure provides a method of modifying an activity of a protein encoded by an RNA molecule comprising contacting the composition of the disclosure and a cell comprising the RNA molecule under conditions suitable for binding of one or more of the guide RNA or the RNA-binding protein or fusion protein (or a portion thereof) to the RNA molecule.
  • the disclosure provides a method of modifying the level of expression of an RNA molecule of the disclosure or a protein encoded by the RNA molecule comprising contacting the composition of the disclosure and the RNA molecule under conditions suitable for RNA nuclease activity wherein the RNA-binding protein or fusion protein induces a break in the RNA molecule.
  • the disclosure provides a method of modifying an activity of a protein encoded by an RNA molecule comprising contacting the composition of the disclosure and the RNA molecule under conditions suitable for RNA nuclease activity wherein the RNA-binding protein or fusion protein induces a break in the RNA molecule.
  • the disclosure provides a method of modifying a level of expression of an RNA molecule of the disclosure or a protein encoded by the RNA molecule comprising contacting the composition of the disclosure and a cell comprising the RNA molecule under conditions suitable for RNA nuclease activity wherein the RNA-binding protein or fusion protein induces a break in the RNA molecule.
  • the cell is in vivo, in vitro, ex vivo or in situ.
  • the composition comprises a vector comprising composition comprising a guide RNA of the disclosure and an RNA-binding fusion protein of the disclosure.
  • the vector is an AAV.
  • the disclosure provides a method of modifying an activity of a protein encoded by an RNA molecule comprising contacting the composition and a cell comprising the RNA molecule under conditions suitable for RNA nuclease activity wherein the RNA-binding protein or fusion protein induces a break in the RNA molecule.
  • the cell is in vivo, in vitro, ex vivo or in situ.
  • the composition comprises a vector comprising composition comprising a guide RNA or a single guide RNA of the disclosure and a nucleic acid sequence encoding an RNA-binding protein or fusion protein of the disclosure.
  • the vector is an AAV.
  • the disclosure provides a method of treating a disease or disorder comprising administering to a subject a therapeutically effective amount of a composition of the disclosure.
  • the disclosure provides a method of treating a disease in a patient in need of such treatment comprising administering to the patient a therapeutically effective amount of a composition of the disclosure, wherein the composition comprises a vector comprising a guide RNA of the disclosure and a nucleic acid sequence encoding an RNA-binding protein or an RNA-binding protein fusion protein of the disclosure, wherein the composition modifies, reduces, destroys, knocks down or ablates a level of expression of a toxic repeat RNA (compared to the level of expression of a toxic repeat RNA treated with a non-targeting (NT) control or compared to no treatment).
  • NT non-targeting
  • the level of reduction is 1-fold or greater.
  • the level of reduction is 2-fold, 3 -fold, 4-fold, 5- fold, 6-fold, 7-fold, 8-fold, 9-fold or 10-fold. In another embodiment, the level of reduction is 10-fold or greater. In another embodiment, the level of reduction is between 10-fold and 20- fold. In another embodiment, the level of reduction is 11-fold, 12-fold, 13-fold, 14-fold, 15- fold, 16-fold, 17-fold, 18-fold, 19-fold, or 20-fold. In another embodiment, the gene therapy compositions disclosed herein when administered to a patient lead to 20%-100% destruction of the toxic repeat RNA.
  • the % elimination of the toxic repeat RNA is any of 20-99%, 25%-99%, 50%-99%, 80%-99%, 90%-99%, 95%-99%. In one embodiment, the % elimination is 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%. In another embodiment, % elimination is complete elimination or 100% elimination of the toxic repeat RNA.
  • a subject of the disclosure has been diagnosed with a disease to be treated. In some embodiments, the subject of the disclosure presents at least one sign or symptom of a disorder or disease to be treated. In some embodiments, the subject of the disclosure presents at least one sign or symptom of a disease.
  • a subject of the disclosure is female. In some embodiments of the methods of the disclosure, a subject of the disclosure is male. In some embodiments, a subject of the disclosure has two XX or XY chromosomes. In some embodiments, a subject of the disclosure has two XX or XY chromosomes and a third chromosome, either an X or a Y.
  • a subject of the disclosure is a neonate, an infant, a child, an adult, a senior adult, or an elderly adult. In some embodiments of the methods of the disclosure, a subject of the disclosure is at least 1, 2, 3, 4,
  • a subject of the disclosure is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or 12 months old. In some embodiments of the methods of the disclosure, a subject of the disclosure is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or any number of years or partial years in between of age.
  • a subject of the disclosure is a mammal. In some embodiments, a subject of the disclosure is a non-human mammal.
  • a subject of the disclosure is a human.
  • a therapeutically effective amount comprises a single dose of a composition of the disclosure. In some embodiments, a therapeutically effective amount comprises a therapeutically effective amount comprises at least one dose of a composition of the disclosure. In some embodiments, a therapeutically effective amount comprises a therapeutically effective amount comprises one or more dose(s) of a composition of the disclosure.
  • a therapeutically effective amount eliminates a sign or symptom of the disease or disorder. In some embodiments, a therapeutically effective amount reduces a severity of a sign or symptom of the disease or disorder.
  • a therapeutically effective amount eliminates the disease or disorder.
  • a therapeutically effective amount prevents an onset of a disease or disorder. In some embodiments, a therapeutically effective amount delays the onset of a disease or disorder. In some embodiments, a therapeutically effective amount reduces the severity of a sign or symptom of the disease or disorder. In some embodiments, a therapeutically effective amount improves a prognosis for the subject.
  • a composition of the disclosure is administered to the subject via intracerebral administration. In some embodiments, the composition of the disclosure is administered to the subject by an intrastriatal route. In some embodiments, the composition of the disclosure is administered to the subject by a stereotaxic injection or an infusion. In some embodiments, the composition is administered to the brain. In some embodiments of the methods of the disclosure, a composition of the disclosure is administered to the subject locally.
  • compositions disclosed herein are formulated as pharmaceutical compositions.
  • pharmaceutical compositions for use as disclosed herein may comprise a protein(s) or a polynucleotide encoding the protein(s), optionally comprised in an AAV, which is optionally also immune orthogonal, in combination with one or more pharmaceutically or physiologically acceptable carriers, diluents or excipients.
  • compositions may comprise buffers such as neutral buffered saline, phosphate buffered saline and the like; carbohydrates such as glucose, mannose, sucrose or dextrans, mannitol; proteins; polypeptides or amino acids such as glycine; antioxidants; chelating agents such as EDTA or glutathione; adjuvants (e.g., aluminum hydroxide); and preservatives.
  • buffers such as neutral buffered saline, phosphate buffered saline and the like
  • carbohydrates such as glucose, mannose, sucrose or dextrans, mannitol
  • proteins such as glucose, mannose, sucrose or dextrans, mannitol
  • proteins such as glucose, mannose, sucrose or dextrans, mannitol
  • proteins such as glucose, mannose, sucrose or dextrans, mannitol
  • proteins such as glucose, mannose, sucrose or dextrans, mannitol
  • proteins such as glucose, mannose
  • compositions of the disclosure may be formulated for routes of administration, such as e.g., oral, enteral, topical, transdermal, intranasal, and/or inhalation; and for routes of administration via injection or infusion such as, e.g., intravenous, intramuscular, subpial, intrathecal, intraparenchymal, intrathecal, intrastriatal, subcutaneous, intradermal, intraperitoneal, intratumoral, intravenous, intraocular, and/or parenteral administration.
  • the compositions of the present disclosure are formulated for intracerebral or intrastriatal administration.
  • Promoter variants were cloned into a dual luciferase plasmid.
  • the EFS, tCAG, or novel synthetic promoters drive expression of Firefly luciferase and an SV40 promoter drives Renilla luciferase expression as a transfection control.
  • Transfection of HEK293 cells in 96 well plates was performed with five different doses of plasmid.
  • Day 1 - HEK293 ells were seeded in 6 96-well poly-d-lysine white-walled clear-bottom luciferase plates (20,000 cells per well).
  • Day 2 - A master mix for guide plasmids for top dose of 30ng/well were made: 75 ul Optimem I + 1.5 ug plasmid + 3 ul P3000; and 75 ul Optimen I + 2.25 ul Lipofectamine 3000.
  • P300 and Lipofectamine 3000 master mixes for each of the plasmid sets were combined, mixed and incubated for 5 minutes.
  • Each sensor mix were diluted in serial dilutions: 5 1:3 dilutions were made by serially pipetting 50 ul into 100 ul Optimen I (0, 0.12, 0.37, 1.1, 3.3, lOng/ul). 1567 ul complete media ((DMEM + 10% FBS) without pen/strep) were added to each well and mixed. Plates were washed and refed with 150 ul of complete media w/o PS. 50 ul of transfection solutions were added per well in triplicate as specified in cell treatment plate below. Day 4 - 48 hours post construct treatments, cells were washed and used for luciferase assay (and/or plates were frozen at -80C) and a luciferase assay was performed.
  • FIG. 1 and FIG. 2 Various modified EFS promtoers were developed including introns inserted in the 5’UTR after the TCT motif of the EFS core promoter (FIG. 1 and FIG. 2).
  • the expression of the screened promoters (EFS(Core) or EFS(Core)+intron) were assessed by luciferase assay.
  • Fig. 3 demonstrates that the majority of modified EFS promoters yielded reduced expression relative to EFS(Core).
  • the addition of an intron in 5’UTR of the EFS promoter significantly boosted expression with the UBB intron being the best performer.
  • Results of the luciferase assay show the hybrid promoter EFS(core)-UBB(intron) is approximately 3 times stronger than the EFS(core) promoter and has the capability to increase expression with increased dose (FIG. 4).
  • the addition of an enhancer sequence from CMV further elevates expression from the EFSUBB promoter.
  • Results show the hybrid promoter eCMV-EFS(core)-UBB(intron) is approximately 5 times stronger than the EFS(core) promoter, and has the capability to increase expression with increased dose (FIG.
  • Example 2 Expression from EFSIcoreUUBBIintron after AAV transduction
  • Materials CHO-Lec2 cells, AAVs encoding GOIs (RhlO serotypes) driven by tCAG or EFSUBB promoter, Gapdh and GOI antibodies, Omega Homogenizer Columns and BCA assay, iBlot, Odyssey Western Reagents, and Biorad Chemidoc
  • Methods Day 1 - CHO-Lec2 cells were seeded in 24 well plates. Day 2 - Cells were transduced with AAV at MOI of 10 L 5. Day 5 - 72 hours post transduction cells were trypsinized, pelleted, and washed.
  • the cell pellet was lysed by SDS addition, and spun through the Omega Homogenizer Column (HCR003). Protein extract was quantified by BCA assay. Equal amounts of protein were then loaded onto the SDS-PAGE gel. After running the gel the protein was transferred to the membrane with the iBlot system. The membrane was removed and blocked in Odyssey Blocking Buffer for 30 minutes at RT. The block was discarded and incubated in primary antibodies overnight. Day 6 - Blot was washed 3 times with TBST and incubated in secondary antibodies at RT for 1 hour. The blot was washed 3 times with TBST, and once with TBS. Image was taken on Biorad Chemidoc. [0548] The image (Fig. 6) shows expression of EFS(core)-UBB(intron) expression of GOI as compared to controls (tCAG driving expression of GOI).
  • Example 3 Expression from EFSIcoreVUBBIintron) after AAV transduction in vivo
  • Methodology Stereotaxic injections targeted mouse striatum.
  • AP +-.7mm;
  • ML +/- 2.1mm;
  • DV -3.0mm.
  • 1E10 vector genomes per hemisphere of the AAVrhlO serotype expressing PUF-targeting CAGRNA (GOI) were injected.
  • Tissue collection and processing Transcardial perfusion with lx PBS, whole brain collected, left hemisphere - embedded in OCT for cryosecting, right hemisphere - striatum micro-dissected and divided into three separate sections along AP axis.
  • RNAscope for PUF-CAG custom-made P-CAG-E17 prober.
  • RNA fluorescence in situ hybridization (RNA-FISH) was performed.
  • RNA-FISH shows expression of GOI in brain tissue using the hybrid promoter EFS(core)-UBB(intron) as compared to the control tCAG.

Landscapes

  • Genetics & Genomics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Virology (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

L'invention concerne des séquences promotrices hybrides de tailles significativement réduites pour des niveaux prolongés et robustes d'expression transgénique, comprenant des promoteurs hybrides dérivés de la liaison fonctionnelle de séquences de promoteurs centraux raccourcies, de séquences d'amplificateurs tronquées et/ou de séquences d'introns UBB modifiées.
PCT/US2022/024419 2021-04-12 2022-04-12 Compositions et procédés comprenant des promoteurs hybrides WO2022221278A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163173810P 2021-04-12 2021-04-12
US63/173,810 2021-04-12
US202163180432P 2021-04-27 2021-04-27
US63/180,432 2021-04-27

Publications (1)

Publication Number Publication Date
WO2022221278A1 true WO2022221278A1 (fr) 2022-10-20

Family

ID=81448682

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/024419 WO2022221278A1 (fr) 2021-04-12 2022-04-12 Compositions et procédés comprenant des promoteurs hybrides

Country Status (1)

Country Link
WO (1) WO2022221278A1 (fr)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008124724A1 (fr) 2007-04-09 2008-10-16 University Of Florida Research Foundation, Inc. Compositions à base de vecteurs raav comprenant des protéines de capside modifiées par la tyrosine et procédés d'utilisation correspondants
US20110097798A1 (en) * 2008-06-30 2011-04-28 Atgcell Inc. Mammalian cell expression vectors and utilization
WO2012006827A1 (fr) 2010-07-14 2012-01-19 中兴通讯股份有限公司 Terminal mobile et procédé permettant de débloquer un terminal mobile
WO2012068627A1 (fr) 2010-11-24 2012-05-31 The University Of Western Australia Peptides pour la liaison spécifique de cibles d'arn
WO2013058404A1 (fr) 2011-10-21 2013-04-25 国立大学法人九州大学 Procédé de conception d'une protéine liant l'arn utilisant le motif ppr et son utilisation
US20160238593A1 (en) 2015-01-13 2016-08-18 Massachusetts Institute Of Technology Pumilio Domain-based Modular Protein Architecture for RNA Binding
WO2018183403A1 (fr) 2017-03-28 2018-10-04 Caribou Biosciences, Inc. Protéine associée à crispr (cas)
WO2019006471A2 (fr) 2017-06-30 2019-01-03 Arbor Biotechnologies, Inc. Nouveaux enzymes de ciblage d'arn crispr, systèmes et utilisations associés
US20190062724A1 (en) 2017-08-22 2019-02-28 Salk Institute For Biological Studies Rna targeting methods and compositions
WO2021007529A1 (fr) * 2019-07-10 2021-01-14 Locanabio, Inc. Compositions de remplacement et d'inactivation ciblant l'arn et méthodes d'utilisation
WO2022119974A1 (fr) * 2020-12-01 2022-06-09 Locanabio, Inc. Compositions de ciblage d'arn et procédés de traitement de maladies à répétition cag
WO2022119979A1 (fr) * 2020-12-01 2022-06-09 Locanabio, Inc. Compositions ciblant l'arn et méthodes de traitement de la dystrophie myotonique de type 1 (dm1)

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008124724A1 (fr) 2007-04-09 2008-10-16 University Of Florida Research Foundation, Inc. Compositions à base de vecteurs raav comprenant des protéines de capside modifiées par la tyrosine et procédés d'utilisation correspondants
US20110097798A1 (en) * 2008-06-30 2011-04-28 Atgcell Inc. Mammalian cell expression vectors and utilization
WO2012006827A1 (fr) 2010-07-14 2012-01-19 中兴通讯股份有限公司 Terminal mobile et procédé permettant de débloquer un terminal mobile
US9580714B2 (en) 2010-11-24 2017-02-28 The University Of Western Australia Peptides for the specific binding of RNA targets
WO2012068627A1 (fr) 2010-11-24 2012-05-31 The University Of Western Australia Peptides pour la liaison spécifique de cibles d'arn
WO2013058404A1 (fr) 2011-10-21 2013-04-25 国立大学法人九州大学 Procédé de conception d'une protéine liant l'arn utilisant le motif ppr et son utilisation
US20160238593A1 (en) 2015-01-13 2016-08-18 Massachusetts Institute Of Technology Pumilio Domain-based Modular Protein Architecture for RNA Binding
WO2018183403A1 (fr) 2017-03-28 2018-10-04 Caribou Biosciences, Inc. Protéine associée à crispr (cas)
WO2019006471A2 (fr) 2017-06-30 2019-01-03 Arbor Biotechnologies, Inc. Nouveaux enzymes de ciblage d'arn crispr, systèmes et utilisations associés
US20190062724A1 (en) 2017-08-22 2019-02-28 Salk Institute For Biological Studies Rna targeting methods and compositions
WO2019040664A1 (fr) 2017-08-22 2019-02-28 Salk Institute For Biological Studies Méthodes et compositions de ciblage d'arn
WO2021007529A1 (fr) * 2019-07-10 2021-01-14 Locanabio, Inc. Compositions de remplacement et d'inactivation ciblant l'arn et méthodes d'utilisation
WO2022119974A1 (fr) * 2020-12-01 2022-06-09 Locanabio, Inc. Compositions de ciblage d'arn et procédés de traitement de maladies à répétition cag
WO2022119979A1 (fr) * 2020-12-01 2022-06-09 Locanabio, Inc. Compositions ciblant l'arn et méthodes de traitement de la dystrophie myotonique de type 1 (dm1)

Non-Patent Citations (19)

* Cited by examiner, † Cited by third party
Title
ABIL ZDENARD CAZHAO H: "Modular assembly of designer PUF proteins for specific post-transcriptional regulation of endogenous RNA", JOURNAL OF BIOLOGICAL ENGINEERING, vol. 8, no. 1, pages 7, XP021179053, DOI: 10.1186/1754-1611-8-7
CHEONG, C. G.HALL, T. M., PNAS, vol. 103, 2006, pages 13635 - 13639
DONG, S. ET AL.: "Specific and modular binding code for cytosine recognition in Pumilio/FBF (PUF) RNA-binding domains", THE JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 286, 2011, pages 26732 - 26742, XP055064475, DOI: 10.1074/jbc.M111.244889
DURAND ET AL., VIRUSES, vol. 3, no. 2, 2011, pages 132 - 159
FATIMA BOSCH: "ESGCT, DGGT, GSZ, and ISCT 2009 Poster Presentations", HUMAN GENE THERAPY, vol. 20, 1 November 2009 (2009-11-01), pages 1417 - 1545, XP055053121, DOI: 10.1089=hum.2009.0926 *
FILIPOVSKA ARACKHAM O.: "Modular recognition of nucleic acids by PUF, TALE and PPR proteins", MOLECULAR BIOSYSTEMS, vol. 8, no. 3, 2012, pages 699 - 708, XP055071307, DOI: 10.1039/c2mb05392f
FILIPOVSKA ARAZIF MFNYGARD KKRACKHAM O.: "A universal code for RNA recognition by PUF proteins", NATURE CHEMICAL BIOLOGY, vol. 7, no. 7, 2011, pages 425 - 427
KATARZYNA ET AL., PNAS, vol. 113, no. 19, 2016, pages E2579 - E2588
KOH YYWANG YQIU COPPERMAN LGROSS LTANAKA HALL TMWICKENS M: "Stacking Interactions in PUF-RNA Complexes", RNA, vol. 17, no. 4, 2011, pages 718 - 727
KONERMANN ET AL., CELL, vol. 173, no. 3, 2018, pages 665 - 676
ROOPASHRI HOLEHONNUR ET AL: "The production of viral vectors designed to express large and difficult to express transgenes within neurons", MOLECULAR BRAIN, BIOMED CENTRAL LTD, LONDON UK, vol. 8, no. 1, 24 February 2015 (2015-02-24), pages 12, XP021218347, ISSN: 1756-6606, DOI: 10.1186/S13041-015-0100-7 *
SHINODA KTSUJI SFUTAKI SIMANISHI M: "Nested PUF Proteins: Extending Target RNA Elements for Gene Regulation", CHEMBIOCHEM, vol. 19, no. 2, 2018, pages 171 - 176
STEVEN J. GRAY ET AL: "Optimizing Promoters for Recombinant Adeno-Associated Virus-Mediated Gene Expression in the Peripheral and Central Nervous System Using Self-Complementary Vectors", HUMAN GENE THERAPY, vol. 22, no. 9, 1 September 2011 (2011-09-01), pages 1143 - 1153, XP055198141, ISSN: 1043-0342, DOI: 10.1089/hum.2010.245 *
STRYER: "Biochemistry", 1988, W.H. 5 FREEMAN AND CO.
TRONO D.: "Lentiviral vectors", 2002, SPRING-VERLAG
WANG ET AL., NAT METHODS., vol. 6, no. 11, 2009, pages 825 - 830
WANG, X. ET AL., CELL, vol. 110, 2002, pages 501 - 512
YAN ET AL., MOL CELL, vol. 70, no. 2, 2018, pages 327 - 339
ZHAO YMAO MZHANG WWANG JLI HYANG YWANG ZWU J.: "Expanding RNA binding specificity and affinity of engineered PUF domains", NUCLEIC ACIDS RESEARCH, vol. 46, no. 9, 2018, pages 4771 - 4782

Similar Documents

Publication Publication Date Title
US20210198673A1 (en) RNA TARGETING OF MUTATIONS VIA SUPPESSOR tRNAs AND DEAMINASES
US20220127621A1 (en) Fusion proteins and fusion ribonucleic acids for tracking and manipulating cellular rna
EP3494997B1 (fr) Protéines de liaison à l'adn inductibles, outils de perturbation du génome et leurs applications
EP4100032A1 (fr) Procédés d'édition génomique pour le traitement de l'amyotrophie musculaire spinale
JP2021526858A (ja) Rna標的化融合タンパク質組成物および使用方法
EP3997227A1 (fr) Compositions de remplacement et d'inactivation ciblant l'arn et méthodes d'utilisation
US20240108751A1 (en) Rna-targeting compositions and methods for treating myotonic dystrophy type 1
CN114015674A (zh) 新型CRISPR-Cas12i系统
US20240000972A1 (en) Rna-targeting compositions and methods for treating cag repeat diseases
JP2021533803A (ja) Fasl免疫モジュレート遺伝子治療組成物および使用方法
US20240018521A1 (en) Compositions and methods comprising engineered short nuclear rna (snrna)
WO2022221278A1 (fr) Compositions et procédés comprenant des promoteurs hybrides
WO2023154807A2 (fr) Compositions et procédés de modulation d'épissage de pré-arnm
WO2023205637A1 (fr) Compositions ciblant l'arn et procédés pour traiter les maladies c9/orf72
CN116801901A (zh) 用于治疗1型强直性肌营养不良的靶向rna的组合物和方法
CN117320741A (zh) 用于治疗cag重复疾病的靶向rna的组合物和方法
CN117230043B (zh) Cas13蛋白及其应用
US20230279398A1 (en) Treating human t-cell leukemia virus by gene editing
WO2024086650A1 (fr) Compositions et procédés comprenant des arnsn programmables pour l'édition d'arn
CN117980484A (zh) 新颖的转录因子

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22720210

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22720210

Country of ref document: EP

Kind code of ref document: A1