WO2020092725A1 - Gene modulation with crispr system type i - Google Patents

Gene modulation with crispr system type i Download PDF

Info

Publication number
WO2020092725A1
WO2020092725A1 PCT/US2019/059098 US2019059098W WO2020092725A1 WO 2020092725 A1 WO2020092725 A1 WO 2020092725A1 US 2019059098 W US2019059098 W US 2019059098W WO 2020092725 A1 WO2020092725 A1 WO 2020092725A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
effector molecule
nucleic acid
cas
cell
Prior art date
Application number
PCT/US2019/059098
Other languages
French (fr)
Inventor
Blake A. Wiedenheft
Sarah M. VANTREESE
Original Assignee
Montana State University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Montana State University filed Critical Montana State University
Publication of WO2020092725A1 publication Critical patent/WO2020092725A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Definitions

  • This disclosure relates to engineered, programmable, non-naturally occurring gene modulating systems, compositions of the system, and methods of carrying out genetic editing.
  • methods, systems, and compositions are described that utilize the Type I CRISPR system.
  • CRISPRs Clustered regularly interspaced short palindromic repeats
  • cas Clustered regularly interspaced short palindromic repeats
  • CRISPR loci consist of a series of short repeats separated by non-repetitive spacer sequences, the spacer sequences of which are acquired from foreign genetic elements such as viruses and plasmids. Transcription of CRISPR loci generates a library of CRISPR-derived RNAs (crRNAs), containing sequences complementary to previously encountered invading nucleic acids.
  • CRISPR-associated (Cas) proteins bind crRNAs, and the resultant ribonucleoprotein complex targets invading nucleic acids complementary to the crRNA guide. Targeted invading nucleic acids are degraded by cis- or mr/t.v-acting nucleases.
  • CRISPR-associated complex for antiviral defense is a Type I-E system composed of 11 protein subunits and a CRISPR-derived RNA (crRNA) complex that relies on complementary pairing between the crRNA-guide and a target nucleic acid sequence, which occurs over 32 nucleotides, or a portion thereof.
  • Type II systems however, rely on a single protein (Cas9) and a 20 nucleotide sequence in recognizing invading DNA. Due to its relative simplicity, the Cas9 system has been used for commercial and research purposes in genetic engineering. Off-target nuclease activity has been detected, and this may limit the use of these tools for certain applications.
  • Cas9 single protein
  • Type I systems rely on a greater number of nucleotides for target DNA recognition, and employ a locking mechanism during target binding, which may be exploited as a gene modification device with enhanced specificity in target recognition compared to Cas9 systems.
  • the complexity of the Type I CRISPR complex, the multiple reading frames, and the delivery of these systems are hurdles to the use of Type I CRISPR complexes as a viable genome editing technology.
  • compositions and methods that utilize Type I CRISPR complexes to deliver one or more effector molecules to a target nucleic acid, for example a target nucleic acid in a cell.
  • the effector molecule includes a DNA-modulating function, such as transcriptional activation, transcriptional repression, and/or base editing functions.
  • the effector molecule is a detectable label or reporter, for example, for detecting the presence and/or quantity of a nucleic acid.
  • methods and systems that employ the complexes, for example, to modulate nucleic acid expression or sequence, or to detect or quantify a nucleic acid.
  • a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence.
  • the Type I CRISPR-Cas complex include at least one Cas protein linked to an effector molecule (for example, directly or indirectly linked).
  • the Cas protein that is linked to the effector molecule is not a Csel protein (also known as Cas8).
  • the effector molecule is linked to a Cas7 protein.
  • the Cas protein is covalently linked to the effector molecule.
  • the Type I CRISPR-Cas complex in some examples is a Type I-E CRISPR-Cascade complex (for example, including Cas8, Cse2, Cas7, Cas5, and Cas6 proteins).
  • the complex includes one or more Cas7 proteins (such as 1, 2, 3, 4, 5, 6, or more Cas7 proteins) linked to an effector molecule.
  • Cas7 proteins linked to an effector molecule include SEQ ID NOs: 2, 4, 6, 8, 10, and 12, which are encoded by the nucleic acid sequences of SEQ ID NOs: 1, 3, 5, 7, 9, and 11.
  • the Type I CRISPR-Cas complex is a Type I-F CRISPR-Csy complex (e.g ., including Csyl (Cas8), Csy2 (Cas5), Csy3 (Cas7), and Csy4 (Cas6) proteins) that includes one or more Csy3 (Cas7) proteins (such as 1, 2, 3, 4, 5, 6, or more Csy3 proteins) linked to an effector molecule.
  • Csyl Cas8
  • Csy2 Cas5
  • Csy3 Cas7
  • Csy4 Csy4 proteins
  • Cas7 refers to the protein corresponding to the Cas7 protein in Type I-E CRISPR, though the protein may be referred to in other Type I systems by other nomenclature in some examples (see, e.g., Koonin et al, Curr. Opin. Microbiol. 37:67-78, 2017 for a summary of Type I CRISPR systems and nomenclature).
  • the effector molecule is covalently linked to the N-terminus or the C- terminus of the Cas protein (such as the N-terminus or C-terminus of a Cas7 protein).
  • the effector molecule may be directly or indirectly linked (for example, via a linker) to the Cas protein.
  • the linker is an amino acid linker.
  • the linker is streptavidin and biotin.
  • One or more of the Cas polypeptides in the complex may also include a nuclear localization signal (NLS) (for example,
  • the system may optionally include a Cas3 nuclease.
  • one or more effectors are tethered to the complex via the crRNA.
  • the 3’ hairpin of the crRNA is extended to include an RNA binding motif and the effector is linked (directly or indirectly) to a protein including an RNA binding domain.
  • the RNA binding motif is an MS2 hairpin that is bound by the MS2 coat protein, which is linked to an effector.
  • the RNA binding motif is a Type I repeat and the protein is a nuclease-inactivated Cas6 family protein linked to an effector.
  • the effector molecule includes a transcriptional activator (for example, one or more VP16 activation domains), a transcriptional repressor (for example, a Kruppel associated box (KRAB) repressor domain), or a base editor (for example, a cytidine deaminase or an adenosine deaminase).
  • the effector molecule is a reporter, such as a fluorescent protein, a fluorescent dye, or a quantum dot.
  • nucleic acids encoding a Cas7 protein linked to an effector molecule.
  • the nucleic acids encode a Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 1 and 3), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 5 and 7), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 9 and 11).
  • VP64 e.g., SEQ ID NOs: 1 and 3
  • KRAB e.g., SEQ ID NOs: 5 and 7
  • emGFP emerald green fluorescent protein
  • the Cas7-effector molecule proteins include Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 2 and 4), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 6 and 8), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 10 and 12).
  • VP64 e.g., SEQ ID NOs: 2 and 4
  • KRAB e.g., SEQ ID NOs: 6 and 8
  • emGFP emerald green fluorescent protein
  • the disclosure includes a vector including a nucleic acid encoding a Cas protein covalently linked to an effector molecule, where the Cas protein is not Csel.
  • the vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule.
  • Exemplary nucleic acids included in the vector are SEQ ID NOs: 1, 3, 5, 7, 9, and 11, which encode the proteins SEQ ID NOs: 2, 4, 6, 8, 10, and 12.
  • the cells may be prokaryotic or eukaryotic cells, including animal cells, plant cells, fungal cells, algal cells, or bacterial cells.
  • the present disclosure provides methods for modulating expression (for example, increasing or decreasing expression) of a target polynucleotide in a cell, which may be in vivo, ex vivo, or in vitro.
  • the present disclosure provides methods for altering the sequence of a target polynucleotide in a cell, for example, changing one or more nucleotide, for example from C to T (or G to A on the opposite strand) or from A to G (or T to C on the opposite strand) in a target nucleic acid.
  • the present disclosure provides methods for detecting presence and/or quantity of a target polynucleotide in a cell.
  • the one or more delivery vehicles including Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence and, optionally, repair template are administered to a cell or a subject.
  • FIGS. 1 A and IB are schematics showing the type I CRISPR system in E. coli.
  • FIG. 1 A illustrates the native Type IE CRISPR-Cascade operon from Escherichia coli.
  • Five of the cas genes encode proteins that assemble in an unequal stoichiometry into a multi-subunit surveillance complex called the CRISPR- associated complex for anti-viral defense (Cascade).
  • the stoichiometry of each subunit is indicated above each arrow.
  • the CRISPR locus consists of a series of 29-nt repeats (diamonds) separated by 32-nt spacer (or guide) sequences (cylinders) (left panel).
  • the right panel illustrates Cascade subunit assembly.
  • FIG. IB illustrates the native Type IF CRISPR-Csy operon (left) and subunit assembly (right).
  • FIG. 2 is a schematic diagram illustrating an embodiment of a Type I CRISPR system including an extension of the crRNA to include stem loop structures that are bound by a protein including an RNA binding domain (left panel) and an assembled complex showing binding of RNA binding domains (RBD) linked to an effector to the stem loop structure (right panel), providing multivalent display of the effector.
  • This embodiment is illustrated with Type IE CRISPR-Cascade, but is generally applicable to Type I CRISPR systems.
  • FIGS. 3A and 3B are schematic diagrams of exemplary vectors for expression of Cascade complexes.
  • FIG. 3A is a diagram of a vector including nucleic acids encoding Csel with an N-terminal nuclear localization signal (NLS), Cse2 with a Strep-tagll, Cas7 with a C-terminal effector, Cas5, and Cas6 with a C-terminal NLS.
  • FIG. 3B is a diagram of a vector including nucleic acids encoding Cas7 with a Strep-tagll, Cas5, and Cas6.
  • FIGS. 4 A and 4B are schematic diagrams of alternative exemplary vectors for expression of Cascade complexes.
  • FIG. 4A is a diagram of a vector including nucleic acids encoding Csel with an N- terminal nuclear localization signal (NLS), Cse2 with a Strep-tagll, and Cas7 including a C-terminal effector.
  • LIG. 4B is a diagram of a vector including nucleic acids encoding Cas5 and Cas6 with a C- terminal NLS.
  • LIG. 5 illustrates size exclusion chromatography of Cascade containing NLS tags on Cas8 and/or on Cas6.
  • the complex containing an NLS tag on Cas8 also contains a VP64 (transcriptional activator) tethered to the C-terminus of Cas7 (“VP64”), or an emerald green fluorescent protein (emGLP) tethered to the C- terminus of Cas7 (“GLP”).
  • VP64 transcriptional activator
  • emGLP emerald green fluorescent protein
  • GLP emerald green fluorescent protein
  • nucleic acid and amino acid sequences listed herein or in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases and amino acids, as defined in 37 C.L.R. ⁇ 1.822. In at least some cases, only one strand of each nucleic acid sequence is shown, but the
  • complementary strand is understood as included by any reference to the displayed strand.
  • SEQ ID NOs: 1 and 2 are nucleic acid and amino acid sequences, respectively, of Cas7 with N- terminally linked VP64.
  • SEQ ID NOs: 3 and 4 are nucleic acid and amino acid sequences, respectively, of Cas7 with C- terminally linked VP64.
  • SEQ ID NOs: 5 and 6 are nucleic acid and amino acid sequences, respectively, of Cas7 with an N- terminally linked Kruppel associated box (KRAB) repressor domain.
  • KRAB Kruppel associated box
  • SEQ ID NOs: 7 and 8 are nucleic acid and amino acid sequences, respectively, of Cas7 with a C- terminally linked KRAB repressor domain.
  • SEQ ID NOs: 9 and 10 are nucleic acid and amino acid sequences, respectively, of Cas7 with N- terminally linked emerald green fluorescent protein (emGFP).
  • SEQ ID NOs: 11 and 12 are nucleic acid and amino acid sequences, respectively, of Cas7 with C- terminally linked emGFP.
  • SEQ ID NO: 13 is the nucleic acid sequence of vector pCDF NLS-Csel, Strep II-tag-C3 cleavage- Cse2, Cas7-linker-VP64.
  • Nucleotides 1593-1625 encode a Strep-tag II sequence
  • nucleotides 1630-1655 encode a C3 cleavage sequence
  • nucleotides 1665-2147 encode Cse2
  • nucleotides 2160-3248 encode Cas7
  • nucleotides 3273-3422 encode VP64.
  • SEQ ID NO: 14 is the nucleic acid sequence of vector pET52 StrepII-Cas7, Cas5, Cas6.
  • Nucleotides 4966-4989 encode Strep-tag II sequence
  • nucleotides 4996-5019 encode a human rhino vims 3C (HRV3C) cleavage site
  • nucleotides 5029-6120 encode Cas7
  • nucleotides 6123-6797 encode Cas5
  • nucleotides 6784-7383 encodes Cas6 (overlaps with Cas5 sequence).
  • SEQ ID NO: 15 is the nucleic acid sequence of vector pET52 Cas5-Cas6-NLS. Nucleotides 4957- 5631 encode Cas5, nucleotides 5621-6217 encode Cas6, and nucleotides 6218-6238 encode an NLS.
  • the type IE CRISPR-mediated immune system in E. coli K12 consists of eight cas genes and one CRISPR locus. Five of the cas genes encode proteins that assemble in an unequal stoichiometry into a multi-subunit surveillance complex called the CRISPR-associated complex for anti-viral defense (Cascade) (FIG. 1A). Cascade binds double-stranded DNA targets that contain a protospacer (sequence
  • Csel or casA or Cas8
  • cse2 or casB
  • cse4 or casC or Cas7
  • cas5 or casD
  • cse3 or casE
  • cas8 or cas7
  • cas5 or casD
  • casE cse3
  • the CRISPR locus consists of a series of 29-nt repeats separated by 32-nt spacer (or guide) sequences; however, the repeat and spacer length may vary depending on the system (see, e.g., Luo et al, Nucleic Acids Research 44:7385-7394, 2016).
  • the present disclosure provides constructs and methods that permit assembly of a Cascade (or other Type I CRISPR) complex with one or more effector molecules.
  • Type I CRISPR complexes include multiple copies of some components (such as Cas7 (6 copies) and Cse2 (2 copies) in Cascade).
  • the complexes disclosed herein can include multiple effector molecules.
  • the complexes and methods described herein can increase efficiency of gene modulation (such as modifying gene expression or base editing), or increase signal strength in the case of reporters or tags.
  • the terms“effector” or“effector molecule” refer to molecules (such as proteins or protein domains) that can“effect” a desired function.
  • the effector molecule function is a DNA-modulating function, which includes transcriptional activation, transcriptional repression, base editing, and/or double-stranded break (DSB) repair functions.
  • the effector molecule function includes cell cycle control or regulation, small molecule delivery ( e.g ., to the nucleus), and/or dimerization functions.
  • the effector molecule function is as a label or reporter for the presence, quantity, and/or localization of a specific nucleic acid sequence (for example presence and/or quantity of a nucleic acid of interest in a cell).
  • the term“host cell” refers to any cell that contains a heterologous nucleic acid.
  • the heterologous nucleic acid can be a vector, such as a shuttle vector or an expression vector, or linear DNA template, or in vitro transcribed RNA.
  • the host cell is able to drive the expression of genes that are encoded on the vector.
  • the host cell supports the replication and propagation of the vector.
  • Host cells can be bacterial cells such as E. coli, animal cells, such as mammalian cells (e.g., human cells or mouse cells), or plant cells. When a suitable host cell is used to create a stably integrated cell line, that cell line can be used to create a complete transgenic organism.
  • Methods for delivering vectors or other nucleic acid molecules into bacterial cells include electroporation methods and transformation of E. coli cells that have been rendered competent by previous treatment with divalent cations such as CaCl ⁇ .
  • Methods for delivering vectors, other nucleic acids (such as RNA), or ribonucleoproteins (RNPs) into mammalian or plant cells in culture include but are not limited to calcium phosphate precipitation, electroporation, lipid-based methods (liposomes or lipoplexes) such as TransfectamineTM (Life
  • cationic polymer transfections for example using DEAE-dextran, direct nucleic acid injection, biolistic particle injection, and viral transduction using engineered viral carriers (termed“transduction,” using e.g., engineered herpes simplex virus, adenovirus, adeno-associated vims, vaccinia vims, Sindbis vims), and sonoporation. Additional methods of transforming or transducing cells may also be selected.
  • the term“recombinant” in reference to a nucleic acid or polypeptide indicates that the material (e.g., a recombinant nucleic acid, gene, polynucleotide, polypeptide, etc.) has been altered by human intervention.
  • the arrangement of parts of a recombinant molecule is not a native configuration, or the primary sequence of the recombinant polynucleotide or polypeptide has in some way been manipulated.
  • a naturally occurring nucleotide sequence becomes a recombinant polynucleotide if it is removed from the native location from which it originated (e.g., a chromosome), or if it is transcribed from a recombinant DNA construct, or its native sequence is modified (e.g., by insertion, deletion, and/or alteration of one or more nucleotides).
  • a naturally occurring polypeptide sequence becomes a recombinant polypeptide if it is removed from the native location from which it originated, or if its native sequence is modified (e.g., insertion, deletion, and/or alteration of one or more amino acids).
  • a gene open reading frame is a recombinant molecule if that nucleotide sequence has been removed from its natural context and cloned into any type of nucleic acid vector (even if that ORF has the same nucleotide sequence as the naturally occurring gene) or PCR template.
  • the term“recombinant cell line” refers to any cell line containing a recombinant nucleic acid, that is to say, a nucleic acid that is not native to that host cell.
  • heterologous or“exogenous” as applied to polynucleotides or polypeptides refers to molecules that have been rearranged or artificially supplied to a biological system and may not be in a native configuration (e.g., with respect to sequence, genomic position, or arrangement of parts) or are not native to that particular biological system. These terms indicate that the relevant material originated from a source other than the naturally occurring source or refers to molecules having a non natural or non-native configuration, genetic location, or arrangement of parts.
  • the terms“exogenous” and “heterologous” are sometimes used interchangeably with“recombinant.”
  • the terms“non-naturally occurring gene editing complex,”“engineered non- naturally occurring gene editing complex,”“non-naturally occurring complex,” and“non-naturally occurring CRISPR-Cascade complex” refer to gene editing complexes that do not occur in nature.
  • the CRISPR associated proteins are Cascade proteins.
  • the Type I CRISPR-Cas complexes used in the described methods and systems are concatenated or partially concatenated complexes, in which a plurality of subunits of the Type I CRISPR-Cas complex are tethered to each other, or one or more of the subunits of the Type I CRISPR-Cas complex are linked or tethered to a heterologous molecule (such as an effector molecule), or the stoichiometry of the Type I CRISPR-Cas complex is modified, and/or the nucleotides in the crRNA are modified.
  • the non-natural complexes are composed of CRISPR associated proteins, one or more effector molecules, and crRNA. Thus, these non-natural complexes are composed of CRISPR associated proteins and crRNA, but these proteins and/or crRNA have been modified or are in an arrangement that does not occur in nature, and which results from the manipulation that occurs during human engineering of the complex.
  • the terms“linker,”“linkage,”“tether,”“fused,”“joined,” and derivatives thereof refer to a means to connect subunits or to a connection between subunits. Accordingly, the terms include, but are not limited to, any compound, organic, inorganic, or a hybrid organic and inorganic compound, that connects, covalently or non-covalently, two subunits. “Linker,”“linkage,”“tether,”“fused,” and“joined” and derivatives thereof may be used interchangeably herein.
  • an effector molecule can be linked to a Cas protein (such as Cas7) in a Type I CRISPR-Cas complex.
  • the term“guide sequence” refers to an RNA sequence that is part of the CRISPR complex and recognizes a target nucleic acid sequence.
  • the guide sequences are presented as DNA sequences which encode for the RNA sequences.
  • target recognition can occur through non-covalent interactions, including hydrogen bonding, recognition of a structural motif, nucleic acid sequence recognition, base pairing, the like, or any combination thereof. In other embodiments, target recognition can occur via covalent interactions.
  • the term“gene” generally refers to a combination of polynucleotide elements, that when operatively linked in either a native or recombinant manner, provide some product or function.
  • the term“gene” is to be interpreted broadly, and can encompass mRNA, cDNA, cRNA, and genomic DNA forms of a gene.
  • the term“gene” encompasses the transcribed sequences, including 5’ and 3’ untranslated regions (5’-UTR and 3’-UTR), exons, and introns. In some genes, the transcribed region will contain“open reading frames” that encode polypeptides.
  • a“gene” comprises only the coding sequences (e.g ., an“open reading frame” or“coding region”) necessary for encoding a polypeptide.
  • genes do not encode a polypeptide, for example, ribosomal RNA (rRNA) genes and transfer RNA (tRNA) genes.
  • rRNA ribosomal RNA
  • tRNA transfer RNA
  • the term“gene” includes not only the transcribed sequences, but in addition, also includes non-transcribed regions including upstream and downstream regulatory regions, enhancers and promoters.
  • the term“gene” encompasses mRNA, cDNA, and genomic forms of a gene.
  • the genomic form or genomic clone of a gene includes the sequences of the transcribed mRNA as well as other non-transcribed sequences that lie outside of the transcript.
  • the regulatory regions that lie outside the mRNA transcription unit are termed 5’ or 3’ flanking sequences.
  • a functional genomic form of a gene typically contains regulatory elements necessary, and sometimes sufficient, for the regulation of transcription.
  • the term“promoter” is generally used to describe a DNA region, typically but not exclusively 5’ of the site of transcription initiation, sufficient to confer accurate transcription initiation.
  • a“promoter” also includes other cis-acting regulatory elements that are necessary for strong or elevated levels of transcription, or confer inducible transcription.
  • a promoter is constitutively active, while in alternative embodiments, the promoter is conditionally active ( e.g ., where transcription is initiated only under certain physiological conditions).
  • the term“regulatory element” refers to any cis-acting genetic element that controls some aspect of the expression of nucleic acid sequences.
  • the term“promoter” comprises essentially the minimal sequences required to initiate transcription.
  • the term“promoter” includes the sequences to start transcription, and in addition, also includes sequences that can upregulate or downregulate transcription, commonly termed“enhancer elements” and“repressor elements,” respectively.
  • DNA regulatory elements including promoters and enhancers, generally only function within a class of organisms.
  • regulatory elements from the bacterial genome generally do not function in eukaryotic organisms.
  • regulatory elements from more closely related organisms frequently show cross functionality.
  • DNA regulatory elements from a particular mammalian organism, such as human will most often function in other mammalian species, such as the mouse.
  • a“protein subunit,”“polypeptide subunit,” or“subunit” refers to a single protein molecule that assembles or co-assembles with other protein or RNA molecules to form a protein or ribonucleoprotein (RNP) complex.
  • RNP ribonucleoprotein
  • Some naturally occurring proteins have a relatively small number of subunits and are therefore described as oligomeric, for example hemoglobin or DNA polymerase. Others may consist of a very large number of subunits and are therefore described as multimeric, for example microtubules and other cytoskeleton proteins.
  • the subunits of a multimeric protein may be identical, homologous or totally dissimilar.
  • the CRISPR-Cascade ribonucleoprotein complex includes 11 subunits, which assemble around a crRNA.
  • the 11 protein subunits of Cascade include Csel (Cas8) (1 subunit), Cse2 (2 subunits), Cas7 (6 subunits), Cas5 (1 subunit), and Cas6 (1 subunit), as well as a 61-nucleotide crRNA.
  • the CRISPR-Csy ribonucleoprotein complex is a ⁇ 350-kDa-ribonucleoprotein complex composed of 9 subunits of four functionally essential Cas proteins (one Csyl, one Csy2, six Csy3, and one Csy4) and a 60-nt crRNA-guide. These two ribonucleoprotein complexes are examples of crRNA-guided DNA binding machines that recruit a trans-acting nuclease, Cas3 for target degradation.
  • Vectors generally comprise parts that mediate vector propagation and manipulation (e.g., one or more origin of replication, genes imparting drug or antibiotic resistance, a multiple cloning site, operably linked promoter/enhancer elements which enable the expression of a cloned gene, etc.) ⁇ Vectors are generally recombinant nucleic acid molecules, often derived from bacteriophages or plant or animal viruses. Plasmids and cosmids refer to two such recombinant vectors.
  • reporter refers generally to a moiety, chemical compound, or other component that can be used to visualize, quantitate, or identify desired components of a system of interest.
  • Reporters are commonly, but not exclusively, genes that encode reporter proteins.
  • a“reporter gene” is a gene that, when expressed in a cell, allows visualization or identification of that cell, or permits quantitation of expression of a recombinant gene.
  • a reporter gene can encode a protein, for example, an enzyme whose activity can be quantitated, for example, chloramphenicol acetyltransferase (CAT) or firefly luciferase protein.
  • CAT chloramphenicol acetyltransferase
  • Reporters also include fluorescent proteins, for example, green fluorescent protein (GFP) or any of the recombinant variants of GFP, including enhanced GFP (EGFP), blue fluorescent proteins (BFP and derivatives), cyan fluorescent protein (CFP and other derivatives), yellow fluorescent protein (YFP and other derivatives) and red fluorescent protein (RFP and other derivatives). Reporters also include non-protein molecules, including fluorescent dyes and quantum dots.
  • GFP green fluorescent protein
  • EGFP enhanced GFP
  • BFP and derivatives blue fluorescent proteins
  • CFP and other derivatives cyan fluorescent protein
  • YFP and other derivatives yellow fluorescent protein
  • RFP and other derivatives red fluorescent protein
  • Reporters also include non-protein molecules, including fluorescent dyes and quantum dots.
  • a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence.
  • the Type I CRISPR-Cas complex includes at least one Cas protein covalently linked to an effector molecule.
  • the Cas protein that is covalently linked to the effector molecule is not a Csel protein.
  • the Type I CRISPR-Cas complex in some examples is a Type IE CRISPR-Cascade complex (for example, including Cas8, Cse2, Cas7, Cas5, and Cas6 proteins).
  • the CRISPR- Cascade complex includes one or more Cas7 proteins (such as 1, 2, 3, 4, 5, or 6 Cas7 proteins) linked to an effector molecule.
  • the CRISPR-Cascade complex includes one or more Cse2 proteins (such as 1 or 2 Cse2 proteins) linked to an effector molecule.
  • the CRISPR-Cascade complex includes a Cas5 or a Cas6 protein linked to an effector molecule.
  • Exemplary Cas7 proteins linked to an effector molecule include SEQ ID NOs: 2, 4, 6, 8, 10, and 12, which are encoded by the nucleic acid sequences of SEQ ID NOs: 1, 3, 5, 7, 9, and 11.
  • the Type I CRISPR-Cas complex is a Type IF CRISPR-Csy complex (for example, including Csyl, Csy2, Csy3, and Csy4 proteins).
  • the CRISPR-Csy complex includes one or more Csy3 proteins (such as 1, 2, 3, 4, 5, or 6 Csy3 proteins) linked to an effector molecule.
  • the CRISPR-Csy complex includes a Csyl, Csy2, or Csy4 protein linked to an effector molecule.
  • the effector molecule is covalently linked to the N-terminus or the C- terminus of the Cas protein (such as the N-terminus or C-terminus of a Cas7 protein).
  • the effector molecule may be directly (for example, without an intervening linker) or indirectly linked (for example, via a linker) to the Cas protein.
  • the C-terminus of an effector molecule is directly linked (for example, by a peptide bond) to the N-terminus of a Cas protein.
  • the C-terminus of a Cas protein is directly linked (for example by a peptide bond) to the N-terminus of an effector molecule.
  • the effector molecule is not a protein (for example, is a fluorescent dye or quantum dot)
  • the effector may be directly linked to the Cas protein by a non-peptide bond (such as a thiol or amine bond).
  • a non-protein effector molecule may be linked to a Cas protein at any location, and is not limited to linkage at the N- or C- terminus.
  • the effector and Cas proteins are joined by a linker.
  • the linker is an amino acid linker (such as 1-100 amino acids, such as 1-20 amino acids, 10-30 amino acids, 20-40 amino acids, 30-50 amino acids, 40-60 amino acids, 50-70 amino acids, 60-80 amino acids, 70-90 amino acids, or 80-100 amino acids).
  • the amino acid linker may include 2 or more repeats of a particular linker sequence.
  • the linker is a cross-linker (such as a maleimide or succinimide linker) or other type of linker, such as a carbon chain. Exemplary crosslinking agents and techniques are described in Crosslinking Technology (Thermo Scientific, available at
  • thermofisher.com content/sfs/brochures/1602163-Crosslinking-Reagents-Handbook.pdf).
  • the effector molecule is linked to a Cas protein by a streptavidin-biotin linker.
  • the C-terminus of streptavidin is directly linked (for example, by a peptide bond) to the N-terminus of a Cas protein.
  • the C-terminus of a Cas protein is directly linked (for example by a peptide bond) to the N-terminus of streptavidin.
  • the Cas protein can also be indirectly linked to streptavidin, for example by a linker, as discussed above with respect to effector molecules.
  • the effector molecule is linked (directly or indirectly) to biotin and the Cas protein and the effector molecule are linked by the interaction of streptavidin and biotin.
  • one or more effectors are associated with the complex via the crRNA.
  • the 3’ hairpin of the crRNA is extended to include one or more (such as 1, 2, 3, 4, or more) RNA binding motifs and the effector is linked (directly or indirectly) to a protein including an RNA binding domain.
  • An exemplary embodiment is illustrated schematically in FIG. 2.
  • the RNA binding motif is an MS2 hairpin that is bound by the MS2 coat protein, which is linked to an effector.
  • the RNA binding motif is a type I repeat and the protein is a nuclease-inactivated Cas6 family protein linked to an effector.
  • One or more of the Cas polypeptides in the complex may also include a nuclear localization signal (NLS).
  • the system may optionally include a Cas3 nuclease.
  • the effector molecule includes a transcriptional activator (for example, one or more VP16 activation domains), a transcriptional repressor (for example, a Kruppel associated box (KRAB) repressor domain), or a base editor (for example, a cytidine deaminase or an adenosine deaminase).
  • the effector molecule is a reporter, such as a fluorescent protein, a fluorescent dye, or a quantum dot.
  • an effector molecule includes an oligonucleotide or peptide. Effector molecules are discussed in more detail in Section IV, below.
  • nucleic acids encoding a Cas7 protein linked to an effector molecule.
  • the nucleic acids encode a Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 1 and 3), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 5 and 7), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 9 and 11).
  • VP64 e.g., SEQ ID NOs: 1 and 3
  • KRAB e.g., SEQ ID NOs: 5 and 7
  • emGFP emerald green fluorescent protein
  • the Cas7-effector molecule proteins include Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 2 and 4), Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 6 and 8), and Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 10 and 12).
  • VP64 e.g., SEQ ID NOs: 2 and 4
  • Cas7 protein linked to KRAB e.g., SEQ ID NOs: 6 and 8
  • emGFP emerald green fluorescent protein
  • the disclosure includes a vector including a nucleic acid encoding a Cas protein covalently linked to an effector molecule, where the Cas protein is not Csel.
  • the vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule.
  • Exemplary nucleic acids included in the vector are SEQ ID NOs: 1, 3, 5, 7, 9, and 11, which encode the proteins SEQ ID NOs:
  • a vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule and one or more additional Type I CRISPR complex proteins, such as Csel, Cse2,
  • Cas5, and/or Cas6 Exemplary vectors include SEQ ID NOs: 13-15.
  • the cells may be prokaryotic or eukaryotic cells, including animal cells, plant cells, fungal cells, algal cells, or bacterial cells.
  • methods that include contacting a nucleic acid in a cell (such as genomic DNA) with a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence.
  • the Type I CRISPR-Cas complex includes at least one Cas protein covalently linked to an effector molecule.
  • the Cas protein that is covalently linked to the effector molecule is not a Csel protein.
  • the method includes altering (for example, increasing or decreasing) expression of a nucleic acid in the cell or modifying a target nucleic acid sequence in the cell in vitro, ex vivo, or in vivo.
  • the method includes detecting and/or quantifying a target nucleic acid in a cell in vitro, ex vivo, or in vivo.
  • CRISPR-Cascade modified systems are illustrated herein in the context of CRISPR-Cascade, also contemplated herein is the use of other similarly modified Type I CRISPR-Cas systems, including but not limited to Type IF CRISPR-Csy systems.
  • an effector molecule may be linked to one or more Cas7 (Csy3) proteins in a CRISPR-Csy complex. It is expected that the CRISPR-Cas nucleic acid systems described herein are equally applicable for any Type I CRISPR-Cas protein complexes.
  • constructs that include an effector molecule tethered or linked to a Cascade component.
  • the Cas-effector component may be present or expressed as an individual protein or as part of a larger Cascade complex.
  • the examples provided herein are in the context of Type IE CRISPR- Cascade and Cas7, it is contemplated that other Cascade proteins and/or any other Type I CRISPR system could be similarly adapted.
  • the constructs include VP64 linked to Cas7.
  • Exemplary VP64-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 1-4. In SEQ ID NO:
  • amino acids 1-53 are VP64, amino acids 54-65 are the linker, and amino acids 66- 427 are Cas7.
  • amino acids 1-363 are Cas7, amino acids 364- 371 are the linker, and amino acids 372-421 are VP64.
  • the constructs include a Kruppel associated box (KRAB) repressor domain linked to Cas7.
  • KRAB-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 5-8.
  • SEQ ID NO: 5 nucleotides 1-198 encode KRAB, nucleotides 199-234 encode a linker, and nucleotides 235-1323 encode Cas7.
  • amino acids 1-66 are KRAB, 67-78 are the linker, and amino acids 79-440 are Cas7.
  • nucleotides 1-1089 encode Cas7
  • nucleotides 1090-1113 encode a linker
  • nucleotides 1114-1311 encode KRAB.
  • amino acids 1-363 are Cas7
  • amino acids 364-371 are the linker
  • amino acids 364-436 are KRAB.
  • the constructs include emerald green fluorescent protein (emGFP) linked to Cas7.
  • emGFP-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 9-12.
  • nucleotides 1-717 encode emGFP and nucleotides 718-1806 encode Cas7.
  • amino acids 1-239 are emGFP and amino acids 240-601 are Cas7.
  • nucleotides 1-1089 encode Cas7
  • nucleotides 1090-1104 encode a linker
  • nucleotides 1105-1821 encode emGFP.
  • amino acids 1-390 are Cas7
  • amino acids 391-396 are the linker
  • amino acids 397-606 are emGFP.
  • the effector molecule -Cas7 nucleic acids have a nucleic acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • the effector molecule-Cas7 nucleic acids encode an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • the effector molecule-Cas7 protein has an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • compositions and methods include one or more effector molecules.
  • effector molecules have one or more DNA-modulating activities, such as proteins (or protein domains) that can modify expression or sequence of a target nucleic acid.
  • effector molecules include agents (such as a reporter) that can be detected, for example, to visualize or identify a cell including a target nucleic acid or that can be used to quantitate expression of a target nucleic acid.
  • effector molecules include proteins or protein domains that can modify expression of a target nucleic acid, such as transcriptional activators or repressors.
  • Transcriptional activators are molecules that increase gene expression. Transcriptional activators typically include DNA binding and transcriptional activation functions, which can be included in a single protein or separate proteins.
  • the DNA binding function of the transcriptional activator is provided by the Cascade complex and the transcription activation function is provided by the effector molecule (such as a transcription activator domain) tethered to a Cascade subunit.
  • An exemplary transcriptional activator disclosed herein is the VP16 activation domain from herpes simplex virus.
  • a multimer of four VP16 molecules is the effector molecule.
  • Other VP16 activation multimers can also be used, for example VP160, which includes 10 VP16 molecules.
  • a transcriptional activator includes a p65 activation domain, an RNA polymerase omega subunit, human heat shock factor 1, a viral RTA activation domain, a heat shock factor 1 (HSF1) activation domain, or a sigma factor (such as s70 (RpoD)).
  • An additional alternative VP16-containing activator is VPR, which includes VP64, a p65 activation domain, and an RTA activation domain. Additional fusions, for example with a p65 activation domain or HSF1 activation domain, may also be used. Additional transcriptional activators or transcriptional activation domains can also be selected.
  • Transcriptional repressors are molecules that decrease or inhibit gene expression.
  • An exemplary transcriptional repressor is the Kruppel associated box (KRAB) repressor domain.
  • Other transcriptional repressors include REST, thyroid hormone receptors a and b, and repressor domains derived from Egr-1, Oct2A, and Drl (see, e.g., Thiel et al., Biological Chem. 382:891-902, 2001). Additional transcriptional repressors or transcriptional repressor domains can also be selected.
  • effector molecules include proteins that can modify a nucleic acid sequence, such as a protein with base editing activity.
  • Base editing is a technique that permits direct conversion of a specific nucleotide (for example, present in genomic DNA) to another nucleotide.
  • Exemplary base editors include cytidine deaminases (e.g., APOBEC1, AID, CDA1, and APOBEC3G) and adenosine deaminases (e.g., TadA) or variants thereof.
  • uridine glycosylase inhibitor is the effector.
  • UGI may also be included as an effector in combination with a base editor effector or as a fusion protein with a base editor, such as cytidine deaminase).
  • Base editors include those described in Komor et al. (Set Adv. 3:eaao4774, 2017), Gaudelli et al. (Nature 551 :464-471, 2017) and Nishida et al. (Science 353:aaf8729, 2016).
  • the effector molecule is a reporter or detectable label.
  • the reporter can be used to visualize or localize a target nucleic acid, for example in a sample, cell, tissue, or organism.
  • a reporter can also be used to quantify (for example, quantitatively or semi-quantitatively) an amount of a target nucleic acid in a sample, cell, tissue, or organ.
  • Reporter effector molecules include fluorescent proteins, such as green fluorescent protein (GFP) or any of the recombinant variants of GFP, including enhanced GFP (EGFP) or emerald GFP (emGFP), blue fluorescent proteins (BFP and derivatives), cyan fluorescent protein (CFP and other derivatives), yellow fluorescent protein (YFP and other derivatives) and red fluorescent protein (RFP and other derivatives).
  • GFP green fluorescent protein
  • EGFP enhanced GFP
  • emGFP emerald GFP
  • BFP and derivatives blue fluorescent proteins
  • CFP and other derivatives cyan fluorescent protein
  • YFP and other derivatives yellow fluorescent protein
  • RFP and other derivatives red fluorescent protein
  • Other detectable agents include but are not limited to fluorescent dyes or Quantum dots.
  • the reporter includes a Halo tag conjugated to a fluorescent dye (see, e.g., Deng et al., PNAS 112:11870-11875, 2015).
  • Exemplary fluorescent dyes include Alexa Fluor dye
  • multiple copies of an effector molecule can he recruited to the complex by linking a SusiTag scaffold containing 10-24 copies of the short epitope GCN4 to the Cas polypeptide.
  • GCN4 recruits an effector molecule fused to the cognate scFV antibody, which is expressed from a separate plasmid.
  • Tilts system amplifies the number of effector molecules which can be included in the complex, for example, increasing intensity of the fluorescent signal in the case of a fluorescent protein effector molecule. See, e.g., Tanenbaum et ai, Cell 159:635-646, 2014.
  • the effector is a component of the non-homologous end joining (NHEJ) repair pathway (e.g., LIG4, Ku70/80, or DNA-PKcs) or homology-dependent repair pathway (e.g., CtIP, Exol, or RAD51).
  • NHEJ non-homologous end joining
  • homology-dependent repair pathway e.g., CtIP, Exol, or RAD51.
  • the effector provides tethering of a single- stranded DNA oligo template for homology directed repair (HDR) to the CRISPR RNP for delivery to the site of a DNA break
  • a DNA binding domain such as a Fokl DNA binding domain or zinc finger protein DNA binding domain or TALE DNA binding domain is linked to a Cas protein.
  • the DNA binding domain of the Fokl restriction enzyme is linked to a Cascade protein (such as Cas7 or Csy3), to allow for delivery of a donor DNA to the site of a DNA break (double-stranded or nicked) for increased homology-directed repair.
  • a ssODN donor would include a short double- stranded“handle” for binding to the Fokl-Cascade fusion and targeted to a specific locus.
  • the effector provides cell cycle control, for example, using an N-terminal fragment of geminin linked to a Cas protein (such as Cas7) for expression during cell cycles when HDR is active.
  • a Cas protein such as Cas7
  • the effector provides chemical control of activity requiring small-molecule ligands for protein stabilization or delivery to the nucleus.
  • the effector is dihydrofolate reductase (DHFR) or a DHFR-derived destabilization domain which can be regulated with addition of trimethoprim.
  • the effector is a destabilizing domain of the estrogen receptor (such as ER50), which can be regulated by CMP8 or 4-hydroxytamoxifen.
  • the effector includes the hormone binding domain of the estrogen receptor (ERT2), which can be regulated with addition of estrogen or an estrogen agonist (such as tamoxifen or 4-hydroxytamoxifen).
  • the effector is CRY2/CIB1 for photoinduced dimerization.
  • nucleic acid modulating systems including the Type I CRISPR-Cas complex systems including at least one linked effector molecule described herein
  • Representative delivery systems herein disclose methods and compositions containing viral and/or non-viral vectors to deliver nucleic acid editing systems, particularly, Type I CRISPR-Cas complex systems including at least one covalently linked effector molecule, and optionally an editing template to edit genes in cells. While gene editing is particularly useful in vivo, in some embodiments, the cell targeted for gene editing may be in vitro, ex vivo, or in vivo.
  • viral vectors or plasmids for gene expression can be used to deliver the Type I CRISPR complexes including at least one linked effector molecule disclosed herein.
  • nucleic acids encoding Type I CRISPR complex proteins, including at least one Cas protein linked to an effector molecule are delivered to a cell in one or more vectors, for example by transformation, transfection, or other known methods of delivery to cells.
  • virus-like particles (VLP) can be used to encapsulate ribonucleoprotein complexes.
  • recombinant expression can be used, and purified ribonucleoprotein complexes disclosed herein can be purified and delivered to cells via electroporation or injection.
  • Delivery vehicles may be viral vectors or non-viral vectors, or RNA conjugates.
  • the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in one or more viral vectors or non-viral vectors (such as 1, 2, 3, or more vectors).
  • the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in a single vector.
  • the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in two vectors.
  • the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in three vectors.
  • the components of the CRISPR-Cas complex may be expressed such that one or more linked proteins are produced (for example, linked by an amino acid sequence) or may be expressed as individual open reading frames (for example, using a vector that results in production of polycistronic RNA).
  • Cas subunits can be included in a single vector or multiple vectors.
  • the nucleic acids encoding Type I CRISPR system proteins are included in two or more vectors, which in some examples can provide improved complex stability upon expression in a cell.
  • a first vector includes nucleic acids encoding one or more Type I CRISPR proteins and a second vector includes nucleic acids encoding one or more Type I CRISPR proteins.
  • One or more of the CRISPR proteins encoded by the first and second vector may be the same and/or one or more of the CRISPR proteins encoded by the first and second vector may be the same.
  • a first vector includes nucleic acids encoding Csel (Cas8), Cse2, Cas7 linked to an effector, Cas5, and Cas6 and a second vector includes nucleic acids encoding Cas7, Cas5, and Cas6 (e.g., FIGS. 3A and 3B).
  • a first vector includes nucleic acids encoding Csel (Cas8), Cse2, and Cas7 linked to an effector and a second vector includes nucleic acids encoding Cas5 and Cas6 (e.g., FIGS. 4A and 4B).
  • Other combinations of protein expression vectors can be used, and can be tested, for example based on relative quantities of each subunit expressed.
  • Exemplary vectors include those provided herein as SEQ ID NOs: 13-15.
  • the vector(s) may further include components for targeting the Cas subunits to the nucleus, such as one or more nuclear localization signals (NLS) linked to one or more of the Cas subunits.
  • NLS nuclear localization signals
  • an NLS is linked to Csel (Cas8), for example, linked to the N-terminus of Csel .
  • an NLS is linked to Cas6, for example, linked to the C-terminus of Cas6.
  • An NLS can alternatively be linked to one or more other proteins in the complex and/or to the effector molecule or between the effector molecule and Cas7.
  • the vector may also include one or more tags for protein purification, such as strep tavidin (e.g., Strep-tagll), maltose binding protein (MBP), 6xHistidine (6xHis), small ubiquitin like modifier (SUMO), or glutathione S transferase (GST), or a combination of two or more thereof (e.g., His-MBP).
  • the purification tag can be linked to the N- or C-terminus of any subunit of the complex.
  • streptavidin e.g., Strep-tagll
  • streptavidin is linked to the N-terminus of Cse2. See, e.g., Brouns et al., Science 321 :960-964, 2008.
  • the guide sequence and the CRISPR-Cas complex with at least one linked effector molecule are provided in the same type of delivery vehicle, wherein the delivery vehicle is a viral vector or a non-viral vector.
  • the guide sequence is provided in a viral vector, and the CRISPR-Cas complex with at least one linked effector molecule is provided in non-viral vector(s).
  • the one or more guide sequence is provided in a non-viral vector and the CRISPR-Cas complex with at least one linked effector molecule is provided in viral vector(s).
  • the guide sequence is provided in an RNA conjugate.
  • Any vector system may be used, including, but not limited to, plasmid vectors, linear constructs, retroviral vectors, lentiviral vectors, adenovirus vectors, poxvirus vectors; herpesvirus vectors and adeno- associated vims vectors, etc. See, also, U.S. Pat. Nos. 6,534,261 ; 6,607,882; 6,824,978; 6,933,113;
  • any of these vectors may comprise one or more CRISPR-Cas encoding sequences and/or additional nucleic acids as appropriate.
  • CRISPR-Cas proteins and/or guide sequence as described herein are introduced into the cell, and additional DNAs as appropriate, they may be carried on the same vector or on different vectors.
  • each vector may comprise a sequence encoding one or multiple components of the Type I CRISPR-Cas complexes, as desired.
  • Exemplary bacterial vectors for expression of the components are shown in LIGS. 3A-3B and 4A-4B and include SEQ ID NOs: 13-15.
  • nucleic acids encoding engineered Type I CRISPR-Cas complexes including at least one linked effector molecule into cells (e.g., bacterial, animal, plant, fungal, or algal cells) and target tissues and to co-in troduce additional nucleotide sequences if desired.
  • Such methods can also be used to administer nucleic acids (e.g., encoding CRISPR-Cas complexes including at least one linked effector molecule or components thereof) to cells in vitro.
  • nucleic acids are administered for in vivo or ex vivo gene therapy uses.
  • Non- viral vector delivery systems include DNA plasmids, naked nucleic acid, and nucleic acid complexed with a delivery vehicle such as a liposome or polymer.
  • Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell.
  • the viral vector is selected from an adeno-associated vims (AAV), adenovirus, retrovirus, and lentivirus vector. While the viral vector may deliver any component of the systems described herein so long as it provides the desired profile for tissue presence or expression, in some embodiments the viral vector provides for expression of the guide sequence and optionally delivers a repair template. In some embodiments, the viral delivery system is adeno-associated vims (AAV) 2/8. However, in various embodiments other AAV serotypes are used, such as AAV1, AAV2, AAV4, AAV5, AAV6, and AAV8.
  • AAV adeno-associated vims
  • AAV6 is used when targeting airway epithelial cells
  • AAV7 is used when targeting skeletal muscle cells (similarly for AAV1 and AAV5)
  • AAV8 is used for hepatocytes.
  • AAV1 and AAV5 can be used for delivery to vascular endothelial cells.
  • most AAV serotypes show neuronal tropism, while AAV5 also transduces astrocytes.
  • hybrid AAV vectors are employed.
  • each serotype is administered only once to avoid immunogenicity. Thus, subsequent administrations employ different AAV serotypes. Additional viral vectors that can be employed are as described in US 8,697,359, which is hereby incorporated by reference in its entirety.
  • the delivery system comprises a non-viral delivery vehicle.
  • the non-viral delivery vehicle is lipid-based.
  • the non-viral delivery vehicle is a polymer.
  • the non-viral delivery vehicle is biodegradable.
  • the non-viral delivery vehicle is a lipid encapsulation system and/or polymeric particle.
  • Methods of non-viral delivery of nucleic acids include electroporation, nucleofection, lipofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipidmucleic acid conjugates, naked DNA, mRNA, artificial virions, and agent-enhanced uptake of DNA.
  • Sonoporation using, e.g., the Sonitron 2000 system (Rich-Mar) can also be used for delivery of nucleic acids.
  • one or more nucleic acids are delivered as mRNA.
  • Use of capped mRNAs to increase translational efficiency and/or mRNA stability is included in some embodiments.
  • ARCA (anti-reverse cap analog) caps or variants thereof are used. See U.S. Pat. Nos. 7,074,596 and 8,153,773, incorporated by reference herein.
  • nucleic acid delivery systems include those provided by Lonza (Cologne, Germany), Amaxa Biosystems (Cologne, Germany), Maxcyte, Inc. (Rockville, Md.), BTX Molecular Delivery Systems (Holliston, Mass.) and Copernicus Therapeutics, Inc., (see for example U.S. Pat. No. 6,008,336).
  • Lipofection is described in e.g., U.S. Pat. No. 5,049,386, U.S. Pat. No. 4,946,787; and U.S. Pat. No. 4,897,355) and lipofection reagents are sold commercially (e.g., TRANSFECTAMTM,
  • Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424 and WO 91/16024. Delivery can be to cells (ex vivo administration) or target tissues (in vivo administration).
  • the delivery system includes lipid particles as described in Kanasty (Nat Mater. 12(11) :967-77, 2013), which is hereby incorporated by reference.
  • the lipid- based vector is a lipid nanoparticle, which is a lipid particle between about 1 and about 100 nanometers in size.
  • the lipid-based vector is a lipid or liposome.
  • Liposomes are artificial spherical vesicles comprising a lipid bilayer.
  • the lipid-based vector is a small nucleic acid-lipid particle (SNALP).
  • SNALP small nucleic acid-lipid particle
  • SNALPs are small (less than 200 nm in diameter) lipid-based nanoparticles that encapsulate a nucleic acid.
  • the SNALP is useful for delivery of an RNA molecule such as crRNA.
  • SNALP formulations deliver nucleic acids to a particular tissue in a subject, such as the liver.
  • the guide sequence and/or Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof (or the RNA encoding the same) is delivered via polymeric vectors.
  • the polymeric vector is a polymer or polymerosome.
  • Polymers encompass any long repeating chain of monomers and include, for example, linear polymers, branched polymers, dendrimers, and polysaccharides. Linear polymers include a single line of monomers, whereas branched polymers include side chains of monomers. Dendrimers are also branched molecules, which are arranged symmetrically around the core of the molecule.
  • Polysaccharides are polymeric carbohydrate molecules, and are made up of long monosaccharide units linked together.
  • Polymersomes are artificial vesicles made up of synthetic amphiphilic copolymers that form a vesicle membrane, and may have a hollow or aqueous core within the vesicle membrane.
  • RNA encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof.
  • Exemplary polymeric materials include poly(D,L- lactic acid-co-glycolic acid) (PLGA), poly(caprolactone) (PCL), ethylene vinyl acetate polymer (EVA), poly(lactic acid) (PLA), poly(L-lactic acid) (PLLA), poly(glycolic acid) (PGA), poly(L-lactic acid-co-glycolic acid) (PLLGA), poly(D,L-lactide) (PDLA), poly(L- lactide) (PLLA), PLGA-b-poly(ethylene glycol)-PLGA (PLGA-bPEG-PLGA), PLLA-bPEG-PLLA, PLGA-PEG-maleimide (PLGA-PEG-mal), poly(D,L-lactide-co-caprolactone), poly(D,L-lactide-co- caprolactone-co- glycolide), poly(D,L-lactide-co-PEO-co-D,L-lactide), poly(D,
  • hydroxypropylcellulose carboxymethylcellulose
  • polymers of acrylic acids such as
  • PBAE Poly( [beta] -amino esters
  • Polymer-based systems may also include Cyclodextrin polymer (CDP)-based nanoparticles such as, for example, CDP-admantane (AD)-PEG conjugates and CDP-AD-PEG-transferrin conjugates.
  • CDP Cyclodextrin polymer
  • Exemplary polymeric particle systems for delivery of substances, including nucleic acids include those described in US 5,543,158, US 6,007,845, US 6,254,890, US 6,998,115, US 7,727,969, US 7,427,394, US 8,323,698, US 8,071,082, US 8,105,652, US 2008/0268063, US 2009/0298710, US 2010/0303723, US 2011/0027172, US 2011/0065807, US 2012/0156135, US 2014/0093575, WO 2013/090861, each of which are hereby incorporated by reference in its entirety.
  • the delivery system is a layer-by-layer particle system including two or more layers.
  • the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof are present in different layers within the layer-by-layer particle.
  • the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof may be administered to a subject in a layer-by-layer particle system such that the release of the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof from the particles can be controlled in a cell-specific and/or temporal fashion.
  • Layer-by-layer particle systems are disclosed, for example, in US 2014/0093575, incorporated herein by reference in its entirety.
  • the lipid-based delivery system includes a lipid encapsulation system.
  • the lipid encapsulation system can be designed to drive the desired tissue distribution and cellular entry properties, as well as to provide the requisite circulation time and biodegrading character.
  • the lipid encapsulation may involve reverse micelles and/or further comprise polymeric matrices, for example as described in US 8,193,334, which is hereby incorporated by reference.
  • the particle includes a lipophilic delivery compound to enhance delivery of the particle to tissues, including in a preferential manner. Such compounds are disclosed in US 2013/0158021, which is hereby incorporated by reference in its entirety.
  • Such compounds may generally include lipophilic groups and conjugated amino acids or peptides, including linear or cyclic peptides, and including isomers thereof.
  • An exemplary compound is referred to as cKK-E12, which can affect delivery to liver and kidney cells, for example.
  • the present disclosure can employ compounds of formulas (I), (II), (III), (IV), (V), and (VI) of US
  • Compounds can be engineered for targeting to various tissues, including but not limited to pancreas, spleen, liver, fat, kidneys, utems/ovaries, muscle, heart, lungs, endothelial tissue, and thymus.
  • the lipid encapsulation comprises one or more of a phospholipid, cholesterol, polyethylene glycol (PEG)-lipid, and a lipophilic compound.
  • the lipophilic compound is C12-200, particularly in embodiments that target the liver (Love et al, PNAS 107(5):1864- 1869; 2010 (erratum in PNAS 107(21), 2010), incorporated herein by reference in its entirety).
  • the lipophilic compound Cl 2-200 is useful in embodiments that target fat tissue.
  • the lipopeptide is cKK-E12 (Dong et al, PNAS 111(11) :3955— 3960, 2014, incorporated herein by reference in its entirety).
  • the lipid encapsulation includes 1 ,2-diolcoyl-v «-glyccro-3- phosphoethanolamine (DOPE), cholesterol, C14-PEG2000, and cKK-E12, which provides for efficient in vivo gene modulation in liver tissue.
  • DOPE diolcoyl-v «-glyccro-3- phosphoethanolamine
  • cholesterol C14-PEG2000
  • cKK-E12 which provides for efficient in vivo gene modulation in liver tissue.
  • delivery particles may include additional components useful for enhancing the properties for in vivo nucleic acid delivery (including compounds disclosed in US 8,450,298 and US 2012/0251560, which are each hereby incorporated by reference).
  • the delivery vehicle may accumulate preferentially in certain tissues thereby providing a tissue targeting effect, but in some embodiments, the delivery vehicle further comprises at least one cell-targeting or tissue-targeting ligand and/or a tissue- specific promoter.
  • Functionalized particles including exemplary targeting ligands, are disclosed in US 2010/0303723 and 2012/0156135, which are hereby incorporated by reference in their entireties.
  • a delivery vehicle can be designed to drive the desired tissue distribution and cellular entry properties of the delivery systems disclosed herein, as well as to provide the requisite circulation time and biodegrading character.
  • lipid particles can employ amino lipids as disclosed US 2011/0009641, which is hereby incorporated by reference.
  • the lipid or polymeric particles may have a size (e.g an average size) in the range of about 50 nm to about 5 pm. In some embodiments, the particles are in the range of about 10 nm to about 100 pm, or about 20 nm to about 50 pm, or about 60 nm to about 5 pm, or about 70 nm to about 500 nm, or about 70 nm to about 200 nm, or about 50 nm to about 100 nm. Particles may be selected so as to avoid rapid clearance by the immune system. Particles may be spherical, or non-spherical in certain embodiments.
  • the non-viral delivery vehicle may be a peptide, such as cell-penetrating peptides or cellular internalization sequences.
  • Cell-penetrating peptides are small peptides that are capable of translocating across plasma membranes.
  • Exemplary cell-penetrating peptides include, but are not limited to, Antennapedia sequences, TAT, HIV-Tat, Penetratin, Antp-3A (Antp mutant), Buforin II, Transportan, MAP (model amphipathic peptide), K-FGF, Ku70, Prion, pVEC, Pep-1, SynBl, Pep-7, 1-IN-1, BGSC (Bis- Guanidinium-Spermidine-Cholesterol, and BGTC (Bis-Guanidinium-Tren -Cholesterol).
  • Antennapedia sequences include, but are not limited to, Antennapedia sequences, TAT, HIV-Tat, Penetratin, Antp-3A (Antp mutant), Buforin II, Transportan, MAP (model amphipathic peptide), K-FGF, Ku70, Prion, pVEC, Pep-1, SynBl, Pep-7, 1-IN-1, BGSC (B
  • the present disclosure provides plasmids for transgenic or transient expression of the Type I CRISPR-Cas proteins including at least one Cas protein linked to an effector molecule.
  • a plasmid encoding a chimeric Type I CRISPR-Cas protein includes in frame sequences for protein fusions of one or more of the other proteins described herein, including, but not limited to a Type I CRISPR-Cas protein, an effector molecule, optionally a linker, and optionally a nuclear localization sequence (NFS).
  • NFS nuclear localization sequence
  • the plasmids and vectors encode the CRISPR-Cas protein(s) and effector molecule and also encode the guide RNA of the present invention.
  • one or more components of the engineered complex can be encoded in two or more distinct plasmids.
  • the plasmids can be used across multiple species. In other embodiments, the plasmids are tailored to the organism or type of cell being transformed. In some embodiments, the sequences of the nucleic acids are codon-optimized for expression in the organism whose genes are being targeted. Promoters providing adequate expression can be selected. In some embodiments, the plasmids for different species will require different promoters.
  • the plasmids and vectors are selectively expressed in the cells of interest.
  • the present application teaches the use of ectopic promoters, tissue-specific promoters, developmentally-regulated promoters, or inducible promoters.
  • the present disclosure also includes the use of terminator sequences.
  • a portion, or the entire complex(es) of the present technology, or the entire set of components of a Cascade-effector molecule complex can be delivered directly to cells (e.g., through microinjection).
  • the polypeptides and/or nucleic acids are expressed and purified.
  • the polypeptides are expressed via inducible or constitutive protein production systems such as a bacterial system, yeast system, plant cell system, or animal cell system.
  • the purification of proteins and/or polypeptides may be purified via affinity tags, or custom antibody purifications.
  • polynucleotides may be chemically synthesized.
  • the nucleic acid disclosed herein are transformed into a heterologous cell.
  • nucleic acids or plasmids disclosed herein can be transformed into cells through any known system.
  • cells may be transformed by particle bombardment, chemical
  • transformation agrobacterium transformation, nano-spike transformation, and/or virus transformation.
  • the delivery vehicles may be administered to a subject by any method known in the art, including injection, optionally by direct injection to target tissues, specific target cells, and even to specific organelles within a single cell ( e.g the nucleus).
  • the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and, optionally, Cas3 and/or repair template are administered simultaneously in the same or in different delivery vehicles.
  • the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and, optionally, Cas3 and/or repair template are administered sequentially via separate delivery vehicles.
  • the guide sequence is administered 1-30 days (for example, 1, 3, 5, 7, 10, 14, or 30 days) prior to administration of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, such that the guide sequence accumulates in the target tissue prior to administration of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule.
  • the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered in a plurality of doses, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more doses.
  • the guide sequence and/or Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered over a time period of from one week to about six months, such as from about two to about ten doses within about two months, such as from three to five doses over about one month.
  • the guide sequence and, optionally, a repair template are provided in an AAV vector that is administered to the subject or cell prior to administration of a nanoparticle containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule.
  • the AAV vector comprising the guide sequence is administered 3, 4, 5, 6, 7, 8, 9, or 10 days prior to the administration of the nanoparticle, to allow expression of the guide sequence from the AAV vector.
  • the nanoparticle containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered multiple times, for example, once every 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 days.
  • the nanoparticle containing the guide sequence is administered for 1 month, 2 months, 3, months, 4 months, 6 months, 8 months, 10 months, 12 months, 18 months, 24 months, or longer. Since AAV expression can occur for 2 years or longer, in one embodiment, the expression of the guide sequence and, optionally, repair template, from the AAV vector and the continual administration of nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule provides efficient gene editing of the target sequence with reduced or absent off-target effects due to the transient expression of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule.
  • the repair template is delivered via an AAV vector, and is injected 3, 4, 5, 6, 7, 8, 9, or 10 days prior to the administration of nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or the guide sequence.
  • the nanoparticles may be administered multiple times, and for several months.
  • the repair template is expressed from the AAV vector in the cell for 2 years or longer, and the nanoparticles comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence are administered in multiple administrations over time in order to provide efficient modulation of the target sequence with reduced or absent off-target effects.
  • one or more guide sequences and, optionally, a repair template is provided in an AAV vector that is administered first, and a Type I CRISPR-Cas proteins or complex including at least one linked effector molecule in a lipid-based delivery vehicle is subsequently administered in one or more doses.
  • the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered in a lipid-based delivery vehicle about 7 days and about 14 days after the administration of the one or more guide sequences in an AAV vector.
  • each of the components of the delivery systems provided herein e.g ., the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence and, optionally, repair template
  • the nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence, and, optionally, repair template are administered at multiple time points, for example, every 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 days.
  • the administration of the nanoparticles separately comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and guide sequence are administered at different time points in order to enhance efficiency in a particular cell or for a particular disease type.
  • the administration of the delivery system is controlled so that expression of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is transient.
  • transient expression minimizes off-target effects, thereby increasing the safety and efficiency of the system disclosed herein.
  • expression of the system is controlled via selection of the delivery vehicles and/or promoters disclosed herein.
  • the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and optionally, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or repair template are administered to a subject or a cell at the same time, such as on the same delivery vehicle, and one or more component is under the control of an inducible promoter.
  • the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and repair template are each present on an AAV viral vector, and the guide sequence is under the control of an inducible promoter, for example, a small molecule-induced promoter such as tetracycline-inducible promoter.
  • components of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule are expressed 5-7 days following administration of the vector, after which the expression of the guide sequence is induced by one or more injections of the small molecule such as tetracycline.
  • the guide sequence expression can be induced at various time points in order to increase efficiency; for example guide sequence expression may be induced every day, or every 2 days, or every 3 days, or every 5 days, or every 10 days, or every 2 weeks, for at least 1 week or at least 2 weeks, or at least 3 weeks, or at least 4 weeks, or at least 5 weeks, or at least 6 weeks, or at least 7 weeks, or at least 8 weeks, or at least 10 weeks, or at least 11 weeks, or at least 12 weeks, or more.
  • component(s) of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule may be expressed from the AAV vector over time, and the guide sequence may be inducibly expressed by multiple injections of the inducing molecule over several days, weeks, or months.
  • the guide sequence can be expressed from the AAV vector over time, and components of the Type I CRISPR-Cas complex including at least one linked effector molecule may be inducibly expressed under control of an inducible promoter by multiple injections of the inducing molecule over several days, weeks, or months.
  • one or more guide sequences and, optionally, a repair template is delivered via an RNA conjugate, such as an RNA-GalNAc conjugate, and the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is delivered via a viral or non-viral vector, such as a nanoparticle.
  • the guide sequence and optionally repair template are attached to the nanoparticle comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, such that the components are delivered to the target cell or tissue together.
  • the guide sequence, optional repair template, and Type I CRISPR-Cas proteins or complex including at least one linked effector molecule may be delivered to the target cell or tissue together, and expression of each component may be controlled by way of different promoters, including inducible promoters, as disclosed herein.
  • the present disclosure provides methods for modulating expression (for example, increasing or decreasing expression) of a target polynucleotide in a cell, which may be in vivo, ex vivo, or in vitro.
  • the present disclosure provides methods for altering the sequence of a target polynucleotide in a cell, for example, changing one or more nucleotide, for example from C to T (or G to A on the opposite strand) or from A to G (or T to C on the opposite strand) in a target nucleic acid.
  • the present disclosure provides methods for detecting presence and/or quantity of a target polynucleotide in a cell.
  • the one or more delivery vehicles including Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence and, optionally, repair template are administered to a subject.
  • the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence, and optionally, repair template are targeted to one or more target tissues in the subject.
  • the target tissue is liver, endothelial tissue, lung (including lung epithelium), kidney, fat, or muscle.
  • the one or more delivery vehicles comprise a viral vector ( e.g ., AAV) or a non-viral vector such as, for example, MD-1, 7C1, PBAE, Cl 2-200, cKK-E12, or a conjugate such as a cholesterol conjugate or an RNA conjugate as disclosed herein.
  • the target tissue is liver, and one or more delivery vehicle is MD-1.
  • the target tissue is endothelial tissue, and one or more delivery vehicle is 7C1.
  • the targeting tissue is lung, and one or more delivery vehicle is PBAE or 7C1.
  • the target tissue is kidney, one or more delivery vehicle is an RNA conjugate.
  • the target tissue is fat, and one or more delivery vehicle is C 12-200.
  • the target tissue is muscle (e.g., skeletal muscle) and one or more delivery vehicle is a cholesterol conjugate.
  • the delivery vehicles may be administered to a subject by any method known in the art, including injection, optionally by direct injection to target tissues.
  • Nucleic acid modification can be monitored over time by, for example, periodic biopsy with PCR amplification and/or sequencing of the target region from genomic DNA, or by RT-PCR and/or sequencing of the expressed transcripts. Alternatively, nucleic acid modification can be monitored by detection of a reporter gene or reporter sequence. Alternatively, nucleic acid modification can be monitored by expression or activity of a corrected gene product or a therapeutic effect in the subject.
  • the subject is a human in need of therapeutic or prophylactic intervention.
  • the subject is an animal, including livestock, poultry, domesticated animal, or laboratory animal.
  • the subject is a mammal, such as a human, horse, cow, dog, cat, rodent, or pig.
  • the“subject” is a fungus or a plant, and the Type I CRISPR systems described herein are used to modulate the genome of these organisms.
  • the methods provided herein include obtaining a cell or population of cells from a subject and modifying expression and/or sequence of a target polynucleotide in the cell or cells ex vivo, using the systems, compositions, and/or methods disclosed herein.
  • the ex v/vo-modified cell or cells may be re-introduced into the subject following ex vivo modification.
  • the present disclosure provides methods for treating a disease or disorder in a subject, comprising obtaining one or more cells from the subject, modifying one or more target nucleotide sequences in the cell ex vivo, and re introducing of the cell with the modified target nucleotide sequence back into the subject having the disease or disorder.
  • cells in which nucleotide sequence modification has occurred are expanded in vitro prior to reintroduction into the subject having the disease or disorder.
  • the cells are bone marrow cells.
  • the nucleic acid editing system and guide sequence and, optionally, repair template are administered to a cell in vitro.
  • At least one component of the delivery system e.g ., the guide sequence or the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule
  • the target tissue which may be, for example, liver, heart, lung (including airway epithelial cells), skeletal muscle, CNS (e.g., nerve cells), endothelial cells, blood cells, bone marrow cells, blood cell precursor cells, stem cells, fat cells, or immune cells.
  • Tissue targeting or distribution can be controlled by selection and design of the viral vector, or in some embodiments is achieved by selection and design of lipid or polymeric particles.
  • the desired tissue targeting of the activity is provided by the combination of viral and non- viral delivery vehicles.
  • CRISPR-Cascade or other Type I complexes including at least one covalently linked effector molecule are recombinantly expressed and purified.
  • NLS-tagged Cas3 is recombinantly expressed and purified separately, or as tethered Cas proteins in the crRNA-guided surveillance complex.
  • the protein(s) and RNA e.g., sgRNA, crRNA
  • the purified complex is delivered to a cell.
  • the CRISPR-Cascade or other Type I complexes including at least one covalently linked effector molecule are injected into either the nuclease or the cytoplasm of a eukaryotic cell.
  • concentration of each protein or complex injected may be adjusted to limit toxicity and off-target effects.
  • Methods of microinjection into individual cells, or into subcellular organelles (such as the nucleus) are well known in the art; see for instance Microinjection, (eds. Lacal, Perona & Feramisco), Birkhauser Verlag, 1999, and Komarova et al.,“Microinjection pf Protein Samples,” Chapter 5 in Live Cell Imaging (eds. Goldman & Spector), CSHL Press, 2005.
  • Micro injection devices are commercially available, for instance from Tritech Research (Los Angeles, CA).
  • Embodiment 1 is directed to a system comprising:
  • Type I CRISPR-Cas complex comprising a plurality of Cas proteins, wherein at least one of the Cas proteins is covalently linked to an effector molecule;
  • RNA having a sequence selected to recognize a target nucleotide sequence.
  • Embodiment 2 is directed to the system of embodiment 1, wherein the complex is a CRISPR- Cascade complex, and the plurality of Cas polypeptides comprises Cas8 (Csel), Cse2, Cas7, Cas5, and Cas6, and the effector molecule is not linked to the Cas8 protein.
  • the complex is a CRISPR- Cascade complex
  • the plurality of Cas polypeptides comprises Cas8 (Csel), Cse2, Cas7, Cas5, and Cas6, and the effector molecule is not linked to the Cas8 protein.
  • Embodiment 3 is directed to the system of embodiment 1 or embodiment 2, wherein the effector molecule is covalently linked to the N-terminus or C-terminus of the at least one Cas protein.
  • Embodiment 4 is directed to the system of any one of embodiments 1 to 3, wherein the effector molecule is directly linked to the at least one Cas protein.
  • Embodiment 5 is directed to the system of any one of embodiments 1 to 3, wherein the effector molecule is linked to the at least one Cas protein via a linker.
  • Embodiment 6 is directed to the system of embodiment 5, wherein the effector molecule is linked to the at least one Cas protein by an amino acid linker or wherein the effector molecule is linked to the at least one Cas protein by a streptavidin-biotin linker.
  • Embodiment 7 is directed to the system of any one of embodiments 1 to 6, wherein the effector molecule comprises a transcriptional activator, a transcriptional repressor, a base editor, a reporter, or a combination of two or more thereof.
  • Embodiment 8 is directed to the system of embodiment 7, wherein the effector molecule comprises a transcriptional activator comprising one or more VP 16 domains, a p65 activation domain, an RNA polymerase omega subunit, a viral RTA activation domain, or a combination of two or more thereof.
  • the effector molecule comprises a transcriptional activator comprising one or more VP 16 domains, a p65 activation domain, an RNA polymerase omega subunit, a viral RTA activation domain, or a combination of two or more thereof.
  • Embodiment 9 is directed to the system of embodiment 8, wherein the effector molecule comprises four VP 16 domains (VP64) or ten VP 16 domains (VP 160).
  • Embodiment 10 is directed to the system of embodiment 7, wherein the effector molecule comprises a transcriptional repressor comprising a Kruppel associated box (KRAB) domain.
  • KRAB Kruppel associated box
  • Embodiment 11 is directed to the system of embodiment 7, wherein the effector molecule comprises a base editor comprising a cytidine deaminase, an adenosine deaminase, a uridine glycosylase inhibitor, or a combination of two or more thereof.
  • Embodiment 12 is directed to the system of embodiment 7, wherein the effector molecule comprises a reporter comprising a fluorescent protein, a fluorescent dye, or Quantum dots.
  • Embodiment 13 is directed to the system of embodiment 12, wherein the fluorescent protein comprises a green fluorescent protein a blue fluorescent protein, a cyan fluorescent protein, a yellow fluorescent protein, a red fluorescent protein, a variant thereof, or a combination of any two or more thereof.
  • the fluorescent protein comprises a green fluorescent protein a blue fluorescent protein, a cyan fluorescent protein, a yellow fluorescent protein, a red fluorescent protein, a variant thereof, or a combination of any two or more thereof.
  • Embodiment 14 is directed to the system of any one of embodiments 2 to 13, wherein the effector molecule is covalently linked to the Cas7 protein.
  • Embodiment 15 is directed to the system of embodiment 14, wherein the effector molecule-Cas7 protein comprises an amino acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12
  • Embodiment 16 is directed to the system of embodiment 15, wherein the effector molecule-Cas7 protein comprises the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 17 is directed to the system of embodiment 15, wherein the effector molecule-Cas7 protein is encoded by a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 18 is directed to the system of embodiment 17, wherein the effector molecule-Cas7 protein is encoded by the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 19 is directed to the system of any one of embodiments 1 or 3 to 13, wherein the complex is a CRISPR-Csy complex and the plurality of Cas polypeptides comprises Csyl, Csy2, Csy3, and Csy4.
  • Embodiment 20 is directed to the system of embodiment 19, wherein the effector molecule is covalently linked to the Csy3 protein.
  • Embodiment 21 is directed to the system of any one of embodiments 1 to 20, wherein one or more of the Cas polypeptides comprises a nuclear localization signal.
  • Embodiment 22 is directed to the system of embodiment 21, wherein the complex is a CRISPR- Cascade complex and Cas8 and/or Cas6 comprise a nuclear localization signal.
  • the complex is a CRISPR- Cascade complex and Cas8 and/or Cas6 comprise a nuclear localization signal.
  • Embodiment 23 is directed to the system of any one of embodiments 1 to 22, further comprising a Cas3 nuclease.
  • Embodiment 24 is directed to a vector comprising a nucleic acid encoding a Cas protein covalently linked to an effector molecule, wherein the Cas protein is not Csel.
  • Embodiment 25 is directed to the vector of embodiment 24, wherein the Cas protein is Cas7.
  • Embodiment 26 is directed to the vector of embodiment 25, wherein the nucleic acid encoding the Cas protein covalently linked to an effector molecule comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 27 is directed to the vector of embodiment 26, wherein the nucleic acid encoding the Cas protein covalently linked to the effector molecule comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 28 is directed to the vector of any one of embodiments 25 to 27, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 29 is directed to the vector of embodiment 28, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 30 is directed to the vector of any one of embodiments 24 to 29, comprising the nucleic acid sequence of SEQ ID NO: 13.
  • Embodiment 31 is directed to a nucleic acid encoding a Cas7 protein linked to an effector molecule.
  • Embodiment 32 is directed to the nucleic acid of embodiment 31, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 33 is directed to the nucleic acid of embodiment 32, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 34 is directed to the nucleic acid of embodiment 32 or embodiment 33, wherein the nucleic acid comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 35 is directed to the nucleic acid of embodiment 34, wherein the nucleic acid comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
  • Embodiment 36 is directed to a protein comprising Cas7 covalently linked to an effector molecule, wherein the protein comprises at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12
  • Embodiment 37 is directed to the protein of embodiment 36, wherein the protein comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
  • Embodiment 38 is directed to a cell comprising the system of any one of embodiments 1 to 23 or the vector of any one of embodiment 24 to 30, or the nucleic acid of any one of embodiments 31 to 35 or the protein of embodiment 36 or embodiment 37.
  • Embodiment 39 is directed to the cell of embodiment 38, wherein the cell is an animal cell, a plant cell, a fungal cell, an algal cell, or a bacterial cell.
  • Embodiment 40 is directed to a method, comprising:
  • Embodiment 41 is directed to the method of embodiment 40 wherein contacting the genomic DNA of the cell with the system comprises:
  • nucleic acids encoding components of the system in the cell; or a combination of two or more thereof.
  • Embodiment 42 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises altering expression of a nucleic acid in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a transcription activator effector molecule or a transcription repressor effector molecule into the cell.
  • Embodiment 43 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises modifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a base editing effector molecule into the cell.
  • Embodiment 44 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises detecting and/or quantifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a reporter effector molecule into the cell.
  • Embodiment 45 is directed to a method for treating or preventing a disease in a subject in need of treatment or prevention, comprising administering to the subject the system of any one of embodiments 1 to 23.
  • Example 1 is provided to illustrate certain particular features and/or embodiments. These examples should not be construed to limit the disclosure to the particular features or embodiments described.
  • Example 1 is provided to illustrate certain particular features and/or embodiments. These examples should not be construed to limit the disclosure to the particular features or embodiments described.
  • Cascade complexes containing nuclear localization signal (NLS) tags on Cas8 and/or on Cas6 were constructed.
  • the complex containing an NLS tag on Cas8 also contained a VP64 (transcriptional activator) tethered to the C-terminus of Cas7, or an emerald green fluorescent protein (emGFP) tethered to the C- terminus of Cas7 (Green line).
  • the NLScascade WT (wild-type) complex also contained a NLS tag on Cas6. Recombinant expression of the tagged Cascade complex was performed using multiple expression vectors.
  • one vector contained NLS tagged Cas8, strep tagged-cse2, cas7 with a VP64 or emGFP tag, cas5 and cas6 NLS tagged (FIG. 3A).
  • Preliminary experiments with this vector and a vector with a CRISPR locus failed to produce Cascade.
  • this two vector system was complemented with a third plasmid encoding strep-Cas7, Cas5, and Cas6 (FIG. 3B). This resulted in stable Cascade complexes containing the tags on the appropriate subunits (FIG. 5).
  • the illustrated embodiments are only examples and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims.

Abstract

A system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence is provided. The Type I CRISPR-Cas complex includes at least one Cas protein covalently linked to an effector molecule. Vectors and nucleic acids encoding a Cas protein covalently linked to an effector molecule are also provided. Methods for modulating expression (for example, increasing or decreasing expression) of a target polynucleotide, altering the sequence of a target polynucleotide, and for detecting presence and/or quantity of a target polynucleotide in a cell using the systems and complexes are provided.

Description

GENE MODULATION WITH CRISPR SYSTEM TYPE I
CROSS REFERENCE TO RELATED APPLICATION
This application claims the benefit of U.S. Provisional Application No. 62/754,268, filed November 1, 2018, which is incorporated by reference herein in its entirety.
FIELD
This disclosure relates to engineered, programmable, non-naturally occurring gene modulating systems, compositions of the system, and methods of carrying out genetic editing. In particular, methods, systems, and compositions are described that utilize the Type I CRISPR system.
BACKGROUND
Clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated genes (cas) are essential components of nucleic acid-based adaptive immune systems that are widespread in bacteria and archaea. CRISPR loci consist of a series of short repeats separated by non-repetitive spacer sequences, the spacer sequences of which are acquired from foreign genetic elements such as viruses and plasmids. Transcription of CRISPR loci generates a library of CRISPR-derived RNAs (crRNAs), containing sequences complementary to previously encountered invading nucleic acids. CRISPR-associated (Cas) proteins bind crRNAs, and the resultant ribonucleoprotein complex targets invading nucleic acids complementary to the crRNA guide. Targeted invading nucleic acids are degraded by cis- or mr/t.v-acting nucleases.
Six main CRISPR system types (Types I to VI) and at least 32 distinct subtypes have been identified to date. All CRISPR systems use short CRISPR-derived RNAs (crRNAs) to target invading nucleic acid, and many of these nucleic acid targeting systems rely on sophisticated multi-subunit complexes. For example, the CRISPR-associated complex for antiviral defense (Cascade) is a Type I-E system composed of 11 protein subunits and a CRISPR-derived RNA (crRNA) complex that relies on complementary pairing between the crRNA-guide and a target nucleic acid sequence, which occurs over 32 nucleotides, or a portion thereof. Type II systems, however, rely on a single protein (Cas9) and a 20 nucleotide sequence in recognizing invading DNA. Due to its relative simplicity, the Cas9 system has been used for commercial and research purposes in genetic engineering. Off-target nuclease activity has been detected, and this may limit the use of these tools for certain applications.
The Type I systems rely on a greater number of nucleotides for target DNA recognition, and employ a locking mechanism during target binding, which may be exploited as a gene modification device with enhanced specificity in target recognition compared to Cas9 systems. However, the complexity of the Type I CRISPR complex, the multiple reading frames, and the delivery of these systems are hurdles to the use of Type I CRISPR complexes as a viable genome editing technology. SUMMARY
Disclosed herein are compositions and methods that utilize Type I CRISPR complexes to deliver one or more effector molecules to a target nucleic acid, for example a target nucleic acid in a cell. In some embodiments, the effector molecule includes a DNA-modulating function, such as transcriptional activation, transcriptional repression, and/or base editing functions. In other embodiments, the effector molecule is a detectable label or reporter, for example, for detecting the presence and/or quantity of a nucleic acid. Also provided are methods and systems that employ the complexes, for example, to modulate nucleic acid expression or sequence, or to detect or quantify a nucleic acid.
Thus, there is provided in some embodiments, a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence. The Type I CRISPR-Cas complex include at least one Cas protein linked to an effector molecule (for example, directly or indirectly linked). In some examples, the Cas protein that is linked to the effector molecule is not a Csel protein (also known as Cas8). In particular embodiments, the effector molecule is linked to a Cas7 protein. In some examples, the Cas protein is covalently linked to the effector molecule.
The Type I CRISPR-Cas complex in some examples is a Type I-E CRISPR-Cascade complex (for example, including Cas8, Cse2, Cas7, Cas5, and Cas6 proteins). In particular examples, the complex includes one or more Cas7 proteins (such as 1, 2, 3, 4, 5, 6, or more Cas7 proteins) linked to an effector molecule. Exemplary Cas7 proteins linked to an effector molecule include SEQ ID NOs: 2, 4, 6, 8, 10, and 12, which are encoded by the nucleic acid sequences of SEQ ID NOs: 1, 3, 5, 7, 9, and 11. In other examples, the Type I CRISPR-Cas complex is a Type I-F CRISPR-Csy complex ( e.g ., including Csyl (Cas8), Csy2 (Cas5), Csy3 (Cas7), and Csy4 (Cas6) proteins) that includes one or more Csy3 (Cas7) proteins (such as 1, 2, 3, 4, 5, 6, or more Csy3 proteins) linked to an effector molecule. As used herein, Cas7 refers to the protein corresponding to the Cas7 protein in Type I-E CRISPR, though the protein may be referred to in other Type I systems by other nomenclature in some examples (see, e.g., Koonin et al, Curr. Opin. Microbiol. 37:67-78, 2017 for a summary of Type I CRISPR systems and nomenclature).
In some embodiments, the effector molecule is covalently linked to the N-terminus or the C- terminus of the Cas protein (such as the N-terminus or C-terminus of a Cas7 protein). The effector molecule may be directly or indirectly linked (for example, via a linker) to the Cas protein. In some examples, the linker is an amino acid linker. In other embodiments, the linker is streptavidin and biotin. One or more of the Cas polypeptides in the complex may also include a nuclear localization signal (NLS) (for example,
Cas8 and/or Cas6). In additional examples, the system may optionally include a Cas3 nuclease.
In other examples, one or more effectors are tethered to the complex via the crRNA. In one embodiment the 3’ hairpin of the crRNA is extended to include an RNA binding motif and the effector is linked (directly or indirectly) to a protein including an RNA binding domain. In some embodiments the RNA binding motif is an MS2 hairpin that is bound by the MS2 coat protein, which is linked to an effector. In other embodiments the RNA binding motif is a Type I repeat and the protein is a nuclease-inactivated Cas6 family protein linked to an effector.
In some embodiments, the effector molecule includes a transcriptional activator (for example, one or more VP16 activation domains), a transcriptional repressor (for example, a Kruppel associated box (KRAB) repressor domain), or a base editor (for example, a cytidine deaminase or an adenosine deaminase). In other embodiments, the effector molecule is a reporter, such as a fluorescent protein, a fluorescent dye, or a quantum dot.
Also disclosed are exemplary nucleic acids encoding a Cas7 protein linked to an effector molecule. The nucleic acids encode a Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 1 and 3), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 5 and 7), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 9 and 11). The Cas7-effector molecule proteins include Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 2 and 4), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 6 and 8), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 10 and 12).
In some embodiments, the disclosure includes a vector including a nucleic acid encoding a Cas protein covalently linked to an effector molecule, where the Cas protein is not Csel. In some examples, the vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule. Exemplary nucleic acids included in the vector are SEQ ID NOs: 1, 3, 5, 7, 9, and 11, which encode the proteins SEQ ID NOs: 2, 4, 6, 8, 10, and 12.
Also provided are cells that include the disclosed systems, nucleic acids, proteins, and/or vectors. The cells may be prokaryotic or eukaryotic cells, including animal cells, plant cells, fungal cells, algal cells, or bacterial cells.
In some embodiments, the present disclosure provides methods for modulating expression (for example, increasing or decreasing expression) of a target polynucleotide in a cell, which may be in vivo, ex vivo, or in vitro. In other embodiments, the present disclosure provides methods for altering the sequence of a target polynucleotide in a cell, for example, changing one or more nucleotide, for example from C to T (or G to A on the opposite strand) or from A to G (or T to C on the opposite strand) in a target nucleic acid. In additional embodiments, the present disclosure provides methods for detecting presence and/or quantity of a target polynucleotide in a cell. In some examples, the one or more delivery vehicles including Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence and, optionally, repair template, are administered to a cell or a subject.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1 A and IB are schematics showing the type I CRISPR system in E. coli. FIG. 1 A illustrates the native Type IE CRISPR-Cascade operon from Escherichia coli. Five of the cas genes encode proteins that assemble in an unequal stoichiometry into a multi-subunit surveillance complex called the CRISPR- associated complex for anti-viral defense (Cascade). The stoichiometry of each subunit is indicated above each arrow. The CRISPR locus consists of a series of 29-nt repeats (diamonds) separated by 32-nt spacer (or guide) sequences (cylinders) (left panel). The right panel illustrates Cascade subunit assembly. FIG. IB illustrates the native Type IF CRISPR-Csy operon (left) and subunit assembly (right).
FIG. 2 is a schematic diagram illustrating an embodiment of a Type I CRISPR system including an extension of the crRNA to include stem loop structures that are bound by a protein including an RNA binding domain (left panel) and an assembled complex showing binding of RNA binding domains (RBD) linked to an effector to the stem loop structure (right panel), providing multivalent display of the effector. This embodiment is illustrated with Type IE CRISPR-Cascade, but is generally applicable to Type I CRISPR systems.
FIGS. 3A and 3B are schematic diagrams of exemplary vectors for expression of Cascade complexes. FIG. 3A is a diagram of a vector including nucleic acids encoding Csel with an N-terminal nuclear localization signal (NLS), Cse2 with a Strep-tagll, Cas7 with a C-terminal effector, Cas5, and Cas6 with a C-terminal NLS. FIG. 3B is a diagram of a vector including nucleic acids encoding Cas7 with a Strep-tagll, Cas5, and Cas6.
FIGS. 4 A and 4B are schematic diagrams of alternative exemplary vectors for expression of Cascade complexes. FIG. 4A is a diagram of a vector including nucleic acids encoding Csel with an N- terminal nuclear localization signal (NLS), Cse2 with a Strep-tagll, and Cas7 including a C-terminal effector. LIG. 4B is a diagram of a vector including nucleic acids encoding Cas5 and Cas6 with a C- terminal NLS.
LIG. 5 illustrates size exclusion chromatography of Cascade containing NLS tags on Cas8 and/or on Cas6. The complex containing an NLS tag on Cas8 also contains a VP64 (transcriptional activator) tethered to the C-terminus of Cas7 (“VP64”), or an emerald green fluorescent protein (emGLP) tethered to the C- terminus of Cas7 (“GLP”). The NLS-Cascade WT (wild-type) also contains an NLS tag on Cas6. Inset: SDS-PAGE of each sample after size exclusion chromatography.
SEQUENCE LISTING
Any nucleic acid and amino acid sequences listed herein or in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases and amino acids, as defined in 37 C.L.R. § 1.822. In at least some cases, only one strand of each nucleic acid sequence is shown, but the
complementary strand is understood as included by any reference to the displayed strand.
SEQ ID NOs: 1 and 2 are nucleic acid and amino acid sequences, respectively, of Cas7 with N- terminally linked VP64.
SEQ ID NOs: 3 and 4 are nucleic acid and amino acid sequences, respectively, of Cas7 with C- terminally linked VP64.
SEQ ID NOs: 5 and 6 are nucleic acid and amino acid sequences, respectively, of Cas7 with an N- terminally linked Kruppel associated box (KRAB) repressor domain.
SEQ ID NOs: 7 and 8 are nucleic acid and amino acid sequences, respectively, of Cas7 with a C- terminally linked KRAB repressor domain. SEQ ID NOs: 9 and 10 are nucleic acid and amino acid sequences, respectively, of Cas7 with N- terminally linked emerald green fluorescent protein (emGFP).
SEQ ID NOs: 11 and 12 are nucleic acid and amino acid sequences, respectively, of Cas7 with C- terminally linked emGFP.
SEQ ID NO: 13 is the nucleic acid sequence of vector pCDF NLS-Csel, Strep II-tag-C3 cleavage- Cse2, Cas7-linker-VP64. Nucleotides 1593-1625 encode a Strep-tag II sequence, nucleotides 1630-1655 encode a C3 cleavage sequence, nucleotides 1665-2147 encode Cse2, nucleotides 2160-3248 encode Cas7, and nucleotides 3273-3422 encode VP64.
SEQ ID NO: 14 is the nucleic acid sequence of vector pET52 StrepII-Cas7, Cas5, Cas6.
Nucleotides 4966-4989 encode Strep-tag II sequence, nucleotides 4996-5019 encode a human rhino vims 3C (HRV3C) cleavage site, nucleotides 5029-6120 encode Cas7, nucleotides 6123-6797 encode Cas5, and nucleotides 6784-7383 encodes Cas6 (overlaps with Cas5 sequence).
SEQ ID NO: 15 is the nucleic acid sequence of vector pET52 Cas5-Cas6-NLS. Nucleotides 4957- 5631 encode Cas5, nucleotides 5621-6217 encode Cas6, and nucleotides 6218-6238 encode an NLS.
DETAILED DESCRIPTION
The type IE CRISPR-mediated immune system in E. coli K12 consists of eight cas genes and one CRISPR locus. Five of the cas genes encode proteins that assemble in an unequal stoichiometry into a multi-subunit surveillance complex called the CRISPR-associated complex for anti-viral defense (Cascade) (FIG. 1A). Cascade binds double-stranded DNA targets that contain a protospacer (sequence
complementary to the crRNA-guide) and a Protospacer Adjacent Motif (PAM). Csel (or casA or Cas8), cse2 (or casB), cse4 (or casC or Cas7), cas5 (or casD) and cse3 (or casE) are members of large gene families referred to as cas8, cse2, cas7, cas5, and cas6, respectively. In some examples, the CRISPR locus consists of a series of 29-nt repeats separated by 32-nt spacer (or guide) sequences; however, the repeat and spacer length may vary depending on the system (see, e.g., Luo et al, Nucleic Acids Research 44:7385-7394, 2016).
The present disclosure provides constructs and methods that permit assembly of a Cascade (or other Type I CRISPR) complex with one or more effector molecules. Type I CRISPR complexes include multiple copies of some components (such as Cas7 (6 copies) and Cse2 (2 copies) in Cascade). Thus, in some embodiments, the complexes disclosed herein can include multiple effector molecules. By providing “multivalent” effector molecules, the complexes and methods described herein can increase efficiency of gene modulation (such as modifying gene expression or base editing), or increase signal strength in the case of reporters or tags.
I. Terms
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8).
Unless otherwise explained, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The singular terms “a,”“an,” and“the” include plural referents unless context clearly indicates otherwise. Thus, for example, reference to“a cell” includes combinations of two or more cells, or entire cultures of cells; reference to“a polynucleotide” includes, as a practical matter, many copies of that polynucleotide; and reference to“a polypeptide” can include multiple copies of that polypeptide. Similarly, the word“or” is intended to include “and” unless the context clearly indicates otherwise. Hence“comprising A or B” means including A, or B, or A and B.
It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for description. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including explanations of terms, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
In order to facilitate review of the various embodiments of the invention, the following explanations of specific terms are provided:
As used herein, the terms“effector” or“effector molecule” refer to molecules (such as proteins or protein domains) that can“effect” a desired function. In some embodiments, the effector molecule function is a DNA-modulating function, which includes transcriptional activation, transcriptional repression, base editing, and/or double-stranded break (DSB) repair functions. In other embodiments, the effector molecule function includes cell cycle control or regulation, small molecule delivery ( e.g ., to the nucleus), and/or dimerization functions. In further embodiments, the effector molecule function is as a label or reporter for the presence, quantity, and/or localization of a specific nucleic acid sequence (for example presence and/or quantity of a nucleic acid of interest in a cell).
As used herein, the term“host cell” refers to any cell that contains a heterologous nucleic acid.
The heterologous nucleic acid can be a vector, such as a shuttle vector or an expression vector, or linear DNA template, or in vitro transcribed RNA. In some aspects, the host cell is able to drive the expression of genes that are encoded on the vector. In some aspects, the host cell supports the replication and propagation of the vector. Host cells can be bacterial cells such as E. coli, animal cells, such as mammalian cells (e.g., human cells or mouse cells), or plant cells. When a suitable host cell is used to create a stably integrated cell line, that cell line can be used to create a complete transgenic organism. Methods for delivering vectors or other nucleic acid molecules into bacterial cells (termed “transformation”) such as Escherichia coli include electroporation methods and transformation of E. coli cells that have been rendered competent by previous treatment with divalent cations such as CaCl·. Methods for delivering vectors, other nucleic acids (such as RNA), or ribonucleoproteins (RNPs) into mammalian or plant cells in culture (termed transfection) include but are not limited to calcium phosphate precipitation, electroporation, lipid-based methods (liposomes or lipoplexes) such as Transfectamine™ (Life
Technologies™) and TransFectin™ (Bio-Rad Laboratories), cationic polymer transfections, for example using DEAE-dextran, direct nucleic acid injection, biolistic particle injection, and viral transduction using engineered viral carriers (termed“transduction,” using e.g., engineered herpes simplex virus, adenovirus, adeno-associated vims, vaccinia vims, Sindbis vims), and sonoporation. Additional methods of transforming or transducing cells may also be selected.
As used herein, the term“recombinant” in reference to a nucleic acid or polypeptide indicates that the material (e.g., a recombinant nucleic acid, gene, polynucleotide, polypeptide, etc.) has been altered by human intervention. Generally, the arrangement of parts of a recombinant molecule is not a native configuration, or the primary sequence of the recombinant polynucleotide or polypeptide has in some way been manipulated. A naturally occurring nucleotide sequence becomes a recombinant polynucleotide if it is removed from the native location from which it originated (e.g., a chromosome), or if it is transcribed from a recombinant DNA construct, or its native sequence is modified (e.g., by insertion, deletion, and/or alteration of one or more nucleotides). A naturally occurring polypeptide sequence becomes a recombinant polypeptide if it is removed from the native location from which it originated, or if its native sequence is modified (e.g., insertion, deletion, and/or alteration of one or more amino acids). A gene open reading frame (ORF) is a recombinant molecule if that nucleotide sequence has been removed from its natural context and cloned into any type of nucleic acid vector (even if that ORF has the same nucleotide sequence as the naturally occurring gene) or PCR template. In some embodiments, the term“recombinant cell line” refers to any cell line containing a recombinant nucleic acid, that is to say, a nucleic acid that is not native to that host cell.
As used herein, the terms“heterologous” or“exogenous” as applied to polynucleotides or polypeptides refers to molecules that have been rearranged or artificially supplied to a biological system and may not be in a native configuration (e.g., with respect to sequence, genomic position, or arrangement of parts) or are not native to that particular biological system. These terms indicate that the relevant material originated from a source other than the naturally occurring source or refers to molecules having a non natural or non-native configuration, genetic location, or arrangement of parts. The terms“exogenous” and “heterologous” are sometimes used interchangeably with“recombinant.”
As used herein, the terms“non-naturally occurring gene editing complex,”“engineered non- naturally occurring gene editing complex,”“non-naturally occurring complex,” and“non-naturally occurring CRISPR-Cascade complex” refer to gene editing complexes that do not occur in nature. In some embodiments, the CRISPR associated proteins are Cascade proteins. In some embodiments, the Type I CRISPR-Cas complexes used in the described methods and systems are concatenated or partially concatenated complexes, in which a plurality of subunits of the Type I CRISPR-Cas complex are tethered to each other, or one or more of the subunits of the Type I CRISPR-Cas complex are linked or tethered to a heterologous molecule (such as an effector molecule), or the stoichiometry of the Type I CRISPR-Cas complex is modified, and/or the nucleotides in the crRNA are modified. In additional embodiments, the non-natural complexes are composed of CRISPR associated proteins, one or more effector molecules, and crRNA. Thus, these non-natural complexes are composed of CRISPR associated proteins and crRNA, but these proteins and/or crRNA have been modified or are in an arrangement that does not occur in nature, and which results from the manipulation that occurs during human engineering of the complex.
As used herein, the terms“linker,”“linkage,”“tether,”“fused,”“joined,” and derivatives thereof refer to a means to connect subunits or to a connection between subunits. Accordingly, the terms include, but are not limited to, any compound, organic, inorganic, or a hybrid organic and inorganic compound, that connects, covalently or non-covalently, two subunits. “Linker,”“linkage,”“tether,”“fused,” and“joined” and derivatives thereof may be used interchangeably herein. By way of example, an effector molecule can be linked to a Cas protein (such as Cas7) in a Type I CRISPR-Cas complex.
As used herein, the term“guide sequence” refers to an RNA sequence that is part of the CRISPR complex and recognizes a target nucleic acid sequence. In some embodiments, the guide sequences are presented as DNA sequences which encode for the RNA sequences. In some embodiments, target recognition can occur through non-covalent interactions, including hydrogen bonding, recognition of a structural motif, nucleic acid sequence recognition, base pairing, the like, or any combination thereof. In other embodiments, target recognition can occur via covalent interactions.
As used herein, the term“gene” generally refers to a combination of polynucleotide elements, that when operatively linked in either a native or recombinant manner, provide some product or function. The term“gene” is to be interpreted broadly, and can encompass mRNA, cDNA, cRNA, and genomic DNA forms of a gene. In some uses, the term“gene” encompasses the transcribed sequences, including 5’ and 3’ untranslated regions (5’-UTR and 3’-UTR), exons, and introns. In some genes, the transcribed region will contain“open reading frames” that encode polypeptides. In some uses of the term, a“gene” comprises only the coding sequences ( e.g ., an“open reading frame” or“coding region”) necessary for encoding a polypeptide. In some aspects, genes do not encode a polypeptide, for example, ribosomal RNA (rRNA) genes and transfer RNA (tRNA) genes. In some aspects, the term“gene” includes not only the transcribed sequences, but in addition, also includes non-transcribed regions including upstream and downstream regulatory regions, enhancers and promoters. The term“gene” encompasses mRNA, cDNA, and genomic forms of a gene.
In some aspects, the genomic form or genomic clone of a gene includes the sequences of the transcribed mRNA as well as other non-transcribed sequences that lie outside of the transcript. The regulatory regions that lie outside the mRNA transcription unit are termed 5’ or 3’ flanking sequences. A functional genomic form of a gene typically contains regulatory elements necessary, and sometimes sufficient, for the regulation of transcription. The term“promoter” is generally used to describe a DNA region, typically but not exclusively 5’ of the site of transcription initiation, sufficient to confer accurate transcription initiation. In some aspects, a“promoter” also includes other cis-acting regulatory elements that are necessary for strong or elevated levels of transcription, or confer inducible transcription. In some embodiments, a promoter is constitutively active, while in alternative embodiments, the promoter is conditionally active ( e.g ., where transcription is initiated only under certain physiological conditions).
Generally, the term“regulatory element” refers to any cis-acting genetic element that controls some aspect of the expression of nucleic acid sequences. In some uses, the term“promoter” comprises essentially the minimal sequences required to initiate transcription. In some uses, the term“promoter” includes the sequences to start transcription, and in addition, also includes sequences that can upregulate or downregulate transcription, commonly termed“enhancer elements” and“repressor elements,” respectively.
Specific DNA regulatory elements, including promoters and enhancers, generally only function within a class of organisms. For example, regulatory elements from the bacterial genome generally do not function in eukaryotic organisms. However, regulatory elements from more closely related organisms frequently show cross functionality. For example, DNA regulatory elements from a particular mammalian organism, such as human, will most often function in other mammalian species, such as the mouse.
Furthermore, in designing recombinant genes that will function across many species, there are consensus sequences for many types of regulatory elements that are known to function across species, e.g., in all mammalian cells, including mouse host cells and human host cells.
As used herein, a“protein subunit,”“polypeptide subunit,” or“subunit” refers to a single protein molecule that assembles or co-assembles with other protein or RNA molecules to form a protein or ribonucleoprotein (RNP) complex. Some naturally occurring proteins have a relatively small number of subunits and are therefore described as oligomeric, for example hemoglobin or DNA polymerase. Others may consist of a very large number of subunits and are therefore described as multimeric, for example microtubules and other cytoskeleton proteins. The subunits of a multimeric protein may be identical, homologous or totally dissimilar. For example, the CRISPR-Cascade ribonucleoprotein complex includes 11 subunits, which assemble around a crRNA. In some embodiments, the 11 protein subunits of Cascade include Csel (Cas8) (1 subunit), Cse2 (2 subunits), Cas7 (6 subunits), Cas5 (1 subunit), and Cas6 (1 subunit), as well as a 61-nucleotide crRNA. Similarly, the CRISPR-Csy ribonucleoprotein complex is a ~350-kDa-ribonucleoprotein complex composed of 9 subunits of four functionally essential Cas proteins (one Csyl, one Csy2, six Csy3, and one Csy4) and a 60-nt crRNA-guide. These two ribonucleoprotein complexes are examples of crRNA-guided DNA binding machines that recruit a trans-acting nuclease, Cas3 for target degradation.
As used herein, the terms“vector,”“vehicle,”“construct,”“template,” and“plasmid” are used in reference to any recombinant polynucleotide molecule that can be propagated and used to transfer nucleic acid segment(s) from one organism to another. Vectors generally comprise parts that mediate vector propagation and manipulation (e.g., one or more origin of replication, genes imparting drug or antibiotic resistance, a multiple cloning site, operably linked promoter/enhancer elements which enable the expression of a cloned gene, etc.)· Vectors are generally recombinant nucleic acid molecules, often derived from bacteriophages or plant or animal viruses. Plasmids and cosmids refer to two such recombinant vectors.
As used herein, the terms“reporter,”“tag,”“marker,” and“label” refer generally to a moiety, chemical compound, or other component that can be used to visualize, quantitate, or identify desired components of a system of interest. Reporters are commonly, but not exclusively, genes that encode reporter proteins. For example, a“reporter gene” is a gene that, when expressed in a cell, allows visualization or identification of that cell, or permits quantitation of expression of a recombinant gene. For example, a reporter gene can encode a protein, for example, an enzyme whose activity can be quantitated, for example, chloramphenicol acetyltransferase (CAT) or firefly luciferase protein. Reporters also include fluorescent proteins, for example, green fluorescent protein (GFP) or any of the recombinant variants of GFP, including enhanced GFP (EGFP), blue fluorescent proteins (BFP and derivatives), cyan fluorescent protein (CFP and other derivatives), yellow fluorescent protein (YFP and other derivatives) and red fluorescent protein (RFP and other derivatives). Reporters also include non-protein molecules, including fluorescent dyes and quantum dots.
II. Overview of Several Embodiments
Provided herein in some embodiments is a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence. The Type I CRISPR-Cas complex includes at least one Cas protein covalently linked to an effector molecule. In some examples, the Cas protein that is covalently linked to the effector molecule is not a Csel protein.
The Type I CRISPR-Cas complex in some examples is a Type IE CRISPR-Cascade complex (for example, including Cas8, Cse2, Cas7, Cas5, and Cas6 proteins). In particular examples, the CRISPR- Cascade complex includes one or more Cas7 proteins (such as 1, 2, 3, 4, 5, or 6 Cas7 proteins) linked to an effector molecule. In other examples, the CRISPR-Cascade complex includes one or more Cse2 proteins (such as 1 or 2 Cse2 proteins) linked to an effector molecule. In still further examples, the CRISPR-Cascade complex includes a Cas5 or a Cas6 protein linked to an effector molecule. Exemplary Cas7 proteins linked to an effector molecule include SEQ ID NOs: 2, 4, 6, 8, 10, and 12, which are encoded by the nucleic acid sequences of SEQ ID NOs: 1, 3, 5, 7, 9, and 11.
In other examples, the Type I CRISPR-Cas complex is a Type IF CRISPR-Csy complex (for example, including Csyl, Csy2, Csy3, and Csy4 proteins). In particular examples, the CRISPR-Csy complex includes one or more Csy3 proteins (such as 1, 2, 3, 4, 5, or 6 Csy3 proteins) linked to an effector molecule. In other examples, the CRISPR-Csy complex includes a Csyl, Csy2, or Csy4 protein linked to an effector molecule.
In some embodiments, the effector molecule is covalently linked to the N-terminus or the C- terminus of the Cas protein (such as the N-terminus or C-terminus of a Cas7 protein). The effector molecule may be directly (for example, without an intervening linker) or indirectly linked (for example, via a linker) to the Cas protein. Thus, in some examples, the C-terminus of an effector molecule is directly linked (for example, by a peptide bond) to the N-terminus of a Cas protein. In other examples, the C-terminus of a Cas protein is directly linked (for example by a peptide bond) to the N-terminus of an effector molecule. If the effector molecule is not a protein (for example, is a fluorescent dye or quantum dot), the effector may be directly linked to the Cas protein by a non-peptide bond (such as a thiol or amine bond). A non-protein effector molecule may be linked to a Cas protein at any location, and is not limited to linkage at the N- or C- terminus.
In other examples, the effector and Cas proteins are joined by a linker. In some examples, the linker is an amino acid linker (such as 1-100 amino acids, such as 1-20 amino acids, 10-30 amino acids, 20-40 amino acids, 30-50 amino acids, 40-60 amino acids, 50-70 amino acids, 60-80 amino acids, 70-90 amino acids, or 80-100 amino acids). The amino acid linker may include 2 or more repeats of a particular linker sequence. In other examples, the linker is a cross-linker (such as a maleimide or succinimide linker) or other type of linker, such as a carbon chain. Exemplary crosslinking agents and techniques are described in Crosslinking Technology (Thermo Scientific, available at
tools.thermofisher.com/content/sfs/brochures/1602163-Crosslinking-Reagents-Handbook.pdf).
In other embodiments, the effector molecule is linked to a Cas protein by a streptavidin-biotin linker. In some examples, the C-terminus of streptavidin is directly linked (for example, by a peptide bond) to the N-terminus of a Cas protein. In other examples, the C-terminus of a Cas protein is directly linked (for example by a peptide bond) to the N-terminus of streptavidin. The Cas protein can also be indirectly linked to streptavidin, for example by a linker, as discussed above with respect to effector molecules. The effector molecule is linked (directly or indirectly) to biotin and the Cas protein and the effector molecule are linked by the interaction of streptavidin and biotin.
In other examples, one or more effectors are associated with the complex via the crRNA. In one embodiment the 3’ hairpin of the crRNA is extended to include one or more (such as 1, 2, 3, 4, or more) RNA binding motifs and the effector is linked (directly or indirectly) to a protein including an RNA binding domain. An exemplary embodiment is illustrated schematically in FIG. 2. In some embodiments the RNA binding motif is an MS2 hairpin that is bound by the MS2 coat protein, which is linked to an effector. In other embodiments the RNA binding motif is a type I repeat and the protein is a nuclease-inactivated Cas6 family protein linked to an effector.
One or more of the Cas polypeptides in the complex (for example, Cas8, Cas5, Cas7, and/or Cas6) may also include a nuclear localization signal (NLS). In additional examples, the system may optionally include a Cas3 nuclease.
In some embodiments, the effector molecule includes a transcriptional activator (for example, one or more VP16 activation domains), a transcriptional repressor (for example, a Kruppel associated box (KRAB) repressor domain), or a base editor (for example, a cytidine deaminase or an adenosine deaminase). In other embodiments, the effector molecule is a reporter, such as a fluorescent protein, a fluorescent dye, or a quantum dot. In still further examples, an effector molecule includes an oligonucleotide or peptide. Effector molecules are discussed in more detail in Section IV, below.
Also disclosed are exemplary nucleic acids encoding a Cas7 protein linked to an effector molecule. The nucleic acids encode a Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 1 and 3), a Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 5 and 7), and a Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 9 and 11). The Cas7-effector molecule proteins include Cas7 protein linked to VP64 (e.g., SEQ ID NOs: 2 and 4), Cas7 protein linked to KRAB (e.g., SEQ ID NOs: 6 and 8), and Cas7 protein linked to emerald green fluorescent protein (emGFP, e.g., SEQ ID NOs: 10 and 12).
In some embodiments, the disclosure includes a vector including a nucleic acid encoding a Cas protein covalently linked to an effector molecule, where the Cas protein is not Csel. In some examples, the vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule. Exemplary nucleic acids included in the vector are SEQ ID NOs: 1, 3, 5, 7, 9, and 11, which encode the proteins SEQ ID NOs:
2, 4, 6, 8, 10, and 12. In some examples, a vector includes a nucleic acid encoding a Cas7 protein linked to an effector molecule and one or more additional Type I CRISPR complex proteins, such as Csel, Cse2,
Cas5, and/or Cas6. Exemplary vectors include SEQ ID NOs: 13-15.
Also provided are cells that include the disclosed systems, nucleic acids, proteins, and/or vectors. The cells may be prokaryotic or eukaryotic cells, including animal cells, plant cells, fungal cells, algal cells, or bacterial cells.
Provided in additional embodiments are methods that include contacting a nucleic acid in a cell (such as genomic DNA) with a system including a Type I CRISPR-Cas complex that includes a plurality of Cas proteins and guide RNA having a sequence selected to recognize a target nucleic acid sequence. The Type I CRISPR-Cas complex includes at least one Cas protein covalently linked to an effector molecule. In some examples, the Cas protein that is covalently linked to the effector molecule is not a Csel protein. In some examples, the method includes altering (for example, increasing or decreasing) expression of a nucleic acid in the cell or modifying a target nucleic acid sequence in the cell in vitro, ex vivo, or in vivo. In other examples, the method includes detecting and/or quantifying a target nucleic acid in a cell in vitro, ex vivo, or in vivo.
Though modified systems are illustrated herein in the context of CRISPR-Cascade, also contemplated herein is the use of other similarly modified Type I CRISPR-Cas systems, including but not limited to Type IF CRISPR-Csy systems. For example, in some embodiments, an effector molecule may be linked to one or more Cas7 (Csy3) proteins in a CRISPR-Csy complex. It is expected that the CRISPR-Cas nucleic acid systems described herein are equally applicable for any Type I CRISPR-Cas protein complexes.
III. Cascade/Effector Molecule Constructs
Disclosed herein are constructs that include an effector molecule tethered or linked to a Cascade component. The Cas-effector component may be present or expressed as an individual protein or as part of a larger Cascade complex. Although the examples provided herein are in the context of Type IE CRISPR- Cascade and Cas7, it is contemplated that other Cascade proteins and/or any other Type I CRISPR system could be similarly adapted.
In non-limiting examples, the constructs include VP64 linked to Cas7. Exemplary VP64-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 1-4. In SEQ ID NO:
1, nucleotides 1-159 encode VP64, nucleotides 160-195 encode a linker, and nucleotides 196-1284 encode Cas7. In SEQ ID NO: 2, amino acids 1-53 are VP64, amino acids 54-65 are the linker, and amino acids 66- 427 are Cas7. In SEQ ID NO: 3, nucleotides 1-1089 encode Cas7, nucleotides 1090-1113 encode a linker, and nucleotides 1114-1266 encode VP64. In SEQ ID NO: 4, amino acids 1-363 are Cas7, amino acids 364- 371 are the linker, and amino acids 372-421 are VP64.
In additional non-limiting examples, the constructs include a Kruppel associated box (KRAB) repressor domain linked to Cas7. Exemplary KRAB-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 5-8. In SEQ ID NO: 5, nucleotides 1-198 encode KRAB, nucleotides 199-234 encode a linker, and nucleotides 235-1323 encode Cas7. In SEQ ID NO: 6, amino acids 1-66 are KRAB, 67-78 are the linker, and amino acids 79-440 are Cas7. In SEQ ID NO: 7, nucleotides 1-1089 encode Cas7, nucleotides 1090-1113 encode a linker, and nucleotides 1114-1311 encode KRAB. In SEQ ID NO: 8, amino acids 1-363 are Cas7, amino acids 364-371 are the linker, and amino acids 364-436 are KRAB.
In additional non-limiting examples, the constructs include emerald green fluorescent protein (emGFP) linked to Cas7. Exemplary emGFP-Cas7 nucleic acid and amino acid sequences include or consist of any one of SEQ ID NOs: 9-12. In SEQ ID NO: 9, nucleotides 1-717 encode emGFP and nucleotides 718-1806 encode Cas7. In SEQ ID NO: 10, amino acids 1-239 are emGFP and amino acids 240-601 are Cas7. In SEQ ID NO: 11, nucleotides 1-1089 encode Cas7, nucleotides 1090-1104 encode a linker, and nucleotides 1105-1821 encode emGFP. In SEQ ID NO: 12, amino acids 1-390 are Cas7, amino acids 391-396 are the linker, and amino acids 397-606 are emGFP.
In other examples, the effector molecule -Cas7 nucleic acids have a nucleic acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11. In some examples, the effector molecule-Cas7 nucleic acids encode an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12. In further examples, the effector molecule-Cas7 protein has an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
IV. Effector Molecules
The disclosed compositions and methods include one or more effector molecules. In some embodiments, effector molecules have one or more DNA-modulating activities, such as proteins (or protein domains) that can modify expression or sequence of a target nucleic acid. In other embodiments, effector molecules include agents (such as a reporter) that can be detected, for example, to visualize or identify a cell including a target nucleic acid or that can be used to quantitate expression of a target nucleic acid.
In some embodiments, effector molecules include proteins or protein domains that can modify expression of a target nucleic acid, such as transcriptional activators or repressors. Transcriptional activators are molecules that increase gene expression. Transcriptional activators typically include DNA binding and transcriptional activation functions, which can be included in a single protein or separate proteins. In the constructs and methods described herein, the DNA binding function of the transcriptional activator is provided by the Cascade complex and the transcription activation function is provided by the effector molecule (such as a transcription activator domain) tethered to a Cascade subunit. An exemplary transcriptional activator disclosed herein is the VP16 activation domain from herpes simplex virus. In some examples, a multimer of four VP16 molecules (referred to as VP64) is the effector molecule. Other VP16 activation multimers can also be used, for example VP160, which includes 10 VP16 molecules. In other examples, a transcriptional activator includes a p65 activation domain, an RNA polymerase omega subunit, human heat shock factor 1, a viral RTA activation domain, a heat shock factor 1 (HSF1) activation domain, or a sigma factor (such as s70 (RpoD)). An additional alternative VP16-containing activator is VPR, which includes VP64, a p65 activation domain, and an RTA activation domain. Additional fusions, for example with a p65 activation domain or HSF1 activation domain, may also be used. Additional transcriptional activators or transcriptional activation domains can also be selected.
Transcriptional repressors are molecules that decrease or inhibit gene expression. An exemplary transcriptional repressor is the Kruppel associated box (KRAB) repressor domain. Other transcriptional repressors include REST, thyroid hormone receptors a and b, and repressor domains derived from Egr-1, Oct2A, and Drl (see, e.g., Thiel et al., Biological Chem. 382:891-902, 2001). Additional transcriptional repressors or transcriptional repressor domains can also be selected.
In other embodiments, effector molecules include proteins that can modify a nucleic acid sequence, such as a protein with base editing activity. Base editing is a technique that permits direct conversion of a specific nucleotide (for example, present in genomic DNA) to another nucleotide. Exemplary base editors include cytidine deaminases (e.g., APOBEC1, AID, CDA1, and APOBEC3G) and adenosine deaminases (e.g., TadA) or variants thereof. In other examples, uridine glycosylase inhibitor (UGI) is the effector. UGI may also be included as an effector in combination with a base editor effector or as a fusion protein with a base editor, such as cytidine deaminase). Base editors include those described in Komor et al. (Set Adv. 3:eaao4774, 2017), Gaudelli et al. (Nature 551 :464-471, 2017) and Nishida et al. (Science 353:aaf8729, 2016).
In additional embodiments, the effector molecule is a reporter or detectable label. The reporter can be used to visualize or localize a target nucleic acid, for example in a sample, cell, tissue, or organism. In some examples, a reporter can also be used to quantify (for example, quantitatively or semi-quantitatively) an amount of a target nucleic acid in a sample, cell, tissue, or organ. Reporter effector molecules include fluorescent proteins, such as green fluorescent protein (GFP) or any of the recombinant variants of GFP, including enhanced GFP (EGFP) or emerald GFP (emGFP), blue fluorescent proteins (BFP and derivatives), cyan fluorescent protein (CFP and other derivatives), yellow fluorescent protein (YFP and other derivatives) and red fluorescent protein (RFP and other derivatives). Other detectable agents include but are not limited to fluorescent dyes or Quantum dots. In some examples, the reporter includes a Halo tag conjugated to a fluorescent dye (see, e.g., Deng et al., PNAS 112:11870-11875, 2015). Exemplary fluorescent dyes include Alexa Fluor dyes, Cy3, Cy5, fluorescein, fluorescein isothiocyanate, DAPI, Hoechst dye, acridine orange, Texas Red. Other fluorescent dyes can be selected.
In additional examples, multiple copies of an effector molecule can he recruited to the complex by linking a SusiTag scaffold containing 10-24 copies of the short epitope GCN4 to the Cas polypeptide. GCN4 recruits an effector molecule fused to the cognate scFV antibody, which is expressed from a separate plasmid. Tilts system amplifies the number of effector molecules which can be included in the complex, for example, increasing intensity of the fluorescent signal in the case of a fluorescent protein effector molecule. See, e.g., Tanenbaum et ai, Cell 159:635-646, 2014.
In further examples, the effector is a component of the non-homologous end joining (NHEJ) repair pathway (e.g., LIG4, Ku70/80, or DNA-PKcs) or homology-dependent repair pathway (e.g., CtIP, Exol, or RAD51).
In other examples, the effector provides tethering of a single- stranded DNA oligo template for homology directed repair (HDR) to the CRISPR RNP for delivery to the site of a DNA break hi some embodiments, a DNA binding domain (such as a Fokl DNA binding domain or zinc finger protein DNA binding domain or TALE DNA binding domain is linked to a Cas protein. In one specific example, the DNA binding domain of the Fokl restriction enzyme is linked to a Cascade protein (such as Cas7 or Csy3), to allow for delivery of a donor DNA to the site of a DNA break (double-stranded or nicked) for increased homology-directed repair. A ssODN donor would include a short double- stranded“handle” for binding to the Fokl-Cascade fusion and targeted to a specific locus.
In other examples, the effector provides cell cycle control, for example, using an N-terminal fragment of geminin linked to a Cas protein (such as Cas7) for expression during cell cycles when HDR is active. See, e.g., Gutschner et al., Cell Reports 14:P1555-1566, 2016.
In other examples, the effector provides chemical control of activity requiring small-molecule ligands for protein stabilization or delivery to the nucleus. In one example, the effector is dihydrofolate reductase (DHFR) or a DHFR-derived destabilization domain which can be regulated with addition of trimethoprim. In other examples, the effector is a destabilizing domain of the estrogen receptor (such as ER50), which can be regulated by CMP8 or 4-hydroxytamoxifen. In further examples, the effector includes the hormone binding domain of the estrogen receptor (ERT2), which can be regulated with addition of estrogen or an estrogen agonist (such as tamoxifen or 4-hydroxytamoxifen). In still further examples the effector is CRY2/CIB1 for photoinduced dimerization. V. Delivery Systems
The efficient delivery of nucleic acid modulating systems, including the Type I CRISPR-Cas complex systems including at least one linked effector molecule described herein, provide for safer and more effective delivery systems, which are especially useful in the clinical setting. Representative delivery systems herein disclose methods and compositions containing viral and/or non-viral vectors to deliver nucleic acid editing systems, particularly, Type I CRISPR-Cas complex systems including at least one covalently linked effector molecule, and optionally an editing template to edit genes in cells. While gene editing is particularly useful in vivo, in some embodiments, the cell targeted for gene editing may be in vitro, ex vivo, or in vivo.
In some embodiments viral vectors or plasmids for gene expression can be used to deliver the Type I CRISPR complexes including at least one linked effector molecule disclosed herein. In some examples nucleic acids encoding Type I CRISPR complex proteins, including at least one Cas protein linked to an effector molecule are delivered to a cell in one or more vectors, for example by transformation, transfection, or other known methods of delivery to cells. In some examples, virus-like particles (VLP) can be used to encapsulate ribonucleoprotein complexes. In other examples, recombinant expression can be used, and purified ribonucleoprotein complexes disclosed herein can be purified and delivered to cells via electroporation or injection.
Delivery vehicles may be viral vectors or non-viral vectors, or RNA conjugates. In some embodiments, the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in one or more viral vectors or non-viral vectors (such as 1, 2, 3, or more vectors). In some examples, the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in a single vector. In other examples, the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in two vectors. In further examples, the components of the CRISPR-Cas complex with at least one linked effector molecule are provided in three vectors. The components of the CRISPR-Cas complex may be expressed such that one or more linked proteins are produced (for example, linked by an amino acid sequence) or may be expressed as individual open reading frames (for example, using a vector that results in production of polycistronic RNA).
Cas subunits can be included in a single vector or multiple vectors. In some embodiments, the nucleic acids encoding Type I CRISPR system proteins are included in two or more vectors, which in some examples can provide improved complex stability upon expression in a cell. In some examples, a first vector includes nucleic acids encoding one or more Type I CRISPR proteins and a second vector includes nucleic acids encoding one or more Type I CRISPR proteins. One or more of the CRISPR proteins encoded by the first and second vector may be the same and/or one or more of the CRISPR proteins encoded by the first and second vector may be the same. In one non-limiting example, a first vector includes nucleic acids encoding Csel (Cas8), Cse2, Cas7 linked to an effector, Cas5, and Cas6 and a second vector includes nucleic acids encoding Cas7, Cas5, and Cas6 (e.g., FIGS. 3A and 3B). In another non-limiting example, a first vector includes nucleic acids encoding Csel (Cas8), Cse2, and Cas7 linked to an effector and a second vector includes nucleic acids encoding Cas5 and Cas6 (e.g., FIGS. 4A and 4B). Other combinations of protein expression vectors can be used, and can be tested, for example based on relative quantities of each subunit expressed. Exemplary vectors include those provided herein as SEQ ID NOs: 13-15.
The vector(s) may further include components for targeting the Cas subunits to the nucleus, such as one or more nuclear localization signals (NLS) linked to one or more of the Cas subunits. In some examples, an NLS is linked to Csel (Cas8), for example, linked to the N-terminus of Csel . In other examples, an NLS is linked to Cas6, for example, linked to the C-terminus of Cas6. An NLS can alternatively be linked to one or more other proteins in the complex and/or to the effector molecule or between the effector molecule and Cas7.
The vector may also include one or more tags for protein purification, such as strep tavidin (e.g., Strep-tagll), maltose binding protein (MBP), 6xHistidine (6xHis), small ubiquitin like modifier (SUMO), or glutathione S transferase (GST), or a combination of two or more thereof (e.g., His-MBP). In some examples, the purification tag can be linked to the N- or C-terminus of any subunit of the complex. In one non-limiting example, streptavidin (e.g., Strep-tagll) is linked to the N-terminus of Cse2. See, e.g., Brouns et al., Science 321 :960-964, 2008.
In some embodiments, the guide sequence and the CRISPR-Cas complex with at least one linked effector molecule are provided in the same type of delivery vehicle, wherein the delivery vehicle is a viral vector or a non-viral vector. In other embodiments, the guide sequence is provided in a viral vector, and the CRISPR-Cas complex with at least one linked effector molecule is provided in non-viral vector(s). In still other embodiments, the one or more guide sequence is provided in a non-viral vector and the CRISPR-Cas complex with at least one linked effector molecule is provided in viral vector(s). In some embodiments, the guide sequence is provided in an RNA conjugate.
Any vector system may be used, including, but not limited to, plasmid vectors, linear constructs, retroviral vectors, lentiviral vectors, adenovirus vectors, poxvirus vectors; herpesvirus vectors and adeno- associated vims vectors, etc. See, also, U.S. Pat. Nos. 6,534,261 ; 6,607,882; 6,824,978; 6,933,113;
6,979,539; 7,013,219; and 7,163,824, incorporated by reference herein in their entireties. Lurthermore, any of these vectors may comprise one or more CRISPR-Cas encoding sequences and/or additional nucleic acids as appropriate. Thus, when one or more Type I CRISPR-Cas proteins and/or guide sequence as described herein are introduced into the cell, and additional DNAs as appropriate, they may be carried on the same vector or on different vectors. When multiple constructs are used, each vector may comprise a sequence encoding one or multiple components of the Type I CRISPR-Cas complexes, as desired. Exemplary bacterial vectors for expression of the components are shown in LIGS. 3A-3B and 4A-4B and include SEQ ID NOs: 13-15.
Conventional viral and non-viral based gene transfer methods can be used to introduce nucleic acids encoding engineered Type I CRISPR-Cas complexes including at least one linked effector molecule into cells (e.g., bacterial, animal, plant, fungal, or algal cells) and target tissues and to co-in troduce additional nucleotide sequences if desired. Such methods can also be used to administer nucleic acids (e.g., encoding CRISPR-Cas complexes including at least one linked effector molecule or components thereof) to cells in vitro. In certain embodiments, nucleic acids are administered for in vivo or ex vivo gene therapy uses.
Non- viral vector delivery systems include DNA plasmids, naked nucleic acid, and nucleic acid complexed with a delivery vehicle such as a liposome or polymer. Viral vector delivery systems include DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell. For a review of gene therapy procedures, see Anderson, Science 256:808-813, 1992; Nabel & Feigner, TIBTECH 11 :211-217, 1993; Mitani & Caskey, TIBTECH 11 :162-166, 1993; Dillon, TIBTECH 11 :167-175, 1993; Miller, Nature 357:455-460, 1992; Van Brunt, Biotechnology 6(10): 1149-1154, 1988; Vigne, Restorative Neurology and Neuroscience 8:35-36, 1995; Kremer & Perricaudet, British Medical Bulletin 51 (1):31 -44, 1995; Haddada et al., in Current Topics in Microbiology and Immunology Doerfler and Bohm (eds.) (1995); and Yu et al., Gene Therapy 1 : 13-26, 1994.
Viral Vectors
In some embodiments, the viral vector is selected from an adeno-associated vims (AAV), adenovirus, retrovirus, and lentivirus vector. While the viral vector may deliver any component of the systems described herein so long as it provides the desired profile for tissue presence or expression, in some embodiments the viral vector provides for expression of the guide sequence and optionally delivers a repair template. In some embodiments, the viral delivery system is adeno-associated vims (AAV) 2/8. However, in various embodiments other AAV serotypes are used, such as AAV1, AAV2, AAV4, AAV5, AAV6, and AAV8. In some embodiments, AAV6 is used when targeting airway epithelial cells, AAV7 is used when targeting skeletal muscle cells (similarly for AAV1 and AAV5), and AAV8 is used for hepatocytes. In some embodiments, AAV1 and AAV5 can be used for delivery to vascular endothelial cells. Further, most AAV serotypes show neuronal tropism, while AAV5 also transduces astrocytes. In some embodiments, hybrid AAV vectors are employed. In some embodiments, each serotype is administered only once to avoid immunogenicity. Thus, subsequent administrations employ different AAV serotypes. Additional viral vectors that can be employed are as described in US 8,697,359, which is hereby incorporated by reference in its entirety.
Non- Viral Vectors
In some embodiments, the delivery system comprises a non-viral delivery vehicle. In some aspects, the non-viral delivery vehicle is lipid-based. In other aspects, the non-viral delivery vehicle is a polymer. In some embodiments, the non-viral delivery vehicle is biodegradable. In embodiments, the non-viral delivery vehicle is a lipid encapsulation system and/or polymeric particle.
Methods of non-viral delivery of nucleic acids include electroporation, nucleofection, lipofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipidmucleic acid conjugates, naked DNA, mRNA, artificial virions, and agent-enhanced uptake of DNA. Sonoporation using, e.g., the Sonitron 2000 system (Rich-Mar) can also be used for delivery of nucleic acids. In one embodiment, one or more nucleic acids are delivered as mRNA. Use of capped mRNAs to increase translational efficiency and/or mRNA stability is included in some embodiments. In particular examples, ARCA (anti-reverse cap analog) caps or variants thereof are used. See U.S. Pat. Nos. 7,074,596 and 8,153,773, incorporated by reference herein.
Additional exemplary nucleic acid delivery systems include those provided by Lonza (Cologne, Germany), Amaxa Biosystems (Cologne, Germany), Maxcyte, Inc. (Rockville, Md.), BTX Molecular Delivery Systems (Holliston, Mass.) and Copernicus Therapeutics, Inc., (see for example U.S. Pat. No. 6,008,336). Lipofection is described in e.g., U.S. Pat. No. 5,049,386, U.S. Pat. No. 4,946,787; and U.S. Pat. No. 4,897,355) and lipofection reagents are sold commercially (e.g., TRANSFECTAM™,
LIPOFECTIN™, and LIPOFECT AMINE™ RNAiMAX). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Feigner, WO 91/17424 and WO 91/16024. Delivery can be to cells (ex vivo administration) or target tissues (in vivo administration).
Lipid-Based and Polymeric Non-Viral Vectors
In certain embodiments, the delivery system includes lipid particles as described in Kanasty (Nat Mater. 12(11) :967-77, 2013), which is hereby incorporated by reference. In some embodiments, the lipid- based vector is a lipid nanoparticle, which is a lipid particle between about 1 and about 100 nanometers in size.
In some embodiments, the lipid-based vector is a lipid or liposome. Liposomes are artificial spherical vesicles comprising a lipid bilayer.
In some embodiments, the lipid-based vector is a small nucleic acid-lipid particle (SNALP).
SNALPs are small (less than 200 nm in diameter) lipid-based nanoparticles that encapsulate a nucleic acid.
In some embodiments, the SNALP is useful for delivery of an RNA molecule such as crRNA. In some embodiments, SNALP formulations deliver nucleic acids to a particular tissue in a subject, such as the liver.
In some embodiments, the guide sequence and/or Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof (or the RNA encoding the same) is delivered via polymeric vectors. In some embodiments, the polymeric vector is a polymer or polymerosome. Polymers encompass any long repeating chain of monomers and include, for example, linear polymers, branched polymers, dendrimers, and polysaccharides. Linear polymers include a single line of monomers, whereas branched polymers include side chains of monomers. Dendrimers are also branched molecules, which are arranged symmetrically around the core of the molecule. Polysaccharides are polymeric carbohydrate molecules, and are made up of long monosaccharide units linked together. Polymersomes are artificial vesicles made up of synthetic amphiphilic copolymers that form a vesicle membrane, and may have a hollow or aqueous core within the vesicle membrane.
Various polymer-based systems can be adapted as a vehicle for administering RNA encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof.
Exemplary polymeric materials include poly(D,L- lactic acid-co-glycolic acid) (PLGA), poly(caprolactone) (PCL), ethylene vinyl acetate polymer (EVA), poly(lactic acid) (PLA), poly(L-lactic acid) (PLLA), poly(glycolic acid) (PGA), poly(L-lactic acid-co-glycolic acid) (PLLGA), poly(D,L-lactide) (PDLA), poly(L- lactide) (PLLA), PLGA-b-poly(ethylene glycol)-PLGA (PLGA-bPEG-PLGA), PLLA-bPEG-PLLA, PLGA-PEG-maleimide (PLGA-PEG-mal), poly(D,L-lactide-co-caprolactone), poly(D,L-lactide-co- caprolactone-co- glycolide), poly(D,L-lactide-co-PEO-co-D,L-lactide), poly(D,L-lactide-co-PPO-co-D,L- lactide), polyalkyl cyanoacralate, polyurethane, poly-L-lysine (PLL), hydroxypropyl methacrylate (HPMA), polyethyleneglycol, poly-L-glutamic acid, poly(hydroxy acids), polyanhydrides, polyorthoesters, poly(ester amides), polyamides, poly(ester ethers), polycarbonates, polyalkylenes such as polyethylene and polypropylene, polyalkylene glycols such as poly(ethylene glycol) (PEG), polyalkylene oxides (PEO), polyalkylene terephthalates such as poly(ethylene terephthalate), polyvinyl alcohols (PVA), polyvinyl ethers, polyvinyl esters such as poly(vinyl acetate), polyvinyl halides such as poly(vinyl chloride) (PVC), polyvinylpyrrolidone, polysiloxanes, polystyrene (PS), polyurethanes, derivatized celluloses such as alkyl celluloses, hydroxyalkyl celluloses, cellulose ethers, cellulose esters, nitro celluloses,
hydroxypropylcellulose, carboxymethylcellulose, polymers of acrylic acids, such as
poly (methyl(meth) acrylate) (PMMA), poly(ethyl(meth)acrylate), poly(butyl(meth)acrylate),
poly (isobutyl(meth) acrylate), poly(hexyl(meth)acrylate), poly (isodecyl(meth) acrylate),
poly(lauryl(meth)acrylate), poly(phenyl(meth)acrylate), poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutyl acrylate), poly(octadecyl acrylate) (polyacrylic acids), and copolymers and mixtures thereof, polydioxanone and its copolymers, polyhydroxyalkanoates, polypropylene fumarate), polyoxymethylene, poloxamers, poly(ortho)esters, poly(butyric acid), poly(valeric acid), poly(lactide-co-caprolactone), trimethylene carbonate, polyvinylpyrrolidone, poly orthoesters, polyphosphazenes, Poly( [beta] -amino esters (PBAE), and polyphosphoesters, and blends and/or block copolymers of two or more such polymers.
Polymer-based systems may also include Cyclodextrin polymer (CDP)-based nanoparticles such as, for example, CDP-admantane (AD)-PEG conjugates and CDP-AD-PEG-transferrin conjugates.
Exemplary polymeric particle systems for delivery of substances, including nucleic acids, include those described in US 5,543,158, US 6,007,845, US 6,254,890, US 6,998,115, US 7,727,969, US 7,427,394, US 8,323,698, US 8,071,082, US 8,105,652, US 2008/0268063, US 2009/0298710, US 2010/0303723, US 2011/0027172, US 2011/0065807, US 2012/0156135, US 2014/0093575, WO 2013/090861, each of which are hereby incorporated by reference in its entirety.
In one embodiment, the delivery system is a layer-by-layer particle system including two or more layers. In a further embodiment, the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof are present in different layers within the layer-by-layer particle. In a yet further embodiment, the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof may be administered to a subject in a layer-by-layer particle system such that the release of the guide RNA and the nucleic acid(s) encoding the Type I CRISPR-Cas system including at least one linked effector molecule or component(s) thereof from the particles can be controlled in a cell-specific and/or temporal fashion. Layer-by-layer particle systems are disclosed, for example, in US 2014/0093575, incorporated herein by reference in its entirety.
Lipid Encapsulation System Vectors
In some embodiments, the lipid-based delivery system includes a lipid encapsulation system. The lipid encapsulation system can be designed to drive the desired tissue distribution and cellular entry properties, as well as to provide the requisite circulation time and biodegrading character. The lipid encapsulation may involve reverse micelles and/or further comprise polymeric matrices, for example as described in US 8,193,334, which is hereby incorporated by reference. In some embodiments, the particle includes a lipophilic delivery compound to enhance delivery of the particle to tissues, including in a preferential manner. Such compounds are disclosed in US 2013/0158021, which is hereby incorporated by reference in its entirety. Such compounds may generally include lipophilic groups and conjugated amino acids or peptides, including linear or cyclic peptides, and including isomers thereof. An exemplary compound is referred to as cKK-E12, which can affect delivery to liver and kidney cells, for example. The present disclosure can employ compounds of formulas (I), (II), (III), (IV), (V), and (VI) of US
2013/0158021. Compounds can be engineered for targeting to various tissues, including but not limited to pancreas, spleen, liver, fat, kidneys, utems/ovaries, muscle, heart, lungs, endothelial tissue, and thymus.
In some embodiments, the lipid encapsulation comprises one or more of a phospholipid, cholesterol, polyethylene glycol (PEG)-lipid, and a lipophilic compound. In some embodiments, the lipophilic compound is C12-200, particularly in embodiments that target the liver (Love et al, PNAS 107(5):1864- 1869; 2010 (erratum in PNAS 107(21), 2010), incorporated herein by reference in its entirety). In other embodiments, the lipophilic compound Cl 2-200 is useful in embodiments that target fat tissue. In still other embodiments, the lipopeptide is cKK-E12 (Dong et al, PNAS 111(11) :3955— 3960, 2014, incorporated herein by reference in its entirety).
In some embodiments, the lipid encapsulation includes 1 ,2-diolcoyl-v«-glyccro-3- phosphoethanolamine (DOPE), cholesterol, C14-PEG2000, and cKK-E12, which provides for efficient in vivo gene modulation in liver tissue.
Additional Components and Features of Non-viral Vectors
When used, delivery particles, whether lipid or polymeric or both, may include additional components useful for enhancing the properties for in vivo nucleic acid delivery (including compounds disclosed in US 8,450,298 and US 2012/0251560, which are each hereby incorporated by reference).
The delivery vehicle may accumulate preferentially in certain tissues thereby providing a tissue targeting effect, but in some embodiments, the delivery vehicle further comprises at least one cell-targeting or tissue-targeting ligand and/or a tissue- specific promoter. Functionalized particles, including exemplary targeting ligands, are disclosed in US 2010/0303723 and 2012/0156135, which are hereby incorporated by reference in their entireties. A delivery vehicle can be designed to drive the desired tissue distribution and cellular entry properties of the delivery systems disclosed herein, as well as to provide the requisite circulation time and biodegrading character. For example, lipid particles can employ amino lipids as disclosed US 2011/0009641, which is hereby incorporated by reference.
The lipid or polymeric particles may have a size ( e.g an average size) in the range of about 50 nm to about 5 pm. In some embodiments, the particles are in the range of about 10 nm to about 100 pm, or about 20 nm to about 50 pm, or about 60 nm to about 5 pm, or about 70 nm to about 500 nm, or about 70 nm to about 200 nm, or about 50 nm to about 100 nm. Particles may be selected so as to avoid rapid clearance by the immune system. Particles may be spherical, or non-spherical in certain embodiments.
In some embodiments, the non-viral delivery vehicle may be a peptide, such as cell-penetrating peptides or cellular internalization sequences. Cell-penetrating peptides are small peptides that are capable of translocating across plasma membranes. Exemplary cell-penetrating peptides include, but are not limited to, Antennapedia sequences, TAT, HIV-Tat, Penetratin, Antp-3A (Antp mutant), Buforin II, Transportan, MAP (model amphipathic peptide), K-FGF, Ku70, Prion, pVEC, Pep-1, SynBl, Pep-7, 1-IN-1, BGSC (Bis- Guanidinium-Spermidine-Cholesterol, and BGTC (Bis-Guanidinium-Tren -Cholesterol).
VI. Expression and Purification
In some embodiments, the present disclosure provides plasmids for transgenic or transient expression of the Type I CRISPR-Cas proteins including at least one Cas protein linked to an effector molecule. In some embodiments a plasmid encoding a chimeric Type I CRISPR-Cas protein includes in frame sequences for protein fusions of one or more of the other proteins described herein, including, but not limited to a Type I CRISPR-Cas protein, an effector molecule, optionally a linker, and optionally a nuclear localization sequence (NFS).
In some embodiments the plasmids and vectors encode the CRISPR-Cas protein(s) and effector molecule and also encode the guide RNA of the present invention. In other embodiments, one or more components of the engineered complex can be encoded in two or more distinct plasmids.
In some embodiments the plasmids can be used across multiple species. In other embodiments, the plasmids are tailored to the organism or type of cell being transformed. In some embodiments, the sequences of the nucleic acids are codon-optimized for expression in the organism whose genes are being targeted. Promoters providing adequate expression can be selected. In some embodiments, the plasmids for different species will require different promoters.
In some embodiments, the plasmids and vectors are selectively expressed in the cells of interest. Thus in some embodiments, the present application teaches the use of ectopic promoters, tissue-specific promoters, developmentally-regulated promoters, or inducible promoters. In some embodiments, the present disclosure also includes the use of terminator sequences.
In other embodiments, a portion, or the entire complex(es) of the present technology, or the entire set of components of a Cascade-effector molecule complex, can be delivered directly to cells (e.g., through microinjection). Thus in some embodiments, the polypeptides and/or nucleic acids are expressed and purified. In some embodiments, the polypeptides are expressed via inducible or constitutive protein production systems such as a bacterial system, yeast system, plant cell system, or animal cell system. In some embodiments, the purification of proteins and/or polypeptides may be purified via affinity tags, or custom antibody purifications. In other embodiments, polynucleotides may be chemically synthesized.
In some embodiments, the nucleic acid disclosed herein are transformed into a heterologous cell.
The nucleic acids or plasmids disclosed herein can be transformed into cells through any known system. For example, in some embodiments, cells may be transformed by particle bombardment, chemical
transformation, agrobacterium transformation, nano-spike transformation, and/or virus transformation.
VII. Delivery of Constructs or Complexes
The delivery vehicles (whether comprising conjugates, RNPs, viral or non-viral vectors, or a combination thereof) may be administered to a subject by any method known in the art, including injection, optionally by direct injection to target tissues, specific target cells, and even to specific organelles within a single cell ( e.g the nucleus). In some embodiments, the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and, optionally, Cas3 and/or repair template are administered simultaneously in the same or in different delivery vehicles. In other embodiments, the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and, optionally, Cas3 and/or repair template are administered sequentially via separate delivery vehicles. In some embodiments, the guide sequence is administered 1-30 days (for example, 1, 3, 5, 7, 10, 14, or 30 days) prior to administration of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, such that the guide sequence accumulates in the target tissue prior to administration of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule. In some embodiments, the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, is administered in a plurality of doses, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more doses. In various embodiments, the guide sequence and/or Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, is administered over a time period of from one week to about six months, such as from about two to about ten doses within about two months, such as from three to five doses over about one month.
In one embodiment, the guide sequence and, optionally, a repair template, are provided in an AAV vector that is administered to the subject or cell prior to administration of a nanoparticle containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule. In a further embodiment, the AAV vector comprising the guide sequence is administered 3, 4, 5, 6, 7, 8, 9, or 10 days prior to the administration of the nanoparticle, to allow expression of the guide sequence from the AAV vector. In a yet further embodiment, the nanoparticle containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered multiple times, for example, once every 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 days. In a still further embodiment, the nanoparticle containing the guide sequence is administered for 1 month, 2 months, 3, months, 4 months, 6 months, 8 months, 10 months, 12 months, 18 months, 24 months, or longer. Since AAV expression can occur for 2 years or longer, in one embodiment, the expression of the guide sequence and, optionally, repair template, from the AAV vector and the continual administration of nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule provides efficient gene editing of the target sequence with reduced or absent off-target effects due to the transient expression of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule.
In another embodiment, the repair template is delivered via an AAV vector, and is injected 3, 4, 5, 6, 7, 8, 9, or 10 days prior to the administration of nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or the guide sequence. As described above, the nanoparticles may be administered multiple times, and for several months. In such embodiments, the repair template is expressed from the AAV vector in the cell for 2 years or longer, and the nanoparticles comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence are administered in multiple administrations over time in order to provide efficient modulation of the target sequence with reduced or absent off-target effects.
In particular embodiments, one or more guide sequences and, optionally, a repair template, is provided in an AAV vector that is administered first, and a Type I CRISPR-Cas proteins or complex including at least one linked effector molecule in a lipid-based delivery vehicle is subsequently administered in one or more doses. In some embodiments, the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is administered in a lipid-based delivery vehicle about 7 days and about 14 days after the administration of the one or more guide sequences in an AAV vector.
In another embodiment, each of the components of the delivery systems provided herein ( e.g ., the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence and, optionally, repair template) are each contained in the same or in different nanoparticles. In a further embodiment, the nanoparticles containing the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence, and, optionally, repair template, are administered at multiple time points, for example, every 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 days. In another embodiment, the administration of the nanoparticles separately comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and guide sequence are administered at different time points in order to enhance efficiency in a particular cell or for a particular disease type.
In some embodiments, the administration of the delivery system is controlled so that expression of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is transient. In some embodiments, such transient expression minimizes off-target effects, thereby increasing the safety and efficiency of the system disclosed herein. For example, expression of the system is controlled via selection of the delivery vehicles and/or promoters disclosed herein.
In another embodiment, the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and optionally, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or repair template, are administered to a subject or a cell at the same time, such as on the same delivery vehicle, and one or more component is under the control of an inducible promoter. As an example, in one embodiment, the guide sequence, Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, and repair template are each present on an AAV viral vector, and the guide sequence is under the control of an inducible promoter, for example, a small molecule-induced promoter such as tetracycline-inducible promoter. In a further embodiment, components of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule are expressed 5-7 days following administration of the vector, after which the expression of the guide sequence is induced by one or more injections of the small molecule such as tetracycline. The guide sequence expression can be induced at various time points in order to increase efficiency; for example guide sequence expression may be induced every day, or every 2 days, or every 3 days, or every 5 days, or every 10 days, or every 2 weeks, for at least 1 week or at least 2 weeks, or at least 3 weeks, or at least 4 weeks, or at least 5 weeks, or at least 6 weeks, or at least 7 weeks, or at least 8 weeks, or at least 10 weeks, or at least 11 weeks, or at least 12 weeks, or more. Thus, component(s) of the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule may be expressed from the AAV vector over time, and the guide sequence may be inducibly expressed by multiple injections of the inducing molecule over several days, weeks, or months. Similarly, the guide sequence can be expressed from the AAV vector over time, and components of the Type I CRISPR-Cas complex including at least one linked effector molecule may be inducibly expressed under control of an inducible promoter by multiple injections of the inducing molecule over several days, weeks, or months.
In another embodiment, one or more guide sequences and, optionally, a repair template, is delivered via an RNA conjugate, such as an RNA-GalNAc conjugate, and the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule is delivered via a viral or non-viral vector, such as a nanoparticle. In another embodiment, the guide sequence and optionally repair template are attached to the nanoparticle comprising the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, such that the components are delivered to the target cell or tissue together. In such embodiments, the guide sequence, optional repair template, and Type I CRISPR-Cas proteins or complex including at least one linked effector molecule may be delivered to the target cell or tissue together, and expression of each component may be controlled by way of different promoters, including inducible promoters, as disclosed herein.
In one aspect, the present disclosure provides methods for modulating expression (for example, increasing or decreasing expression) of a target polynucleotide in a cell, which may be in vivo, ex vivo, or in vitro. In other aspects, the present disclosure provides methods for altering the sequence of a target polynucleotide in a cell, for example, changing one or more nucleotide, for example from C to T (or G to A on the opposite strand) or from A to G (or T to C on the opposite strand) in a target nucleic acid. In additional aspects, the present disclosure provides methods for detecting presence and/or quantity of a target polynucleotide in a cell. In some embodiments, the one or more delivery vehicles including Type I CRISPR-Cas proteins or complex including at least one linked effector molecule and/or guide sequence and, optionally, repair template, are administered to a subject.
In further embodiments, the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule, guide sequence, and optionally, repair template, are targeted to one or more target tissues in the subject. For example, in one embodiment, the target tissue is liver, endothelial tissue, lung (including lung epithelium), kidney, fat, or muscle. In one embodiment, the one or more delivery vehicles comprise a viral vector ( e.g ., AAV) or a non-viral vector such as, for example, MD-1, 7C1, PBAE, Cl 2-200, cKK-E12, or a conjugate such as a cholesterol conjugate or an RNA conjugate as disclosed herein. In one embodiment, the target tissue is liver, and one or more delivery vehicle is MD-1. In another embodiment, the target tissue is endothelial tissue, and one or more delivery vehicle is 7C1. In another embodiment, the targeting tissue is lung, and one or more delivery vehicle is PBAE or 7C1. In another embodiment, the target tissue is kidney, one or more delivery vehicle is an RNA conjugate. In another embodiment, the target tissue is fat, and one or more delivery vehicle is C 12-200. In another embodiment, the target tissue is muscle (e.g., skeletal muscle) and one or more delivery vehicle is a cholesterol conjugate.
The delivery vehicles (whether viral vector or non-viral vector or RNA conjugate material) may be administered to a subject by any method known in the art, including injection, optionally by direct injection to target tissues. Nucleic acid modification can be monitored over time by, for example, periodic biopsy with PCR amplification and/or sequencing of the target region from genomic DNA, or by RT-PCR and/or sequencing of the expressed transcripts. Alternatively, nucleic acid modification can be monitored by detection of a reporter gene or reporter sequence. Alternatively, nucleic acid modification can be monitored by expression or activity of a corrected gene product or a therapeutic effect in the subject.
In some embodiments, the subject is a human in need of therapeutic or prophylactic intervention. Alternatively, the subject is an animal, including livestock, poultry, domesticated animal, or laboratory animal. In various embodiments, the subject is a mammal, such as a human, horse, cow, dog, cat, rodent, or pig. Also contemplated are embodiments where the“subject” is a fungus or a plant, and the Type I CRISPR systems described herein are used to modulate the genome of these organisms.
In some embodiments, the methods provided herein include obtaining a cell or population of cells from a subject and modifying expression and/or sequence of a target polynucleotide in the cell or cells ex vivo, using the systems, compositions, and/or methods disclosed herein. In further embodiments, the ex v/vo-modified cell or cells may be re-introduced into the subject following ex vivo modification. Thus, the present disclosure provides methods for treating a disease or disorder in a subject, comprising obtaining one or more cells from the subject, modifying one or more target nucleotide sequences in the cell ex vivo, and re introducing of the cell with the modified target nucleotide sequence back into the subject having the disease or disorder. In some embodiments, cells in which nucleotide sequence modification has occurred are expanded in vitro prior to reintroduction into the subject having the disease or disorder. In one embodiment, the cells are bone marrow cells. In other embodiments, the nucleic acid editing system and guide sequence and, optionally, repair template, are administered to a cell in vitro.
In some embodiments, at least one component of the delivery system ( e.g ., the guide sequence or the Type I CRISPR-Cas proteins or complex including at least one linked effector molecule) accumulates in the target tissue, which may be, for example, liver, heart, lung (including airway epithelial cells), skeletal muscle, CNS (e.g., nerve cells), endothelial cells, blood cells, bone marrow cells, blood cell precursor cells, stem cells, fat cells, or immune cells. Tissue targeting or distribution can be controlled by selection and design of the viral vector, or in some embodiments is achieved by selection and design of lipid or polymeric particles. In some embodiments, the desired tissue targeting of the activity is provided by the combination of viral and non- viral delivery vehicles.
Also contemplated are methods of directly delivering fully assembled nucleoprotein complex(es), or subunits thereof, to a target cell or organelle within the cell (e.g., directly to the nucleus of a eukaryotic cell).
By way of example, CRISPR-Cascade or other Type I complexes including at least one covalently linked effector molecule are recombinantly expressed and purified. NLS-tagged Cas3 is recombinantly expressed and purified separately, or as tethered Cas proteins in the crRNA-guided surveillance complex. In some examples, the protein(s) and RNA (e.g., sgRNA, crRNA) are assembled as a complex in vitro and the purified complex is delivered to a cell.
The CRISPR-Cascade or other Type I complexes including at least one covalently linked effector molecule are injected into either the nuclease or the cytoplasm of a eukaryotic cell. The concentration of each protein or complex injected may be adjusted to limit toxicity and off-target effects. Methods of microinjection into individual cells, or into subcellular organelles (such as the nucleus) are well known in the art; see for instance Microinjection, (eds. Lacal, Perona & Feramisco), Birkhauser Verlag, 1999, and Komarova et al.,“Microinjection pf Protein Samples,” Chapter 5 in Live Cell Imaging (eds. Goldman & Spector), CSHL Press, 2005. Micro injection devices are commercially available, for instance from Tritech Research (Los Angeles, CA).
VII. Embodiments of the Disclosure
In addition to, or as an alternative to the above, the following embodiments are described:
Embodiment 1 is directed to a system comprising:
a Type I CRISPR-Cas complex comprising a plurality of Cas proteins, wherein at least one of the Cas proteins is covalently linked to an effector molecule; and
a guide RNA having a sequence selected to recognize a target nucleotide sequence.
Embodiment 2 is directed to the system of embodiment 1, wherein the complex is a CRISPR- Cascade complex, and the plurality of Cas polypeptides comprises Cas8 (Csel), Cse2, Cas7, Cas5, and Cas6, and the effector molecule is not linked to the Cas8 protein.
Embodiment 3 is directed to the system of embodiment 1 or embodiment 2, wherein the effector molecule is covalently linked to the N-terminus or C-terminus of the at least one Cas protein. Embodiment 4 is directed to the system of any one of embodiments 1 to 3, wherein the effector molecule is directly linked to the at least one Cas protein.
Embodiment 5 is directed to the system of any one of embodiments 1 to 3, wherein the effector molecule is linked to the at least one Cas protein via a linker.
Embodiment 6 is directed to the system of embodiment 5, wherein the effector molecule is linked to the at least one Cas protein by an amino acid linker or wherein the effector molecule is linked to the at least one Cas protein by a streptavidin-biotin linker.
Embodiment 7 is directed to the system of any one of embodiments 1 to 6, wherein the effector molecule comprises a transcriptional activator, a transcriptional repressor, a base editor, a reporter, or a combination of two or more thereof.
Embodiment 8 is directed to the system of embodiment 7, wherein the effector molecule comprises a transcriptional activator comprising one or more VP 16 domains, a p65 activation domain, an RNA polymerase omega subunit, a viral RTA activation domain, or a combination of two or more thereof.
Embodiment 9 is directed to the system of embodiment 8, wherein the effector molecule comprises four VP 16 domains (VP64) or ten VP 16 domains (VP 160).
Embodiment 10 is directed to the system of embodiment 7, wherein the effector molecule comprises a transcriptional repressor comprising a Kruppel associated box (KRAB) domain.
Embodiment 11 is directed to the system of embodiment 7, wherein the effector molecule comprises a base editor comprising a cytidine deaminase, an adenosine deaminase, a uridine glycosylase inhibitor, or a combination of two or more thereof.
Embodiment 12 is directed to the system of embodiment 7, wherein the effector molecule comprises a reporter comprising a fluorescent protein, a fluorescent dye, or Quantum dots.
Embodiment 13 is directed to the system of embodiment 12, wherein the fluorescent protein comprises a green fluorescent protein a blue fluorescent protein, a cyan fluorescent protein, a yellow fluorescent protein, a red fluorescent protein, a variant thereof, or a combination of any two or more thereof.
Embodiment 14 is directed to the system of any one of embodiments 2 to 13, wherein the effector molecule is covalently linked to the Cas7 protein.
Embodiment 15 is directed to the system of embodiment 14, wherein the effector molecule-Cas7 protein comprises an amino acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12
Embodiment 16 is directed to the system of embodiment 15, wherein the effector molecule-Cas7 protein comprises the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 17 is directed to the system of embodiment 15, wherein the effector molecule-Cas7 protein is encoded by a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
Embodiment 18 is directed to the system of embodiment 17, wherein the effector molecule-Cas7 protein is encoded by the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11. Embodiment 19 is directed to the system of any one of embodiments 1 or 3 to 13, wherein the complex is a CRISPR-Csy complex and the plurality of Cas polypeptides comprises Csyl, Csy2, Csy3, and Csy4.
Embodiment 20 is directed to the system of embodiment 19, wherein the effector molecule is covalently linked to the Csy3 protein.
Embodiment 21 is directed to the system of any one of embodiments 1 to 20, wherein one or more of the Cas polypeptides comprises a nuclear localization signal.
Embodiment 22 is directed to the system of embodiment 21, wherein the complex is a CRISPR- Cascade complex and Cas8 and/or Cas6 comprise a nuclear localization signal.
Embodiment 23 is directed to the system of any one of embodiments 1 to 22, further comprising a Cas3 nuclease.
Embodiment 24 is directed to a vector comprising a nucleic acid encoding a Cas protein covalently linked to an effector molecule, wherein the Cas protein is not Csel.
Embodiment 25 is directed to the vector of embodiment 24, wherein the Cas protein is Cas7.
Embodiment 26 is directed to the vector of embodiment 25, wherein the nucleic acid encoding the Cas protein covalently linked to an effector molecule comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
Embodiment 27 is directed to the vector of embodiment 26, wherein the nucleic acid encoding the Cas protein covalently linked to the effector molecule comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
Embodiment 28 is directed to the vector of any one of embodiments 25 to 27, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 29 is directed to the vector of embodiment 28, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 30 is directed to the vector of any one of embodiments 24 to 29, comprising the nucleic acid sequence of SEQ ID NO: 13.
Embodiment 31 is directed to a nucleic acid encoding a Cas7 protein linked to an effector molecule.
Embodiment 32 is directed to the nucleic acid of embodiment 31, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 33 is directed to the nucleic acid of embodiment 32, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 34 is directed to the nucleic acid of embodiment 32 or embodiment 33, wherein the nucleic acid comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
Embodiment 35 is directed to the nucleic acid of embodiment 34, wherein the nucleic acid comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11. Embodiment 36 is directed to a protein comprising Cas7 covalently linked to an effector molecule, wherein the protein comprises at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12
Embodiment 37 is directed to the protein of embodiment 36, wherein the protein comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
Embodiment 38 is directed to a cell comprising the system of any one of embodiments 1 to 23 or the vector of any one of embodiment 24 to 30, or the nucleic acid of any one of embodiments 31 to 35 or the protein of embodiment 36 or embodiment 37.
Embodiment 39 is directed to the cell of embodiment 38, wherein the cell is an animal cell, a plant cell, a fungal cell, an algal cell, or a bacterial cell.
Embodiment 40 is directed to a method, comprising:
contacting genomic DNA in a cell with the system of any one of embodiments 1 to 23.
Embodiment 41 is directed to the method of embodiment 40 wherein contacting the genomic DNA of the cell with the system comprises:
introducing individual protein or nucleic acid components of the system into the cell; introducing the system into the cell;
expressing one or more nucleic acids encoding components of the system in the cell; or a combination of two or more thereof.
Embodiment 42 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises altering expression of a nucleic acid in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a transcription activator effector molecule or a transcription repressor effector molecule into the cell.
Embodiment 43 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises modifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a base editing effector molecule into the cell.
Embodiment 44 is directed to the method of embodiment 40 or embodiment 41, wherein the method comprises detecting and/or quantifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a reporter effector molecule into the cell.
Embodiment 45 is directed to a method for treating or preventing a disease in a subject in need of treatment or prevention, comprising administering to the subject the system of any one of embodiments 1 to 23.
EXAMPLES
The following examples are provided to illustrate certain particular features and/or embodiments. These examples should not be construed to limit the disclosure to the particular features or embodiments described. Example 1
Assembly of Cascade with Effectors
Cascade complexes containing nuclear localization signal (NLS) tags on Cas8 and/or on Cas6 were constructed. The complex containing an NLS tag on Cas8 also contained a VP64 (transcriptional activator) tethered to the C-terminus of Cas7, or an emerald green fluorescent protein (emGFP) tethered to the C- terminus of Cas7 (Green line). The NLScascade WT (wild-type) complex also contained a NLS tag on Cas6. Recombinant expression of the tagged Cascade complex was performed using multiple expression vectors. In initial experiments, one vector contained NLS tagged Cas8, strep tagged-cse2, cas7 with a VP64 or emGFP tag, cas5 and cas6 NLS tagged (FIG. 3A). Preliminary experiments with this vector and a vector with a CRISPR locus failed to produce Cascade. To test whether the assembly failed due to incomplete expression of genes downstream of the large tags ( e.g VP64 or emGFP) on Cas7, this two vector system was complemented with a third plasmid encoding strep-Cas7, Cas5, and Cas6 (FIG. 3B). This resulted in stable Cascade complexes containing the tags on the appropriate subunits (FIG. 5). In view of the many possible embodiments to which the principles of the disclosure may be applied, it should be recognized that the illustrated embodiments are only examples and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims.
We therefore claim as our invention all that comes within the scope and spirit of these claims.

Claims

We claim:
1. A system comprising:
a Type I CRISPR-Cas complex comprising a plurality of Cas proteins, wherein at least one of the Cas proteins is covalently linked to an effector molecule; and
a guide RNA having a sequence selected to recognize a target nucleotide sequence.
2. The system of claim 1, wherein the complex is a CRISPR-Cascade complex, and the plurality of Cas polypeptides comprises Cas8 (Csel), Cse2, Cas7, Cas5, and Cas6, and the effector molecule is not linked to the Cas8 protein.
3. The system of claim 2, wherein the effector molecule is covalently linked to the N-terminus or C- terminus of the at least one Cas protein.
4. The system of any one of claims 1 to 3, wherein the effector molecule is directly linked to the at least one Cas protein.
5. The system of any one of claims 1 to 3, wherein the effector molecule is linked to the at least one Cas protein via a linker.
6. The system of claim 5, wherein the effector molecule is linked to the at least one Cas protein by an amino acid linker or wherein the effector molecule is linked to the at least one Cas protein by a streptavidin- biotin linker.
7. The system of any one of claims 1 to 3, wherein the effector molecule comprises a transcriptional activator, a transcriptional repressor, a base editor, a reporter, or a combination of two or more thereof.
8. The system of claim 7, wherein the effector molecule comprises a transcriptional activator comprising one or more VP16 domains, a p65 activation domain, an RNA polymerase omega subunit, a viral RTA activation domain, or a combination of two or more thereof.
9. The system of claim 8, wherein the effector molecule comprises four VP 16 domains (VP64) or ten VP16 domains (VP160).
10. The system of claim 7, wherein the effector molecule comprises a transcriptional repressor comprising a Kruppel associated box (KRAB) domain.
11. The system of claim 7, wherein the effector molecule comprises a base editor comprising a cytidine deaminase, an adenosine deaminase, a uridine glycosylase inhibitor, or a combination of two or more thereof.
12. The system of claim 7, wherein the effector molecule comprises a reporter comprising a fluorescent protein, a fluorescent dye, or Quantum dots.
13. The system of claim 12, wherein the fluorescent protein comprises a green fluorescent protein a blue fluorescent protein, a cyan fluorescent protein, a yellow fluorescent protein, a red fluorescent protein, a variant thereof, or a combination of any two or more thereof.
14. The system of claim 2, wherein the effector molecule is covalently linked to the Cas7 protein.
15. The system of claim 14, wherein the effector molecule-Cas7 protein comprises an amino acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12
16. The system of claim 15, wherein the effector molecule-Cas7 protein comprises the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
17. The system of claim 15, wherein the effector molecule-Cas7 protein is encoded by a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
18. The system of claim 17, wherein the effector molecule-Cas7 protein is encoded by the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
19. The system of claim 1 or claim 3, wherein the complex is a CRISPR-Csy complex and the plurality of Cas polypeptides comprises Csyl, Csy2, Csy3, and Csy4.
20. The system of claim 19, wherein the effector molecule is covalently linked to the Csy3 protein.
21. The system of any one of claims 1 to 3, wherein one or more of the Cas polypeptides comprises a nuclear localization signal.
22. The system of claim 21, wherein the complex is a CRISPR-Cascade complex and Cas8 and/or Cas6 comprise a nuclear localization signal.
23. The system of any one of claims 1 to 3, further comprising a Cas3 nuclease.
24. A vector comprising a nucleic acid encoding a Cas protein covalently linked to an effector molecule, wherein the Cas protein is not Csel.
25. The vector of claim 24, wherein the Cas protein is Cas7.
26. The vector of claim 25, wherein the nucleic acid encoding the Cas protein covalently linked to an effector molecule comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs:
1, 3, 5, 7, 9, or 11.
27. The vector of claim 26, wherein the nucleic acid encoding the Cas protein covalently linked to the effector molecule comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7,
9, or 11.
28. The vector of claim 25, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
29. The vector of claim 28, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
30. The vector of any one of claims 24 to 29, comprising the nucleic acid sequence of SEQ ID NO: 13.
31. A nucleic acid encoding a Cas7 protein linked to an effector molecule.
32. The nucleic acid of claim 31, wherein the nucleic acid encodes a protein with at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
33. The nucleic acid of claim 32, wherein the nucleic acid encodes a protein comprising or consisting of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
34. The nucleic acid of claim 32, wherein the nucleic acid comprises a nucleic acid with at least 90% sequence identity to any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
35. The nucleic acid of claim 34, wherein the nucleic acid comprises or consists of the nucleic acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, or 11.
36. A protein comprising Cas7 covalently linked to an effector molecule, wherein the protein comprises at least 90% sequence identity to any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
37. The protein of claim 36, wherein the protein comprises or consists of the amino acid sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, or 12.
38. A cell comprising the system of any one of claims 1 to 3 or the vector of any one of claims 24 to 29, or the nucleic acid of any one of claims 31 to 35 or the protein of claim 36 or claim 37.
39. The cell of claim 38, wherein the cell is an animal cell, a plant cell, a fungal cell, an algal cell, or a bacterial cell.
40. A method, comprising:
contacting genomic DNA in a cell with the system of any one of claims 1 to 3.
41. The method of claim 40 wherein contacting the genomic DNA of the cell with the system comprises:
introducing individual protein or nucleic acid components of the system into the cell;
introducing the system into the cell;
expressing one or more nucleic acids encoding components of the system in the cell; or a combination of two or more thereof.
42. The method of claim 41, wherein the method comprises altering expression of a nucleic acid in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a transcription activator effector molecule or a transcription repressor effector molecule into the cell.
43. The method of claim 41, wherein the method comprises modifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a base editing effector molecule into the cell.
44. The method of claim 41, wherein the method comprises detecting and/or quantifying a target nucleic acid sequence in the cell and the method comprises introducing a Type I CRISPR-Cas complex comprising a reporter effector molecule into the cell.
45. A method for treating or preventing a disease in a subject in need of treatment or prevention, comprising administering to the subject the system of any one of claims 1 to 3.
PCT/US2019/059098 2018-11-01 2019-10-31 Gene modulation with crispr system type i WO2020092725A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862754268P 2018-11-01 2018-11-01
US62/754,268 2018-11-01

Publications (1)

Publication Number Publication Date
WO2020092725A1 true WO2020092725A1 (en) 2020-05-07

Family

ID=70464243

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/059098 WO2020092725A1 (en) 2018-11-01 2019-10-31 Gene modulation with crispr system type i

Country Status (1)

Country Link
WO (1) WO2020092725A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115595330A (en) * 2021-07-12 2023-01-13 中国科学院微生物研究所(Cn) CRISPR-Cas3 system and application thereof in aspect of plant virus resistance
WO2023004391A3 (en) * 2021-07-21 2023-03-02 Montana State University Nucleic acid detection using type iii crispr complex

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014197748A2 (en) * 2013-06-05 2014-12-11 Duke University Rna-guided gene editing and gene regulation
WO2017066497A2 (en) * 2015-10-13 2017-04-20 Duke University Genome engineering with type i crispr systems in eukaryotic cells
US20170121693A1 (en) * 2015-10-23 2017-05-04 President And Fellows Of Harvard College Nucleobase editors and uses thereof
US20170204407A1 (en) * 2014-07-14 2017-07-20 The Regents Of The University Of California Crispr/cas transcriptional modulation
US20180119121A1 (en) * 2011-12-30 2018-05-03 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180119121A1 (en) * 2011-12-30 2018-05-03 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
WO2014197748A2 (en) * 2013-06-05 2014-12-11 Duke University Rna-guided gene editing and gene regulation
US20170204407A1 (en) * 2014-07-14 2017-07-20 The Regents Of The University Of California Crispr/cas transcriptional modulation
WO2017066497A2 (en) * 2015-10-13 2017-04-20 Duke University Genome engineering with type i crispr systems in eukaryotic cells
US20170121693A1 (en) * 2015-10-23 2017-05-04 President And Fellows Of Harvard College Nucleobase editors and uses thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BIKARD ET AL.: "Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system", NUCLEIC ACIDS RESEARCH, vol. 41, no. 15, 12 June 2013 (2013-06-12), pages 7429 - 7437, XP055195374, DOI: 10.1093/nar/gkt520 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115595330A (en) * 2021-07-12 2023-01-13 中国科学院微生物研究所(Cn) CRISPR-Cas3 system and application thereof in aspect of plant virus resistance
WO2023004391A3 (en) * 2021-07-21 2023-03-02 Montana State University Nucleic acid detection using type iii crispr complex
US11814689B2 (en) 2021-07-21 2023-11-14 Montana State University Nucleic acid detection using type III CRISPR complex

Similar Documents

Publication Publication Date Title
JP6525971B2 (en) Protein-enriched microvesicles, methods of making and using protein-enriched microvesicles
JP7275043B2 (en) Enhanced hAT Family Transposon-Mediated Gene Transfer and Related Compositions, Systems and Methods
JP2023168355A (en) Methods for improved homologous recombination and compositions thereof
WO2017136520A1 (en) Mitochondrial genome editing and regulation
US11767528B2 (en) Targeted trans-splicing using CRISPR/Cas13
US20060252140A1 (en) Development of a transposon system for site-specific DNA integration in mammalian cells
AU2019244594B2 (en) Modified nucleic acid editing systems for tethering donor DNA
KR20190089175A (en) Compositions and methods for target nucleic acid modification
CA3068072A1 (en) Methods and compositions for assessing crispr/cas-mediated disruption or excision and crispr/cas-induced recombination with an exogenous donor nucleic acid in vivo
JP2024041866A (en) Enhanced hAT family transposon-mediated gene transfer and related compositions, systems, and methods
EP3943600A1 (en) Novel, non-naturally occurring crispr-cas nucleases for genome editing
CN110891419A (en) Evaluation of CRISPR/CAS-induced in vivo recombination with exogenous donor nucleic acids
WO2020092725A1 (en) Gene modulation with crispr system type i
CN113993994A (en) Polynucleotides, compositions and methods for polypeptide expression
WO2023235725A2 (en) Crispr-based therapeutics for c9orf72 repeat expansion disease
CN117355607A (en) Non-viral homology mediated end ligation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19878022

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19878022

Country of ref document: EP

Kind code of ref document: A1