WO2010054108A2 - Cas6 polypeptides and methods of use - Google Patents

Cas6 polypeptides and methods of use Download PDF

Info

Publication number
WO2010054108A2
WO2010054108A2 PCT/US2009/063432 US2009063432W WO2010054108A2 WO 2010054108 A2 WO2010054108 A2 WO 2010054108A2 US 2009063432 W US2009063432 W US 2009063432W WO 2010054108 A2 WO2010054108 A2 WO 2010054108A2
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotide
polypeptide
cas6
amino acid
acid sequence
Prior art date
Application number
PCT/US2009/063432
Other languages
French (fr)
Other versions
WO2010054108A9 (en
WO2010054108A8 (en
WO2010054108A3 (en
Inventor
Rebecca M. Terns
Michael P. Terns
Jason Carte
Original Assignee
University Of Georgia Research Foundation, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Georgia Research Foundation, Inc. filed Critical University Of Georgia Research Foundation, Inc.
Priority to US13/127,764 priority Critical patent/US9404098B2/en
Publication of WO2010054108A2 publication Critical patent/WO2010054108A2/en
Publication of WO2010054108A3 publication Critical patent/WO2010054108A3/en
Publication of WO2010054108A9 publication Critical patent/WO2010054108A9/en
Publication of WO2010054108A8 publication Critical patent/WO2010054108A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Definitions

  • RNAi RNA silencing RNA silencing RNAs
  • the RNAi defense response is mediated by short ( ⁇ 22-nucleotide [nt]) RNAs termed siRNAs.
  • the siRNAs are generated from invading viral RNAs by dsRNA-specific, RNase Ill-like endonucleases called Dicers (Jaskiewicz and Filipowicz, 2008. Curr. Top. Microbiol. Immunol. 320: 77-97).
  • siRNAs are assembled with host effector proteins and target them to corresponding viral target RNAs to effect viral gene silencing via RNA destruction or other mechanisms (Farazi et al., 2008. Development 135: 1201- 1214; Girard and Hannon, 2008. Trends Cell Biol. 18:136-148).
  • the pathway is proposed to arise from two evolutionarily and often physically linked gene loci: the CRISPR (clustered regularly interspaced short palindromic repeats) locus, which encodes RNA components of the system, and the cas (CRISPR-associated) locus, which encodes proteins (Jansen et al., 2002. MoI. Microbiol. 43: 1565—1575; Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Makarova et al., 2006. Biol. Direct 1 : 7; Haft et al., 2005. PLoS Comput. Biol. 1 : e60).
  • CRISPR clustered regularly interspaced short palindromic repeats
  • cas CRISPR-associated locus
  • the individual Cas proteins do not share significant sequence similarity with protein components of the eukaryotic RNAi machinery, but have analogous predicted functions (e.g., RNA binding, nuclease, helicase, etc.) (Makarova et al., 2006. Biol. Direct 1: 7).
  • the effector RNAs of pRNAi are encoded in the host genome.
  • CRISPR loci encode short (typically ⁇ 30- to 35-nt) invader-derived sequences interspersed between short (typically ⁇ 30- to 35-nt) direct repeat sequences (Bolotin et al., 2005.
  • RNAs that contain the invader targeting sequences, and are termed guide RNAs or prokaryotic silencing RNAs (psiRNAs) based on their hypothesized role in the pathway (Makarova et al., 2006. Biol. Direct 1: 7; Hale et al., 2008. RNA, 14: 2572-2579).
  • RNA analysis indicates that CRISPR locus transcripts are cleaved within the repeat sequences to release ⁇ 60- to 70-nt RNA intermediates that contain individual invader targeting sequences and flanking repeat fragments (Fig. IA; Tang et al., 2002. Proc. Natl. Acad. Sci. 99: 7536-7541; Tang et al., 2005. MoL Microbiol. 55: 469-481; Lillestol et al., 2006. Archaea 2: 59-72; Brouns et al., 2008. Science 321 : 960-964; Hale et al., 2008. RNA, 14: 2572- 2579). In the archaeon Pyrococcus furiosus, these intermediate RNAs are further processed to abundant, stable ⁇ 35- to 45-nt mature psiRNAs (Hale et al., 2008. RNA, 14: 2572-2579).
  • the polynucleotides may include a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the amino acid sequence of the polypeptide and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, or the complement thereof.
  • the polyncleotides may include a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the nucleotide sequence of the isolated polynucleotide and the nucleotide sequence of SEQ ID NO:1 have at least 80% identity, or the complement thereof.
  • the polynucleotides may be enriched, isolated, or purified.
  • the polynucleotides may include a heterologous polynucleotide, such as a regulatory sequence, or a vector.
  • a polynucleotide referred to herein as a target RNA polynucleotide, may include a Cas6 recognition domain, wherein the Cas6 recognition domain includes 5'- GTTACAATAAGA (SEQ ID NO:237), or the complement thereof.
  • the polynucleotide may include UNCNNUNNNNM4NNNNNNNNNNNNNNNNNN (SEQ ID NO: 192), UUACAAUANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN (SEQ ID NO: 193), GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO: 191), or the complements thereof.
  • the polynucleotide may include an operably linked regulatory sequence or a vector, and the polynucleotide may be RNA.
  • polypeptides A polypeptide has Cas6 endoribonuclease activity, and the polypeptide includes an amino acid sequence, wherein the amino acid sequence and the amino acid sequence of SEQ ID NO:2 have at least 80% identity.
  • the polypeptides may further include a heterologous polypeptide.
  • a polypeptide may be enriched, isolated, or purified.
  • genetically modified microbes A genetically modified microbe may include a polynucleotide described herein or a polypeptide described herein.
  • the microbe may be, for instance, a bacteria, such as a gram positive or a gram negative microbe, for example, E. coli, or an archeae, such as Haloferax volcanii.
  • compositions that include the polynucleotides, the polypeptides, and/or the genetically modified microbes described herein.
  • a composition may include a polypeptide having Cas6 activity, a target RNA polynuncleotide, or the combination.
  • the methods may be used to cleave a nucleotide sequence.
  • the method may include incubating a target RNA polynucleotide with a polypeptide under conditions suitable for cleavage of the target RNA polynucleotide, wherein the target RNA polynucleotide includes a Cas6 recognition domain.
  • the polypeptide may be a Cas6 polypeptide from a microbe genome, for instance, the polypeptide includes an amino acid sequence having at least 80% with the amino acid sequence of SEQ ID NO:2, an amino acid sequence depicted in Figure 1, an amino acid sequence depicted in Figure 2, or an amino acid sequence depicted in Figure 3, and has Cas6 endoribonuclease activity.
  • the polypeptide cleaves the target RNA polynucleotide at a cleavage site.
  • the cleavage site may be located 5 to 20 nucleotides downstream of the Cas ⁇ recognition domain.
  • the target RNA polynucleotide may include a Cas6 recognition domain.
  • the Cas ⁇ recognition domain may be one that is present in a microbe genome, such as 5'- GTTACAATAAGA (SEQ ID NO:237).
  • the target RNA polynucleotide may include UNCNNUNNNNNNNNNNNNNNNNNNNN ⁇ (SEQ ID NO: 192), or UUACAAUANNNNNNNNNNNNNNNN>M ⁇ M ⁇ TON (SEQ ID NO:193), or GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO:191).
  • the methods may be in vivo or in vitro.
  • an "enriched" polynucleotide means that a polynucleotide constitutes a significantly higher fraction of the total DNA or RNA present in a mixture of interest than in cells from which the sequence was taken.
  • a person skilled in the art could enrich a polynucleotide by preferentially reducing the amount of other polynucleotides present, or preferentially increasing the amount of the specific polynucleotide, or both.
  • polynucleotide enrichment does not imply that there is no other DNA or RNA present, the term only indicates that the relative amount of the sequence of interest has been significantly increased.
  • an "enriched" polypeptide defines a specific amino acid sequence constituting a significantly higher fraction of the total of amino acids present in a mixture of interest than in cells from which the polypeptide was separated.
  • a person skilled in the art can preferentially reduce the amount of other amino acid sequences present, or preferentially increase the amount of specific amino acid sequences of interest, or both.
  • the term “enriched” does not imply that there are no other amino acid sequences present. Enriched simply means the relative amount of the sequence of interest has been significantly increased. The term “significant” indicates that the level of increase is useful to the person making such an increase.
  • the term also means an increase relative to other amino acids of at least 2 fold, or more preferably at least 5 to 10 fold, or even more.
  • the term also does not imply that there are no amino acid sequences from other sources.
  • Other amino acid sequences may, for example, include amino acid sequences from a host organism.
  • an "isolated" substance is one that has been removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized.
  • a polypeptide or a polynucleotide can be isolated.
  • a substance may be purified, i.e., is at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other components with which it is naturally associated.
  • polypeptide refers broadly to a polymer of two or more amino acids joined together by peptide bonds.
  • polypeptide also includes molecules which contain more than one polypeptide joined by a disulfide bond, or complexes of polypeptides that are joined together, covalently or noncovalently, as multimers (e.g., dimers, tetramers).
  • multimers e.g., dimers, tetramers.
  • peptide, oligopeptide, enzyme, and protein are all included within the definition of polypeptide and these terms are used interchangeably.
  • heterologous amino acids or “heterologous polypeptides” refer to amino acids that are not normally associated with a polypeptide in a wild-type cell.
  • heterologous polypeptides include, but are not limited to a tag useful for purification or a carrier polypeptide useful to increase immunogenicity of a polypeptide.
  • a polypeptide that includes heterologous polypeptides may be referred to as a fusion polypeptide.
  • polynucleotide refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides, and includes both double- and single-stranded RNA and DNA.
  • a polynucleotide can be obtained directly from a natural source, or can be prepared with the aid of recombinant, enzymatic, or chemical techniques.
  • a polynucleotide can be linear or circular in topology.
  • a polynucleotide may be, for example, a portion of a vector, such as an expression or cloning vector, or a fragment.
  • a polynucleotide may include nucleotide sequences having different functions, including, for instance, coding regions, and non-coding regions such as regulatory regions.
  • coding region and “coding sequence” are used interchangeably and refer to a nucleotide sequence that encodes a polypeptide and, when placed under the control of appropriate regulatory sequences expresses the encoded polypeptide.
  • the boundaries of a coding region are generally determined by a translation start codon at its 5' end and a translation stop codon at its 3' end.
  • a "regulatory sequence” is a nucleotide sequence that regulates expression of a coding sequence to which it is operably linked.
  • Non-limiting examples of regulatory sequences include promoters, enhancers, transcription initiation sites, translation start sites, translation stop sites, and transcription terminators.
  • operably linked refers to a juxtaposition of components such that they are in a relationship permitting them to function in their intended manner.
  • a regulatory sequence is "operably linked" to a coding region when it is joined in such a way that expression of the coding region is achieved under conditions compatible with the regulatory sequence.
  • a polynucleotide that includes a coding region may include heterologous nucleotides that flank one or both sides of the coding region.
  • heterologous nucleotides refer to nucleotides that are not normally present flanking a coding region that is present in a wild-type cell. For instance, a coding region present in a wild-type microbe and encoding a Cas6 polypeptide is flanked by homologous sequences, and any other nucleotide sequence flanking the coding region is considered to be heterologous. Examples of heterologous nucleotides include, but are not limited to regulatory sequences.
  • heterologous nucleotides are present in a polynucleotide disclosed herein through the use of standard genetic and/or recombinant methodologies well known to one skilled in the art.
  • a polynucleotide disclosed herein may be included in a suitable vector.
  • an "exogenous polynucleotide” refers to a polynucleotide that is not normally or naturally found in a microbe.
  • the term “endogenous polynucleotide” refers to a polynucleotide that is normally or naturally found in a cell microbe.
  • An “endogenous polynucleotide” is also referred to as a "native polynucleotide.”
  • identity refers to sequence similarity between two polypeptides or two polynucleotides.
  • sequence similarity between two polypeptides is determined by aligning the residues of the two polypeptides (e.g., a candidate amino acid sequence and a reference amino acid sequence, such as SEQ ID NO: 2) to optimize the number of identical amino acids along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared amino acids, although the amino acids in each sequence must nonetheless remain in their proper order.
  • sequence similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. Sequence similarity may be determined, for example, using sequence analysis techniques such as the BESTFIT or GAP algorithm in the GCG package (Madison WI), or the Blastp program of the BLAST 2 search algorithm, as described by Tatusova, et al.
  • sequence similarity between two amino acid sequences is determined using the Blastp program of the BLAST 2 search algorithm.
  • sequence similarity is referred to as "identities.”
  • sequence similarity between two polynucleotides is determined by aligning the residues of the two polynucleotides (e.g., a candidate nucleotide sequence and a reference nucleotide sequence, such as SEQ ID NO:1) to optimize the number of identical nucleotides along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared nucleotides, although the nucleotides in each sequence must nonetheless remain in their proper order.
  • residues of the two polynucleotides e.g., a candidate nucleotide sequence and a reference nucleotide sequence, such as SEQ ID NO:1
  • sequence similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. Sequence similarity may be determined, for example, using sequence techniques such as GCG FastA (Genetics Computer Group, Madison, Wisconsin), Mac Vector 4.5 (Kodak/IBI software package) or other suitable sequence analysis programs or methods known in the art.
  • sequence similarity between two nucleotide sequences is determined using the Blastn program of the BLAST 2 search algorithm, as described by Tatusova, et al. (1999, FEMS Microbiol Lett., 174:247-250), and available through the World Wide Web, for instance at the internet site maintained by the National Center for Biotechnology Information, National Institutes of Health.
  • sequence similarity is referred to as "identities.”
  • prokaryotic microbe and “microbe” are used interchangeably and refer to members of the domains Bacteria and Archaea.
  • genetically modified microbe refers to a microbe which has been altered “by the hand of man.”
  • a genetically modified microbe includes a microbe into which has been introduced an exogenous polynucleotide.
  • Genetically modified microbe also refers to a microbe that has been genetically manipulated such that endogenous nucleotides have been altered to include a mutation, such as a deletion, an insertion, a transition, a transversion, or a combination thereof. For instance, an endogenous coding region could be deleted. Such mutations may result in a polypeptide having a different amino acid sequence than was encoded by the endogenous polynucleotide.
  • Another example of a genetically modified microbe is one having an altered regulatory sequence, such as a promoter, to result in increased or decreased expression of an operably linked endogenous coding region.
  • Conditions that are "suitable” for an event to occur such as cleavage of a polynucleotide, or “suitable” conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event.
  • in vitro refers to air artificial environment and to processes or reactions that occur within an artificial environment. In vitro environments can consist of, but are not limited to, test tubes.
  • in vivo refers to the natural environment (e.g., a cell, including a genetically modified microbe) and to processes or reaction that occur within a natural environment.
  • the steps may be conducted in any feasible order. And, as appropriate, any combination of two or more steps may be conducted simultaneously.
  • Cas ⁇ is an endoribonuclease that cleaves CRISPR RNAs within repeat sequences.
  • A psiRNA biogenesis pathway model. The primary CRISPR transcript contains unique invader targeting or guide sequences (shaded blocks) flanked by direct repeat sequences (R). Cas ⁇ catalyzes site- specific cleavage within each repeat, releasing individual invader targeting units. The Cas6 cleavage products undergo further processing to generate smaller mature psiRNA species.
  • B Purified recombinant PfCas ⁇ expressed in E. coli. The sizes (in kilodaltons) of protein markers (M) are indicated.
  • RNAs (repeat-guide-repeat [R-g-R] or repeat alone [R], as diagrammed) were either uniformly or 5 '-end-labeled and incubated in the absence (-) or presence (+) of PfCas ⁇ protein (500 nM). Products were resolved by denaturing gel electrophoresis and visualized using a phosphorimager. The main cleavage products are indicated by a star or asterisk on the gel and in the diagram.
  • FIG. 1 PfCas ⁇ cleavage of a CRISPR RNA containing two repeat- guide RNA units.
  • a uniformly radiolabeled substrate RNA containing two guide (invader targeting) sequences ( ⁇ ), two repeats (R) and a short (natural) 5' leader (L) sequence was incubated with 1 ⁇ M PfCas ⁇ protein and samples were analyzed by denaturing gel electrophoresis at the indicated times.
  • the expected sizes and compositions of the RNA products are indicated, as are the sizes of the marker RNAs (M).
  • FIG. 3 Identification of the site of PfCas ⁇ cleavage within the CRISPR repeat RNA.
  • A The site of PfCas ⁇ cleavage within the CRISPR repeat RNA was mapped by incubating 5' end labeled repeat RNA with PfCas ⁇ nuclease and comparing the size of the 5' RNA cleavage product (arrow) with RNAse Tl (Tl) and alkaline hydrolysis (OH) sequence ladders.
  • B Potential secondary structure off. furiosus repeat RNA with cleavage site indicated.
  • C Analysis of cleavage of wild-type and cleavage site mutant (AA to GG) repeat RNAs with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCas ⁇ .
  • D Native gel mobility shift analysis of wild-type and mutant repeat RNAs with increasing concentrations of PfCas ⁇ . The positions of the free (RNA) and protein-bound (RNP) RNAs are indicated. 5' and 3' cleavage products are indicated in both C and D. The sizes of RNA markers (M) are indicated in A and C.
  • FIG. 4 CRISPR repeat sequence requirements for PfCas ⁇ binding.
  • A Detailed analysis of binding with a series of CRISPR-derived RNAs and mutants.
  • the left panel illustrates the RNAs tested, with repeat (R) and invader targeting ( ⁇ ) sequences, and PfCas ⁇ cleavage site (dashed lines) indicated.
  • the shaded portion of j denotes an insertion
  • dashed block denotes an internal deletion
  • the shaded portions of **, e, f, and k denote substitutions (with complementary sequence).
  • DNA indicates a DNA repeat sequence substrate.
  • PfCas ⁇ binding is summarized relative to binding to the 5' cleavage product (++++).
  • Corresponding RNA diagrams and data panels are designated with lowercase letters.
  • the right panels show gel mobility shift analysis of the indicated RNAs with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCas ⁇ . Substrates are uniformly radiolabeled except for those shown in panels a, b, c, and 1, which are 5 '-end-labeled. Data for the intact repeat (*) and cleavage site mutant (**) are shown in Figure 3D.
  • (B) PfCas ⁇ interacts with the gel-purified 5' cleavage product.
  • the left panel shows the products of incubation of uniformly radiolabeled repeat RNA with (+) or without (-) PfCas ⁇ (1 ⁇ M). The positions of the 5' and 3' cleavage products are indicated.
  • the right panel shows native gel mobility shift analysis of the gel-purified 5' and 3' PfCas ⁇ cleavage products (from the left panel) with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCas ⁇ .
  • concentrations (0, 1, 50, 200, and 500 nM) of PfCas ⁇ .
  • RNP protein-bound RNA
  • FIG. 5 Influence of temperature on the ability of PfCas ⁇ to bind and cleave CRISPR repeat RNA.
  • Repeat RNA (uniformly radiolabeled) was incubated with (+) or without (-) 1 ⁇ M PfCas ⁇ protein at the indicated temperatures and the products were resolved by electrophoresis on denaturing (A) or native (B) polyacrylamide gels to assess RNA binding or cleavage, respectively.
  • the positions of the 5' and 3' cleavage products are indicated.
  • the positions of the free (RNA) and protein-bound (RNP) RNAs are indicated in panel B. Based on the data shown in panel A, the RNPs in panel B include primarily the 5' cleavage product at higher temperatures and the intact repeat at lower temperatures.
  • FIG. 6 CRISPR repeat sequence requirements for PfCas ⁇ cleavage.
  • the left panel illustrates the RNAs tested as in Figure 4.
  • PfCas ⁇ cleavage is summarized relative to cleavage of the intact repeat RNA (++++).
  • PfCas ⁇ binding is summarized from Figure 4.
  • Corresponding RNA diagrams and data panels are designated with lowercase letters.
  • the right panels show cleavage assays using uniformly radiolabeled repeat RNA with (+) or without (-) PfCas ⁇ (500 nM). Data for the intact repeat (*) is shown on right and data for the cleavage site mutant (**) is shown in Figure 3C.
  • FIG. 7 Structural features of PfCas ⁇ .
  • the fold topology is illustrated with arrows ( ⁇ -strands) and circles ( ⁇ -helices).
  • the G-rich loop characteristic of RAMP proteins is designated " ⁇ l l” in A and "G-rich loop” in B and the predicted catalytic triad residues are labeled Tyr31 , His46, and Lys52 in B.
  • the electrostatic potential was computed using the GRASP2 program (Petrey and Honig 2003. Methods Enzymol. 374: 492- 509) and is shaded dark and light, for negative and positive potentials, respectively.
  • FIG. 8 Amino acid sequence alignment of Cas ⁇ proteins.
  • the strictly conserved residues are the putative catalytic triad residues and the four glycine residues in the G-rich loop, ⁇ l, ⁇ 2, etc., ⁇ l, ⁇ 2, etc., and TT refer to predicted secondary structure elements, ⁇ - strand, ⁇ helix, ⁇ -turn, respectively.
  • Organisms and genes listed include: Pyrococcus furiosus DSM 3638 (gi_l 8977503), Pyrococcus abyssi GE5 (gi_14521345), Pyrococcus horikoshii OT3 (gi_14591070), Thermococcus kodakaraensis KODl (gi_57640399), Methanocaldococcus jannaschii DSM 2661 (gi_l 5668551), Pelodictyon phaeoclathratiforme BU-I (gi_68548726), Archaeoglobus fulgidus DSM 4304 (gi_l 1497692), Chlorobium phaeobacteroides DSM 266 (gi_l 19357836), Candidatus Desulforudis audaxviator MP 104C (gij 69831963), Prosthecochloris aestuarii DSM 271 (gi_6855202
  • Fusaro (gi_73667850), Methanosarcina acetivorans C2A (gi_20092472), Geobacillus thermodenitrificans NG80-2 (gi_138893955), Thermotoga maritima MSB8 (gi_l 5644558 ), Thermotoga sp. RQ2 (gij 70288802), sp.
  • WCH70 (gi_l 71325396), Desulfitobacterium hafniense DCB-2 (gi_l 09645858), Chlorobium limicola DSM 245 (gi_67917921), Desulfitobacterium hafniense Y51 (gi_l 24521532), Methanobrevibacter smithii ATCC 35061 (gi_l 48642230), Carboxydothermus hydrogenoformans Z-2901 (gi_78043250), Methanococcus voltae A3 (gi_l 63800065), Pelotomaculum thermopropionicum SI (gi_147678256), Methanosphaera stadtmanae DSM 3091 (gi_84489743), Clostridium thermocellum ATCC 27405 (gij 25974788), Candidatus Kuenenia stuttgartiensis (gi_91200631), Caldicellulosiruptor saccharolyticus DSM
  • FIG. 9 Catalytic features of PfCas6 cleavage activity.
  • Cleavage activity is not dependent on divalent metal ions. Uniformly radiolabeled repeat RNA was incubated with 1 ⁇ M PfCas ⁇ in the absence (-) or presence (+) of 1.5 mM MgCl 2 or 20 niM metal chelator EDTA as indicated.
  • B Analysis of the termini of PfCas ⁇ cleavage products.
  • RNA cleavage reactions performed with unlabeled repeat RNA substrates (initially containing hydroxyl groups at both the 5' and 3' termini) were radiolabeled at either their 5' ends (using 32 P-ATP and polynuclotide kinase) or 3' ends (using 32pCp and RNA ligase).
  • the positions of the 5' and 3' cleavage products are indicated in A and B.
  • C The pattern of radiolabeling of the RNA cleavage products (B) indicates that PfCas ⁇ cleaves on the 5' side of the phosphodiester bond, as is the case for other metal-independent ribonucleases. Cleavage likely generates 5' hydroxyl (OH) and 2', 3' cyclic phosphate (>P) RNA termini.
  • FIG. 10 Lead-induced and RNase A cleavage footprinting with CRISPR repeat RNA and PfCas ⁇ .
  • A 3 ' end labeled CRISPR repeat RNA was incubated in the absence (RNA) or presence of increasing concentrations of PfCas ⁇ (indicated in ⁇ M) and then subjected to RNase A cleavage (left panel) or lead-induced cleavage (right panel).
  • RNAs were separated by 15% denaturing (7 M urea) polyacrylamide gels. Size markers include 5' end- labeled RNA markers (M) and alkaline hydrolysis ladders (OH). Bars along right side of each gel indicate strong protections.
  • FIG. 11 Cleavage activity of Cas6 mutants.
  • A Uniformly 32 P labeled CRISPR repeat RNA was incubated in the absence (-) or presence of increasing concentrations of wild type or mutant Cas6 (0.001, 0.05, and 0.5 ⁇ M) followed by separation on a 15% denaturing (7 M urea) polyacrylamide gel. The 5' and 3' cleavage products are indicated.
  • B Purified wild type (wt) and mutant Cas6 proteins (as indicated above) were separated by SDS-PAGE. Molecular weight markers are indicated in kDa.
  • FIG. 12 Substrate recognition by Cas6 mutants. Uniformly P- labeled CRISPR repeat RNA was incubated in the absence (-) or presence of increasing concentrations of wild type or mutant Cas6 (0.001, 0.05, 0.2, and 0.5 ⁇ M) and then assessed for their ability to form a stable complex with the substrate RNA by employing native gel mobility shift analysis. The positions of the free (RNA) and bound (RNP) substrate RNA are indicated.
  • Figure 13 Native Cas6 cleaves CRISPR repeat RNA and associates with crRNAs.
  • RNA Uniformly 32 P-labeled CRISPR repeat RNA was incubated in the absence (RNA) or presence of recombinant Cas6 (rCas ⁇ ), whole cell extract (WCE), or samples from immunoprecipitation reactions using anti-Cas6 antibodies (Pre, preimmune; Imm, immune; S, supernatant; P, pellet).
  • rCas ⁇ recombinant Cas6
  • WCE whole cell extract
  • P supernatant
  • the RNAs were separated on a 15% denaturing, 7 M urea, containing polyacrylamide gel along with 5' end-labeled RNA markers (M).
  • M 5' end-labeled RNA markers
  • RNAs extracted from WCE, preimmune (Pre) and immune (Imm) supernatants (Sup, left panel), and pellets (Pel, right panel) from an immunoprecipitation using anti-Cas6 antibodies were separated on 15% denaturing, 7 M urea containing, polyacrylamide gel along with 5' end-labeled RNA markers (M).
  • M 5' end-labeled RNA markers
  • FIG 14. The proposed catalytic mechanism of Cas6. Tyr31 acts as a general base and His46 as a general acid, while Lys52 stabilizes a predicted pentavalent intermediate. The cleavage products generated contain a 5' OH and likely 2' -3' cyclic phosphate.
  • Figure 15. Amino acid sequences of Cas6 polypeptides from Archeae.
  • FIG. 1 Amino acid sequences of Cas6 polypeptides from Bacteria. The alphanumeric code above each sequence is the UniProtKB/TrEMBL accession number.
  • FIG. 1 Amino acid sequences of Cas6 polypeptides from Cyanobacteria.
  • the alphanumeric code above each sequence is the UniProtKB/TrEMBL accession number.
  • FIG. 1 Amino acid sequences of a Cas6 polypeptide (SEQ ID NO:2) and a nucleotide sequence (SEQ ID NO: 1) encoding the polypeptide.
  • FIG. 19 Alignments between Cas6 polypeptide regions and domains of hidden Markov models present in the TIGRFAM database of protein families. Amino acids 44 to 236 or 95 to 238 of SEQ ID NO:2), domain present in T ⁇ GR01877 (SEQ ID NO:188), domain present in PR01881 (SEQ ID NO: 189).
  • polypeptides having endoribonuclease activity are provided herein.
  • a polypeptide having endoribonuclease activity as described below is referred to herein as a Cas6 polypeptide, and the endoribonuclease activity is referred to herein as Cas6 endoribonuclease activity.
  • Examples of Cas6 polypeptides are depicted at Genbank Accession No. AAL81255 (SEQ ID NO:2), Figure 15, Figure 16, and Figure 17.
  • Other examples of Cas6 polypeptides provided herein include those having sequence similarity with the amino acid sequence of SEQ ID NO:2, an amino acid sequence depicted in Figure 15, an amino acid sequence depicted in Figure 16, or an amino acid sequence depicted at Figure 17.
  • a Cas6 polypeptide having sequence similarity with the amino acid sequence depicted at SEQ ID NO:2, Figure 15, Figure 16, or Figure 17 has Cas6 endoribonuclease activity.
  • a Cas6 polypeptide may be enriched, isolated, or purified from a microbe having a CRDPSR locus and the cas (CRISPR- associated) locus, such as, but not limited to, Pyrococcus furiosus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized using routine methods.
  • a Cas6 polypeptide may be enriched, isolated, or purified from a microbe that does not have CRISPR loci.
  • amino acid sequence of a Cas6 polypeptide having sequence similarity to an amino acid sequence disclosed herein, such as SEQ ID NO:2, an amino acid sequence depicted in Figure 15, an amino acid sequence depicted in Figure 16, or an amino acid sequence depicted in Figure 17, may include conservative substitutions of amino acids present in an amino acid sequence.
  • a conservative substitution is typically the substitution of one amino acid for another that is a member of the same class.
  • an amino acid belonging to a grouping of amino acids having a particular size or characteristic such as charge, hydrophobicity, and/or hydrophilicity
  • conservative amino acid substitutions are defined to result from exchange of amino acids residues from within one of the following classes of residues: Class I: GIy, Ala, VaI, Leu, and He (representing aliphatic side chains); Class II: GIy, Ala, VaI, Leu, He, Ser, and Thr (representing aliphatic and aliphatic hydroxyl side chains); Class III: Tyr, Ser, and Thr (representing hydroxyl side chains); Class IV: Cys and Met (representing sulfur-containing side chains); Class V: GIu, Asp, Asn and GIn (carboxyl or amide group containing side chains); Class VI: His, Arg and Lys (representing basic side chains); Class VII: GIy, Ala, Pro, Trp, Tyr, He, VaI, Leu, Phe and Met (representing hydrophobic side chains); Class VIII: Phe, Trp, and Tyr (representing aromatic side chains); and Class IX: Asn and GIn (representing amide
  • Bowie et al. discloses that there are two main approaches for studying the tolerance of a polypeptide sequence to change. The first method relies on the process of evolution, in which mutations are either accepted or rejected by natural selection. The second approach uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene and selects or screens to identify sequences that maintain functionality. As stated by the authors, these studies have revealed that proteins are surprisingly tolerant of amino acid substitutions.
  • a Cas6 polypeptide may include a GhGxxxxxGhG (SEQ ID NO: 190) motif (where "h” indicates a hydrophobic amino acid) near the C-terminus.
  • An Arg or Lys may be, and often is, found within the central stretch of 5 amino acids (i.e. xxxxx).
  • a Cas ⁇ polypeptide contains at least one residue - the His46 shown in Figure 8 - that may play a role in catalysis, or conservative substitution thereof.
  • a Cas6 polypeptide may contain other residues - the Tyr31 and Lys52 shown in Figure 8 - which may also play a role in catalysis, or conservative substitution thereof.
  • the residue(s) expected to play a role in catalysis may be located near the G-rich loop that contains the Cas6 signature motif in the 3D structure of the protein as described in Example 1 herein. Other areas that are conserved, as well as areas that are not conserved, are shown in Figure 8.
  • Cas6 polypeptides may include domains present in the TIGRFAM database at accession numbers TIGRO 1877 and PFO 1881 , as shown in Figure 19.
  • the TIGRFAM database includes families of polypeptides for which function is conserved (Haft et al., Nucl.
  • Cas6 polypeptides include those present in prokaryotic microbes having a CRISPR locus and a cas locus. Examples include those depicted in Figure 15, Figure 16, and Figure 17. Cas6 polypeptides can be easily identified in any microbe that includes a CRTPSR locus. A coding region encoding a Cas6 polypeptide is typically in a cas locus located in close proximity to a CRISPR locus. Haft et al. (2005, PLoS
  • Haft et al describe the coding region encoding Cas6 polypeptides as being found in association with at least four separate CRISPR/Cas subtypes (Tneap, Hmari, Apern, and Mtube), and as typically being the cas coding region located most distal to the CRISPR locus.
  • Cas6 polypeptides may be identified using the resources available at the JCVI Comprehensive Microbial Resource (http://cmr.jcvi.org/cgi-bin/CMR/CmrHomePage.cgi).
  • Cas6 polypeptides that are useful in the methods described herein can be identified by the skilled person using routine methods.
  • prokaryotic microbes with known whole genomic sequences containing coding regions expected to encode a Cas6 polypeptide include Thermotoga maritima MSB8, Campylobacter fetus subsp.
  • OS Type A Porphyromonas gingivalis W83, Bacteroides fragilis YCH46, Bacteroides fragilis NCTC9343, Aquifex aeolicus VF5, Rubrobacter xylanophilus DSM 9941, Mycobacterium tuberculosis H37Rv (lab strain), Mycobacterium tuberculosis CDC 1551, Mycobacterium bovis subsp.
  • Cas6 polypeptides are known to the skilled person, see, for instance, members of the COGl 583 group of polypeptides (available at the Clusters of Orthologous Groups of proteins (COGs) web page through the National Center for Biotechnology Information internet site, see also Tatusov et al., 1997, Science, 278:631-637, and Tatusov et al. 2003, BMC Bioi ⁇ formatics, 4(1):41), members of the InterPro family having accession number IPR010156, Makarova et al., (2002, Nuc. Acids Res., 30:482-496) and Haft et al. (2005, PLoS Comput.
  • COGs Clusters of Orthologous Groups of proteins
  • a Cas6 polypeptide having Cas6 endoribonuclease activity is able to cleave a target RNA polynucleotide. Whether a polypeptide has Cas6 endoribonuclease activity can be determined by in vitro assays. An in vitro assay may be carried out by combining a suitable target RNA polynucleotide with a polypeptide expected to have Cas6 endoribonulease activity. The characteristics of the target RNA polynucleotide may depend upon the amino acid sequence of the Cas6 polypeptide. Target RNA polynucleotides are described below.
  • the target RNA polynucleotide may be between 0.01 pmol to 0.1 pmol, such as 0.05 pmol, and the Cas6 polypeptide may be between 50 nM and 1 ⁇ M, such as 200 nM or 500 nM.
  • the polypeptide to be tested may be enriched, isolated, or purified.
  • the polypeptide may be from a whole cell extract, such as an SlOO extract, or from an immunoprecipitation reaction.
  • the suitable target RNA polynucleotide and polypeptide may be incubated in a buffer such as HEPES-KOH at 15 mM to 25 mM, preferably 20 mM, and pH between 6.5. and 7.5, preferably 7.0.
  • the mixture may also include KCl at 240 mM to 260 mM, preferably 250 mM, DTT at 0.7 mM to 0.8 mM, preferably 0.75 mM, MgCl 2 at 1.0 mM to 2.0 mM, preferably 1.5 mM, glycerol at 5% to 15%, preferably 10%, and additional RNA, such as E. coli tRNA at 5 ⁇ g per 20- ⁇ L reaction volume.
  • This may be incubated at a suitable temperature such as at least 30°C, at least 40°C, at least 50°C, at least 60°C, at least 70 0 C, at least 80 0 C, and at least 90 0 C, for at least 30 minutes.
  • a portion of the mixture may be removed and resolved on a native polyacrylamide gel to measure binding of the polypeptide to the target RNA polynucleotide.
  • the polypeptide may be removed by extraction and the mixture resolved on a denaturing (7 M urea), 12%— 15% polyacrylamide gel. The presence of a band that runs at a molecular weight that is less than the original target RNA polynucleotide indicates the polypeptide is a Cas6 polypeptide.
  • a polynucleotide encoding a Cas6 polypeptide having Cas6 endoribonuclease activity is referred to herein as a Cas6 polynucleotide.
  • Cas6 polynucleotides may have a nucleotide sequence encoding a polypeptide having the amino acid sequence shown in SEQ ID NO:2.
  • An example of the class of nucleotide sequences encoding such a polypeptide is the nucleotide sequence depicted at Genbank Accession No. AE010223 (SEQ ID NO:1).
  • a polynucleotide encoding a Cas6 polypeptide represented by SEQ ID NO: 2 is not limited to the nucleotide sequence disclosed at SEQ ID NO: 1 , but also includes the class of polynucleotides encoding such polypeptides as a result of the degeneracy of the genetic code.
  • the naturally occurring nucleotide sequence SEQ ID NO:1 is but one member of the class of nucleotide sequences encoding a polypeptide having the amino acid sequence SEQ ID NO:2.
  • the class of nucleotide sequences encoding a selected polypeptide sequence is large but finite, and the nucleotide sequence of each member of the class may be readily determined by one skilled in the art by reference to the standard genetic code, wherein different nucleotide triplets (codons) are known to encode the same amino acid.
  • Examples of other Cas6 polynucleotides include those having a nucleotide sequence encoding a polypeptide having the amino acid sequence shown in Figure 15, 16, or 17.
  • a Cas6 polynucleotide may have sequence similarity with the nucleotide sequence of SEQ ID NO:1. Cas6 polynucleotides having sequence similarity with the nucleotide sequence of SEQ ID NO:1 encode a Cas6 polypeptide.
  • a Cas6 polynucleotide may be isolated from a microbe having CRIPSR loci, such as, but not limited to, Pyrococcus furiosus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized using routine methods.
  • a Cas6 polynucleotide may further include heterologous nucleotides flanking the open reading frame encoding the Cas6 polynucleotide.
  • heterologous nucleotides may be at the 5' end of the coding region, at the 3' end of the coding region, or the combination thereof.
  • the number of heterologous nucleotides may be, for instance, at least 10, at least 100, or at least 1000.
  • the present invention also includes fragments of the polypeptides described herein, and the polynucleotides encoding such fragments.
  • the present invention includes fragments of SEQ ID NO:2, as well as fragments having structural similarity to SEQ ID NO:2.
  • a polypeptide fragment may include a sequence of at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, or at least 100 amino acid residues.
  • a polypeptide disclosed herein or a fragment thereof may be expressed as a fusion polypeptide that includes a polypeptide disclosed herein or a fragment thereof and an additional heterologous amino acid sequence.
  • the additional amino acid sequence may be useful for purification of the fusion polypeptide by affinity chromatography.
  • affinity purification moieties to proteins. Representative examples may be found in Hopp et al. (U.S. Pat. No. 4,703,004), Hopp et al. (U.S. Pat. No. 4,782,137), Sgarlato (U.S. Pat. No. 5,935,824), and Sharma (U.S. Pat. No. 5,594,115).
  • the additional amino acid sequence may be a carrier polypeptide.
  • the carrier polypeptide may be used to increase the immunogenicity of the fusion polypeptide to increase production of antibodies that specifically bind to a polypeptide of the invention.
  • the invention is not limited by the types of carrier polypeptides that may be used to create fusion polypeptides.
  • Examples of carrier polypeptides include, but are not limited to, keyhole limpet hemacyanin, bovine serum albumin, ovalbumin, mouse serum albumin, rabbit serum albumin, and the like.
  • a polynucleotide disclosed herein such as a polynucleotide encoding a Cas6 polypeptide or a polynucleotide encoding a target RJSfA polynucleotide, may be present in a vector.
  • Target RNA polynucleotides are described below.
  • a vector is a replicating polynucleotide, such as a plasmid, phage, or cosmid, to which another polynucleotide may be attached so as to bring about the replication of the attached polynucleotide. Construction of vectors containing a polynucleotide of the invention employs standard ligation techniques known in the art.
  • a vector may provide for further cloning (amplification of the polynucleotide), i.e., a cloning vector, or for expression of the polynucleotide, i.e., an expression vector.
  • the term vector includes, but is not limited to, plasmid vectors, viral vectors, cosmid vectors, and artificial chromosome vectors.
  • viral vectors include, for instance, adenoviral vectors, adeno-associated viral vectors, lentiviral vectors, retroviral vectors, and herpes virus vectors.
  • a vector is capable of replication in a microbial host, for instance, a fungus, such as S. cerevisiae, or a prokaryotic bacterium, such as E. coli.
  • a microbial host for instance, a fungus, such as S. cerevisiae, or a prokaryotic bacterium, such as E. coli.
  • the vector is a plasmid.
  • suitable host cells for cloning or expressing the vectors herein include eukaryotic cells. Suitable eukaryotic cells include fungi, such as S. cerevisiae and P. pastoris. In other aspects, suitable host cells for cloning or expressing the vectors herein include prokaryotic cells. Suitable prokaryotic cells include bacteria, such as gram-negative microbes, for example, E. coli. Other suitable prokaryotic cells include archeae, such as Haloferax volcanii. Vectors may be introduced into a host cell using methods that are known and used routinely by the skilled person. For example, calcium phosphate precipitation, electroporation, heat shock, lipofection, microinjection, and viral-mediated nucleic acid transfer are common methods for introducing nucleic acids into host cells.
  • Polynucleotides of the present invention may be obtained from microbes, for instance, members of the genus Pyrococcus, such as P.furiosus, or produced in vitro or in vivo.
  • methods for in vitro synthesis include, but are not limited to, chemical synthesis with a conventional DNA/RNA synthesizer.
  • Commercial suppliers of synthetic polynucleotides and reagents for such synthesis are well known.
  • polypeptides of the present invention may be obtained from microbes, or produced in vitro or in vivo.
  • An expression vector may optionally include a promoter that results in expression of an operably linked coding region.
  • Promoters act as regulatory signals that bind RNA polymerase in a cell to initiate transcription of a downstream (3' direction) coding region.
  • Promoters present in prokaryotic microbes typically include two short sequences at -10 (often referred to as the Pribnow box, or the -10 element) and -35 positions (often referred to as the -35 element), or a short sequence at -30 (often referred to as a TATA box) located 5' from the transcription start site, for bacterial and archael organisms, respectively.
  • the promoter used may be a constitutive or an inducible promoter.
  • Target RNA polynucleotides of the present invention do not encode a polypeptide, and expression of a target RNA polynucleotide present in a vector results in a non- coding RNA.
  • a vector including a target RNA polynucleotide may also include a transcription- start signal and/or a transcription terminator operably linked to the target RNA polynucleotide, but a translation start signal and/or translation stop signal typically are not operably linked to a target RNA polynucleotide.
  • Promoters have been identified in many microbes and are known to the skilled person. Many computer algorithms have been developed to detect promoters in genomic sequences, and promoter prediction is a common element of many gene prediction methods. Thus, the skilled person can easily identify nucleotide sequences present in microbes that will function as promoters.
  • An expression vector may optionally include a ribosome binding site and a start site (e.g., the codon ATG) to initiate translation of the transcribed message to produce the polypeptide. It may also include a termination sequence to end translation. A termination sequence is typically a codon for which there exists no corresponding aminoacetyl-tRNA, thus ending polypeptide synthesis.
  • the polynucleotide used to transform the host cell may optionally further include a transcription termination sequence.
  • a vector introduced into a host cell optionally includes one or more marker sequences, which typically encode a molecule that inactivates or otherwise detects or is detected by a compound in the growth medium.
  • a marker sequence may render the transformed cell resistant to a selective agent, such as an antibiotic, or it may confer compound- specific metabolism on the transformed cell.
  • a marker sequence include, but are not limited to, sequences that confer resistance to kanamycin, ampicillin, chloramphenicol, tetracycline, streptomycin, and neomycin.
  • HMG-CoA 3- hydroxy-3-methylglutaryl coenzyme A reductase
  • An enzyme used for archaeal membrane lipid biosynthesis is 3- hydroxy-3-methylglutaryl coenzyme A reductase (HMG-CoA), an enzyme used for archaeal membrane lipid biosynthesis.
  • a marker is a nutritional marker.
  • a nutritional marker is typically a coding region that, when mutated in a cell, confers on that cell a requirement for a particular compound. Cells containing such a mutation will not grow on defined medium that does not include the appropriate compound, and cells receiving a coding region that complements the mutation can grow on the defined medium in the absence of the compound.
  • nutritional markers include, but are not limited to, coding regions encoding polypeptides in biosynthetic pathways, such as nucleic acid biosynthesis (e.g,. biosynthesis of uracil), amino acid biosynthesis (e.g., biosynthesis of histidine and tryptophan), vitamin biosynthesis (e.g., biosynthesis of thiamine), and the like.
  • Polypeptides useful in the methods described herein such as the polypeptides described herein and other Cas6 polypeptides, may be obtained from a microbe that has a CRISPR locus. Examples of such microbes are listed above. Polypeptides and fragments thereof useful in the present invention may be produced using recombinant DNA techniques, such as an expression vector present in a cell. Such methods are routine and known in the art. The polypeptides and fragments thereof may also be synthesized in vitro, e.g., by solid phase peptide synthetic methods. The solid phase peptide synthetic methods are routine and known in the art.
  • a polypeptide obtained from a microbe having a CRISPR locus, produced using recombinant techniques, or by solid phase peptide synthetic methods may be further purified by routine methods, such as fractionation on immunoaffinity or ion-exchange columns, ethanol precipitation, reverse phase HPLC, chromatography on silica or on an anion-exchange resin such as DEAE, chromatofocusing, SDS-PAGE, ammonium sulfate precipitation, gel filtration using, for example, Sephadex G- 75, or ligand affinity.
  • routine methods such as fractionation on immunoaffinity or ion-exchange columns, ethanol precipitation, reverse phase HPLC, chromatography on silica or on an anion-exchange resin such as DEAE, chromatofocusing, SDS-PAGE, ammonium sulfate precipitation, gel filtration using, for example, Sephadex G- 75, or ligand affinity.
  • obtaining polypeptides includes conditions that minimize RNAse and proteinase activity
  • the present invention also includes genetically modified microbes that have a polynucleotide encoding target RNA polynucleotide, a Cas6 polypeptide, or the combination.
  • a genetically modified microbe may exhibit production of an exogenous polynucleotide or an exogenous polypeptide disclosed herein, or increased production of an endogenous Cas6 polypeptide.
  • a polynucleotide encoding a target RNA polynucleotide or a Cas6 polypeptide disclosed herein may be present in the microbe as a vector or integrated into a chromosome.
  • microbes that can be genetically modified include, but are not limited to, eukaryotic cells, such as S. cerevisiae and P. pastoris, bacteria, such as gram-negative microbes, for example, E. coli, and archeae, such as Haloferax volcanii.
  • eukaryotic cells such as S. cerevisiae and P. pastoris
  • bacteria such as gram-negative microbes, for example, E. coli, and archeae, such as Haloferax volcanii.
  • the methods include incubating a target RNA polynucleotide with a Cas6 polypeptide under conditions suitable for cleavage of the polynucleotide by the Cas6 polypeptide.
  • Restriction endonucleases recognize a specific nucleotide sequence (a recognition domain) of a target polynucleotide and cleave the target at a specific location which can be within the recognition domain or outside of the recognition domain.
  • a Cas6 polypeptide cleaves a target outside of the recognition domain, but unlike a restriction endonuclease, the nucleotide sequence to which different Cas6 polypeptides bind can vary.
  • Target polynucleotides described herein are not limited to those possessing a recognition domain with a specific nucleotide sequence.
  • the target polynucleotide may be RNA.
  • a target RNA polynucleotide has a Cas6 recognition domain, i.e., the site to which a Cas6 polypeptide binds, and a cleavage site, i.e., the site enzymatically cleaved by a Cas6 polypeptide. While the term target RNA polynucleotide suggests the nucleotides are ribonucleotides, polynucleotides described herein also include the corresponding deoxyribonucleotide sequence, and the RNA and DNA complements thereof.
  • a target RNA polynucleotide may be based on a nucleotide sequence from a CRISPR locus.
  • a CRISPR locus of a prokaryotic microbe includes, from 5' to 3', a repeat followed immediately by a spacer (referred to herein as a "repeat-spacer unit").
  • a CRISPR locus includes multiple repeat-spacer units. In a CRJSPR locus, each repeat is nearly identical (Barrangou et al., U.S.
  • each spacer of a CRISPR locus is typically a different nucleotide sequence.
  • the Cas6 endoribonuclease activity of a Cas6 polypeptide disclosed herein cleaves a repeat region derived from a CRISPR locus.
  • the location of the cleavage site is on the 5' side of the nucleotide located 10, 9, 8, 7, 6, or 5 nucleotides from the 3' end of the repeat.
  • the cleavage site is on the 5' side of the nucleotide located 8 nucleotides from the 3' end of the repeat.
  • the nucleotide sequence of a repeat present in a CRISPR locus can easily be identified in any microbe that includes a CRISPR locus.
  • the genomic sequences of many microbes are known, and the location of CRISPR loci in these microbes is often known, or can easily be located using routine bioinformatic methods known in the art.
  • Edgar BMC Bioinformatics, 2007, 8:18
  • Acids Res., 2007, 35(Web Server issue):W52-W57) describe a computer program which identifies CRISPRs from genomic sequences, extracts the repeat and spacer sequences, and constructs a database which is automatically updated monthly using newly released genome sequences.
  • the nucleotide sequence of a repeat in a CRISPR locus can be determined by the skilled person using routine methods.
  • a repeat present in Pyrococcus furiosus is GTTCCAATAAGACTAAAATAGA i ATTGAAAG (SEQ ID NO: 191 ), and the location of the site cleaved by a Cas6 polypeptide, such as SEQ ID NO:2, is shown by the arrow, i.e., 8 nucleotides from the 3' end of the repeat.
  • a target RNA polynucleotide may include other nucleotide sequences downstream of the cleavage site, i.e., the nucleotides that correspond to the 3' end of a repeat present in a microbe and downstream of a cleavage site may be different relative to the nucleotides present in a repeat present in a microbe. It is expected that the nucleotides downstream of a cleavage site may include at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 substitutions when compared to the nucleotides present in a repeat present in a microbe.
  • a target RNA polynucleotide based on a repeat present in a microbe may include fewer than 8 nucleotides downstream of the cleavage site.
  • a target RNA polynucleotide based on a repeat present in a microbe may include at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 nucleotides downstream of the cleavage site.
  • one or both of the nucleotides flanking the cleavage site are the same as found in the wild-type microbe.
  • a target RNA polynucleotide based on a repeat obtained from a particular microbe may include other variations in nucleotide sequence relative to the repeat present in the microbe. Typically, such variations occur outside of the Cas6 recognition domain.
  • a Cas6 recognition domain is located near the 5' end of a repeat.
  • a Cas6 recognition domain includes the nucleotide beginning at position 1 (i.e., the nucleotide at the 5' end of the repeat) and extends to nucleotide 6, nucleotide 7, nucleotide 8, nucleotide 9, nucleotide 10, nucleotide 11, nucleotide 12, or nucleotide 13.
  • the Cas6 recognition domain of a target RNA polynucleotide may be defined relative to its distance from the cleavage site.
  • a Cas6 recognition domain includes nucleotides located 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and/or 21 or more nucleotides upstream of the cleavage site.
  • the size of a Cas6 recognition domain may span at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, or at least 8 nucleotides to no greater than 10, no greater than 11, no greater than 12 nucleotides, or no greater than 13 nucleotides.
  • a Cas6 recognition domain may include the nucleotides located 15, 18, and 20 nucleotides upstream of the cleavage site, and can be represented as UNCNMJNNNNbWNNNNNNNN
  • a Cas6 recognition domain includes the nucleotides located 14 to 21 nucleotides upstream of the cleavage site, and can be represented as
  • the nucleotide sequence between the Cas6 recognition domain and the cleavage site may vary from the sequence present in a wild-type repeat.
  • the Cas ⁇ polypeptide used to cleave the target RNA polynucleotide is a Cas ⁇ polypeptide present in that microbe (or a microbe with a similar CRISPR repeat sequence), or has sequence similarity to such a Cas6 polypeptide.
  • the Cas6 polypeptide is SEQ ID NO: 2 or has sequence similarity to SEQ ID NO:2.
  • the Cas ⁇ polypeptide is SEQ ID NO:3 or has sequence similarity to SEQ ID NO:3.
  • a target RNA polynucleotide is based on a repeat identical or similar to that present in a microbe listed in Figure 15, Figure 16, or Figure 17, the Cas ⁇ polypeptide is, or has sequence similarity to, a Cas ⁇ polypeptide present in that microbe.
  • the Cas6 polypeptide may also be one present in a microbe with an identical or similar CRISPR repeat sequence as that in the target RNA polynucleotide.
  • RNA polynucleotide and Cas ⁇ polypeptide can be used to result in cleavage of a target RNA polynucleotide.
  • a target RNA polynucleotide may include an additional polynucleotide at the 3' end, at the 5' end, or at both ends. If the target RNA polynucleotide is identical to a CRISPR repeat, the additional polynucleotide may be referred to as a heterologous polynucleotide.
  • This additional polynucleotide at the 3' end can be chosen by a skilled person and cleaved using the methods described herein. Thus, the skilled person can design a target RNA polynucleotide that will result in the production of an RNA with a predictable and known 5' end.
  • a target RNA polynucleotide may include at least 10, at least 50, or at least 100 additional nucleotides at the 3' end.
  • the methods may be in vitro or in vivo.
  • Practicing the method in vivo may include introducing a polynucleotide into a microbe.
  • the introduced polynucleotide may include the target RNA polynucleotide, or the introduced polynucleotide may encode the target RNA polynucleotide.
  • the microbe may be,- but is not limited to, a genetically modified microbe.
  • An example of a genetically modified microbe for use in the methods includes one with an exogenous polynucleotide encoding a Cas6 polypeptide.
  • the method may be practiced at a suitable temperature such as at least 30°C, at least 40 0 C, at least 50°C, at least 60°C, at least 70 0 C, at least 80 0 C, or at least 90°C
  • RNA polynucleotides that include a Cas6 recognition domain as described above.
  • the polynucleotide may be RNA, or may be DNA. If it is DNA it may be operably linked to a regulatory sequence, such as a promoter, and may be present in a vector.
  • the polynucleotide may include nucleotides downstream of the cleavage site to facilitate the ligation of a different polynucleotide downstream of the cleavage site.
  • nucleotides downstream of the cleavage site may include a restriction endonuclease site or a multiple cloning site.
  • kits may include one or more of the polynucleotides or polypeptides described herein.
  • a kit may include a target RNA polynucleotide or a DNA polynucleotide encoding a target RNA polynucleotide, a polynucleotide encoding a Cas6 polypeptide, a Cas6 polypeptide, or a combination thereof.
  • Kits may be used, for instance, for modifying a microbe to express a Cas6 polypeptide and/or a target RNA polynucleotide.
  • Kits may be used for in vitro cleavage of a target RNA polynucleotide.
  • the kit components are present in a suitable packaging material in an amount sufficient for at least one assay.
  • other reagents such as buffers and solutions needed to practice the invention are also included.
  • the phrase "packaging material” refers to one or more physical structures used to house the contents of the kit.
  • the packaging material is constructed by well known methods, preferably to provide a sterile, contaminant-free environment.
  • the packaging material has a label which indicates that the components can be used for methods as described herein.
  • the packaging material contains instructions indicating how the materials within the kit are employed.
  • the term "package” refers to a solid matrix or material such as glass, plastic, paper, foil,, and the like, capable of holding within fixed limits a kit component.
  • a package can be a glass vial used to contain milligram quantities of a polypeptide or polynucleotide.
  • Instructions for use typically include a tangible expression describing the reagent concentration or at least one assay method parameter.
  • RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes.
  • CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses.
  • the CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs.
  • a Pyrococcus furiosus Cas6 was identified as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targeting RNAs.
  • Cas6 interacts with a specific sequence motif in the 5' region of the CRISPR repeat element and cleaves at a defined site within the 3' region of the repeat.
  • the 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins.
  • the predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal- independent, cas ⁇ is one of the most widely distributed CRISPR-associated genes.
  • PFl 131 protein Purification of PFl 131 protein for cleavage and RNA-binding assays.
  • N-terminal, 6x-histidine-tagged PFl 131 protein (PfCas ⁇ from P. furiosus DSM 3638 strain) was expressed in Escherichia coli BL21 codon + (DE3, Invitrogen) cells harboring a pET24d plasmid containing the appropriate gene insert (obtained from Michael Adams, University of Georgia, Athens, GA). Protein expression was induced by growing the cells to an OD 6 oo of 0.6 and adding isopropylthio- ⁇ -D-galactoside (IPTG) to a final concentration of 1 mM.
  • IPTG isopropylthio- ⁇ -D-galactoside
  • the cells were disrupted by sonication (Misonix Sonicator 3000) in buffer A (20 mM sodium phosphate [pH 7.0], 500 mM NaCl and 0.1 mM phenylmethylsulfonyl fluoride). The Iy sate was then cleared by centrifugation and the supernatant was incubated for 20 minutes at 70°C. This sample was centrifuged and the supernatant was applied to a Ni-NTA agarose column (Qiagen) that had been equilibrated with Buffer A. The protein was eluted from the column with Buffer A containing 350 mM imidazole. The purity of the protein was evaluated by SDS-PAGE and staining with coomassie blue. Buffer exchange into 40 mM HEPES-KOH (pH 7.0), 500 mM KCL was carried out using Microcon PL-IO filter columns (Millipore). The protein concentration was determined by the BCA assay (Pierce).
  • Synthetic RNAs listed in Table 1 and the RNA size standards (Decade Markers) were purchased from Integrated DNA Technologies (IDT) and
  • RNAs were 5 '-end-labeled with T4 Polynucleotide kinase (Ambion) in a 20- ⁇ L reaction containing 20 pmol of RNA, 500 ⁇ Ci of [ ⁇ 32 P] ATP (3000 Ci/mmol; MP Biomedicals), and 20 U of T4 kinase.
  • the RNAs were separated by electrophoresis on denaturing (7 M urea) 15% polyacrylamide gels, and the appropriate RNA species were excised from the gel with a sterile razor blade guided by a brief autoradiographic exposure.
  • RNAs were eluted from the gel slices by end-over-end rotation in 400 ⁇ L of RNA elution buffer (500 mM NH4OAc, 0.1% SDS 3 0.5 mM EDTA) for 12-14 h at 4°C.
  • RNA elution buffer 500 mM NH4OAc, 0.1% SDS 3 0.5 mM EDTA
  • the RNA was then extracted with phenol/chloroform/isoamyl alcohol (PCI, 25:24: 1 at pH 5.2), and precipitated with 2.5 volumes of 100% ethanol in the presence of 0.3 M sodium acetate and 20 ⁇ g of glycogen after incubation for 1 hour at -20°C.
  • PCI phenol/chloroform/isoamyl alcohol
  • RNAs were generated by in vitro transcription using T7 RNA polymerase (Ambion) and uniformly labeled with [ ⁇ - 32 P] UTP (700 Ci/mmol; MP Biomedicals) as described (Baker et al , 2005. Genes & Dev. 19: 1238- 1248).
  • the templates used were either annealed DNA oligonucleotides or PCR products (see Tables 1, 2), both containing the T7 promoter sequence.
  • a typical reaction contained 200 ng of PCR product or annealed deoxyoligonucleotides, 1 mM DTT, 10 U SUPERase-IN RNase inihibitor (Ambion), 500 ⁇ M ATP, CTP, and GTP, 50 ⁇ M UTP 5 30 ⁇ Ci [ ⁇ - 32 P] UTP, 1 transcription buffer (Ambion), and 40 U T7 RNA polymerase in a total volume of 20 ⁇ L.
  • oligos were either annealed directly (FVT) or were used as PCR primers to generate template DNA (PCR) for in vitro transcription reactions. Oligo sequences are listed in Table 1.
  • reaction conditions were used to assay the ability of PfCas6 protein to bind to and to cleave substrate RNAs. These reactions were initiated by incubating 0.05 pmol of 32 P-radiolabed RNAs (either uniformly or 5 '-end-labeled) with up to 1 ⁇ M (as indicated in the figure legends) of PfCas6 protein in 20 mM HEPES-KOH (pH 7.0), 250 niM KCl, 0.75 mM DTT, 1.5 mM MgC12, 5 ⁇ g of E. coli tRNA, and 10% glycerol in a 20- ⁇ L reaction volume for 30 minutes at 70°C.
  • RNA cleavage was assayed using the remaining half of the reaction by deproteinizing (PCI extraction and ethanol precipitation) the RNAs and separating them by electrophoresis on denaturing (7 M urea), 12%-15% polyacrylamide gels. Gels were dried and the radiolabeled RNAs visualized by phosporimaging.
  • RNA cleavage reaction was set up using 5' end labeled repeat RNA as described above. Alkaline hydrolysis and RNase Tl (0.1 U) ladders were generated as described previously (Youssef et al., 2007. Nucleic Acids Res. 35: 6196-6206). Following the reactions, the RNAs were extracted with PCI, ethanol precipitated, and separated by electrophoresis on large, denaturing (7 M urea), 15% polyacrylamide (19:1 acrylamide:bis) gels. The gels were dried and the RNAs visualized by phosphorimaging.
  • N-terminal polyhistidine-tagged wild-type and selenomethionine-labeled PFl 131 protein was expressed in E. coli and purified from cell extract by heat- denaturation and two chromatography steps.
  • the cells were disrupted by sonication in a buffer containing 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, 5 mM ⁇ -mercaptoethanol ( ⁇ ME), and 0.2 mM phenylmethylsulfonyl fluoride. The cell lysate was heated for 15 minutes to 70°C before being pelleted.
  • the supernatant was then directly loaded at room temperature onto a Ni-NTA (Qiagen) column equilibrated with 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, and 5 mM imidazole.
  • the column was washed with the loading buffer containing 25 mM imidazole and then the bound protein was eluted using the loading buffer containing 350 mM imidazole.
  • Both the wild-type and selenomethionine-labeled PFl 131 protein were crystallized using vapor diffusion in a hanging drop at 30 0 C.
  • the droplets of PFl 131 at 40 mg/mL were combined in equal volume with a well solution that contained 50 mM MES (pH 6.0), 30 mM MgCl 2 , and 15% (v/v) isopropanol.
  • the crystals formed in 1-5 days with a cubic shape and to a size of ⁇ 0.4 mm* O.4 mm ⁇ O.4 mm.
  • Crystals were soaked briefly in a cryo-protecting solution containing the mother liquor plus 20% (w/v) polyethylene glycol 4000 before being flash frozen in a nitrogen stream at 100 Kelvin.
  • the crystals of the native and selenomethionine-labeled PFl 131 diffracted to d m ⁇ n 1.8-2.2 A at the Southeast Regional Collaborative Access Team (SER-CAT) beamline 22ID.
  • the space group of the crystals was determined to be P3 2 21 and the cell dimensions are listed in Table 3.
  • a single wavelength data set was collected at the anomalous peak of selenine from a selenomethionine-labeled crystal.
  • the solvent content was calculated to be 54.9% if the crystal was assumed to contain one PFl 131 in one asymmetric unit.
  • the structure of PFl 131 was solved by a SAD phasing method using the automated crystallographic structure solution program SOLVE (Terwilliger and Berendzen, 1999. Acta Crystallogr. D Biol. Crystallogr. 55: 849-861).
  • SOLVE automated crystallographic structure solution program
  • the initial model traced by SOLVE was further improved by the program COOT (Emsley and Cowtan, 2004. Acta Crystallogr. D Biol. Crystallogr. 60: 2126-2132), followed by refinement using CNS (Brunger et al, 1998. Acta Crystallogr. D Biol. Crystallogr.
  • the psiRNAs which are thought to be primary agents in prokaryotic genome defense, are derived from CRISPR RNA transcripts that consist of a series of individual invader targeting sequences separated by a common repeat sequence (Fig. IA).
  • Fig. IA To identify the enzyme required for dicing CRISPR RNA transcripts and releasing the individual embedded psiRNAs, a number of recombinant P. furiosus Cas proteins were screened for the ability to cleave CRISPR repeat sequences.
  • a single protein was identified, Cas6 (PFl 131), that cleaves specifically within the repeat sequence of radiolabeled substrate RNAs consisting of either a guide (invader targeting or "spacer") sequence flanked by two repeat sequences or the repeat sequence alone (Fig.
  • CRISPR-associated genes More than 40 CRISPR-associated genes have been identified; however, only a subset of the cas genes is found in any given genome, and no cas gene appears to be present in all organisms that possess the CRISPR-Cas system (Haft et al, 2005. PLoS Comput. Biol. l:e60; Makarova et al, 2006. Biol. Direct 1 :7).
  • Cas6 is among the most widely distributed Cas proteins and is found in both bacteria and archaea (Haft et al., 2005. PLoS Comput. Biol. 1 :e60).
  • a distinct protein with similar activity was very recently reported in Escherichia coli (Brouns et al., 2008. Science 321 : 960-964).
  • Cse3 CRISPR-Cas system subtype E. coli, also referred to as CasE
  • CasE CRISPR-Cas system subtype E. coli, also referred to as CasE
  • Both Cas6 and Cse3 are members of the RAMP (repeat-associated mysterious protein) superfamily, as are a large number of the Cas proteins (Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Makarova et al., 2006. Biol. Direct 1: 7).
  • RAMP proteins contain G-rich loops and are predicted to be RNA-binding proteins (Makarova et al., 2002.
  • Cas6 is distinguished from the many other RAMP family members by a conserved sequence motif within the predicted C-terminal G-rich loop (consensus GhGxxxxxGhG, where h is hydrophobic and xxxxx has at least one lysine or arginine) (Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Haft et al., 2005. PLoS Comput. Biol. 1: e60). Nuclease activity was not predicted for Cas6 based on sequence analysis.
  • RNA sequence requirements of Cas6 binding and endonucleolytic cleavage were investigated.
  • Fig. 4A gel mobility shift assays with a series of RNAs. The results indicate that sequences in the 5' region of the CRISPR repeat are important for PfCas ⁇ binding.
  • rapid cleavage prevents unambiguous observation of PfCas ⁇ binding to the intact repeat (Fig. 3 C 5 D), although binding can be observed with the cleavage site mutant (Fig. 3D) and at reduced temperatures where PfCas ⁇ cleavage activity is inhibited (Fig. 5).
  • incubation of PfCas ⁇ with the repeat RNA Fig.
  • nucleotides at the 5' end of the CRISPR repeat are sufficient for robust PfCas ⁇ binding, cleavage appears to involve additional elements.
  • mutations that disrupt protein binding also eliminate cleavage activity (Fig. 6, panels d,e).
  • other mutations dramatically reduced cleavage efficiency without disrupting PfCas ⁇ binding.
  • substitution of the two adenosines at the cleavage site disrupts cleavage but not binding (Fig. 3 C 3 D).
  • substitution of the last 8 nucleotides of the repeat specifically disrupted cleavage (Fig. 6, panel f).
  • P. furiosus has seven CRISPR loci with five slightly varied repeat sequences, and the elements that we identified as most important for Cas ⁇ recognition and cleavage map to the regions of greatest sequence conservation. Variation is observed at only one position within each the first 12 and last 11 nucleotides of the P.furiosus repeat sequences, consistent with the importance of these two regions in Cas6 binding and cleavage. On the other hand, variation occurs at three positions between the binding and cleavage sites (positions 14, 16, and 19), suggesting that nucleotide identities are less important in this region.
  • PfCas ⁇ contains a duplicated ferredoxin-like fold linked by an extended peptide (residues 118-123).
  • the close arrangement of the ⁇ - sheets of the two ferredoxin-like folds creates a well-formed central cleft (Fig. 7A).
  • the ferredoxin fold is a common protein fold also found in the structures of other RNA-binding proteins including the well-characterized RNA recognition motif (RRM), which primarily functions in ssRNA binding (Maris et al, 2005. FEBS J. 272: 2118-2131).
  • RRM RNA recognition motif
  • PfCas ⁇ appears to exploit a distinct mechanism of base-specific ssRNA recognition. Most notably, PfCas ⁇ lacks the prevalent aromatic and positive residues that characterize the ⁇ -sheets of RRMs (Maris et al., 2005. FEBS J. 272: 2118-2131). The central regions of both the front and back surfaces of PfCas ⁇ display positive potential that coincides with regions of conserved amino acids (Fig. 7) suggesting that the composite surfaces formed by the tandem ferredoxin-like folds correspond to RNA-binding sites.
  • PfCas ⁇ allows us to predict the site of catalysis and catalytic mechanism of the enzyme.
  • Several candidate catalytic residues are evident as strictly conserved residues in aligned Cas ⁇ sequences (Fig. 8). These include Tyr31 , His46, and Lys52, which cluster within 6 A of each other and are found in close proximity to the G-rich loop that contains the Cas ⁇ signature motif (Fig. 7B). These three residues may form a catalytic triad for RNA cleavage similar to that of the tRNA intron splicing endonuclease (Calvin and Li., 2008. Cell. MoI. Life Sci. 65: 1176-1185).
  • the G-rich loop is located immediately above the putative catalytic triad and may facilitate the placement of CRISPR repeat RNA substrates. Consistent with the corresponding predicted general acid-base catalytic mechanism (proposed for the splicing endonuclease) (Calvin and Li., 2008. Cell. MoI. Life Sci. 65: 1176-1185), PfCas ⁇ does not require divalent metals and like other metal-independent nucleases cleaves on the 5' side of the phosphodiester bond, likely generating 5' hydroxyl (OH) and 2', 3' cyclic phosphate RNA end groups (Fig. 9). Finally, while binding of the enzyme occurs over a wide temperature range, PfCas ⁇ cleavage activity is sharply temperature-dependent with significantly more activity at 7O 0 C than 37°C (Fig. 5).
  • Cas6 plays a central role in the production of the psiRNAs in the emerging prokaryotic RNAi pathway.
  • Cas6 is a novel riboendonuclease. Through direct binding and cleavage of CRISPR repeat sequences, Cas6 dices long, single-stranded CRISPR primary transcripts into units that consist of an individual guide sequence flanked by a short (8-nt) repeat sequence at the 5' end and by the remaining repeat sequence at the 3' end of the RNA (Fig. IA). Mature psiRNAs retain the short repeat-derived sequence established by Cas6 at their 5' ends in P.
  • Cas6 is evolutionarily, structurally, and catalytically distinct from the Dicer proteins that function in the release of individual RNAs that mediate gene silencing in eukaryotes (Hammond, 2005. FEBS Lett. 579: 5822-5829; Jaskiewicz and Filipowicz, 2008. Curr. Top. Microbiol. Immunol. 320: 77-97).
  • Cas6 is one of three different ferredoxin fold Cas proteins recently found to possess nuclease activity.
  • Cas2 another protein found in many of the prokaryotes that possess the CRISPR-Cas system, cleaves U-rich ssRNA (Beloglazova et al, 2008. J. Biol. Chem.
  • the mechanism of action of Cas6 seems to be distinct from that of Cas2, which appears to be a metal-dependent, hydrolytic enzyme (Beloglazova et al., 2008. J. Biol. Chem. 283: 20361-20371).
  • the role of Cas2 in the pRNAi pathway is currently unknown.
  • the E. coli Cse3 protein functions like Cas6 as a CRISPR repeat cleaving enzyme (Brouns et al., 2008. Science 321 : 960-964). Cse3 also cleaves RNA in a divalent metal-independent manner (Brouns et al., 2008. Science 321: 960-964).
  • Cas6 substrate recognition was probed at single nucleotide resolution using RNA footprinting.
  • the results of this analysis confirm that sequence elements in the 5' region of the repeat are the primary determinants for recognition by Cas6 and that nucleotides 2-8 likely have direct contact with Cas6.
  • sequence elements in the 5' region of the repeat are the primary determinants for recognition by Cas6 and that nucleotides 2-8 likely have direct contact with Cas6.
  • a critical role of the predicted catalytic triad was established and an acid/base catalytic mechanism involving these three amino acids is proposed.
  • native Cas6 was isolated from P. furiosus extract, was shown to cleave CRISPR repeat RNA, and was found to co-purify with several crRNA (CRISPR RNA) processing intermediates.
  • RNAs were 5' end labeled with T4 polynucleotide kinase (Applied Biosystems) and [ ⁇ 32 P] ATP (7000 Ci/mmol; MP Biomedicals) as described in example 1. End-labeling at the 3' end was performed with T4 RNA ligase (Promega) and [ ⁇ 32 P] pCp (2500 Ci/mmol; MP Biomedicals).
  • a typical reaction contained 10 pmol of RNA, 20 U T4 RNA ligase, 10 U SUPERase-INTM RNase inhibitor (Applied Biosystems), IX T4 RNA ligase buffer (Promega), 20% polyethylene glycol 3350, and -12 pmol [ ⁇ 32 P] pCp.
  • the uniformly labeled CRISPR repeat RNA substrate was generated by in vitro transcription by T7 polymerase using annealed DNA oligos containing the T7 promoter sequence as a template (see Tables 4 and 5 for sequence information) in the presence of [ ⁇ P] UTP (MP Biomedicals) and purified as described in example 1. All radiolabeled RNAs were extracted with phenol/chloroform/isoamyl alcohol (PCI), precipitated with ethanol, and gel purified as described in example 1.
  • PCI phenol/chloroform/isoamyl alcohol
  • RNA footprinting Lead (II) induced and RNase A cleavage were carried out essentially as described previously (Youssef et al., 2007. Nucleic Acids Res; 35:6196-206). Briefly, 0.1 pmol Of 32 P end-labeled RNA (either 5' or 3') were incubated in the absence (free RNA) or presence of increasing concentrations of Cas6 at 65-70° C for 30 minutes in buffer A (20 mM HEPES- KOH pH [7.0], 500 mM KCl). Lead (II) induced cleavage was initiated by the addition of 15 mM Pb(II) acetate (lead (II) acetate) prepared fresh in sterile water.
  • Reactions were carried out at room temperature for 10 minutes and were stopped by the addition of EDTA to a final concentration of 20 mM followed by PCI extraction and ethanol precipitation.
  • RNase A cleavage was initiated by the addition of 0.01 ng of RNase A (Applied Biosystems) and incubated at 37° C for 15 minutes. Reactions were stopped by PCI extraction followed by ethanol precipitation. Alkaline hydrolysis ladders (cleavage after each nucleotide) were generated as described previously (Youssef et al., 2007. Nucleic Acids Res; 35:6196-206).
  • RNA loading dye (10 M urea, 2 mM EDTA, 0.5% SDS, and 0.02% [w/v] each bromophenol blue and xylene cyanol) and separated on 38x30 cm 15% polyacrylamide (acrylamide:bis ratio 19:1) 7 M urea containing gels. The gels were dried and RNAs visualized by phosphor imaging.
  • RNA binding and cleavage reactions were carried out as described in example 1. Briefly, 0.05 pmol of uniformly 32 P-labeled RNA was incubated in the absence (free RNA) or presence of increasing concentrations of Cas6 (as indicated in figure legends) in buffer A for 30 minutes at 65-70° C. Half of each reaction was run on 8% native polyacrylamide gels to assess RNA binding by gel mobility shift analysis. RNA cleavage was assessed by separation of the RNAs on denaturing, 7 M urea containing, 15% polyacrylamide gels following PCI extraction and ethanol precipitation.
  • furiosus cells were lysed in 10 mL 50 mM Tris (pH 8.0) in the presence of 100 U RQl DNase (Promega) and 0.1 mM phenylmethanesulfonylfluoride (PMSF). The extract was then subjected to ultracentrifugation at 100,000g for 90 minutes. The resulting SlOO was then stored at -80° C until use.
  • IgY was purified from the egg yolks by polyethylene glycol (PEG) precipitation as described previously (Poison et al., 1980. Immunol Commun; 9:475-93). Briefly, egg yolks were separated from the whites and washed with dH ⁇ O. In a typical purification, three egg yolks were punctured, combined, and then resuspended in approximately 250 mL of lysis buffer (10 mM Tris [pH 7.5], 100 mM NaCl). Polyethylene glycol (PEG) 8000 (Fisher Scientific) was then added to 3.5 % w/v. The sample was then mixed by shaking and centrifuged at 10,000g for ten minutes.
  • PEG polyethylene glycol
  • the supernatant was then filtered through 100% cotton cheesecloth and then PEG 8000 was added 9% w/v.
  • the sample was mixed by shaking and then centrifuged at 10,000g for ten minutes. The supernatant was removed and discarded. The pellet was resuspended in approximately 35 mL of Lysis buffer by incubation at 4° C overnight. The PEG precipitation was then repeated and the pellet from the 9% w/v PEG step was resuspended in approximately 7 mL Lysis buffer and stored at either 4° C or -80° C until use.
  • the protein concentration was determined by the BCA assay (Pierce).
  • IP Immunoprecipitation of Cas ⁇ from P. ⁇ uriosus extract. Immunoprecipitations (IP) were performed using anti-Cas ⁇ IgY antibodies conjugated to CarboLinkTM coupling gel (Pierce). Coupling was performed according to the manufacturer's protocol and was verified by A absorbance readings.
  • a P. furiosus SlOO cell extract was pre-cleared in a reaction containing ⁇ 8 mg of total protein, -550 ⁇ g of non-immune IgY coupled CarboLinkTM resin, IX CompleteTM Mini protease inhibitor (Roche), 50 U SUPERase-INTM RNase inhibitor, and brought up to a total volume of 1 mL with IPP-300 (10 mM Tris [pH 8.0], 300 mM NaCl, 0.05% Igepal).
  • the pre-clearing reaction was incubated at room temperature for two hours with end-over-end rotation. The sample was then centrifuged at 300Og for 2 minutes and the supernatant was split between preimmune and immune BP reactions.
  • a typical IP reaction contained 500 ⁇ L of pre-cleared cell extract ( ⁇ 4mg total protein), IX CompleteTM Mini protease inhibitor, 50 U SUPERasinTM RNase inhibitor, 270- 550 ⁇ g antibody (either preimmune or immune) coupled resin, and brought up to a total volume of 1 mL with EPP-300.
  • the reactions were incubated at room temperature for 2 hours with end-over-end rotation and then washed four times with IPP-300. The pellets were resuspended in an equal volume of buffer A and stored at 4° C for later analysis.
  • RNAs were extracted from immunoprecipitation samples (both immune and preimmune) and WCE using TRIzolTM LS Reagent (Invitrogen) according to manufacturer's recommendations. Northern blots were performed essentially as described previously (Hale et al, 2008. RNA; 14:2572-9). Briefly, RNAs were separated on a 15% polyacrylamide, 7 M urea containing gel (Criterion , Bio-Rad) then transferred onto Zeta-ProbeTM nylon membranes (Bio-Rad) using a Trans-Blot SD Semi-Dry CellTM (Bio-Rad).
  • the membranes were then baked at 80°C for one hour before prehybridization in a ProBlot hybridization oven (LabNet) at 42°C for one hour. Prehybridization and hybridization were performed in Oligo-UltraHybTM (Applied Biosystems). Hybridization was initiated by adding 5' end-labeled probe to the prehybridization buffer, and hybridization was carried out at 42°C overnight. Following hybridization, the membrane was washed twice with 2X SSC (30 mM sodium citrate [pH 7.0], 300 mM NaCl) with 0.5% SDS for 30 minutes at 42°C. RNAs were then visualized by phosphor imaging. Results
  • RNA Mapping the Cas6/CRISPR repeat RNA interaction.
  • the 5- region of CRISPR repeat RNA plays a role in recognition by Cas6. Substitution or deletion of these nucleotides prevented detectable binding.
  • an RNA consisting of nucleotides 1-12 of the repeat displayed an binding affinity comparable to that of the full length repeat RNA. Sequence elements in the middle and 3' regions of the repeat did not appear to be important for binding given that Cas binding was insensitive to deletions or substitutions in these regions of the repeat RNA.
  • RNA footprinting was performed with radioactively labeled (either 3' or 5' end- labeled) CRISPR repeat RNA and recombinant PfCas ⁇ protein.
  • Lead (II) acetate cleaves single-stranded and tertiary interactions
  • RNase A cleaves after unpaired Cs and Us
  • a strong protection was observed in the 5' region of the repeat with both lead (II) acetate and RNase A using 3' end-labeled repeat RNA (Fig. 1OA and 10C).
  • nucleotides 2-8 were protected from lead induced cleavage in a Cas6 concentration dependent manner.
  • a similar protection profile was observed with RNase A, with cleavage products at nucleotides 3, 5, and 8 becoming less susceptible to degradation in the presence of Cas6.
  • No protection was observed with either lead (II) acetate or RNase A within the 3' region of the repeat using 5' end labeled repeat (Fig. 1OB and 10C).
  • lead (II) acetate or RNase A within the 3' region of the repeat using 5' end labeled repeat (Fig. 1OB and 10C).
  • Similar results were obtained when RNase Tl was used as a cleavage reagent for RNA footprinting.
  • RNA mutagenesis revealed that sequence elements in the 5' region of the repeat are the primary determinants for recognition by Cas6. Despite weak potential for the 5' and 3' regions of repeat RNA to base-pair, consistent with predictions made by in silico analysis (Kunin et al., 2007. Genome Biol; 8:R61), repeats from P. furiosus appear to be mostly unstructured in solution (RNA alone in Fig 1OA and 10B).
  • Cas6 proteins containing single amino acid substitutions (Y31A, Y31F, H46A, H46Q, K52A, and K52E) were expressed and purified in Esherichia coli (Fig. 1 IB) and assessed for their ability to cleave radiolabeled CRISPR repeat RNA. Mutation of any of the three triad amino acids led to a significant decrease or complete loss of cleavage activity relative to wild type Cas6 (Fig. 1 IA). Cleavage of the repeat was abolished in the Y3 IA, H46A, and H46Q mutants indicating that Tyr31 and His46 likely play a critical role in catalysis.
  • Native Cas6 cleaves CRISPR repeat RNA and associates with crRNA intermediates.
  • polyclonal antibodies were raised against recombinant Cas6 and used to immunoprecipitate the native protein from a P. furiosus extract.
  • the immunoprecipitation samples, along with whole cell extract (WGE) were then tested for Cas6 cleavage activity by incubation with uniformly labeled repeat RNA (Fig. 13A).
  • WGE whole cell extract
  • Fig. 13A uniformly labeled repeat RNA
  • the CRISPR repeat RNA was cleaved into the same products generated by recombinant Cas6 with no other cleavage products evident.
  • Cas6 cleavage activity was present in the immune, but not preimmune pellet following immunoprecipitation, indicating that the cleavage activity observed in WCE was carried out by native Cas6.
  • the cleavage activity observed in P. furiosus extract was found to be divalent metal ion independent, as was shown to be the case for recombinant Cas6.
  • RNA-cleaving endoribonuclease Cse3 from E. coli
  • RNP ribonucleoprotein complex
  • Cas6 co-purifies with several crRNA species including the 2X and IX intermediates intermediates (Hale et al., 2008. RNA; 14:2572-9), which correspond to cleavage products generated by Cas6 cleavage of the CRISPR primary transcript (Fig. 13B).
  • Cas6 also weakly co-purified with mature crRNAs (Fig. 14B).
  • the biogenesis of mature crRNAs is critical to CRISPR-Cas mediated resistance to genome invaders.
  • the initial processing step, endonucleolytic cleavage of the primary transcript within the repeat region is performed by Cas6 in P. furiosus. This cleavage results in a crRNA intermediate that retains eight nucleotides of the repeat at the 5' end and ⁇ 22 nucleotides of the next repeat at the 3' end.
  • this crRNA intermediate is then processed at the 3' end to yield two mature crRNA species that retain the eight nucleotide, repeat derived "tag" that we propose serves as a recognition sequence for other Cas proteins (Hale et al., 2008. RNA; 14:2572-9).
  • Cse3 is a divalent metal ion independent endoribonuclease that cleaves CRISPR repeat RNAs within the 3' region of the repeat. Although the sequences of the repeat RNAs differ, the position of cleavage on E. coli derived CRJSPR repeat RNAs and Cas6 cleavage of P. furiosus repeat RNAs, occurs eight nucleotides upstream of the 3' end of the repeat.
  • This cleavage generates an eight nucleotide tag which is retained on the mature crRNAs in both E. coli and P. furiosus (Brouns et al., 2008. Science; 321 :960-964; Hale et al., 2008. RNA; 14:2572-9).
  • the presence of these eight nucleotides may be a universal feature of crRNAs, serving as a recognition sequence for effector Cas proteins that rely on the spacer sequence to guide the complex to invading mobile genetic elements.
  • Cse3 from E. coli was found to be a component of a large RNP containing a number of other Cas proteins as well as mature crRNAs (Brouns et al., 2008. Science; 321 :960-964).
  • Cas6 appears to associate with several crRNA species, including its predicted cleavage products, the IX crRNA intermediate (Fig. 13B). Because it remains bound to its cleavage product, Cas6 may influence the 3' end processing of the IX crRNA intermediate. Cas6 is not likely to be a structural component of the invader-targeting effector complex because the protein only weakly associates with mature crRNAs (Fig. 13B).
  • Cse3 from Thermus thermophilus revealed a very similar overall architecture as that shown in the structure of Cas6 from P. furiosus (Ebihara et al., 2006. Protein Sci; 15:1494-1499). That is, Cse3 is composed of duplicated ferrodoxin folds separated by a central cleft which contains a conserved Gly-rich loop (Ebihara et al., 2006. Protein Sci; 15:1494- 1499). Located adjacent to this loop is an invariant His residue that was shown to be required for Cse3 cleavage activity (Brouns et al., 2008. Science; 321:960- 964; Ebihara et al., 2006. Protein Sci; 15:1494-1499).
  • Cas6 mediated cleavage of CRISPR repeat RNA was shown to require the highly conserved His46 residue, which is also located adjacent to the conserved Gly- rich loop characteristic of Cas6 proteins.
  • Cse3 lacks conserved Tyr and Lys residues that were shown in the current study to be important for Cas6 cleavage activity. Therefore it seems that despite the aforementioned similarities between the two, Cas6 and Cse3 likely employ distinct catalytic mechanisms.
  • Proposed Cas6 cleavage mechanism We propose a general acid/base catalytic mechanism for Cas6 based on similar active site architecture and reaction characteristics to the archaeal tRNA splicing endonuclease.
  • Cas6 employs a distinct method of substrate recognition and cleavage. Initially, Cas6 binds to sequence elements in the 5' region of CRJSPR repeat RNA and then cleavage occurs site specifically at a location outside the binding site.
  • One model for how this might occur is that Cas6 binding to the repeat RNA induces a conformational change in the RNA, possibly involving base pairing between the 5' and 3' palindromic regions of the repeat, resulting in the proper positioning of the scissile phosphate bond in the active site.
  • the RNA may wrap itself around Cas6 through a series of weak contacts to position the cleavage site in the active site. In this were the case, the weak interactions that occur outside the primary binding site could not be detected by the techniques used in this study. Further studies are required to determine the molecular mechanisms and dynamics that allow substrate recognition and catalysis to occur in distinct regions of both the protein and substrate.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

Provided herein are methods for cleaving a target RNA polynucleotide. The target RNA polynucleotide includes a Cas6 recognition domain and a cleavage site, and may be based on a repeat from a CRISPR locus. The methods may be practiced in vivo or in vitro. Also provided are polypeptides that have Cas6 endoribonuclease activity in the presence of a target RNA polynucleotide, and methods for using the polypeptides.

Description

CAS6 POLYPEPTIDES AND METHODS OF USE
CONTINUING APPLICATION DATA
This application claims the benefit of U.S. Provisional Application Serial No. 61/112,040, filed November 6, 2008, which is incorporated by reference herein.
GOVERNMENT FUNDING
The present invention was made with government support under Grant No. ROl GM54682, awarded by the NIH. The Government has certain rights in this invention.
BACKGROUND
All genomes are potential targets of invasion by molecular parasites such as viruses and transposable elements, and organisms have evolved RNA- directed defense mechanisms to cope with the constant threat of genome invaders (Farazi et al, 2008. Development 135: 1201- 1214; Girard and
Hannon, 2008. Trends Cell Biol. 18:136-148). The well-known subpathway of RNA silencing referred to as RNAi functions in defense against viruses in eukaryotes (Ding and Voinnet, 2007. Cell 130: 413-426). The RNAi defense response is mediated by short (~22-nucleotide [nt]) RNAs termed siRNAs. The siRNAs are generated from invading viral RNAs by dsRNA-specific, RNase Ill-like endonucleases called Dicers (Jaskiewicz and Filipowicz, 2008. Curr. Top. Microbiol. Immunol. 320: 77-97). The mature siRNAs are assembled with host effector proteins and target them to corresponding viral target RNAs to effect viral gene silencing via RNA destruction or other mechanisms (Farazi et al., 2008. Development 135: 1201- 1214; Girard and Hannon, 2008. Trends Cell Biol. 18:136-148).
Compelling evidence has recently emerged for the existence of an RNA- mediated genome defense pathway in archaea and numerous bacteria that has been hypothesized to parallel the eukaryotic RNAi pathway (for reviews, see Godde and Bickerton, 2006. J. MoL Evol. 62: 718-729; Lillestol et al., 2006. Archaea 2: 59-72; Makarova et al., 2006. Biol. Direct 1: 7.; Sorek et al., 2008. Nat. Rev. Microbiol. 6: 181-186). Known as the CRISPR-Cas system or prokaryotic RNAi (pRNAi), the pathway is proposed to arise from two evolutionarily and often physically linked gene loci: the CRISPR (clustered regularly interspaced short palindromic repeats) locus, which encodes RNA components of the system, and the cas (CRISPR-associated) locus, which encodes proteins (Jansen et al., 2002. MoI. Microbiol. 43: 1565—1575; Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Makarova et al., 2006. Biol. Direct 1 : 7; Haft et al., 2005. PLoS Comput. Biol. 1 : e60). The individual Cas proteins do not share significant sequence similarity with protein components of the eukaryotic RNAi machinery, but have analogous predicted functions (e.g., RNA binding, nuclease, helicase, etc.) (Makarova et al., 2006. Biol. Direct 1: 7). Unlike the siRNAs of the eukaryotic RNAi system, the effector RNAs of pRNAi are encoded in the host genome. CRISPR loci encode short (typically ~30- to 35-nt) invader-derived sequences interspersed between short (typically ~30- to 35-nt) direct repeat sequences (Bolotin et al., 2005. Microbiology 151: 2551-2561; Mojica et al., 2005. J. MoI. Evol. 60: 174-182; Pourcel et al., 2005. Microbiology 151 : 653-663; Godde and Bickerton, 2006. J. MoI. Evol. 62: 718-729; Lillestol et al., 2006. Archaea 2: 59-72; Makarova et al., 2006. Biol. Direct 1:7; Horvath et al., 2008. J. Bacteriol. 190: 1401-1412; Sorek et al., 2008. Nat. Rev. Microbiol. 6: 181-186). Recent studies have provided clear experimental evidence that correlates the presence of virus-specific CRISPR sequences with viral immunity (Barrangou et al., 2007. Science 315: 1709— 1712; Brouns et al., 2008. Science 321: 960-964; Deveau et al., 2008. J. Bacteriol. 190: 1390-1400). Furthermore, viral infection has been shown to result in the appearance of new corresponding CRISPR elements in surviving strains (Barrangou et al., 2007. Science 315: 1709-1712; Deveau et al., 2008. J. Bacteriol. 190: 1390-1400). This rapidly adapting CRISPR-based immunity acts within natural microbial populations to promote host cell fitness and to influence microbial ecology (Andersson and Banfϊeld, 2008. Science 320: 1047-1050; Tyson and Banfield, 2008. Microbiol. 10: 200-207). The primary products of the CRISPR loci appear to be short RNAs that contain the invader targeting sequences, and are termed guide RNAs or prokaryotic silencing RNAs (psiRNAs) based on their hypothesized role in the pathway (Makarova et al., 2006. Biol. Direct 1: 7; Hale et al., 2008. RNA, 14: 2572-2579). RNA analysis indicates that CRISPR locus transcripts are cleaved within the repeat sequences to release ~60- to 70-nt RNA intermediates that contain individual invader targeting sequences and flanking repeat fragments (Fig. IA; Tang et al., 2002. Proc. Natl. Acad. Sci. 99: 7536-7541; Tang et al., 2005. MoL Microbiol. 55: 469-481; Lillestol et al., 2006. Archaea 2: 59-72; Brouns et al., 2008. Science 321 : 960-964; Hale et al., 2008. RNA, 14: 2572- 2579). In the archaeon Pyrococcus furiosus, these intermediate RNAs are further processed to abundant, stable ~35- to 45-nt mature psiRNAs (Hale et al., 2008. RNA, 14: 2572-2579).
SUMMARY OF THE INVENTION
Provided herein are polynucleotides. The polynucleotides may include a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the amino acid sequence of the polypeptide and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, or the complement thereof. The polyncleotides may include a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the nucleotide sequence of the isolated polynucleotide and the nucleotide sequence of SEQ ID NO:1 have at least 80% identity, or the complement thereof. The polynucleotides may be enriched, isolated, or purified. The polynucleotides may include a heterologous polynucleotide, such as a regulatory sequence, or a vector.
In another aspect, a polynucleotide, referred to herein as a target RNA polynucleotide, may include a Cas6 recognition domain, wherein the Cas6 recognition domain includes 5'- GTTACAATAAGA (SEQ ID NO:237), or the complement thereof. For instance, the polynucleotide may include UNCNNUNNNNM4NNNNNNNNNNNNNNNN (SEQ ID NO: 192), UUACAAUANNNNNNNNNNNNNNNNNNNNN (SEQ ID NO: 193), GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO: 191), or the complements thereof. The polynucleotide may include an operably linked regulatory sequence or a vector, and the polynucleotide may be RNA. Also provided herein are polypeptides. A polypeptide has Cas6 endoribonuclease activity, and the polypeptide includes an amino acid sequence, wherein the amino acid sequence and the amino acid sequence of SEQ ID NO:2 have at least 80% identity. The polypeptides may further include a heterologous polypeptide. A polypeptide may be enriched, isolated, or purified. Further provided herein are genetically modified microbes. A genetically modified microbe may include a polynucleotide described herein or a polypeptide described herein. The microbe may be, for instance, a bacteria, such as a gram positive or a gram negative microbe, for example, E. coli, or an archeae, such as Haloferax volcanii. Also provided herein are compositions that include the polynucleotides, the polypeptides, and/or the genetically modified microbes described herein. For instance, a composition may include a polypeptide having Cas6 activity, a target RNA polynuncleotide, or the combination.
Provided herein are methods for using the polynucleotides, polypeptides, and genetically modified microbes described herein, hi one aspect, the methods may be used to cleave a nucleotide sequence. The method may include incubating a target RNA polynucleotide with a polypeptide under conditions suitable for cleavage of the target RNA polynucleotide, wherein the target RNA polynucleotide includes a Cas6 recognition domain. The polypeptide may be a Cas6 polypeptide from a microbe genome, for instance, the polypeptide includes an amino acid sequence having at least 80% with the amino acid sequence of SEQ ID NO:2, an amino acid sequence depicted in Figure 1, an amino acid sequence depicted in Figure 2, or an amino acid sequence depicted in Figure 3, and has Cas6 endoribonuclease activity. The polypeptide cleaves the target RNA polynucleotide at a cleavage site. The cleavage site may be located 5 to 20 nucleotides downstream of the Casό recognition domain. The target RNA polynucleotide may include a Cas6 recognition domain. The Casό recognition domain may be one that is present in a microbe genome, such as 5'- GTTACAATAAGA (SEQ ID NO:237). The target RNA polynucleotide may include UNCNNUNNNNNNNNNNNNNNNNNNNN^ (SEQ ID NO: 192), or UUACAAUANNNNNNNNNNNNNN>M^M^TON (SEQ ID NO:193), or GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO:191). The methods may be in vivo or in vitro. As used herein, an "enriched" polynucleotide means that a polynucleotide constitutes a significantly higher fraction of the total DNA or RNA present in a mixture of interest than in cells from which the sequence was taken. A person skilled in the art could enrich a polynucleotide by preferentially reducing the amount of other polynucleotides present, or preferentially increasing the amount of the specific polynucleotide, or both. However, polynucleotide enrichment does not imply that there is no other DNA or RNA present, the term only indicates that the relative amount of the sequence of interest has been significantly increased. The term "significantly" qualifies "increased" to indicate that the level of increase is useful to the person using the polynucleotide, and generally means an increase relative to other nucleic acids of at least 2 fold, or more preferably at least 5 to 10 fold or more. The term also does not imply that there is no polynucleotide from other sources. Other polynucleotides may, for example, include DNA from a bacterial genome, or a cloning vector. As used herein, an "enriched" polypeptide defines a specific amino acid sequence constituting a significantly higher fraction of the total of amino acids present in a mixture of interest than in cells from which the polypeptide was separated. A person skilled in the art can preferentially reduce the amount of other amino acid sequences present, or preferentially increase the amount of specific amino acid sequences of interest, or both. However, the term "enriched" does not imply that there are no other amino acid sequences present. Enriched simply means the relative amount of the sequence of interest has been significantly increased. The term "significant" indicates that the level of increase is useful to the person making such an increase. The term also means an increase relative to other amino acids of at least 2 fold, or more preferably at least 5 to 10 fold, or even more. The term also does not imply that there are no amino acid sequences from other sources. Other amino acid sequences may, for example, include amino acid sequences from a host organism. As used herein, an "isolated" substance is one that has been removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized. For instance, a polypeptide or a polynucleotide can be isolated. A substance may be purified, i.e., is at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other components with which it is naturally associated.
As used herein, the term "polypeptide" refers broadly to a polymer of two or more amino acids joined together by peptide bonds. The term "polypeptide" also includes molecules which contain more than one polypeptide joined by a disulfide bond, or complexes of polypeptides that are joined together, covalently or noncovalently, as multimers (e.g., dimers, tetramers). Thus, the terms peptide, oligopeptide, enzyme, and protein are all included within the definition of polypeptide and these terms are used interchangeably. It should be understood that these terms do not connote a specific length of a polymer of amino acids, nor are they intended to imply or distinguish whether the polypeptide is produced using recombinant techniques, chemical or enzymatic synthesis, or is naturally occurring. As used herein, "heterologous amino acids" or "heterologous polypeptides" refer to amino acids that are not normally associated with a polypeptide in a wild-type cell. Examples of heterologous polypeptides include, but are not limited to a tag useful for purification or a carrier polypeptide useful to increase immunogenicity of a polypeptide. A polypeptide that includes heterologous polypeptides may be referred to as a fusion polypeptide.
As used herein, the term "polynucleotide" refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides, and includes both double- and single-stranded RNA and DNA. A polynucleotide can be obtained directly from a natural source, or can be prepared with the aid of recombinant, enzymatic, or chemical techniques. A polynucleotide can be linear or circular in topology. A polynucleotide may be, for example, a portion of a vector, such as an expression or cloning vector, or a fragment. A polynucleotide may include nucleotide sequences having different functions, including, for instance, coding regions, and non-coding regions such as regulatory regions. As used herein, the terms "coding region" and "coding sequence" are used interchangeably and refer to a nucleotide sequence that encodes a polypeptide and, when placed under the control of appropriate regulatory sequences expresses the encoded polypeptide. The boundaries of a coding region are generally determined by a translation start codon at its 5' end and a translation stop codon at its 3' end. A "regulatory sequence" is a nucleotide sequence that regulates expression of a coding sequence to which it is operably linked. Non-limiting examples of regulatory sequences include promoters, enhancers, transcription initiation sites, translation start sites, translation stop sites, and transcription terminators. The term "operably linked" refers to a juxtaposition of components such that they are in a relationship permitting them to function in their intended manner. A regulatory sequence is "operably linked" to a coding region when it is joined in such a way that expression of the coding region is achieved under conditions compatible with the regulatory sequence.
A polynucleotide that includes a coding region may include heterologous nucleotides that flank one or both sides of the coding region. As used herein, "heterologous nucleotides" refer to nucleotides that are not normally present flanking a coding region that is present in a wild-type cell. For instance, a coding region present in a wild-type microbe and encoding a Cas6 polypeptide is flanked by homologous sequences, and any other nucleotide sequence flanking the coding region is considered to be heterologous. Examples of heterologous nucleotides include, but are not limited to regulatory sequences. Typically, heterologous nucleotides are present in a polynucleotide disclosed herein through the use of standard genetic and/or recombinant methodologies well known to one skilled in the art. A polynucleotide disclosed herein may be included in a suitable vector.
As used herein, an "exogenous polynucleotide" refers to a polynucleotide that is not normally or naturally found in a microbe. As used herein, the term "endogenous polynucleotide" refers to a polynucleotide that is normally or naturally found in a cell microbe. An "endogenous polynucleotide" is also referred to as a "native polynucleotide."
As used herein, "identity" refers to sequence similarity between two polypeptides or two polynucleotides. The sequence similarity between two polypeptides is determined by aligning the residues of the two polypeptides (e.g., a candidate amino acid sequence and a reference amino acid sequence, such as SEQ ID NO: 2) to optimize the number of identical amino acids along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared amino acids, although the amino acids in each sequence must nonetheless remain in their proper order. The sequence similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. Sequence similarity may be determined, for example, using sequence analysis techniques such as the BESTFIT or GAP algorithm in the GCG package (Madison WI), or the Blastp program of the BLAST 2 search algorithm, as described by Tatusova, et al. (FEMS Microbiol Lett 1999, 174:247-250), and available through the World Wide Web, for instance at the internet site maintained by the National Center for Biotechnology Information, National Institutes of Health. Preferably, sequence similarity between two amino acid sequences is determined using the Blastp program of the BLAST 2 search algorithm. Preferably, the default values for all BLAST 2 search parameters are used, including matrix = BLOSUM62; open gap penalty = 11, extension gap penalty = 1, gap x_dropoff = 50, expect = 10, wordsize = 3, and optionally, filter on. In the comparison of two amino acid sequences using the BLAST search algorithm, sequence similarity is referred to as "identities."
The sequence similarity between two polynucleotides is determined by aligning the residues of the two polynucleotides (e.g., a candidate nucleotide sequence and a reference nucleotide sequence, such as SEQ ID NO:1) to optimize the number of identical nucleotides along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of shared nucleotides, although the nucleotides in each sequence must nonetheless remain in their proper order. The sequence similarity is typically at least 80% identity, at least 81% identity, at least 82% identity, at least 83% identity, at least 84% identity, at least 85% identity, at least 86% identity, at least 87% identity, at least 88% identity, at least 89% identity, at least 90% identity, at least 91% identity, at least 92% identity, at least 93% identity, at least 94% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity, or at least 99% identity. Sequence similarity may be determined, for example, using sequence techniques such as GCG FastA (Genetics Computer Group, Madison, Wisconsin), Mac Vector 4.5 (Kodak/IBI software package) or other suitable sequence analysis programs or methods known in the art. Preferably, sequence similarity between two nucleotide sequences is determined using the Blastn program of the BLAST 2 search algorithm, as described by Tatusova, et al. (1999, FEMS Microbiol Lett., 174:247-250), and available through the World Wide Web, for instance at the internet site maintained by the National Center for Biotechnology Information, National Institutes of Health. Preferably, the default values for all BLAST 2 search parameters are used, including reward for match = 1, penalty for mismatch = -2, open gap penalty = 5, extension gap penalty = 2, gap x dropoff = 50, expect = 10, wordsize = 11, and optionally, filter on. In the comparison of two nucleotide sequences using the BLAST search algorithm, sequence similarity is referred to as "identities."
As used herein "prokaryotic microbe" and "microbe" are used interchangeably and refer to members of the domains Bacteria and Archaea.
As used herein, "genetically modified microbe" refers to a microbe which has been altered "by the hand of man." A genetically modified microbe includes a microbe into which has been introduced an exogenous polynucleotide. Genetically modified microbe also refers to a microbe that has been genetically manipulated such that endogenous nucleotides have been altered to include a mutation, such as a deletion, an insertion, a transition, a transversion, or a combination thereof. For instance, an endogenous coding region could be deleted. Such mutations may result in a polypeptide having a different amino acid sequence than was encoded by the endogenous polynucleotide. Another example of a genetically modified microbe is one having an altered regulatory sequence, such as a promoter, to result in increased or decreased expression of an operably linked endogenous coding region.
Conditions that are "suitable" for an event to occur, such as cleavage of a polynucleotide, or "suitable" conditions are conditions that do not prevent such events from occurring. Thus, these conditions permit, enhance, facilitate, and/or are conducive to the event.
As used herein, "in vitro" refers to air artificial environment and to processes or reactions that occur within an artificial environment. In vitro environments can consist of, but are not limited to, test tubes. The term "in vivo" refers to the natural environment (e.g., a cell, including a genetically modified microbe) and to processes or reaction that occur within a natural environment.
The term "and/or" means one or all of the listed elements or a combination of any two or more of the listed elements.
The words "preferred" and "preferably" refer to embodiments of the invention that may afford certain benefits, under certain circumstances. However, other embodiments may also be preferred, under the same or other circumstances. Furthermore, the recitation of one or more preferred embodiments does not imply that other embodiments are not useful, and is not intended to exclude other embodiments from the scope of the invention.
The terms "comprises" and variations thereof do not have a limiting meaning where these terms appear in the description and claims.
Unless otherwise specified, "a," "an," "the," and "at least one" are used interchangeably and mean one or more than one.
Also herein, the recitations of numerical ranges by endpoints include all numbers subsumed within that range (e.g., 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, 5, etc.).
For any method disclosed herein that includes discrete steps, the steps may be conducted in any feasible order. And, as appropriate, any combination of two or more steps may be conducted simultaneously.
The above summary of the present invention is not intended to describe each disclosed embodiment or every implementation of the present invention. The description that follows more particularly exemplifies illustrative embodiments, hi several places throughout the application, guidance is provided through lists of examples, which examples can be used in various combinations. In each instance, the recited list serves only as a representative group and should not be interpreted as an exclusive list. BRIEF DESCRIPTION OF THE FIGURES
Figure 1. Casό is an endoribonuclease that cleaves CRISPR RNAs within repeat sequences. (A) psiRNA biogenesis pathway model. The primary CRISPR transcript contains unique invader targeting or guide sequences (shaded blocks) flanked by direct repeat sequences (R). Casό catalyzes site- specific cleavage within each repeat, releasing individual invader targeting units. The Cas6 cleavage products undergo further processing to generate smaller mature psiRNA species. (B) Purified recombinant PfCasό expressed in E. coli. The sizes (in kilodaltons) of protein markers (M) are indicated. (C) Radiolabeled RNAs (repeat-guide-repeat [R-g-R] or repeat alone [R], as diagrammed) were either uniformly or 5 '-end-labeled and incubated in the absence (-) or presence (+) of PfCasό protein (500 nM). Products were resolved by denaturing gel electrophoresis and visualized using a phosphorimager. The main cleavage products are indicated by a star or asterisk on the gel and in the diagram.
Figure 2. PfCasό cleavage of a CRISPR RNA containing two repeat- guide RNA units. A uniformly radiolabeled substrate RNA containing two guide (invader targeting) sequences (ψ), two repeats (R) and a short (natural) 5' leader (L) sequence was incubated with 1 μM PfCasό protein and samples were analyzed by denaturing gel electrophoresis at the indicated times. The expected sizes and compositions of the RNA products (based on site-specific cleavage within each repeat) are indicated, as are the sizes of the marker RNAs (M).
Figure 3. Identification of the site of PfCasό cleavage within the CRISPR repeat RNA. (A) The site of PfCasό cleavage within the CRISPR repeat RNA was mapped by incubating 5' end labeled repeat RNA with PfCasό nuclease and comparing the size of the 5' RNA cleavage product (arrow) with RNAse Tl (Tl) and alkaline hydrolysis (OH) sequence ladders. (B) Potential secondary structure off. furiosus repeat RNA with cleavage site indicated. (C) Analysis of cleavage of wild-type and cleavage site mutant (AA to GG) repeat RNAs with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCasό. (D) Native gel mobility shift analysis of wild-type and mutant repeat RNAs with increasing concentrations of PfCasό. The positions of the free (RNA) and protein-bound (RNP) RNAs are indicated. 5' and 3' cleavage products are indicated in both C and D. The sizes of RNA markers (M) are indicated in A and C.
Figure 4. CRISPR repeat sequence requirements for PfCasό binding. (A) Detailed analysis of binding with a series of CRISPR-derived RNAs and mutants. The left panel illustrates the RNAs tested, with repeat (R) and invader targeting (ψ) sequences, and PfCasό cleavage site (dashed lines) indicated. The shaded portion of j denotes an insertion, dashed block denotes an internal deletion, and The shaded portions of **, e, f, and k denote substitutions (with complementary sequence). DNA indicates a DNA repeat sequence substrate. PfCasό binding is summarized relative to binding to the 5' cleavage product (++++). Corresponding RNA diagrams and data panels are designated with lowercase letters. The right panels show gel mobility shift analysis of the indicated RNAs with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCasό. Substrates are uniformly radiolabeled except for those shown in panels a, b, c, and 1, which are 5 '-end-labeled. Data for the intact repeat (*) and cleavage site mutant (**) are shown in Figure 3D. (B) PfCasό interacts with the gel-purified 5' cleavage product. The left panel shows the products of incubation of uniformly radiolabeled repeat RNA with (+) or without (-) PfCasό (1 μM). The positions of the 5' and 3' cleavage products are indicated. The right panel shows native gel mobility shift analysis of the gel-purified 5' and 3' PfCasό cleavage products (from the left panel) with increasing concentrations (0, 1, 50, 200, and 500 nM) of PfCasό. The positions of free (RNA) and protein-bound RNA (RNP) are indicated. (C) Model summarizing the minimal PfCasό-binding site within the CRISPR repeat RNA relative to the cleavage site.
Figure 5. Influence of temperature on the ability of PfCasό to bind and cleave CRISPR repeat RNA. Repeat RNA (uniformly radiolabeled) was incubated with (+) or without (-) 1 μM PfCasό protein at the indicated temperatures and the products were resolved by electrophoresis on denaturing (A) or native (B) polyacrylamide gels to assess RNA binding or cleavage, respectively. The positions of the 5' and 3' cleavage products are indicated. The positions of the free (RNA) and protein-bound (RNP) RNAs are indicated in panel B. Based on the data shown in panel A, the RNPs in panel B include primarily the 5' cleavage product at higher temperatures and the intact repeat at lower temperatures.
Figure 6. CRISPR repeat sequence requirements for PfCasό cleavage. Detailed analysis of cleavage with a series of CRISPR-derived RNAs and mutants. The left panel illustrates the RNAs tested as in Figure 4. PfCasό cleavage is summarized relative to cleavage of the intact repeat RNA (++++). PfCasό binding is summarized from Figure 4. Corresponding RNA diagrams and data panels are designated with lowercase letters. The right panels show cleavage assays using uniformly radiolabeled repeat RNA with (+) or without (-) PfCasό (500 nM). Data for the intact repeat (*) is shown on right and data for the cleavage site mutant (**) is shown in Figure 3C.
Figure 7. Structural features of PfCasό. Front (A) and back (B) views of the structure of PfCasό represented in ribbon diagrams (left) and shaded electrostatic surface potential (right). In the center, the fold topology is illustrated with arrows (β-strands) and circles (α-helices). In the ribbon diagrams, the G-rich loop characteristic of RAMP proteins is designated "βl l" in A and "G-rich loop" in B and the predicted catalytic triad residues are labeled Tyr31 , His46, and Lys52 in B. The electrostatic potential was computed using the GRASP2 program (Petrey and Honig 2003. Methods Enzymol. 374: 492- 509) and is shaded dark and light, for negative and positive potentials, respectively.
Figure 8. Amino acid sequence alignment of Casό proteins. PSI-BLAST of PFl 131 amino acid sequence against the non-redundant protein database (nr) at NCBI yielded 151 protein sequences that have E- values of less than 10"4. It was immediately clear that many organisms contain more than one Casό-related sequence. These fell into two distinctive classes: one that includes the conserved triad residues (like PfCasό) and one that does not. We aligned 42 Casό homologs that appear to belong to the first class and have E- values of less than 10"23. hi this alignment, the strictly conserved residues are the putative catalytic triad residues and the four glycine residues in the G-rich loop, βl, β2, etc., αl, α2, etc., and TT refer to predicted secondary structure elements, β- strand, α helix, β-turn, respectively. Organisms and genes listed include: Pyrococcus furiosus DSM 3638 (gi_l 8977503), Pyrococcus abyssi GE5 (gi_14521345), Pyrococcus horikoshii OT3 (gi_14591070), Thermococcus kodakaraensis KODl (gi_57640399), Methanocaldococcus jannaschii DSM 2661 (gi_l 5668551), Pelodictyon phaeoclathratiforme BU-I (gi_68548726), Archaeoglobus fulgidus DSM 4304 (gi_l 1497692), Chlorobium phaeobacteroides DSM 266 (gi_l 19357836), Candidatus Desulforudis audaxviator MP 104C (gij 69831963), Prosthecochloris aestuarii DSM 271 (gi_68552024), Desulfotomaculum reducens MI-I (gi_l 34298408), Thermoanaerobacter tengcongensis MB4 (gi_20809008), Methanosarcina barkeri str. Fusaro (gi_73667850), Methanosarcina acetivorans C2A (gi_20092472), Geobacillus thermodenitrificans NG80-2 (gi_138893955), Thermotoga maritima MSB8 (gi_l 5644558 ), Thermotoga sp. RQ2 (gij 70288802),
Figure imgf000015_0001
sp. 128-5-R1-1 (gij 63782737), Thermoanaerobacter tengcongensis MB4 (gi_20809011), Methanococcoides burtonii DSM 6242 (gi_91773105), Thermotoga petrophila RKU-I (gi_148270229), Geobacillus sp. WCH70 (gi_l 71325396), Desulfitobacterium hafniense DCB-2 (gi_l 09645858), Chlorobium limicola DSM 245 (gi_67917921), Desulfitobacterium hafniense Y51 (gi_l 24521532), Methanobrevibacter smithii ATCC 35061 (gi_l 48642230), Carboxydothermus hydrogenoformans Z-2901 (gi_78043250), Methanococcus voltae A3 (gi_l 63800065), Pelotomaculum thermopropionicum SI (gi_147678256), Methanosphaera stadtmanae DSM 3091 (gi_84489743), Clostridium thermocellum ATCC 27405 (gij 25974788), Candidatus Kuenenia stuttgartiensis (gi_91200631), Caldicellulosiruptor saccharolyticus DSM 8903 (gi_l 46296147), Carboxydothermus hydrogenoformans Z-2901 (gi_78044781), Thermoanaerobacter pseudethanolicus ATCC 33223 (gi_l 67036552), Rubrobacter xylanophilus DSM 9941 (gij 08803123), Fervidobacterium nodosum Rtl7-Bl (gij 54250072), Petrotoga mobilis SJ95 (gi_l 60903200), Victivallis vadensis ATCC BAA-548 (gi_l 50384465), Microscilla marina ATCC 23134 (gi_l 24008802), Clostridium difficile QCD-32g58 (gij 45953632). Figure 9. Catalytic features of PfCas6 cleavage activity. (A) Cleavage activity is not dependent on divalent metal ions. Uniformly radiolabeled repeat RNA was incubated with 1 μM PfCasό in the absence (-) or presence (+) of 1.5 mM MgCl2 or 20 niM metal chelator EDTA as indicated. (B) Analysis of the termini of PfCasό cleavage products. The products of cleavage reactions performed with unlabeled repeat RNA substrates (initially containing hydroxyl groups at both the 5' and 3' termini) were radiolabeled at either their 5' ends (using 32P-ATP and polynuclotide kinase) or 3' ends (using 32pCp and RNA ligase). The positions of the 5' and 3' cleavage products are indicated in A and B. (C) The pattern of radiolabeling of the RNA cleavage products (B) indicates that PfCasό cleaves on the 5' side of the phosphodiester bond, as is the case for other metal-independent ribonucleases. Cleavage likely generates 5' hydroxyl (OH) and 2', 3' cyclic phosphate (>P) RNA termini.
Figure 10. Lead-induced and RNase A cleavage footprinting with CRISPR repeat RNA and PfCasό. (A) 3 ' end labeled CRISPR repeat RNA was incubated in the absence (RNA) or presence of increasing concentrations of PfCasβ (indicated in μM) and then subjected to RNase A cleavage (left panel) or lead-induced cleavage (right panel). RNAs were separated by 15% denaturing (7 M urea) polyacrylamide gels. Size markers include 5' end- labeled RNA markers (M) and alkaline hydrolysis ladders (OH). Bars along right side of each gel indicate strong protections. (B) 5' end-labeled CRISPR repeat RNA was used for lead-induced and RNase A cleavage as was done in (A). A summary of cleavage protections is displayed to the right of each gel. (C) A summary of cleavage protection is shown. The Cas6 cleavage site is indicated by an asterisk (*), and the nucleotides protected from cleavage are shown by the bars above and below the sequence.
Figure 11. Cleavage activity of Cas6 mutants. (A) Uniformly 32P labeled CRISPR repeat RNA was incubated in the absence (-) or presence of increasing concentrations of wild type or mutant Cas6 (0.001, 0.05, and 0.5 μM) followed by separation on a 15% denaturing (7 M urea) polyacrylamide gel. The 5' and 3' cleavage products are indicated. (B) Purified wild type (wt) and mutant Cas6 proteins (as indicated above) were separated by SDS-PAGE. Molecular weight markers are indicated in kDa.
Figure 12. Substrate recognition by Cas6 mutants. Uniformly P- labeled CRISPR repeat RNA was incubated in the absence (-) or presence of increasing concentrations of wild type or mutant Cas6 (0.001, 0.05, 0.2, and 0.5 μM) and then assessed for their ability to form a stable complex with the substrate RNA by employing native gel mobility shift analysis. The positions of the free (RNA) and bound (RNP) substrate RNA are indicated. Figure 13. Native Cas6 cleaves CRISPR repeat RNA and associates with crRNAs. (A) Uniformly 32P-labeled CRISPR repeat RNA was incubated in the absence (RNA) or presence of recombinant Cas6 (rCasό), whole cell extract (WCE), or samples from immunoprecipitation reactions using anti-Cas6 antibodies (Pre, preimmune; Imm, immune; S, supernatant; P, pellet). The RNAs were separated on a 15% denaturing, 7 M urea, containing polyacrylamide gel along with 5' end-labeled RNA markers (M). (B) Northern blot analysis of Cas6 immunoprecipitation. RNAs extracted from WCE, preimmune (Pre) and immune (Imm) supernatants (Sup, left panel), and pellets (Pel, right panel) from an immunoprecipitation using anti-Cas6 antibodies were separated on 15% denaturing, 7 M urea containing, polyacrylamide gel along with 5' end-labeled RNA markers (M). A 5' end-labeled DNA oligonucleotide that was antisense to crRNA spacer 6.01 from P. furiosus was used as a probe. The positions of the 2X intermediate, IX intermediate, and mature crRNAs are indicated.
Figure 14. The proposed catalytic mechanism of Cas6. Tyr31 acts as a general base and His46 as a general acid, while Lys52 stabilizes a predicted pentavalent intermediate. The cleavage products generated contain a 5' OH and likely 2' -3' cyclic phosphate. Figure 15. Amino acid sequences of Cas6 polypeptides from Archeae.
The alphanumeric code above each sequence is the UniProtKB/TrEMBL accession number.
Figure 16. Amino acid sequences of Cas6 polypeptides from Bacteria. The alphanumeric code above each sequence is the UniProtKB/TrEMBL accession number.
Figure 17. Amino acid sequences of Cas6 polypeptides from Cyanobacteria. The alphanumeric code above each sequence is the UniProtKB/TrEMBL accession number.
Figure 18. Amino acid sequences of a Cas6 polypeptide (SEQ ID NO:2) and a nucleotide sequence (SEQ ID NO: 1) encoding the polypeptide.
Figure 19. Alignments between Cas6 polypeptide regions and domains of hidden Markov models present in the TIGRFAM database of protein families. Amino acids 44 to 236 or 95 to 238 of SEQ ID NO:2), domain present in TΪGR01877 (SEQ ID NO:188), domain present in PR01881 (SEQ ID NO: 189).
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
Polypeptides
Provided herein are polypeptides having endoribonuclease activity. A polypeptide having endoribonuclease activity as described below is referred to herein as a Cas6 polypeptide, and the endoribonuclease activity is referred to herein as Cas6 endoribonuclease activity. Examples of Cas6 polypeptides are depicted at Genbank Accession No. AAL81255 (SEQ ID NO:2), Figure 15, Figure 16, and Figure 17. Other examples of Cas6 polypeptides provided herein include those having sequence similarity with the amino acid sequence of SEQ ID NO:2, an amino acid sequence depicted in Figure 15, an amino acid sequence depicted in Figure 16, or an amino acid sequence depicted at Figure 17. A Cas6 polypeptide having sequence similarity with the amino acid sequence depicted at SEQ ID NO:2, Figure 15, Figure 16, or Figure 17 has Cas6 endoribonuclease activity. A Cas6 polypeptide may be enriched, isolated, or purified from a microbe having a CRDPSR locus and the cas (CRISPR- associated) locus, such as, but not limited to, Pyrococcus furiosus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized using routine methods. In some aspects, a Cas6 polypeptide may be enriched, isolated, or purified from a microbe that does not have CRISPR loci. The amino acid sequence of a Cas6 polypeptide having sequence similarity to an amino acid sequence disclosed herein, such as SEQ ID NO:2, an amino acid sequence depicted in Figure 15, an amino acid sequence depicted in Figure 16, or an amino acid sequence depicted in Figure 17, may include conservative substitutions of amino acids present in an amino acid sequence. A conservative substitution is typically the substitution of one amino acid for another that is a member of the same class. For example, it is well known in the art of protein biochemistry that an amino acid belonging to a grouping of amino acids having a particular size or characteristic (such as charge, hydrophobicity, and/or hydrophilicity) may generally be substituted for another amino acid without substantially altering the secondary and/or tertiary structure of a polypeptide. For the purposes of this invention, conservative amino acid substitutions are defined to result from exchange of amino acids residues from within one of the following classes of residues: Class I: GIy, Ala, VaI, Leu, and He (representing aliphatic side chains); Class II: GIy, Ala, VaI, Leu, He, Ser, and Thr (representing aliphatic and aliphatic hydroxyl side chains); Class III: Tyr, Ser, and Thr (representing hydroxyl side chains); Class IV: Cys and Met (representing sulfur-containing side chains); Class V: GIu, Asp, Asn and GIn (carboxyl or amide group containing side chains); Class VI: His, Arg and Lys (representing basic side chains); Class VII: GIy, Ala, Pro, Trp, Tyr, He, VaI, Leu, Phe and Met (representing hydrophobic side chains); Class VIII: Phe, Trp, and Tyr (representing aromatic side chains); and Class IX: Asn and GIn (representing amide side chains). The classes are not limited to naturally occurring amino acids, but also include artificial amino acids, such as beta or gamma amino acids and those containing non-natural side chains, and/or other similar monomers such as hydroxy acids.
Guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie et al. (1990, Science, 247:1306-1310), wherein the authors indicate proteins are surprisingly tolerant of amino acid substitutions. For example, Bowie et al. disclose that there are two main approaches for studying the tolerance of a polypeptide sequence to change. The first method relies on the process of evolution, in which mutations are either accepted or rejected by natural selection. The second approach uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene and selects or screens to identify sequences that maintain functionality. As stated by the authors, these studies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which changes are likely to be permissive at a certain position of the protein. For example, most buried amino acid residues require non-polar side chains, whereas few features of surface side chains are generally conserved. Other such phenotypically silent substitutions are described in Bowie et al, and the references cited therein.
A Cas6 polypeptide may include a GhGxxxxxGhG (SEQ ID NO: 190) motif (where "h" indicates a hydrophobic amino acid) near the C-terminus. An Arg or Lys may be, and often is, found within the central stretch of 5 amino acids (i.e. xxxxx). A Casό polypeptide contains at least one residue - the His46 shown in Figure 8 - that may play a role in catalysis, or conservative substitution thereof. A Cas6 polypeptide may contain other residues - the Tyr31 and Lys52 shown in Figure 8 - which may also play a role in catalysis, or conservative substitution thereof. The residue(s) expected to play a role in catalysis may be located near the G-rich loop that contains the Cas6 signature motif in the 3D structure of the protein as described in Example 1 herein. Other areas that are conserved, as well as areas that are not conserved, are shown in Figure 8. Cas6 polypeptides may include domains present in the TIGRFAM database at accession numbers TIGRO 1877 and PFO 1881 , as shown in Figure 19. The TIGRFAM database includes families of polypeptides for which function is conserved (Haft et al., Nucl. Acids Res., 2003, 31:371-373, Bateman and Haft, 2002, Briefings Bioinformatics, 3:236-245, and Haft et al., 2005, PLoS Computational Biol, l(6):e60). Other examples of Cas6 polypeptides provided herein include those present in prokaryotic microbes having a CRISPR locus and a cas locus. Examples include those depicted in Figure 15, Figure 16, and Figure 17. Cas6 polypeptides can be easily identified in any microbe that includes a CRTPSR locus. A coding region encoding a Cas6 polypeptide is typically in a cas locus located in close proximity to a CRISPR locus. Haft et al. (2005, PLoS
Computational Biol., l(6):e60) review the Cas protein family, and created rules for the identification of specific subtypes of the CRISPR/Cas system. Haft et al describe the coding region encoding Cas6 polypeptides as being found in association with at least four separate CRISPR/Cas subtypes (Tneap, Hmari, Apern, and Mtube), and as typically being the cas coding region located most distal to the CRISPR locus. Cas6 polypeptides may be identified using the resources available at the JCVI Comprehensive Microbial Resource (http://cmr.jcvi.org/cgi-bin/CMR/CmrHomePage.cgi). For instance, running a genome property search against all available genomes for the genome property CRISPR Regions {Guild} results in a list of microbes that are predicted to include a Cas6 polypeptide. Thus, Cas6 polypeptides that are useful in the methods described herein can be identified by the skilled person using routine methods. Examples of prokaryotic microbes with known whole genomic sequences containing coding regions expected to encode a Cas6 polypeptide include Thermotoga maritima MSB8, Campylobacter fetus subsp. fetus 82-40, Fusobacterium nucleatum ATCC 25586, Streptococcus thermophilus LMG 18311, Thermoanaerobacter tengcongensis MB4(T), Moorella thermoacetica ATCC 39073, Desulfitobacterium hafniense Y51, Clostridium tetani E88, Clostridium perfringens SMlOl, Clostridium difficile QCD-32g58, Clostridium botulinum Hall A Sanger, Clostridium botulinum F Langeland, Clostridium botulinum Bl strain Okra, Clostridium botulinum A3 strain Loch Maree, Clostridium botulinum A Hall, Clostridium botulinum A ATCC 19397,
Carboxydothermus hydrogenoformans Z-2901, Staphylococcus epidermidis RP62A, Thermus thermophilus HB8, Thermus thermophilus HB27, Nostoc sp. PCC 7120, Anabaena variabilis ATCC 29413, Synechococccus sp. OS Type B prime, Synechococccus sp. OS Type A, Porphyromonas gingivalis W83, Bacteroides fragilis YCH46, Bacteroides fragilis NCTC9343, Aquifex aeolicus VF5, Rubrobacter xylanophilus DSM 9941, Mycobacterium tuberculosis H37Rv (lab strain), Mycobacterium tuberculosis CDC 1551, Mycobacterium bovis subsp. bovis AF2122/97, Frankia alni ACN 14a, Thermoplasma volcanium GSSl, Picrophilus torridus DSM 9790, Thermococcus kodakarensis KODl, Pyrococcus horikoshii shinkaj OT3, Pyrococcus furiosus DSM 3638, Pyrococcus abyssi GE5, Methanosarcina barkerifusaro, Methanosarcina acetivorans C2A, Methanococcoides burtonii DSM 6242, Methanococcus jannaschii DSM2661, Methanobacterium thermoautotrophicum delta H, Haloarcula marismortui ATCC 43049, Archaeoglobus fulgidus DSM4304, Pyrobaculum aerophilum IM2, Sulfolobus tokodaii strain 7, Sulfolobus solfataricus P2, Sulfolobus acidocaldarius DSM 639, Aeropyrum pernix Kl. Other examples of Cas6 polypeptides are known to the skilled person, see, for instance, members of the COGl 583 group of polypeptides (available at the Clusters of Orthologous Groups of proteins (COGs) web page through the National Center for Biotechnology Information internet site, see also Tatusov et al., 1997, Science, 278:631-637, and Tatusov et al. 2003, BMC Bioiπformatics, 4(1):41), members of the InterPro family having accession number IPR010156, Makarova et al., (2002, Nuc. Acids Res., 30:482-496) and Haft et al. (2005, PLoS Comput. Biol, l(6):e60, 474-483). A Cas6 polypeptide having Cas6 endoribonuclease activity is able to cleave a target RNA polynucleotide. Whether a polypeptide has Cas6 endoribonuclease activity can be determined by in vitro assays. An in vitro assay may be carried out by combining a suitable target RNA polynucleotide with a polypeptide expected to have Cas6 endoribonulease activity. The characteristics of the target RNA polynucleotide may depend upon the amino acid sequence of the Cas6 polypeptide. Target RNA polynucleotides are described below. The target RNA polynucleotide may be between 0.01 pmol to 0.1 pmol, such as 0.05 pmol, and the Cas6 polypeptide may be between 50 nM and 1 μM, such as 200 nM or 500 nM. The polypeptide to be tested may be enriched, isolated, or purified. For instance, the polypeptide may be from a whole cell extract, such as an SlOO extract, or from an immunoprecipitation reaction. The suitable target RNA polynucleotide and polypeptide may be incubated in a buffer such as HEPES-KOH at 15 mM to 25 mM, preferably 20 mM, and pH between 6.5. and 7.5, preferably 7.0. The mixture may also include KCl at 240 mM to 260 mM, preferably 250 mM, DTT at 0.7 mM to 0.8 mM, preferably 0.75 mM, MgCl2 at 1.0 mM to 2.0 mM, preferably 1.5 mM, glycerol at 5% to 15%, preferably 10%, and additional RNA, such as E. coli tRNA at 5 μg per 20-μL reaction volume. This may be incubated at a suitable temperature such as at least 30°C, at least 40°C, at least 50°C, at least 60°C, at least 700C, at least 800C, and at least 900C, for at least 30 minutes. A portion of the mixture may be removed and resolved on a native polyacrylamide gel to measure binding of the polypeptide to the target RNA polynucleotide. To measure cleavage, the polypeptide may be removed by extraction and the mixture resolved on a denaturing (7 M urea), 12%— 15% polyacrylamide gel. The presence of a band that runs at a molecular weight that is less than the original target RNA polynucleotide indicates the polypeptide is a Cas6 polypeptide.
Polynucleotides Also provided herein are enriched, optionally isolated polynucleotides, encoding a Cas6 polypeptide. A polynucleotide encoding a Cas6 polypeptide having Cas6 endoribonuclease activity is referred to herein as a Cas6 polynucleotide. Cas6 polynucleotides may have a nucleotide sequence encoding a polypeptide having the amino acid sequence shown in SEQ ID NO:2. An example of the class of nucleotide sequences encoding such a polypeptide is the nucleotide sequence depicted at Genbank Accession No. AE010223 (SEQ ID NO:1). It should be understood that a polynucleotide encoding a Cas6 polypeptide represented by SEQ ID NO: 2 is not limited to the nucleotide sequence disclosed at SEQ ID NO: 1 , but also includes the class of polynucleotides encoding such polypeptides as a result of the degeneracy of the genetic code. For example, the naturally occurring nucleotide sequence SEQ ID NO:1 is but one member of the class of nucleotide sequences encoding a polypeptide having the amino acid sequence SEQ ID NO:2. The class of nucleotide sequences encoding a selected polypeptide sequence is large but finite, and the nucleotide sequence of each member of the class may be readily determined by one skilled in the art by reference to the standard genetic code, wherein different nucleotide triplets (codons) are known to encode the same amino acid. Examples of other Cas6 polynucleotides include those having a nucleotide sequence encoding a polypeptide having the amino acid sequence shown in Figure 15, 16, or 17.
A Cas6 polynucleotide may have sequence similarity with the nucleotide sequence of SEQ ID NO:1. Cas6 polynucleotides having sequence similarity with the nucleotide sequence of SEQ ID NO:1 encode a Cas6 polypeptide. A Cas6 polynucleotide may be isolated from a microbe having CRIPSR loci, such as, but not limited to, Pyrococcus furiosus, or may be produced using recombinant techniques, or chemically or enzymatically synthesized using routine methods. A Cas6 polynucleotide may further include heterologous nucleotides flanking the open reading frame encoding the Cas6 polynucleotide. Typically, heterologous nucleotides may be at the 5' end of the coding region, at the 3' end of the coding region, or the combination thereof. The number of heterologous nucleotides may be, for instance, at least 10, at least 100, or at least 1000.
The present invention also includes fragments of the polypeptides described herein, and the polynucleotides encoding such fragments. For instance, the present invention includes fragments of SEQ ID NO:2, as well as fragments having structural similarity to SEQ ID NO:2. A polypeptide fragment may include a sequence of at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, or at least 100 amino acid residues.
A polypeptide disclosed herein or a fragment thereof may be expressed as a fusion polypeptide that includes a polypeptide disclosed herein or a fragment thereof and an additional heterologous amino acid sequence. For instance, the additional amino acid sequence may be useful for purification of the fusion polypeptide by affinity chromatography. Various methods are available for the addition of such affinity purification moieties to proteins. Representative examples may be found in Hopp et al. (U.S. Pat. No. 4,703,004), Hopp et al. (U.S. Pat. No. 4,782,137), Sgarlato (U.S. Pat. No. 5,935,824), and Sharma (U.S. Pat. No. 5,594,115). In another example, the additional amino acid sequence may be a carrier polypeptide. The carrier polypeptide may be used to increase the immunogenicity of the fusion polypeptide to increase production of antibodies that specifically bind to a polypeptide of the invention. The invention is not limited by the types of carrier polypeptides that may be used to create fusion polypeptides. Examples of carrier polypeptides include, but are not limited to, keyhole limpet hemacyanin, bovine serum albumin, ovalbumin, mouse serum albumin, rabbit serum albumin, and the like.
A polynucleotide disclosed herein, such as a polynucleotide encoding a Cas6 polypeptide or a polynucleotide encoding a target RJSfA polynucleotide, may be present in a vector. Target RNA polynucleotides are described below. A vector is a replicating polynucleotide, such as a plasmid, phage, or cosmid, to which another polynucleotide may be attached so as to bring about the replication of the attached polynucleotide. Construction of vectors containing a polynucleotide of the invention employs standard ligation techniques known in the art. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual., Cold Spring Harbor Laboratory Press (1989). A vector may provide for further cloning (amplification of the polynucleotide), i.e., a cloning vector, or for expression of the polynucleotide, i.e., an expression vector. The term vector includes, but is not limited to, plasmid vectors, viral vectors, cosmid vectors, and artificial chromosome vectors. Examples of viral vectors include, for instance, adenoviral vectors, adeno-associated viral vectors, lentiviral vectors, retroviral vectors, and herpes virus vectors. Typically, a vector is capable of replication in a microbial host, for instance, a fungus, such as S. cerevisiae, or a prokaryotic bacterium, such as E. coli. Preferably the vector is a plasmid.
Selection of a vector depends upon a variety of desired characteristics in the resulting construct, such as a selection marker, vector replication rate, and the like. In some aspects, suitable host cells for cloning or expressing the vectors herein include eukaryotic cells. Suitable eukaryotic cells include fungi, such as S. cerevisiae and P. pastoris. In other aspects, suitable host cells for cloning or expressing the vectors herein include prokaryotic cells. Suitable prokaryotic cells include bacteria, such as gram-negative microbes, for example, E. coli. Other suitable prokaryotic cells include archeae, such as Haloferax volcanii. Vectors may be introduced into a host cell using methods that are known and used routinely by the skilled person. For example, calcium phosphate precipitation, electroporation, heat shock, lipofection, microinjection, and viral-mediated nucleic acid transfer are common methods for introducing nucleic acids into host cells.
Polynucleotides of the present invention may be obtained from microbes, for instance, members of the genus Pyrococcus, such as P.furiosus, or produced in vitro or in vivo. For instance, methods for in vitro synthesis include, but are not limited to, chemical synthesis with a conventional DNA/RNA synthesizer. Commercial suppliers of synthetic polynucleotides and reagents for such synthesis are well known. Likewise, polypeptides of the present invention may be obtained from microbes, or produced in vitro or in vivo.
An expression vector may optionally include a promoter that results in expression of an operably linked coding region. Promoters act as regulatory signals that bind RNA polymerase in a cell to initiate transcription of a downstream (3' direction) coding region. Promoters present in prokaryotic microbes typically include two short sequences at -10 (often referred to as the Pribnow box, or the -10 element) and -35 positions (often referred to as the -35 element), or a short sequence at -30 (often referred to as a TATA box) located 5' from the transcription start site, for bacterial and archael organisms, respectively. The promoter used may be a constitutive or an inducible promoter. It may be, but need not be, heterologous with respect to a host cell. Target RNA polynucleotides of the present invention do not encode a polypeptide, and expression of a target RNA polynucleotide present in a vector results in a non- coding RNA. Thus, a vector including a target RNA polynucleotide may also include a transcription- start signal and/or a transcription terminator operably linked to the target RNA polynucleotide, but a translation start signal and/or translation stop signal typically are not operably linked to a target RNA polynucleotide. Promoters have been identified in many microbes and are known to the skilled person. Many computer algorithms have been developed to detect promoters in genomic sequences, and promoter prediction is a common element of many gene prediction methods. Thus, the skilled person can easily identify nucleotide sequences present in microbes that will function as promoters.
An expression vector may optionally include a ribosome binding site and a start site (e.g., the codon ATG) to initiate translation of the transcribed message to produce the polypeptide. It may also include a termination sequence to end translation. A termination sequence is typically a codon for which there exists no corresponding aminoacetyl-tRNA, thus ending polypeptide synthesis. The polynucleotide used to transform the host cell may optionally further include a transcription termination sequence.
A vector introduced into a host cell optionally includes one or more marker sequences, which typically encode a molecule that inactivates or otherwise detects or is detected by a compound in the growth medium. For example, the inclusion of a marker sequence may render the transformed cell resistant to a selective agent, such as an antibiotic, or it may confer compound- specific metabolism on the transformed cell. Examples of a marker sequence include, but are not limited to, sequences that confer resistance to kanamycin, ampicillin, chloramphenicol, tetracycline, streptomycin, and neomycin. Another example of a marker that renders a cell resistant to a selective agent is 3- hydroxy-3-methylglutaryl coenzyme A reductase (HMG-CoA), an enzyme used for archaeal membrane lipid biosynthesis (Matsumi et al., J. Bacteriol., 2007, 189:2683-2691). Certain statins, such as mevinolin and its analog simvastatin, inhibit HMG-CoA reductase activity, and overexpression of HMG-CoA reductase can confer resistance to mevinolin and/or simvastatin. Yet another example of a marker is a nutritional marker. A nutritional marker is typically a coding region that, when mutated in a cell, confers on that cell a requirement for a particular compound. Cells containing such a mutation will not grow on defined medium that does not include the appropriate compound, and cells receiving a coding region that complements the mutation can grow on the defined medium in the absence of the compound. Examples of nutritional markers include, but are not limited to, coding regions encoding polypeptides in biosynthetic pathways, such as nucleic acid biosynthesis (e.g,. biosynthesis of uracil), amino acid biosynthesis (e.g., biosynthesis of histidine and tryptophan), vitamin biosynthesis (e.g., biosynthesis of thiamine), and the like.
Polypeptides useful in the methods described herein, such as the polypeptides described herein and other Cas6 polypeptides, may be obtained from a microbe that has a CRISPR locus. Examples of such microbes are listed above. Polypeptides and fragments thereof useful in the present invention may be produced using recombinant DNA techniques, such as an expression vector present in a cell. Such methods are routine and known in the art. The polypeptides and fragments thereof may also be synthesized in vitro, e.g., by solid phase peptide synthetic methods. The solid phase peptide synthetic methods are routine and known in the art. A polypeptide obtained from a microbe having a CRISPR locus, produced using recombinant techniques, or by solid phase peptide synthetic methods may be further purified by routine methods, such as fractionation on immunoaffinity or ion-exchange columns, ethanol precipitation, reverse phase HPLC, chromatography on silica or on an anion-exchange resin such as DEAE, chromatofocusing, SDS-PAGE, ammonium sulfate precipitation, gel filtration using, for example, Sephadex G- 75, or ligand affinity. Typically, obtaining polypeptides includes conditions that minimize RNAse and proteinase activity, such as by including RNAse inhibitors and protease inhibitors.
Genetically modified microbes
The present invention also includes genetically modified microbes that have a polynucleotide encoding target RNA polynucleotide, a Cas6 polypeptide, or the combination. Compared to a control microbe that is not genetically modified according to the present invention, a genetically modified microbe may exhibit production of an exogenous polynucleotide or an exogenous polypeptide disclosed herein, or increased production of an endogenous Cas6 polypeptide. A polynucleotide encoding a target RNA polynucleotide or a Cas6 polypeptide disclosed herein may be present in the microbe as a vector or integrated into a chromosome. Examples of microbes that can be genetically modified include, but are not limited to, eukaryotic cells, such as S. cerevisiae and P. pastoris, bacteria, such as gram-negative microbes, for example, E. coli, and archeae, such as Haloferax volcanii.
Methods of Use
Also provided herein are methods for cleaving a polynucleotide. The methods include incubating a target RNA polynucleotide with a Cas6 polypeptide under conditions suitable for cleavage of the polynucleotide by the Cas6 polypeptide. Restriction endonucleases recognize a specific nucleotide sequence (a recognition domain) of a target polynucleotide and cleave the target at a specific location which can be within the recognition domain or outside of the recognition domain. A Cas6 polypeptide cleaves a target outside of the recognition domain, but unlike a restriction endonuclease, the nucleotide sequence to which different Cas6 polypeptides bind can vary. Target polynucleotides described herein are not limited to those possessing a recognition domain with a specific nucleotide sequence. Moreover, unlike restriction endonucleases known in the art, the target polynucleotide may be RNA.
A target RNA polynucleotide has a Cas6 recognition domain, i.e., the site to which a Cas6 polypeptide binds, and a cleavage site, i.e., the site enzymatically cleaved by a Cas6 polypeptide. While the term target RNA polynucleotide suggests the nucleotides are ribonucleotides, polynucleotides described herein also include the corresponding deoxyribonucleotide sequence, and the RNA and DNA complements thereof. It should be understood that the sequences disclosed herein as DNA sequences can be converted from a DNA sequence to an RNA sequence by replacing each thymidine nucleotide with a uracil nucleotide. In one aspect, a target RNA polynucleotide may be based on a nucleotide sequence from a CRISPR locus. A CRISPR locus of a prokaryotic microbe includes, from 5' to 3', a repeat followed immediately by a spacer (referred to herein as a "repeat-spacer unit"). Typically, a CRISPR locus includes multiple repeat-spacer units. In a CRJSPR locus, each repeat is nearly identical (Barrangou et al., U.S. Published Patent Application 2008/0124725), and is typically 30 to 35 nucleotides in length. In contrast to the repeats, each spacer of a CRISPR locus is typically a different nucleotide sequence. The Cas6 endoribonuclease activity of a Cas6 polypeptide disclosed herein cleaves a repeat region derived from a CRISPR locus. The location of the cleavage site is on the 5' side of the nucleotide located 10, 9, 8, 7, 6, or 5 nucleotides from the 3' end of the repeat. In some aspects, the cleavage site is on the 5' side of the nucleotide located 8 nucleotides from the 3' end of the repeat.
The nucleotide sequence of a repeat present in a CRISPR locus can easily be identified in any microbe that includes a CRISPR locus. For instance, the genomic sequences of many microbes are known, and the location of CRISPR loci in these microbes is often known, or can easily be located using routine bioinformatic methods known in the art. For instance, Edgar (BMC Bioinformatics, 2007, 8:18) describes a computer program specifically designed for the identification and analysis of CRISPR repeats, and includes a list of predicted repeats based on 346 prokaryotic genomes (see Edgar, Supplementary Table 1). Grissa et al. (BMC Bioinformatics, 2007, 8:172, and Nucl. Acids Res., 2007, 35(Web Server issue):W52-W57) describe a computer program which identifies CRISPRs from genomic sequences, extracts the repeat and spacer sequences, and constructs a database which is automatically updated monthly using newly released genome sequences. Thus, the nucleotide sequence of a repeat in a CRISPR locus can be determined by the skilled person using routine methods. For example, a repeat present in Pyrococcus furiosus is GTTCCAATAAGACTAAAATAGA i ATTGAAAG (SEQ ID NO: 191 ), and the location of the site cleaved by a Cas6 polypeptide, such as SEQ ID NO:2, is shown by the arrow, i.e., 8 nucleotides from the 3' end of the repeat.
In another aspect, a target RNA polynucleotide may include other nucleotide sequences downstream of the cleavage site, i.e., the nucleotides that correspond to the 3' end of a repeat present in a microbe and downstream of a cleavage site may be different relative to the nucleotides present in a repeat present in a microbe. It is expected that the nucleotides downstream of a cleavage site may include at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 substitutions when compared to the nucleotides present in a repeat present in a microbe. A target RNA polynucleotide based on a repeat present in a microbe may include fewer than 8 nucleotides downstream of the cleavage site. For instance, a target RNA polynucleotide based on a repeat present in a microbe may include at least 1, at least 2, at least 3, at least 4, at least 5, or at least 6 nucleotides downstream of the cleavage site. Optionally and preferably, one or both of the nucleotides flanking the cleavage site are the same as found in the wild-type microbe.
In some aspects, a target RNA polynucleotide based on a repeat obtained from a particular microbe may include other variations in nucleotide sequence relative to the repeat present in the microbe. Typically, such variations occur outside of the Cas6 recognition domain. A Cas6 recognition domain is located near the 5' end of a repeat. In one aspect, a Cas6 recognition domain includes the nucleotide beginning at position 1 (i.e., the nucleotide at the 5' end of the repeat) and extends to nucleotide 6, nucleotide 7, nucleotide 8, nucleotide 9, nucleotide 10, nucleotide 11, nucleotide 12, or nucleotide 13. The Cas6 recognition domain of a target RNA polynucleotide may be defined relative to its distance from the cleavage site. For instance, a Cas6 recognition domain includes nucleotides located 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, and/or 21 or more nucleotides upstream of the cleavage site. The size of a Cas6 recognition domain may span at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, or at least 8 nucleotides to no greater than 10, no greater than 11, no greater than 12 nucleotides, or no greater than 13 nucleotides. For instance, when the Cas6 polypeptide is SEQ ID NO:2 or has sequence similarity with SEQ ID NO:2, a Cas6 recognition domain may include the nucleotides located 15, 18, and 20 nucleotides upstream of the cleavage site, and can be represented as UNCNMJNNNNbWNNNNNNNN | NNNNNNNN (SEQ ID NO: 192), where the arrow refers to the cleavage site, and one or both of the nucleotides flanking the cleavage site is A. Preferably, when the Cas6 polypeptide is SEQ ID NO:2 or has sequence similarity with SEQ ID NO:2, a Cas6 recognition domain includes the nucleotides located 14 to 21 nucleotides upstream of the cleavage site, and can be represented as
UUACAAUANNNNNNNNNNNNN 4 NNNNNNNN (SEQ ID NO: 193), where the arrow refers to the cleavage site, and one or both of the nucleotides flanking the cleavage site is A. Thus, for a target RNA polynucleotide that is based on a repeat present in a CRISPR locus, the nucleotide sequence between the Cas6 recognition domain and the cleavage site may vary from the sequence present in a wild-type repeat.
Typically, when a target RNA polynucleotide is based on a repeat obtained from a particular microbe, the Casό polypeptide used to cleave the target RNA polynucleotide is a Casό polypeptide present in that microbe (or a microbe with a similar CRISPR repeat sequence), or has sequence similarity to such a Cas6 polypeptide. Thus, when a target RNA polynucleotide is based on a repeat identical or similar to that present in Pyrococcus furiosus, the Cas6 polypeptide is SEQ ID NO: 2 or has sequence similarity to SEQ ID NO:2. When a target RNA polynucleotide is based on a repeat identical or similar to that present in Korarchaeum cryptofilum, the Casό polypeptide is SEQ ID NO:3 or has sequence similarity to SEQ ID NO:3. Likewise, when a target RNA polynucleotide is based on a repeat identical or similar to that present in a microbe listed in Figure 15, Figure 16, or Figure 17, the Casό polypeptide is, or has sequence similarity to, a Casβ polypeptide present in that microbe. The Cas6 polypeptide may also be one present in a microbe with an identical or similar CRISPR repeat sequence as that in the target RNA polynucleotide. Identifying nucleotide sequences encoding Casό polypeptides is described above. In view of the present disclosure, the skilled person now knows which target RNA polynucleotide and Casό polypeptide can be used to result in cleavage of a target RNA polynucleotide.
A target RNA polynucleotide may include an additional polynucleotide at the 3' end, at the 5' end, or at both ends. If the target RNA polynucleotide is identical to a CRISPR repeat, the additional polynucleotide may be referred to as a heterologous polynucleotide. This additional polynucleotide at the 3' end can be chosen by a skilled person and cleaved using the methods described herein. Thus, the skilled person can design a target RNA polynucleotide that will result in the production of an RNA with a predictable and known 5' end. It is expected that there is no upper limit on the number of nucleotides that may added to the 3' end of a repeat. For instance, a target RNA polynucleotide may include at least 10, at least 50, or at least 100 additional nucleotides at the 3' end.
The methods may be in vitro or in vivo. Practicing the method in vivo may include introducing a polynucleotide into a microbe. The introduced polynucleotide may include the target RNA polynucleotide, or the introduced polynucleotide may encode the target RNA polynucleotide. The microbe may be,- but is not limited to, a genetically modified microbe. An example of a genetically modified microbe for use in the methods includes one with an exogenous polynucleotide encoding a Cas6 polypeptide. The method may be practiced at a suitable temperature such as at least 30°C, at least 400C, at least 50°C, at least 60°C, at least 700C, at least 800C, or at least 90°C
Also provided herein are target RNA polynucleotides that include a Cas6 recognition domain as described above. The polynucleotide may be RNA, or may be DNA. If it is DNA it may be operably linked to a regulatory sequence, such as a promoter, and may be present in a vector. Optionally, the polynucleotide may include nucleotides downstream of the cleavage site to facilitate the ligation of a different polynucleotide downstream of the cleavage site. For instance, nucleotides downstream of the cleavage site may include a restriction endonuclease site or a multiple cloning site.
The present invention also provides kits. A kit may include one or more of the polynucleotides or polypeptides described herein. For instance, a kit may include a target RNA polynucleotide or a DNA polynucleotide encoding a target RNA polynucleotide, a polynucleotide encoding a Cas6 polypeptide, a Cas6 polypeptide, or a combination thereof. Kits may be used, for instance, for modifying a microbe to express a Cas6 polypeptide and/or a target RNA polynucleotide. Kits may be used for in vitro cleavage of a target RNA polynucleotide. The kit components are present in a suitable packaging material in an amount sufficient for at least one assay. Optionally, other reagents such as buffers and solutions needed to practice the invention are also included.
Instructions for use of the packaged polypeptide and/or polynucleotide are also typically included.
As used herein, the phrase "packaging material" refers to one or more physical structures used to house the contents of the kit. The packaging material is constructed by well known methods, preferably to provide a sterile, contaminant-free environment. The packaging material has a label which indicates that the components can be used for methods as described herein. In addition, the packaging material contains instructions indicating how the materials within the kit are employed. As used herein, the term "package" refers to a solid matrix or material such as glass, plastic, paper, foil,, and the like, capable of holding within fixed limits a kit component. Thus, for example, a package can be a glass vial used to contain milligram quantities of a polypeptide or polynucleotide. "Instructions for use" typically include a tangible expression describing the reagent concentration or at least one assay method parameter.
The present invention is illustrated by the following examples. It is to be understood that the particular examples, materials, amounts, and procedures are to be interpreted broadly in accordance with the scope and spirit of the invention as set forth herein.
Example 1
An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRJSPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here a Pyrococcus furiosus Cas6 was identified as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targeting RNAs. Cas6 interacts with a specific sequence motif in the 5' region of the CRISPR repeat element and cleaves at a defined site within the 3' region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal- independent, casό is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR- derived guide RNAs in numerous bacteria and archaea.
Materials and methods
Purification of PFl 131 protein for cleavage and RNA-binding assays. N-terminal, 6x-histidine-tagged PFl 131 protein (PfCasό from P. furiosus DSM 3638 strain) was expressed in Escherichia coli BL21 codon + (DE3, Invitrogen) cells harboring a pET24d plasmid containing the appropriate gene insert (obtained from Michael Adams, University of Georgia, Athens, GA). Protein expression was induced by growing the cells to an OD6oo of 0.6 and adding isopropylthio-β-D-galactoside (IPTG) to a final concentration of 1 mM. The cells were disrupted by sonication (Misonix Sonicator 3000) in buffer A (20 mM sodium phosphate [pH 7.0], 500 mM NaCl and 0.1 mM phenylmethylsulfonyl fluoride). The Iy sate was then cleared by centrifugation and the supernatant was incubated for 20 minutes at 70°C. This sample was centrifuged and the supernatant was applied to a Ni-NTA agarose column (Qiagen) that had been equilibrated with Buffer A. The protein was eluted from the column with Buffer A containing 350 mM imidazole. The purity of the protein was evaluated by SDS-PAGE and staining with coomassie blue. Buffer exchange into 40 mM HEPES-KOH (pH 7.0), 500 mM KCL was carried out using Microcon PL-IO filter columns (Millipore). The protein concentration was determined by the BCA assay (Pierce).
Generation of RNA substrates.
Synthetic RNAs (listed in Table 1) and the RNA size standards (Decade Markers) were purchased from Integrated DNA Technologies (IDT) and
Ambion, respectively. These RNAs were 5 '-end-labeled with T4 Polynucleotide kinase (Ambion) in a 20-μL reaction containing 20 pmol of RNA, 500 μCi of [γ32P] ATP (3000 Ci/mmol; MP Biomedicals), and 20 U of T4 kinase. The RNAs were separated by electrophoresis on denaturing (7 M urea) 15% polyacrylamide gels, and the appropriate RNA species were excised from the gel with a sterile razor blade guided by a brief autoradiographic exposure. The RNAs were eluted from the gel slices by end-over-end rotation in 400 μL of RNA elution buffer (500 mM NH4OAc, 0.1% SDS3 0.5 mM EDTA) for 12-14 h at 4°C. The RNA was then extracted with phenol/chloroform/isoamyl alcohol (PCI, 25:24: 1 at pH 5.2), and precipitated with 2.5 volumes of 100% ethanol in the presence of 0.3 M sodium acetate and 20 μg of glycogen after incubation for 1 hour at -20°C.
Table 1. Oligonucleotides used in this study
Figure imgf000036_0001
Figure imgf000037_0001
All other RNAs were generated by in vitro transcription using T7 RNA polymerase (Ambion) and uniformly labeled with [α-32P] UTP (700 Ci/mmol; MP Biomedicals) as described (Baker et al , 2005. Genes & Dev. 19: 1238- 1248). The templates used were either annealed DNA oligonucleotides or PCR products (see Tables 1, 2), both containing the T7 promoter sequence. A typical reaction contained 200 ng of PCR product or annealed deoxyoligonucleotides, 1 mM DTT, 10 U SUPERase-IN RNase inihibitor (Ambion), 500 μM ATP, CTP, and GTP, 50 μM UTP5 30 μCi [α-32P] UTP, 1 transcription buffer (Ambion), and 40 U T7 RNA polymerase in a total volume of 20 μL.
Table 2. Combinations of deoxyoligonucleotides used to generate RNAs in this study
Figure imgf000038_0001
The oligos were either annealed directly (FVT) or were used as PCR primers to generate template DNA (PCR) for in vitro transcription reactions. Oligo sequences are listed in Table 1.
RNA-binding and cleavage reactions
Typically, identical reaction conditions were used to assay the ability of PfCas6 protein to bind to and to cleave substrate RNAs. These reactions were initiated by incubating 0.05 pmol of 32P-radiolabed RNAs (either uniformly or 5 '-end-labeled) with up to 1 μM (as indicated in the figure legends) of PfCas6 protein in 20 mM HEPES-KOH (pH 7.0), 250 niM KCl, 0.75 mM DTT, 1.5 mM MgC12, 5 μg of E. coli tRNA, and 10% glycerol in a 20-μL reaction volume for 30 minutes at 70°C. Half of the reactions were directly run on native 8% polyacrylamide gels to assay RNA binding by gel mobility shift essentially as described (Baker et al., 2005. Genes & Dev. 19: 1238-1248). RNA cleavage was assayed using the remaining half of the reaction by deproteinizing (PCI extraction and ethanol precipitation) the RNAs and separating them by electrophoresis on denaturing (7 M urea), 12%-15% polyacrylamide gels. Gels were dried and the radiolabeled RNAs visualized by phosporimaging.
Cleavage site mapping.
In order to map the site of RNA cleavage by Cas6, a standard cleavage reaction was set up using 5' end labeled repeat RNA as described above. Alkaline hydrolysis and RNase Tl (0.1 U) ladders were generated as described previously (Youssef et al., 2007. Nucleic Acids Res. 35: 6196-6206). Following the reactions, the RNAs were extracted with PCI, ethanol precipitated, and separated by electrophoresis on large, denaturing (7 M urea), 15% polyacrylamide (19:1 acrylamide:bis) gels. The gels were dried and the RNAs visualized by phosphorimaging.
Purification of PfCas6 for structure determination.
N-terminal polyhistidine-tagged wild-type and selenomethionine-labeled PFl 131 protein was expressed in E. coli and purified from cell extract by heat- denaturation and two chromatography steps. The cells were disrupted by sonication in a buffer containing 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, 5 mM β-mercaptoethanol (βME), and 0.2 mM phenylmethylsulfonyl fluoride. The cell lysate was heated for 15 minutes to 70°C before being pelleted. The supernatant was then directly loaded at room temperature onto a Ni-NTA (Qiagen) column equilibrated with 25 mM sodium phosphate (pH 7.5), 5% (v/v) glycerol, 1 M NaCl, and 5 mM imidazole. The column was washed with the loading buffer containing 25 mM imidazole and then the bound protein was eluted using the loading buffer containing 350 mM imidazole. Fractions containing PFl 131 were pooled and loaded onto a Superdex 200 (Hiload 26/60, Pharmacia) size-exclusion column equilibrated with 20 mM Tris-HCl (pH 7.4), 500 mM KCl, 5% glycerol, 0.5 mM ethylenediaminetetraacetic acid (EDTA), and 5 mM βME. The fractions corresponding to PFl 131 were pooled and concentrated to 100 mg/mL for crystallization.
Crystallization of PFl 131 and selenomethionine-labeled PFl 131.
Both the wild-type and selenomethionine-labeled PFl 131 protein were crystallized using vapor diffusion in a hanging drop at 300C. The droplets of PFl 131 at 40 mg/mL were combined in equal volume with a well solution that contained 50 mM MES (pH 6.0), 30 mM MgCl2, and 15% (v/v) isopropanol. The crystals formed in 1-5 days with a cubic shape and to a size of ~0.4 mm* O.4 mmχO.4 mm.
Data collection and structure determination. Crystals were soaked briefly in a cryo-protecting solution containing the mother liquor plus 20% (w/v) polyethylene glycol 4000 before being flash frozen in a nitrogen stream at 100 Kelvin. The crystals of the native and selenomethionine-labeled PFl 131 diffracted to dm{n = 1.8-2.2 A at the Southeast Regional Collaborative Access Team (SER-CAT) beamline 22ID. The space group of the crystals was determined to be P3221 and the cell dimensions are listed in Table 3. A single wavelength data set was collected at the anomalous peak of selenine from a selenomethionine-labeled crystal. The solvent content was calculated to be 54.9% if the crystal was assumed to contain one PFl 131 in one asymmetric unit. The structure of PFl 131 was solved by a SAD phasing method using the automated crystallographic structure solution program SOLVE (Terwilliger and Berendzen, 1999. Acta Crystallogr. D Biol. Crystallogr. 55: 849-861). The initial model traced by SOLVE was further improved by the program COOT (Emsley and Cowtan, 2004. Acta Crystallogr. D Biol. Crystallogr. 60: 2126-2132), followed by refinement using CNS (Brunger et al, 1998. Acta Crystallogr. D Biol. Crystallogr. 54: 905- 921) and REFMAC5 (Murshudov et al., 1997. Acta Crystallogr. D Biol. Crystallogr. 53: 240-255) to
Figure imgf000040_0001
of 23.6/27.3. The quality of the structure model was checked by PROCHECK (Laskowski et al., 1993. J. Appl. Crystallogr. 26: 283- 291) and was found to be of satisfactory stereochemical properties. Table 3. Data collection and refinement statistics (values in parentheses refer to those of the highest resolution shell)
Crystal information space group P3221 unit cell parameters (A/°): a /c/g 84.745/81.679/120
SAD data wavelength (A) 0.97925 resolution range (A) 50.0-2.25 (2.33-2.25) number of unique reflections 16705 redundancy 20.8 (17.7) completeness (%) 99.7 (99.7)
I/σ(I) 93.6 (8.1)
Figure imgf000041_0001
Refinement data and statistics resolution range (A) 50.0-1.8 (1.86-1.80) number of unique reflections 30102 (1923) redundancy 16.0 (5.1) completeness (%) 94.5 (61.0)
I/σ(I) 78.9 (2.4)
Figure imgf000041_0002
Rwork(%) 23.6 (35.8)
Rfree (%) 27.3 (40.1)
Model information number of amino-acid 232 number of protein atoms 1951 number of waters 35
R.M.S.D of the model bond length (A) 0.007 bond angle (°) 1.041
Ramachandran plot residues in most favored region 183 [92.9%] residues in additionally allowed region 13 [6.6%] residues in generously allowed region 1 [0.5%] residues in disallowed region 0 [0%] Results
The psiRNAs, which are thought to be primary agents in prokaryotic genome defense, are derived from CRISPR RNA transcripts that consist of a series of individual invader targeting sequences separated by a common repeat sequence (Fig. IA). To identify the enzyme required for dicing CRISPR RNA transcripts and releasing the individual embedded psiRNAs, a number of recombinant P. furiosus Cas proteins were screened for the ability to cleave CRISPR repeat sequences. A single protein was identified, Cas6 (PFl 131), that cleaves specifically within the repeat sequence of radiolabeled substrate RNAs consisting of either a guide (invader targeting or "spacer") sequence flanked by two repeat sequences or the repeat sequence alone (Fig. IB3C). Examination of the cleavage products generated from uniformly labeled and 5 '-end-labeled RNA substrates indicates that cleavage occurs -20-25 nt from the 5' end of the repeat. Cleavage also occurs within each repeat of an extended substrate RNA containing two guide sequences and flanking repeats (Fig. 2).
More than 40 CRISPR-associated genes have been identified; however, only a subset of the cas genes is found in any given genome, and no cas gene appears to be present in all organisms that possess the CRISPR-Cas system (Haft et al, 2005. PLoS Comput. Biol. l:e60; Makarova et al, 2006. Biol. Direct 1 :7). Cas6 is among the most widely distributed Cas proteins and is found in both bacteria and archaea (Haft et al., 2005. PLoS Comput. Biol. 1 :e60). A distinct protein with similar activity was very recently reported in Escherichia coli (Brouns et al., 2008. Science 321 : 960-964). This protein, Cse3 (CRISPR-Cas system subtype E. coli, also referred to as CasE), is found in some bacteria that lack Cas6 (Haft et al., 2005. PLoS Comput. Biol. 1 :e60). Both Cas6 and Cse3 are members of the RAMP (repeat-associated mysterious protein) superfamily, as are a large number of the Cas proteins (Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Makarova et al., 2006. Biol. Direct 1: 7). RAMP proteins contain G-rich loops and are predicted to be RNA-binding proteins (Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Makarova et al., 2006. Biol. Direct 1 : 7). Cas6 is distinguished from the many other RAMP family members by a conserved sequence motif within the predicted C-terminal G-rich loop (consensus GhGxxxxxGhG, where h is hydrophobic and xxxxx has at least one lysine or arginine) (Makarova et al., 2002. Nucleic Acids Res. 30: 482-496; Haft et al., 2005. PLoS Comput. Biol. 1: e60). Nuclease activity was not predicted for Cas6 based on sequence analysis.
To determine the precise PfCasό cleavage site within the CRISPR repeat sequence, 5 '-end-labeled repeat RNA was incubated with the purified enzyme and the 5' cleavage product was mapped relative to RNase Tl (cuts after guanosines) and alkaline hydrolysis (cuts after each nucleotide) cleavage products (Fig. 3A). A 22-nt 5' cleavage product was identified indicating that cleavage occurs between adenosine 22 and adenosine 23 of the 30-nt repeat sequence (Fig. 3 A5B). The resulting 5' end generated by PfCasό is the same as that observed in mature psiRNA species isolated from P. furiosus cells. Mutation of the two nucleotides spanning the cleavage site (AA to GG) drastically reduced the cleavage activity of PfCasό (Fig. 3C) without preventing binding of the enzyme to the RNA (assayed by RNA gel mobility shift; Fig. 3D). The site of cleavage is at a junction within a potential stem-loop structure that may form by base-pairing between weakly palindromic sequences commonly found at the 5' and 3' termini of CRISPR repeat sequences (Fig. 3B; Godde and Bickerton, 2006. J. MoI. Evol. 62: 718-729; Kunin et al., 2007. Genome Biol. 8: R61).
We next investigated the RNA sequence requirements of Cas6 binding and endonucleolytic cleavage. To identify the RNA-binding determinants, we performed gel mobility shift assays with a series of RNAs (Fig. 4A). The results indicate that sequences in the 5' region of the CRISPR repeat are important for PfCasό binding. Under normal assay conditions, rapid cleavage prevents unambiguous observation of PfCasό binding to the intact repeat (Fig. 3 C5D), although binding can be observed with the cleavage site mutant (Fig. 3D) and at reduced temperatures where PfCasό cleavage activity is inhibited (Fig. 5). However, incubation of PfCasό with the repeat RNA (Fig. 3D) or with a guide sequence flanked by two repeat sequences (Fig. 4A, panel a) under conditions compatible with cleavage reveals interaction of the protein with the 5' cleavage product generated during incubation. PfCasό also interacts with the gel-purified 5' cleavage product, but not with the 3' cleavage product (Fig. 4B). Furthermore, we found that PfCasό binds each tested RNA that contains the repeat sequences found upstream of the cleavage site (i.e., the first 22 nt of the repeat) (Fig. 4A, panels c,f,g), but not an RNA that contains only the downstream region (last 8 nt) of the repeat (Fig. 4A5 panel b).
Further analysis indicates that the first 12 nucleotides of the 5' region of the CRISPR repeat play a critical role in Cas6 binding. PfCasό binds to an RNA comprised of the first 12 nucleotides of the repeat with similar affinity as the 5' cleavage product (Fig. 4 A, panel h). Furthermore, protein binding is abolished by substitution or deletion of the first eight nucleotides of the repeat (Fig. 4 A, panels d,e). In addition, substitution, insertion or deletion in the region of nucleotides 9-12 appears to have slightly reduced interaction (Fig. 4A3 panels ij,k). No binding was observed with a DNA repeat sequence (Fig. 4A, panel 1). Taken together, the results indicate that PfCasό requires sequence and/or structure information present within the first 12 nucleotides of the CRISPR repeat RNA for stable interaction (Fig. 4C).
While nucleotides at the 5' end of the CRISPR repeat are sufficient for robust PfCasό binding, cleavage appears to involve additional elements. As expected, mutations that disrupt protein binding also eliminate cleavage activity (Fig. 6, panels d,e). However, other mutations dramatically reduced cleavage efficiency without disrupting PfCasβ binding. As indicated above, substitution of the two adenosines at the cleavage site disrupts cleavage but not binding (Fig. 3 C3D). In addition, substitution of the last 8 nucleotides of the repeat specifically disrupted cleavage (Fig. 6, panel f). PfCasό cleavage activity was also significantly reduced by small (4-nt) insertions or deletions between the PfCasό-binding site and cleavage site (Fig. 6, panels i,j). Substitution of 6 nt between the binding and cleavage sites also disrupted cleavage (Fig. 6, panel k)., No cleavage activity was observed with a DNA repeat sequence (Fig. 6, panel 1). These results suggest that cleavage depends upon sequence elements along the length of the repeat and perhaps upon the distance between the binding and cleavage sites, and are consistent with a requirement for a specific RNA fold such as the predicted hairpin structure (Fig. 3B; Godde and Bickerton, 2006. J. MoI. Evol. 62: 718-729; Kunin et al., 2007. Genome Biol. 8: R61).
P. furiosus has seven CRISPR loci with five slightly varied repeat sequences, and the elements that we identified as most important for Casό recognition and cleavage map to the regions of greatest sequence conservation. Variation is observed at only one position within each the first 12 and last 11 nucleotides of the P.furiosus repeat sequences, consistent with the importance of these two regions in Cas6 binding and cleavage. On the other hand, variation occurs at three positions between the binding and cleavage sites (positions 14, 16, and 19), suggesting that nucleotide identities are less important in this region.
To gain a more detailed understanding of PfCasό, we obtained a crystal structure of the protein at 1.8 A resolution (Fig. 6; see Table 3 for structure determination details). PfCasό contains a duplicated ferredoxin-like fold linked by an extended peptide (residues 118-123). The close arrangement of the β- sheets of the two ferredoxin-like folds creates a well-formed central cleft (Fig. 7A). The ferredoxin fold is a common protein fold also found in the structures of other RNA-binding proteins including the well-characterized RNA recognition motif (RRM), which primarily functions in ssRNA binding (Maris et al, 2005. FEBS J. 272: 2118-2131). However, PfCasό appears to exploit a distinct mechanism of base-specific ssRNA recognition. Most notably, PfCasό lacks the prevalent aromatic and positive residues that characterize the β-sheets of RRMs (Maris et al., 2005. FEBS J. 272: 2118-2131). The central regions of both the front and back surfaces of PfCasό display positive potential that coincides with regions of conserved amino acids (Fig. 7) suggesting that the composite surfaces formed by the tandem ferredoxin-like folds correspond to RNA-binding sites.
The structure of PfCasό allows us to predict the site of catalysis and catalytic mechanism of the enzyme. Several candidate catalytic residues are evident as strictly conserved residues in aligned Casό sequences (Fig. 8). These include Tyr31 , His46, and Lys52, which cluster within 6 A of each other and are found in close proximity to the G-rich loop that contains the Casό signature motif (Fig. 7B). These three residues may form a catalytic triad for RNA cleavage similar to that of the tRNA intron splicing endonuclease (Calvin and Li., 2008. Cell. MoI. Life Sci. 65: 1176-1185). The G-rich loop is located immediately above the putative catalytic triad and may facilitate the placement of CRISPR repeat RNA substrates. Consistent with the corresponding predicted general acid-base catalytic mechanism (proposed for the splicing endonuclease) (Calvin and Li., 2008. Cell. MoI. Life Sci. 65: 1176-1185), PfCasό does not require divalent metals and like other metal-independent nucleases cleaves on the 5' side of the phosphodiester bond, likely generating 5' hydroxyl (OH) and 2', 3' cyclic phosphate RNA end groups (Fig. 9). Finally, while binding of the enzyme occurs over a wide temperature range, PfCasό cleavage activity is sharply temperature-dependent with significantly more activity at 7O0C than 37°C (Fig. 5).
Discussion
The results presented here indicate that Cas6 plays a central role in the production of the psiRNAs in the emerging prokaryotic RNAi pathway. Cas6 is a novel riboendonuclease. Through direct binding and cleavage of CRISPR repeat sequences, Cas6 dices long, single-stranded CRISPR primary transcripts into units that consist of an individual guide sequence flanked by a short (8-nt) repeat sequence at the 5' end and by the remaining repeat sequence at the 3' end of the RNA (Fig. IA). Mature psiRNAs retain the short repeat-derived sequence established by Cas6 at their 5' ends in P. furiosus, which we speculate functions as a psiRNA identity tag that allows recognition of the guide RNAs by components of the pRNAi machinery. A repeat sequence of the same length was observed on the 5' ends of RNAs associated with E. coli Cse3, indicating that this may indeed be a generally conserved feature (Brouns et al., 2008. Science 321 : 960-964). The 3' ends of Cas6 cleavage products appear to be further processed since mature psiRNAs lack repeat sequences at their 3' termini in P. furiosus. Because Cas6 remains bound to the CRISPR repeat sequences at the 3' end of the cleavage product (Figs. 3, 4B), Cas6 could influence the subsequent 3' end processing of the RNA. Additional studies may reveal if Cas6 is also an important component of pRNAi effector complexes (serving to couple biogenesis and function), as is the case for eukaryotic Dicer enzymes (Jaskiewicz and Filipowicz, 2008. Curr. Top. Microbiol. Immunol. 320: 77-97).
Cas6 is evolutionarily, structurally, and catalytically distinct from the Dicer proteins that function in the release of individual RNAs that mediate gene silencing in eukaryotes (Hammond, 2005. FEBS Lett. 579: 5822-5829; Jaskiewicz and Filipowicz, 2008. Curr. Top. Microbiol. Immunol. 320: 77-97). However, Cas6 is one of three different ferredoxin fold Cas proteins recently found to possess nuclease activity. Cas2, another protein found in many of the prokaryotes that possess the CRISPR-Cas system, cleaves U-rich ssRNA (Beloglazova et al, 2008. J. Biol. Chem. 283: 20361-20371). The mechanism of action of Cas6 seems to be distinct from that of Cas2, which appears to be a metal-dependent, hydrolytic enzyme (Beloglazova et al., 2008. J. Biol. Chem. 283: 20361-20371). The role of Cas2 in the pRNAi pathway is currently unknown. The E. coli Cse3 protein functions like Cas6 as a CRISPR repeat cleaving enzyme (Brouns et al., 2008. Science 321 : 960-964). Cse3 also cleaves RNA in a divalent metal-independent manner (Brouns et al., 2008. Science 321: 960-964). The substrate RNA recognition requirements and the precise cleavage site have not yet been defined for Cse3. Interestingly, despite the lack of significant sequence homology, the Cas6 and Cse3 proteins appear to adopt similar structures to perform a common function in psiRNA biogenesis. Moreover, some bacteria with the CRISPR-Cas system do not appear to contain either a casό or a cse3 gene, suggesting that there is another Cas6 functional homolog among the Cas proteins, and illustrating the diversity of the CRISPR- Cas systems present in prokaryotes.
Example 2
Cas6 substrate recognition was probed at single nucleotide resolution using RNA footprinting. The results of this analysis confirm that sequence elements in the 5' region of the repeat are the primary determinants for recognition by Cas6 and that nucleotides 2-8 likely have direct contact with Cas6. Also, through mutational analysis, a critical role of the predicted catalytic triad was established and an acid/base catalytic mechanism involving these three amino acids is proposed. Finally, native Cas6 was isolated from P. furiosus extract, was shown to cleave CRISPR repeat RNA, and was found to co-purify with several crRNA (CRISPR RNA) processing intermediates.
Materials and methods
Expression and purification of PfCas6 and mutants. Primers to generate site-specific mutants were designed and ordered from Eurofins MWG Operon (listed in Table 4). Mutant casό genes were generated from a pET24d plasmid containing the PFl 131 (Cas6 from P. furiosus) insert using QuikChange™ site- directed mutatagenesis (Stratagene). The DNA sequences were confirmed by sequencing. N-terminal, 6x histidine-tagged proteins were expressed in E. coli BL21 codon + (DE3, Invitrogen) and purified to homogeneity as described in Example 1.
Table 4. Oligonucleotides used in this study.
Figure imgf000049_0001
Generation of radiolabeled RNAs. The synthetic RNA (see Table 4 for sequence) and RNA size standards (Decade™ markers) used in this study were purchased from Integrated DNA technologies (IDT) and Applied Biosystems, respectively. The northern probe used in this study (see Table 4, #15 for sequence) was purchased from Eurofϊns MWG Operon. RNAs were 5' end labeled with T4 polynucleotide kinase (Applied Biosystems) and [γ32P] ATP (7000 Ci/mmol; MP Biomedicals) as described in example 1. End-labeling at the 3' end was performed with T4 RNA ligase (Promega) and [α32P] pCp (2500 Ci/mmol; MP Biomedicals). A typical reaction contained 10 pmol of RNA, 20 U T4 RNA ligase, 10 U SUPERase-IN™ RNase inhibitor (Applied Biosystems), IX T4 RNA ligase buffer (Promega), 20% polyethylene glycol 3350, and -12 pmol [α32P] pCp. The uniformly labeled CRISPR repeat RNA substrate was generated by in vitro transcription by T7 polymerase using annealed DNA oligos containing the T7 promoter sequence as a template (see Tables 4 and 5 for sequence information) in the presence of [α P] UTP (MP Biomedicals) and purified as described in example 1. All radiolabeled RNAs were extracted with phenol/chloroform/isoamyl alcohol (PCI), precipitated with ethanol, and gel purified as described in example 1.
Table 5. Combinations of deoxynucleotides used in this study. The oligos were either used to generate site-directed mutant PfCasβ constructs (PCR) or annealed directly and used as templates for in vitro transcription (IVT).
Figure imgf000050_0001
RNA footprinting. Lead (II) induced and RNase A cleavage were carried out essentially as described previously (Youssef et al., 2007. Nucleic Acids Res; 35:6196-206). Briefly, 0.1 pmol Of32P end-labeled RNA (either 5' or 3') were incubated in the absence (free RNA) or presence of increasing concentrations of Cas6 at 65-70° C for 30 minutes in buffer A (20 mM HEPES- KOH pH [7.0], 500 mM KCl). Lead (II) induced cleavage was initiated by the addition of 15 mM Pb(II) acetate (lead (II) acetate) prepared fresh in sterile water. Reactions were carried out at room temperature for 10 minutes and were stopped by the addition of EDTA to a final concentration of 20 mM followed by PCI extraction and ethanol precipitation. RNase A cleavage was initiated by the addition of 0.01 ng of RNase A (Applied Biosystems) and incubated at 37° C for 15 minutes. Reactions were stopped by PCI extraction followed by ethanol precipitation. Alkaline hydrolysis ladders (cleavage after each nucleotide) were generated as described previously (Youssef et al., 2007. Nucleic Acids Res; 35:6196-206). In each case, precipitated RNAs were resuspended in RNA loading dye (10 M urea, 2 mM EDTA, 0.5% SDS, and 0.02% [w/v] each bromophenol blue and xylene cyanol) and separated on 38x30 cm 15% polyacrylamide (acrylamide:bis ratio 19:1) 7 M urea containing gels. The gels were dried and RNAs visualized by phosphor imaging.
RNA-binding and cleavage reactions. RNA binding and cleavage reactions were carried out as described in example 1. Briefly, 0.05 pmol of uniformly 32P-labeled RNA was incubated in the absence (free RNA) or presence of increasing concentrations of Cas6 (as indicated in figure legends) in buffer A for 30 minutes at 65-70° C. Half of each reaction was run on 8% native polyacrylamide gels to assess RNA binding by gel mobility shift analysis. RNA cleavage was assessed by separation of the RNAs on denaturing, 7 M urea containing, 15% polyacrylamide gels following PCI extraction and ethanol precipitation. For analysis of native Cas6 cleavage activity, either ~40 μg of whole cell extract (WCE) or supernatant from an immunoprecipitation reaction (see below) was incubated with 0.05 pmol of uniformly P lableled RNA for 30 minutes at 70° C. Alternatively, 10 μL of resin from an immunoprecipitation reaction (see below) was added, hi this case, samples were mixed every five minutes by pipetting up and down during the 30 minute incubation at 70° C. The gels were dried and RNAs visualized by phosphor imaging. Quantitation of cleavage was performed using ImageQuant™ TL software (GE Life Sciences). Preparation of P. fwiosus cell extract. Four grams of P. furiosus cells were lysed in 10 mL 50 mM Tris (pH 8.0) in the presence of 100 U RQl DNase (Promega) and 0.1 mM phenylmethanesulfonylfluoride (PMSF). The extract was then subjected to ultracentrifugation at 100,000g for 90 minutes. The resulting SlOO was then stored at -80° C until use.
Preparation of polyclonal antibodies against PfCasό in Gallus gallus. Specific antibodies against PfCasό were raised in egg laying hens {Gallus gallus). For immunization, three injections of 200 μg of 6x histidine-tagged PfCasό in buffer B (20 mM sodium phosphate [pH 7.0], 500 mM NaCl) were done. Each injection was separated by two weeks. For the initial injection, 500 μL antigen (0.4 mg/mL) was emulsified in 200 μL of Freund's complete adjuvant prior to injection in the breast muscle. For the two booster injections, 500 μL of antigen (0.4 mg/mL) was emulsified in 200 μL of Freund's incomplete adjuvant prior to injection. One week following the final injection, immune eggs were collected daily for three months.
IgY was purified from the egg yolks by polyethylene glycol (PEG) precipitation as described previously (Poison et al., 1980. Immunol Commun; 9:475-93). Briefly, egg yolks were separated from the whites and washed with dH^O. In a typical purification, three egg yolks were punctured, combined, and then resuspended in approximately 250 mL of lysis buffer (10 mM Tris [pH 7.5], 100 mM NaCl). Polyethylene glycol (PEG) 8000 (Fisher Scientific) was then added to 3.5 % w/v. The sample was then mixed by shaking and centrifuged at 10,000g for ten minutes. The supernatant was then filtered through 100% cotton cheesecloth and then PEG 8000 was added 9% w/v. The sample was mixed by shaking and then centrifuged at 10,000g for ten minutes. The supernatant was removed and discarded. The pellet was resuspended in approximately 35 mL of Lysis buffer by incubation at 4° C overnight. The PEG precipitation was then repeated and the pellet from the 9% w/v PEG step was resuspended in approximately 7 mL Lysis buffer and stored at either 4° C or -80° C until use. The protein concentration was determined by the BCA assay (Pierce).
Immunoprecipitation of Casό from P.β^uriosus extract. Immunoprecipitations (IP) were performed using anti-Casό IgY antibodies conjugated to CarboLink™ coupling gel (Pierce). Coupling was performed according to the manufacturer's protocol and was verified by A absorbance readings.
A P. furiosus SlOO cell extract was pre-cleared in a reaction containing ~8 mg of total protein, -550 μg of non-immune IgY coupled CarboLink™ resin, IX Complete™ Mini protease inhibitor (Roche), 50 U SUPERase-IN™ RNase inhibitor, and brought up to a total volume of 1 mL with IPP-300 (10 mM Tris [pH 8.0], 300 mM NaCl, 0.05% Igepal). The pre-clearing reaction was incubated at room temperature for two hours with end-over-end rotation. The sample was then centrifuged at 300Og for 2 minutes and the supernatant was split between preimmune and immune BP reactions. A typical IP reaction contained 500 μL of pre-cleared cell extract (~4mg total protein), IX Complete™ Mini protease inhibitor, 50 U SUPERasin™ RNase inhibitor, 270- 550 μg antibody (either preimmune or immune) coupled resin, and brought up to a total volume of 1 mL with EPP-300. The reactions were incubated at room temperature for 2 hours with end-over-end rotation and then washed four times with IPP-300. The pellets were resuspended in an equal volume of buffer A and stored at 4° C for later analysis.
Northern analysis. For northern blot analysis, RNAs were extracted from immunoprecipitation samples (both immune and preimmune) and WCE using TRIzol™ LS Reagent (Invitrogen) according to manufacturer's recommendations. Northern blots were performed essentially as described previously (Hale et al, 2008. RNA; 14:2572-9). Briefly, RNAs were separated on a 15% polyacrylamide, 7 M urea containing gel (Criterion , Bio-Rad) then transferred onto Zeta-Probe™ nylon membranes (Bio-Rad) using a Trans-Blot SD Semi-Dry Cell™ (Bio-Rad). The membranes were then baked at 80°C for one hour before prehybridization in a ProBlot hybridization oven (LabNet) at 42°C for one hour. Prehybridization and hybridization were performed in Oligo-UltraHyb™ (Applied Biosystems). Hybridization was initiated by adding 5' end-labeled probe to the prehybridization buffer, and hybridization was carried out at 42°C overnight. Following hybridization, the membrane was washed twice with 2X SSC (30 mM sodium citrate [pH 7.0], 300 mM NaCl) with 0.5% SDS for 30 minutes at 42°C. RNAs were then visualized by phosphor imaging. Results
Mapping the Cas6/CRISPR repeat RNA interaction. As discussed in Example 1 , the 5- region of CRISPR repeat RNA plays a role in recognition by Cas6. Substitution or deletion of these nucleotides prevented detectable binding. Additionally, an RNA consisting of nucleotides 1-12 of the repeat displayed an binding affinity comparable to that of the full length repeat RNA. Sequence elements in the middle and 3' regions of the repeat did not appear to be important for binding given that Cas binding was insensitive to deletions or substitutions in these regions of the repeat RNA. In order to gain a more detailed understanding of Cas6 recognition of CRISPR repeat RNA, RNA footprinting was performed with radioactively labeled (either 3' or 5' end- labeled) CRISPR repeat RNA and recombinant PfCasό protein. Lead (II) acetate (cleaves single-stranded and tertiary interactions) and RNase A (cleaves after unpaired Cs and Us) were chosen as probing reagents. A strong protection was observed in the 5' region of the repeat with both lead (II) acetate and RNase A using 3' end-labeled repeat RNA (Fig. 1OA and 10C). Specifically, nucleotides 2-8 were protected from lead induced cleavage in a Cas6 concentration dependent manner. A similar protection profile was observed with RNase A, with cleavage products at nucleotides 3, 5, and 8 becoming less susceptible to degradation in the presence of Cas6. No protection was observed with either lead (II) acetate or RNase A within the 3' region of the repeat using 5' end labeled repeat (Fig. 1OB and 10C). Similar results were obtained when RNase Tl was used as a cleavage reagent for RNA footprinting. These findings are in good agreement with previous results in which
RNA mutagenesis revealed that sequence elements in the 5' region of the repeat are the primary determinants for recognition by Cas6. Despite weak potential for the 5' and 3' regions of repeat RNA to base-pair, consistent with predictions made by in silico analysis (Kunin et al., 2007. Genome Biol; 8:R61), repeats from P. furiosus appear to be mostly unstructured in solution (RNA alone in Fig 1OA and 10B).
Mutational analysis of a putative catalytic amino acid triad. As described in Example 1, cleavage of CRISPR RNA repeats by Cas6 has been predicted to involve a conserved Tyr, His, Lys triad. This prediction was based both on the high degree of conservation of these amino acids and the observation that in the crystal structure these amino acids clustered in close proximity- to one anotherin a similar configuration to that observed in the archaeal tRNA splicing endonuclease (Haft et al., 2005. PLoS Comput Biol; 1 :e60; Calvin and Li, 2008. Cell MoI Life Sci; 65: 1176-85). In order to determine whether these amino acids are required for catalysis, Cas6 proteins containing single amino acid substitutions (Y31A, Y31F, H46A, H46Q, K52A, and K52E) were expressed and purified in Esherichia coli (Fig. 1 IB) and assessed for their ability to cleave radiolabeled CRISPR repeat RNA. Mutation of any of the three triad amino acids led to a significant decrease or complete loss of cleavage activity relative to wild type Cas6 (Fig. 1 IA). Cleavage of the repeat was abolished in the Y3 IA, H46A, and H46Q mutants indicating that Tyr31 and His46 likely play a critical role in catalysis. In the K52A and K52E mutants, cleavage was reduced >40 and >150 fold at the highest concentration tested (500 nM), respectively, suggesting a key role for this residue in catalysis. Significant cleavage activity was retained in the Y3 IF mutant, with only ~2-fold reduction in cleavage compared to the wild type. Similar results were obtained when residues of the catalytic triad of the tRNA splicing endonuclease from Archaeaglobus fulgidus and Thermoplasma acidophilum were mutated (Calvin et al., 2008. Biochemistry; 47:13659-65; Kim et al., 2007. J Bacteriol; 189:8339-46).
Next, we tested whether the loss or reduction of cleavage observed in each of the Cas6 mutants was due to an inability of the mutants to bind to the substrate RNA. To this end, native gel mobility shift assays were performed with each Cas6 mutant and radiolabeled CRISPR repeat RNA (Fig. 12). The ability of Cas6 to bind CRISPR repeat RNA was largely unaffected by mutations in the proposed catalytic triad. Each of the six Cas6 mutants was able to form a stable complex with CRISPR repeat RNA with a similar binding affinity as wild type Cas6 (Fig. 12). Thus Tyr31 , His46, and Lys52, are required for efficient cleavage of the repeat and likely play direct roles in catalysis.
Native Cas6 cleaves CRISPR repeat RNA and associates with crRNA intermediates. In order to determine whether native Cas6 behaves similarly to the recombinant protein, polyclonal antibodies were raised against recombinant Cas6 and used to immunoprecipitate the native protein from a P. furiosus extract. The immunoprecipitation samples, along with whole cell extract (WGE) were then tested for Cas6 cleavage activity by incubation with uniformly labeled repeat RNA (Fig. 13A). Remarkably, in the WCE, the CRISPR repeat RNA was cleaved into the same products generated by recombinant Cas6 with no other cleavage products evident. Additionally, Cas6 cleavage activity was present in the immune, but not preimmune pellet following immunoprecipitation, indicating that the cleavage activity observed in WCE was carried out by native Cas6. The cleavage activity observed in P. furiosus extract was found to be divalent metal ion independent, as was shown to be the case for recombinant Cas6.
It has been shown that a similar CRISPR repeat RNA-cleaving endoribonuclease, Cse3 from E. coli, is found as part of a large ribonucleoprotein complex (RNP) that contains both Cas proteins and crRNAs (Brouns et al, 2008. Science; 321 :960-964). hi order to determine whether Cas6 was also part of an RNP, RNAs were extracted from immunoprecipitation samples and probed by northern blot analysis for a crRNA spacer. Cas6 co- purifies with several crRNA species including the 2X and IX intermediates intermediates (Hale et al., 2008. RNA; 14:2572-9), which correspond to cleavage products generated by Cas6 cleavage of the CRISPR primary transcript (Fig. 13B). Cas6 also weakly co-purified with mature crRNAs (Fig. 14B).
Discussion The biogenesis of mature crRNAs is critical to CRISPR-Cas mediated resistance to genome invaders. The initial processing step, endonucleolytic cleavage of the primary transcript within the repeat region is performed by Cas6 in P. furiosus. This cleavage results in a crRNA intermediate that retains eight nucleotides of the repeat at the 5' end and ~22 nucleotides of the next repeat at the 3' end. In P. furiosus, it appears that this crRNA intermediate is then processed at the 3' end to yield two mature crRNA species that retain the eight nucleotide, repeat derived "tag" that we propose serves as a recognition sequence for other Cas proteins (Hale et al., 2008. RNA; 14:2572-9). In an E. coli Kl 2 strain, a ribomiclease that performs a similar function as Cas6, Cse3, was shown to be required for crRNA biogenesis (Brouns et al, 2008. Science; 321:960-964). Cse3 is a divalent metal ion independent endoribonuclease that cleaves CRISPR repeat RNAs within the 3' region of the repeat. Although the sequences of the repeat RNAs differ, the position of cleavage on E. coli derived CRJSPR repeat RNAs and Cas6 cleavage of P. furiosus repeat RNAs, occurs eight nucleotides upstream of the 3' end of the repeat. This cleavage generates an eight nucleotide tag which is retained on the mature crRNAs in both E. coli and P. furiosus (Brouns et al., 2008. Science; 321 :960-964; Hale et al., 2008. RNA; 14:2572-9). The presence of these eight nucleotides may be a universal feature of crRNAs, serving as a recognition sequence for effector Cas proteins that rely on the spacer sequence to guide the complex to invading mobile genetic elements.
Cse3 from E. coli was found to be a component of a large RNP containing a number of other Cas proteins as well as mature crRNAs (Brouns et al., 2008. Science; 321 :960-964). In the present study we have shown that Cas6 appears to associate with several crRNA species, including its predicted cleavage products, the IX crRNA intermediate (Fig. 13B). Because it remains bound to its cleavage product, Cas6 may influence the 3' end processing of the IX crRNA intermediate. Cas6 is not likely to be a structural component of the invader-targeting effector complex because the protein only weakly associates with mature crRNAs (Fig. 13B).
The structure of Cse3 from Thermus thermophilus revealed a very similar overall architecture as that shown in the structure of Cas6 from P. furiosus (Ebihara et al., 2006. Protein Sci; 15:1494-1499). That is, Cse3 is composed of duplicated ferrodoxin folds separated by a central cleft which contains a conserved Gly-rich loop (Ebihara et al., 2006. Protein Sci; 15:1494- 1499). Located adjacent to this loop is an invariant His residue that was shown to be required for Cse3 cleavage activity (Brouns et al., 2008. Science; 321:960- 964; Ebihara et al., 2006. Protein Sci; 15:1494-1499). In our study, Cas6 mediated cleavage of CRISPR repeat RNA was shown to require the highly conserved His46 residue, which is also located adjacent to the conserved Gly- rich loop characteristic of Cas6 proteins. Cse3, however, lacks conserved Tyr and Lys residues that were shown in the current study to be important for Cas6 cleavage activity. Therefore it seems that despite the aforementioned similarities between the two, Cas6 and Cse3 likely employ distinct catalytic mechanisms. Proposed Cas6 cleavage mechanism. We propose a general acid/base catalytic mechanism for Cas6 based on similar active site architecture and reaction characteristics to the archaeal tRNA splicing endonuclease. In this proposed mechanism, a proton is abstracted from the T hydroxyl of the ribose ring by the hydroxyl group of Tyr31 (Fig. 14). The ability of Y31F mutant to support significant cleavage activity is consistent with previous studies involving archaeal tRNA splicing endonuclease (Calvin et al., 2008. Biochemistry; 47:13659-65; Kim et al., 2007. J Bacteriol; 189:8339-46). It has been proposed that the stereochemistry of the catalytic Tyr, rather than its hydroxyl group, may account for its role in cleavage. In addition to its function as a general base, Tyr31 may also be required for proper substrate positioning in the active site. Removal of a proton from the T hydroxyl of the ribose ring leads to nucleophilic attack by the T oxygen of the ribose on the phosphate backbone, resulting in a pentavalent transition state whose negative charge is stabilized by the positive charge of the amine group of Lys52. Cleavage of the scissile phosphate bond is facilitated by proton donation from the imidazole ring of His46. This would result in cleavage products with a 5' hydroxyl and a 2'-3' cyclic phosphate, consistent with previous findings (example 1 and Calvin and Li, 2008. Cell MoI Life Sci; 65:1176-85).
Cas6 employs a distinct method of substrate recognition and cleavage. Initially, Cas6 binds to sequence elements in the 5' region of CRJSPR repeat RNA and then cleavage occurs site specifically at a location outside the binding site. One model for how this might occur is that Cas6 binding to the repeat RNA induces a conformational change in the RNA, possibly involving base pairing between the 5' and 3' palindromic regions of the repeat, resulting in the proper positioning of the scissile phosphate bond in the active site. Alternatively, following substrate recognition by Cas6, the RNA may wrap itself around Cas6 through a series of weak contacts to position the cleavage site in the active site. In this were the case, the weak interactions that occur outside the primary binding site could not be detected by the techniques used in this study. Further studies are required to determine the molecular mechanisms and dynamics that allow substrate recognition and catalysis to occur in distinct regions of both the protein and substrate.
The complete disclosure of all patents, patent applications, and publications, and electronically available material (including, for instance, nucleotide sequence submissions in, e.g., GenBank and RefSeq, and amino acid sequence submissions in, e.g., SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq) cited herein are incorporated by reference in their entirety. Supplementary materials referenced in publications (such as supplementary tables, supplementary figures, supplementary materials and methods, and/or supplementary experimental data) are likewise incorporated by reference in their entirety. In the event that any inconsistency exists between the disclosure of the present application and the disclosure(s) of any document incorporated herein by reference, the disclosure of the present application shall govern. The foregoing detailed description and examples have been given for clarity of understanding only. No unnecessary limitations are to be understood therefrom. The invention is not limited to the exact details shown and described, for variations obvious to one skilled in the art will be included within the invention defined by the claims.
Unless otherwise indicated, all numbers expressing quantities of components, molecular weights, and so forth used in the specification and claims are to be understood as being modified in all instances by the term
"about." Accordingly, unless otherwise indicated to the contrary, the numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the present invention. At the very least, and not as an attempt to limit the doctrine of equivalents to the scope of the claims, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques.
Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. AU numerical values, however, inherently contain a range necessarily resulting from the standard deviation found in their respective testing measurements.
All headings are for the convenience of the reader and should not be used to limit the meaning of the text that follows the heading, unless so specified.

Claims

What is claimed is:
1. A polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the amino acid sequence of the polypeptide and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, or (b) the full complement of the nucleotide sequence of (a), and a heterologous polynucleotide.
2. A polynucleotide comprising: (a) a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the nucleotide sequence of the isolated polynucleotide and the nucleotide sequence of SEQ ID NO:1 have at least 80% identity, or (b) the full complement of the nucleotide sequence of (a), and a heterologous polynucleotide.
4. The polynucleotide of claim 1 or 2 wherein the heterologous polynucleotide comprises a regulatory sequence
5. The polynucleotide of claim 1 or 2 wherein the heterologous polynucleotide comprises a vector.
6. A genetically modified microbe comprising an exogenous polynucleotide, wherein the exogenous polynucleotide is the polynucleotide of claim 1 or 2.
7. A genetically modified microbe comprising an exogenous polynucleotide, wherein the exogenous polynucleotide comprises a nucleotide sequence encoding a polypeptide having Cas6 endoribonuclease activity, wherein the amino acid sequence of the polypeptide and the amino acid sequence of SEQ ID NO:2 have at least 80% identity.
8. An enriched polypeptide having Cas6 endoribonuclease activity, wherein the polypeptide comprises an amino acid sequence, wherein the amino acid sequence and the amino acid sequence of SEQ ID NO:2 have at least 80% identity.
9. A polypeptide having Cas6 endoribonuclease activity, wherein the polypeptide comprises an amino acid sequence, wherein the amino acid sequence and the amino acid sequence of SEQ ID NO:2 have at least 80% identity, and wherein the polypeptide further comprises a heterologous polypeptide.
10. A genetically modified microbe comprising an exogenous polypeptide, wherein the exogenous polypeptide is the polypeptide of claim 8 or 9.
11. The genetically modified microbe of claim 6, 7, or 10 wherein the microbe is E. coli.
12. A composition comprising the polypeptide of claim 8 or 9.
13. The composition of claim 12 wherein the polypeptide is isolated.
14. The composition of claim 12 further comprising a target RNA polynucleotide.
15. The composition of claim 14 wherein the target RNA polynucleotide comprises IMCNNUNNNNNNNNNNNNNNNNNNNNNN (SEQ ID NO: 192).
16. The composition of claim 14 wherein the target RNA polynucleotide comprises UUACAAUANNNNNNNNNNNNNNNNNNNNN (SEQ ID NO: 193).
17. The composition of claim 14 wherein the target RNA polynucleotide comprises GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO:191).
18. A method for cleaving a nucleotide sequence comprising: incubating a target RNA polynucleotide with a polypeptide under conditions suitable for cleavage of the target RNA polynucleotide, wherein the target RNA polynucleotide comprises a Cas6 recognition domain; wherein the polypeptide comprises an amino acid sequence having at least 80% with the amino acid sequence of SEQ ID NO:2, an amino acid sequence depicted in Figure 1 , an amino acid sequence depicted in Figure 2, or an amino acid sequence depicted in Figure 3, and has Cas6 endoribonuclease activity; and wherein the polypeptide cleaves the target RNA polynucleotide, the cleavage site located 5 to 20 nucleotides downstream of the Cas6 recognition domain.
19. The method of claim 18 wherein the target RNA polynucleotide comprises GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO: 191), and wherein the polypeptide comprises an amino acid sequence having at least 80% with the amino acid sequence of SEQ ID NO:2
20. A method for cleaving a nucleotide sequence comprising: incubating a target RNA polynucleotide with a polypeptide under conditions suitable for cleavage of the target RNA polynucleotide, wherein the target RNA polynucleotide comprises a Cas6 recognition domain present in a prokaryotic genome; wherein the polypeptide comprises an amino acid sequence of a
Cas6 polypeptide from the prokaryotic genome and has Cas6 endoribonuclease activity; and wherein the polypeptide cleaves the target RNA polynucleotide, the cleavage site located 5 to 20 nucleotides downstream of the Cas6 endoribonuclease domain.
21. The method of claim 19 or 20 wherein the method is in vivo.
22. The method of claim 19 or 20 wherein the method is in vitro.
23. An isolated polynucleotide comprising a Cas6 recognition domain, wherein the Cas6 recognition domain comprises 5'- GTT ACAAT AAGA (SEQ ID NO:237), or the complement thereof.
24. The isolated polynucleotide of claim 23 wherein the isolated polynucleotide comprises GTTCCAATAAGACTAAAATAGAATTGAAAG (SEQ ID NO: 191), or the complement thereof.
25. The isolated polynucleotide of claim 23 wherein the isolated polynucleotide further comprises an operably linked regulatory sequence.
26. The isolated polynucleotide of claim 23 wherein the isolated polynucleotide further comprises a vector.
27. The isolated polynucleotide of claim 23 wherein the isolated polynucleotide is RNA.
28. A kit comprising packaging materials, and polypeptide having Cas6 endoribonuclease activity, wherein the polypeptide comprises an amino acid sequence, wherein the amino acid sequence and the amino acid sequence of SEQ ID NO: 2 have at least 80% identity.
PCT/US2009/063432 2008-11-06 2009-11-05 Cas6 polypeptides and methods of use WO2010054108A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/127,764 US9404098B2 (en) 2008-11-06 2009-11-05 Method for cleaving a target RNA using a Cas6 polypeptide

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11204008P 2008-11-06 2008-11-06
US61/112,040 2008-11-06

Publications (4)

Publication Number Publication Date
WO2010054108A2 true WO2010054108A2 (en) 2010-05-14
WO2010054108A3 WO2010054108A3 (en) 2010-09-16
WO2010054108A9 WO2010054108A9 (en) 2010-12-29
WO2010054108A8 WO2010054108A8 (en) 2011-05-05

Family

ID=42153554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/063432 WO2010054108A2 (en) 2008-11-06 2009-11-05 Cas6 polypeptides and methods of use

Country Status (2)

Country Link
US (1) US9404098B2 (en)
WO (1) WO2010054108A2 (en)

Cited By (92)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013098244A1 (en) * 2011-12-30 2013-07-04 Wageningen Universiteit Modified cascade ribonucleoproteins and uses thereof
JP2013528372A (en) * 2010-05-10 2013-07-11 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Endoribonuclease composition and method of use thereof
US8697359B1 (en) 2012-12-12 2014-04-15 The Broad Institute, Inc. CRISPR-Cas systems and methods for altering expression of gene products
US8795965B2 (en) 2012-12-12 2014-08-05 The Broad Institute, Inc. CRISPR-Cas component systems, methods and compositions for sequence manipulation
US8865406B2 (en) 2012-12-12 2014-10-21 The Broad Institute Inc. Engineering and optimization of improved systems, methods and enzyme compositions for sequence manipulation
US8889356B2 (en) 2012-12-12 2014-11-18 The Broad Institute Inc. CRISPR-Cas nickase systems, methods and compositions for sequence manipulation in eukaryotes
US8906616B2 (en) 2012-12-12 2014-12-09 The Broad Institute Inc. Engineering of systems, methods and optimized guide compositions for sequence manipulation
US8993233B2 (en) 2012-12-12 2015-03-31 The Broad Institute Inc. Engineering and optimization of systems, methods and compositions for sequence manipulation with functional domains
US9023649B2 (en) 2012-12-17 2015-05-05 President And Fellows Of Harvard College RNA-guided human genome engineering
US9068179B1 (en) 2013-12-12 2015-06-30 President And Fellows Of Harvard College Methods for correcting presenilin point mutations
US9074199B1 (en) 2013-11-19 2015-07-07 President And Fellows Of Harvard College Mutant Cas9 proteins
US9163284B2 (en) 2013-08-09 2015-10-20 President And Fellows Of Harvard College Methods for identifying a target site of a Cas9 nuclease
US9228207B2 (en) 2013-09-06 2016-01-05 President And Fellows Of Harvard College Switchable gRNAs comprising aptamers
US9260752B1 (en) 2013-03-14 2016-02-16 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9267135B2 (en) 2013-06-04 2016-02-23 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US9322006B2 (en) 2011-07-22 2016-04-26 President And Fellows Of Harvard College Evaluation and improvement of nuclease cleavage specificity
US9322037B2 (en) 2013-09-06 2016-04-26 President And Fellows Of Harvard College Cas9-FokI fusion proteins and uses thereof
US9359599B2 (en) 2013-08-22 2016-06-07 President And Fellows Of Harvard College Engineered transcription activator-like effector (TALE) domains and uses thereof
WO2016141224A1 (en) 2015-03-03 2016-09-09 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity
US9512446B1 (en) 2015-08-28 2016-12-06 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US9526784B2 (en) 2013-09-06 2016-12-27 President And Fellows Of Harvard College Delivery system for functional nucleases
US9567603B2 (en) 2013-03-15 2017-02-14 The General Hospital Corporation Using RNA-guided FokI nucleases (RFNs) to increase specificity for RNA-guided genome editing
US9587252B2 (en) 2013-07-10 2017-03-07 President And Fellows Of Harvard College Orthogonal Cas9 proteins for RNA-guided gene regulation and editing
WO2017040348A1 (en) 2015-08-28 2017-03-09 The General Hospital Corporation Engineered crispr-cas9 nucleases
WO2017132552A1 (en) 2016-01-27 2017-08-03 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US9834791B2 (en) 2013-11-07 2017-12-05 Editas Medicine, Inc. CRISPR-related methods and compositions with governing gRNAS
US9888673B2 (en) 2014-12-10 2018-02-13 Regents Of The University Of Minnesota Genetically modified cells, tissues, and organs for treating disease
US9926546B2 (en) 2015-08-28 2018-03-27 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US9938521B2 (en) 2014-03-10 2018-04-10 Editas Medicine, Inc. CRISPR/CAS-related methods and compositions for treating leber's congenital amaurosis 10 (LCA10)
WO2018071892A1 (en) 2016-10-14 2018-04-19 Joung J Keith Epigenetically regulated site-specific nucleases
US10000772B2 (en) 2012-05-25 2018-06-19 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10011850B2 (en) 2013-06-21 2018-07-03 The General Hospital Corporation Using RNA-guided FokI Nucleases (RFNs) to increase specificity for RNA-Guided Genome Editing
US10077453B2 (en) 2014-07-30 2018-09-18 President And Fellows Of Harvard College CAS9 proteins including ligand-dependent inteins
WO2018195545A2 (en) 2017-04-21 2018-10-25 The General Hospital Corporation Variants of cpf1 (cas12a) with altered pam specificity
US10113163B2 (en) 2016-08-03 2018-10-30 President And Fellows Of Harvard College Adenosine nucleobase editors and uses thereof
WO2018218206A1 (en) 2017-05-25 2018-11-29 The General Hospital Corporation Bipartite base editor (bbe) architectures and type-ii-c-cas9 zinc finger editing
US10166255B2 (en) 2015-07-31 2019-01-01 Regents Of The University Of Minnesota Intracellular genomic transplant and methods of therapy
US10167457B2 (en) 2015-10-23 2019-01-01 President And Fellows Of Harvard College Nucleobase editors and uses thereof
WO2019023483A1 (en) 2017-07-26 2019-01-31 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US10266851B2 (en) 2016-06-02 2019-04-23 Sigma-Aldrich Co. Llc Using programmable DNA binding proteins to enhance targeted genome modification
US10377998B2 (en) 2013-12-12 2019-08-13 The Broad Institute, Inc. CRISPR-CAS systems and methods for altering expression of gene products, structural information and inducible modular CAS enzymes
US10494621B2 (en) 2015-06-18 2019-12-03 The Broad Institute, Inc. Crispr enzyme mutations reducing off-target effects
US10501794B2 (en) 2014-06-23 2019-12-10 The General Hospital Corporation Genomewide unbiased identification of DSBs evaluated by sequencing (GUIDE-seq)
US10526589B2 (en) 2013-03-15 2020-01-07 The General Hospital Corporation Multiplex guide RNAs
US10550372B2 (en) 2013-12-12 2020-02-04 The Broad Institute, Inc. Systems, methods and compositions for sequence manipulation with optimized functional CRISPR-Cas systems
US10563225B2 (en) 2013-07-26 2020-02-18 President And Fellows Of Harvard College Genome engineering
US10577630B2 (en) 2013-06-17 2020-03-03 The Broad Institute, Inc. Delivery and use of the CRISPR-Cas systems, vectors and compositions for hepatic targeting and therapy
US10696986B2 (en) 2014-12-12 2020-06-30 The Board Institute, Inc. Protected guide RNAS (PGRNAS)
US10711285B2 (en) 2013-06-17 2020-07-14 The Broad Institute, Inc. Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation
US10731181B2 (en) 2012-12-06 2020-08-04 Sigma, Aldrich Co. LLC CRISPR-based genome modification and regulation
US10738303B2 (en) 2015-09-30 2020-08-11 The General Hospital Corporation Comprehensive in vitro reporting of cleavage events by sequencing (CIRCLE-seq)
WO2020163396A1 (en) 2019-02-04 2020-08-13 The General Hospital Corporation Adenine dna base editor variants with reduced off-target rna editing
US10745677B2 (en) 2016-12-23 2020-08-18 President And Fellows Of Harvard College Editing of CCR5 receptor gene to protect against HIV infection
US10781444B2 (en) 2013-06-17 2020-09-22 The Broad Institute, Inc. Functional genomics using CRISPR-Cas systems, compositions, methods, screens and applications thereof
US10787684B2 (en) 2013-11-19 2020-09-29 President And Fellows Of Harvard College Large gene excision and insertion
US10851357B2 (en) 2013-12-12 2020-12-01 The Broad Institute, Inc. Compositions and methods of use of CRISPR-Cas systems in nucleotide repeat disorders
US10912797B2 (en) 2016-10-18 2021-02-09 Intima Bioscience, Inc. Tumor infiltrating lymphocytes and methods of therapy
US10930367B2 (en) 2012-12-12 2021-02-23 The Broad Institute, Inc. Methods, models, systems, and apparatus for identifying target sequences for Cas enzymes or CRISPR-Cas systems for target sequences and conveying results thereof
US10946108B2 (en) 2013-06-17 2021-03-16 The Broad Institute, Inc. Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for targeting disorders and diseases using viral components
US11008588B2 (en) 2013-06-17 2021-05-18 The Broad Institute, Inc. Delivery, engineering and optimization of tandem guide systems, methods and compositions for sequence manipulation
US11028429B2 (en) 2015-09-11 2021-06-08 The General Hospital Corporation Full interrogation of nuclease DSBs and sequencing (FIND-seq)
US11041173B2 (en) 2012-12-12 2021-06-22 The Broad Institute, Inc. Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications
US11078481B1 (en) 2016-08-03 2021-08-03 KSQ Therapeutics, Inc. Methods for screening for cancer targets
US11078483B1 (en) 2016-09-02 2021-08-03 KSQ Therapeutics, Inc. Methods for measuring and improving CRISPR reagent function
US11098325B2 (en) 2017-06-30 2021-08-24 Intima Bioscience, Inc. Adeno-associated viral vectors for gene therapy
US11111521B2 (en) 2011-12-22 2021-09-07 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11149267B2 (en) 2013-10-28 2021-10-19 The Broad Institute, Inc. Functional genomics using CRISPR-Cas systems, compositions, methods, screens and applications thereof
US11155795B2 (en) 2013-12-12 2021-10-26 The Broad Institute, Inc. CRISPR-Cas systems, crystal structure and uses thereof
US11268082B2 (en) 2017-03-23 2022-03-08 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable DNA binding proteins
US11286468B2 (en) 2017-08-23 2022-03-29 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US11299767B2 (en) 2013-03-12 2022-04-12 President And Fellows Of Harvard College Method for generating a three-dimensional nucleic acid containing matrix
US11306324B2 (en) 2016-10-14 2022-04-19 President And Fellows Of Harvard College AAV delivery of nucleobase editors
US11319532B2 (en) 2017-08-30 2022-05-03 President And Fellows Of Harvard College High efficiency base editors comprising Gam
US11407985B2 (en) 2013-12-12 2022-08-09 The Broad Institute, Inc. Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for genome editing
US11447770B1 (en) 2019-03-19 2022-09-20 The Broad Institute, Inc. Methods and compositions for prime editing nucleotide sequences
US11542554B2 (en) 2015-11-03 2023-01-03 President And Fellows Of Harvard College Method and apparatus for volumetric imaging
US11542509B2 (en) 2016-08-24 2023-01-03 President And Fellows Of Harvard College Incorporation of unnatural amino acids into proteins using base editing
US11542496B2 (en) 2017-03-10 2023-01-03 President And Fellows Of Harvard College Cytosine to guanine base editor
US11560566B2 (en) 2017-05-12 2023-01-24 President And Fellows Of Harvard College Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation
US11578312B2 (en) 2015-06-18 2023-02-14 The Broad Institute Inc. Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation
US11661590B2 (en) 2016-08-09 2023-05-30 President And Fellows Of Harvard College Programmable CAS9-recombinase fusion proteins and uses thereof
EP4198124A1 (en) 2021-12-15 2023-06-21 Versitech Limited Engineered cas9-nucleases and method of use thereof
US11713485B2 (en) 2016-04-25 2023-08-01 President And Fellows Of Harvard College Hybridization chain reaction methods for in situ molecular detection
US11725228B2 (en) 2017-10-11 2023-08-15 The General Hospital Corporation Methods for detecting site-specific and spurious genomic deamination induced by base editing technologies
US11732274B2 (en) 2017-07-28 2023-08-22 President And Fellows Of Harvard College Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE)
US11795443B2 (en) 2017-10-16 2023-10-24 The Broad Institute, Inc. Uses of adenosine base editors
US11845987B2 (en) 2018-04-17 2023-12-19 The General Hospital Corporation Highly sensitive in vitro assays to define substrate preferences and sites of nucleic acid cleaving agents
US11898179B2 (en) 2017-03-09 2024-02-13 President And Fellows Of Harvard College Suppression of pain by gene editing
US11912985B2 (en) 2020-05-08 2024-02-27 The Broad Institute, Inc. Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
US11920128B2 (en) 2013-09-18 2024-03-05 Kymab Limited Methods, cells and organisms
US11981917B2 (en) 2013-06-04 2024-05-14 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US12031126B2 (en) 2023-12-08 2024-07-09 The Broad Institute, Inc. Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3226329A1 (en) 2011-12-16 2013-06-20 Targetgene Biotechnologies Ltd Compositions and methods for modifying a predetermined target nucleic acid sequence
US9102936B2 (en) 2012-06-11 2015-08-11 Agilent Technologies, Inc. Method of adaptor-dimer subtraction using a CRISPR CAS6 protein
US9688971B2 (en) 2012-06-15 2017-06-27 The Regents Of The University Of California Endoribonuclease and methods of use thereof
EP2880171B1 (en) 2012-08-03 2018-10-03 The Regents of The University of California Methods and compositions for controlling gene expression by rna processing
US9902973B2 (en) 2013-04-11 2018-02-27 Caribou Biosciences, Inc. Methods of modifying a target nucleic acid with an argonaute
US9963689B2 (en) 2013-12-31 2018-05-08 The Regents Of The University Of California Cas9 crystals and methods of use thereof
CN113215219A (en) 2014-02-13 2021-08-06 宝生物工程(美国) 有限公司 Methods of depleting target molecules from an initial collection of nucleic acids, and compositions and kits for practicing same
EP3114227B1 (en) 2014-03-05 2021-07-21 Editas Medicine, Inc. Crispr/cas-related methods and compositions for treating usher syndrome and retinitis pigmentosa
US11141493B2 (en) 2014-03-10 2021-10-12 Editas Medicine, Inc. Compositions and methods for treating CEP290-associated disease
US11339437B2 (en) 2014-03-10 2022-05-24 Editas Medicine, Inc. Compositions and methods for treating CEP290-associated disease
EP3981876A1 (en) 2014-03-26 2022-04-13 Editas Medicine, Inc. Crispr/cas-related methods and compositions for treating sickle cell disease
WO2016025759A1 (en) 2014-08-14 2016-02-18 Shen Yuelei Dna knock-in system
CA2963820A1 (en) 2014-11-07 2016-05-12 Editas Medicine, Inc. Methods for improving crispr/cas-mediated genome-editing
KR102535217B1 (en) 2015-04-24 2023-05-19 에디타스 메디신, 인코포레이티드 Assessment of CAS9 Molecule/Guide RNA Molecule Complexes
JP7030522B2 (en) 2015-05-11 2022-03-07 エディタス・メディシン、インコーポレイテッド Optimized CRISPR / CAS9 system and method for gene editing in stem cells
CA2986262A1 (en) 2015-06-09 2016-12-15 Editas Medicine, Inc. Crispr/cas-related methods and compositions for improving transplantation
GB201510296D0 (en) * 2015-06-12 2015-07-29 Univ Wageningen Thermostable CAS9 nucleases
CA2999500A1 (en) 2015-09-24 2017-03-30 Editas Medicine, Inc. Use of exonucleases to improve crispr/cas-mediated genome editing
WO2017120410A1 (en) 2016-01-08 2017-07-13 University Of Georgia Research Foundation, Inc. Methods for cleaving dna and rna molecules
EA201891619A1 (en) 2016-01-11 2019-02-28 Те Борд Оф Трастиз Оф Те Лилэнд Стэнфорд Джуниор Юниверсити CHEMERAL PROTEINS AND METHODS OF REGULATING GENE EXPRESSION
WO2017123556A1 (en) 2016-01-11 2017-07-20 The Board Of Trustees Of The Leland Stanford Junior University Chimeric proteins and methods of immunotherapy
WO2017165826A1 (en) 2016-03-25 2017-09-28 Editas Medicine, Inc. Genome editing systems comprising repair-modulating enzyme molecules and methods of their use
EP3433364A1 (en) 2016-03-25 2019-01-30 Editas Medicine, Inc. Systems and methods for treating alpha 1-antitrypsin (a1at) deficiency
EP3443086B1 (en) 2016-04-13 2021-11-24 Editas Medicine, Inc. Cas9 fusion molecules, gene editing systems, and methods of use thereof
CA3032822A1 (en) 2016-08-02 2018-02-08 Editas Medicine, Inc. Compositions and methods for treating cep290 associated disease
WO2018144097A1 (en) 2016-11-04 2018-08-09 Akeagen Llc Genetically modified non-human animals and methods for producing heavy chain-only antibodies
EP3596217A1 (en) 2017-03-14 2020-01-22 Editas Medicine, Inc. Systems and methods for the treatment of hemoglobinopathies
US11499151B2 (en) 2017-04-28 2022-11-15 Editas Medicine, Inc. Methods and systems for analyzing guide RNA molecules
EP3622070A2 (en) 2017-05-10 2020-03-18 Editas Medicine, Inc. Crispr/rna-guided nuclease systems and methods
JP2020524497A (en) 2017-06-09 2020-08-20 エディタス・メディシン,インコーポレイテッド Engineered CAS9 nuclease
WO2019014564A1 (en) 2017-07-14 2019-01-17 Editas Medicine, Inc. Systems and methods for targeted integration and genome editing and detection thereof using integrated priming sites
CN110093347B (en) * 2018-12-05 2023-03-28 西北工业大学 shRNA for inhibiting mouse NPFFR2 gene expression
WO2020131986A1 (en) * 2018-12-21 2020-06-25 Pioneer Hi-Bred International, Inc. Multiplex genome targeting
WO2023023603A1 (en) * 2021-08-18 2023-02-23 Cornell University Type i-a crispr-cas3 system for genome editing and diagnostics
WO2023039566A2 (en) * 2021-09-13 2023-03-16 North Carolina State University Novel type i-b crispr-cas system from clostridia
WO2023212649A1 (en) * 2022-04-27 2023-11-02 The Penn State Research Foundation Modrna-based cas endonuclease and base editor and uses thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007025097A2 (en) * 2005-08-26 2007-03-01 Danisco A/S Use
WO2008108989A2 (en) * 2007-03-02 2008-09-12 Danisco A/S Cultures with improved phage resistance

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4703004A (en) * 1984-01-24 1987-10-27 Immunex Corporation Synthesis of protein with an identification peptide
US4782137A (en) * 1984-01-24 1988-11-01 Immunex Corporation Synthesis of protein with an identification peptide, and hybrid polypeptide incorporating same
US5594115A (en) * 1990-04-09 1997-01-14 Pharmacia & Upjohn Company Process of purifying recombinant proteins and compounds useful in such process
US5935824A (en) * 1996-01-31 1999-08-10 Technologene, Inc. Protein expression system
ES2373586T3 (en) * 2006-05-19 2012-02-06 Danisco A/S MARKED MICROORGANISMS AND METHODS TO MARK.
US8546553B2 (en) * 2008-07-25 2013-10-01 University Of Georgia Research Foundation, Inc. Prokaryotic RNAi-like system and methods of use

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007025097A2 (en) * 2005-08-26 2007-03-01 Danisco A/S Use
WO2008108989A2 (en) * 2007-03-02 2008-09-12 Danisco A/S Cultures with improved phage resistance

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CARYN HALE ET AL.: 'Prokaryotic silencing (psi)RNAs in Pyrococcus furiosus' RNA vol. 14, 29 October 2008, pages 2572 - 2579 *
DANIEL H. HAFT ET AL.: 'A Guild of 45 CRISPR-Associated (Cas) Protein Families and Multiple CRISPR/Cas Subtypes Exist in Prokaryotic Genomes' PLOS COMPUTATIONAL BIOLOGY vol. 1, no. ISS.6, 30 November 2005, page E60 *
DATABASE GENBANK 25 February 2002 Database accession no. AAL81255 *
JASON CARTE ET AL.: 'Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes' GENES & DEVELOPMENT vol. 22, 31 December 2008, pages 3489 - 3496 *
NATALIA BELOGLAZOVA ET AL.: 'A Novel Family of Sequence-specific Endoribonucleases Associated with the Clustered Regularly Interspaced Short palindromic Repeats' JOURNAL OF BIOLOGICAL CHEMISTRY vol. 283, no. 29, 18 July 2008, pages 20361 - 20371 *
RUUD. JANSEN ET AL.: 'Identification of genes that are associated with DNA repeats in prokaryotes' MOLECULAR MICROBIOLOGY vol. 43, no. 6, 2002, pages 1565 - 1575 *

Cited By (294)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9115348B2 (en) 2010-05-10 2015-08-25 The Regents Of The University Of California Endoribonuclease compositions and methods of use thereof
JP2013528372A (en) * 2010-05-10 2013-07-11 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Endoribonuclease composition and method of use thereof
US9708646B2 (en) 2010-05-10 2017-07-18 The Regents Of The University Of California Endoribonuclease compositions and methods of use thereof
US9605246B2 (en) 2010-05-10 2017-03-28 The Regents Of The University Of California Endoribonuclease compositions and methods of use thereof
US9322006B2 (en) 2011-07-22 2016-04-26 President And Fellows Of Harvard College Evaluation and improvement of nuclease cleavage specificity
US12006520B2 (en) 2011-07-22 2024-06-11 President And Fellows Of Harvard College Evaluation and improvement of nuclease cleavage specificity
US10323236B2 (en) 2011-07-22 2019-06-18 President And Fellows Of Harvard College Evaluation and improvement of nuclease cleavage specificity
US11111521B2 (en) 2011-12-22 2021-09-07 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11566277B2 (en) 2011-12-22 2023-01-31 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11293052B2 (en) 2011-12-22 2022-04-05 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11293051B2 (en) 2011-12-22 2022-04-05 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11639518B2 (en) 2011-12-22 2023-05-02 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11976318B2 (en) 2011-12-22 2024-05-07 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11549136B2 (en) 2011-12-22 2023-01-10 President And Fellows Of Harvard College Compositions and methods for analyte detection
US11566276B2 (en) 2011-12-22 2023-01-31 President And Fellows Of Harvard College Compositions and methods for analyte detection
KR20140115335A (en) * 2011-12-30 2014-09-30 바게닝겐 유니버시테이트 Modified cascade ribonucleoproteins and uses thereof
JP2015503535A (en) * 2011-12-30 2015-02-02 ヴァーヘニンヘン ウニフェルジテイト Modified CASCADE ribonucleoproteins and their uses
US9885026B2 (en) 2011-12-30 2018-02-06 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
KR101889589B1 (en) 2011-12-30 2018-08-17 카리부 바이오사이언시스 인코포레이티드 Modified cascade ribonucleoproteins and uses thereof
WO2013098244A1 (en) * 2011-12-30 2013-07-04 Wageningen Universiteit Modified cascade ribonucleoproteins and uses thereof
US10954498B2 (en) 2011-12-30 2021-03-23 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
GB2512246A (en) * 2011-12-30 2014-09-24 Univ Wageningen Modified cascade ribonucleoproteins and uses thereof
GB2512246B (en) * 2011-12-30 2016-07-20 Caribou Biosciences Inc Modified cascade ribonucleoproteins and uses thereof
US10435678B2 (en) 2011-12-30 2019-10-08 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
US10711257B2 (en) 2011-12-30 2020-07-14 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
US11939604B2 (en) 2011-12-30 2024-03-26 Caribou Biosciences, Inc. Modified cascade ribonucleoproteins and uses thereof
US10526619B2 (en) 2012-05-25 2020-01-07 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10337029B2 (en) 2012-05-25 2019-07-02 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11970711B2 (en) 2012-05-25 2024-04-30 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10988780B2 (en) 2012-05-25 2021-04-27 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11001863B2 (en) 2012-05-25 2021-05-11 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11814645B2 (en) 2012-05-25 2023-11-14 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11674159B2 (en) 2012-05-25 2023-06-13 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10752920B2 (en) 2012-05-25 2020-08-25 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10774344B1 (en) 2012-05-25 2020-09-15 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10676759B2 (en) 2012-05-25 2020-06-09 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10443076B2 (en) 2012-05-25 2019-10-15 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11634730B2 (en) 2012-05-25 2023-04-25 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10669560B2 (en) 2012-05-25 2020-06-02 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10640791B2 (en) 2012-05-25 2020-05-05 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10793878B1 (en) 2012-05-25 2020-10-06 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10626419B2 (en) 2012-05-25 2020-04-21 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11008590B2 (en) 2012-05-25 2021-05-18 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10428352B2 (en) 2012-05-25 2019-10-01 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10612045B2 (en) 2012-05-25 2020-04-07 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10597680B2 (en) 2012-05-25 2020-03-24 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10421980B2 (en) 2012-05-25 2019-09-24 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10988782B2 (en) 2012-05-25 2021-04-27 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10900054B2 (en) 2012-05-25 2021-01-26 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10577631B2 (en) 2012-05-25 2020-03-03 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10415061B2 (en) 2012-05-25 2019-09-17 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11549127B2 (en) 2012-05-25 2023-01-10 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10982230B2 (en) 2012-05-25 2021-04-20 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10407697B2 (en) 2012-05-25 2019-09-10 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10400253B2 (en) 2012-05-25 2019-09-03 The Regents Of The University Of California Methods and compositions or RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10570419B2 (en) 2012-05-25 2020-02-25 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11008589B2 (en) 2012-05-25 2021-05-18 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10385360B2 (en) 2012-05-25 2019-08-20 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10982231B2 (en) 2012-05-25 2021-04-20 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11479794B2 (en) 2012-05-25 2022-10-25 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11473108B2 (en) 2012-05-25 2022-10-18 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11401532B2 (en) 2012-05-25 2022-08-02 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11028412B2 (en) 2012-05-25 2021-06-08 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10358659B2 (en) 2012-05-25 2019-07-23 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10358658B2 (en) 2012-05-25 2019-07-23 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10000772B2 (en) 2012-05-25 2018-06-19 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10563227B2 (en) 2012-05-25 2020-02-18 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10351878B2 (en) 2012-05-25 2019-07-16 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11332761B2 (en) 2012-05-25 2022-05-17 The Regenis of Wie University of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10550407B2 (en) 2012-05-25 2020-02-04 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10513712B2 (en) 2012-05-25 2019-12-24 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10308961B2 (en) 2012-05-25 2019-06-04 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10113167B2 (en) 2012-05-25 2018-10-30 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10301651B2 (en) 2012-05-25 2019-05-28 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10533190B2 (en) 2012-05-25 2020-01-14 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10266850B2 (en) 2012-05-25 2019-04-23 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10519467B2 (en) 2012-05-25 2019-12-31 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11293034B2 (en) 2012-05-25 2022-04-05 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10227611B2 (en) 2012-05-25 2019-03-12 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11274318B2 (en) 2012-05-25 2022-03-15 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11242543B2 (en) 2012-05-25 2022-02-08 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10487341B2 (en) 2012-05-25 2019-11-26 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US11186849B2 (en) 2012-05-25 2021-11-30 The Regents Of The University Of California Methods and compositions for RNA-directed target DNA modification and for RNA-directed modulation of transcription
US10745716B2 (en) 2012-12-06 2020-08-18 Sigma-Aldrich Co. Llc CRISPR-based genome modification and regulation
US10731181B2 (en) 2012-12-06 2020-08-04 Sigma, Aldrich Co. LLC CRISPR-based genome modification and regulation
US8697359B1 (en) 2012-12-12 2014-04-15 The Broad Institute, Inc. CRISPR-Cas systems and methods for altering expression of gene products
US8999641B2 (en) 2012-12-12 2015-04-07 The Broad Institute Inc. Engineering and optimization of systems, methods and compositions for sequence manipulation with functional domains
US8795965B2 (en) 2012-12-12 2014-08-05 The Broad Institute, Inc. CRISPR-Cas component systems, methods and compositions for sequence manipulation
US8865406B2 (en) 2012-12-12 2014-10-21 The Broad Institute Inc. Engineering and optimization of improved systems, methods and enzyme compositions for sequence manipulation
US11041173B2 (en) 2012-12-12 2021-06-22 The Broad Institute, Inc. Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications
US8889356B2 (en) 2012-12-12 2014-11-18 The Broad Institute Inc. CRISPR-Cas nickase systems, methods and compositions for sequence manipulation in eukaryotes
US8906616B2 (en) 2012-12-12 2014-12-09 The Broad Institute Inc. Engineering of systems, methods and optimized guide compositions for sequence manipulation
US8771945B1 (en) 2012-12-12 2014-07-08 The Broad Institute, Inc. CRISPR-Cas systems and methods for altering expression of gene products
US8889418B2 (en) 2012-12-12 2014-11-18 The Broad Institute Inc. Engineering and optimization of improved systems, methods and enzyme compositions for sequence manipulation
US8871445B2 (en) 2012-12-12 2014-10-28 The Broad Institute Inc. CRISPR-Cas component systems, methods and compositions for sequence manipulation
US8895308B1 (en) 2012-12-12 2014-11-25 The Broad Institute Inc. Engineering and optimization of improved systems, methods and enzyme compositions for sequence manipulation
US8993233B2 (en) 2012-12-12 2015-03-31 The Broad Institute Inc. Engineering and optimization of systems, methods and compositions for sequence manipulation with functional domains
US8945839B2 (en) 2012-12-12 2015-02-03 The Broad Institute Inc. CRISPR-Cas systems and methods for altering expression of gene products
US9822372B2 (en) 2012-12-12 2017-11-21 The Broad Institute Inc. CRISPR-Cas component systems, methods and compositions for sequence manipulation
US9840713B2 (en) 2012-12-12 2017-12-12 The Broad Institute Inc. CRISPR-Cas component systems, methods and compositions for sequence manipulation
US10930367B2 (en) 2012-12-12 2021-02-23 The Broad Institute, Inc. Methods, models, systems, and apparatus for identifying target sequences for Cas enzymes or CRISPR-Cas systems for target sequences and conveying results thereof
US8932814B2 (en) 2012-12-12 2015-01-13 The Broad Institute Inc. CRISPR-Cas nickase systems, methods and compositions for sequence manipulation in eukaryotes
US11512325B2 (en) 2012-12-17 2022-11-29 President And Fellows Of Harvard College RNA-guided human genome engineering
US11535863B2 (en) 2012-12-17 2022-12-27 President And Fellows Of Harvard College RNA-guided human genome engineering
US11236359B2 (en) 2012-12-17 2022-02-01 President And Fellows Of Harvard College RNA-guided human genome engineering
US11365429B2 (en) 2012-12-17 2022-06-21 President And Fellows Of Harvard College RNA-guided human genome engineering
US9970024B2 (en) 2012-12-17 2018-05-15 President And Fellows Of Harvard College RNA-guided human genome engineering
US11359211B2 (en) 2012-12-17 2022-06-14 President And Fellows Of Harvard College RNA-guided human genome engineering
US10717990B2 (en) 2012-12-17 2020-07-21 President And Fellows Of Harvard College RNA-guided human genome engineering
US9023649B2 (en) 2012-12-17 2015-05-05 President And Fellows Of Harvard College RNA-guided human genome engineering
US10435708B2 (en) 2012-12-17 2019-10-08 President And Fellows Of Harvard College RNA-guided human genome engineering
US12018272B2 (en) 2012-12-17 2024-06-25 President And Fellows Of Harvard College RNA-guided human genome engineering
US10273501B2 (en) 2012-12-17 2019-04-30 President And Fellows Of Harvard College RNA-guided human genome engineering
US9260723B2 (en) 2012-12-17 2016-02-16 President And Fellows Of Harvard College RNA-guided human genome engineering
US11299767B2 (en) 2013-03-12 2022-04-12 President And Fellows Of Harvard College Method for generating a three-dimensional nucleic acid containing matrix
US10125361B2 (en) 2013-03-14 2018-11-13 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US11312953B2 (en) 2013-03-14 2022-04-26 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9260752B1 (en) 2013-03-14 2016-02-16 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9410198B2 (en) 2013-03-14 2016-08-09 Caribou Biosciences, Inc. Compostions and methods of nucleic acid-targeting nucleic acids
US9725714B2 (en) 2013-03-14 2017-08-08 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9803194B2 (en) 2013-03-14 2017-10-31 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9809814B1 (en) 2013-03-14 2017-11-07 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US9909122B2 (en) 2013-03-14 2018-03-06 Caribou Biosciences, Inc. Compositions and methods of nucleic acid-targeting nucleic acids
US10544433B2 (en) 2013-03-15 2020-01-28 The General Hospital Corporation Using RNA-guided FokI nucleases (RFNs) to increase specificity for RNA-guided genome editing
US9885033B2 (en) 2013-03-15 2018-02-06 The General Hospital Corporation Increasing specificity for RNA-guided genome editing
US10415059B2 (en) 2013-03-15 2019-09-17 The General Hospital Corporation Using truncated guide RNAs (tru-gRNAs) to increase specificity for RNA-guided genome editing
US10526589B2 (en) 2013-03-15 2020-01-07 The General Hospital Corporation Multiplex guide RNAs
US11920152B2 (en) 2013-03-15 2024-03-05 The General Hospital Corporation Increasing specificity for RNA-guided genome editing
US10760064B2 (en) 2013-03-15 2020-09-01 The General Hospital Corporation RNA-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci
US10138476B2 (en) 2013-03-15 2018-11-27 The General Hospital Corporation Using RNA-guided FokI nucleases (RFNs) to increase specificity for RNA-guided genome editing
US9567604B2 (en) 2013-03-15 2017-02-14 The General Hospital Corporation Using truncated guide RNAs (tru-gRNAs) to increase specificity for RNA-guided genome editing
US10119133B2 (en) 2013-03-15 2018-11-06 The General Hospital Corporation Using truncated guide RNAs (tru-gRNAs) to increase specificity for RNA-guided genome editing
US11168338B2 (en) 2013-03-15 2021-11-09 The General Hospital Corporation RNA-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci
US11098326B2 (en) 2013-03-15 2021-08-24 The General Hospital Corporation Using RNA-guided FokI nucleases (RFNs) to increase specificity for RNA-guided genome editing
US10378027B2 (en) 2013-03-15 2019-08-13 The General Hospital Corporation RNA-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci
US9567603B2 (en) 2013-03-15 2017-02-14 The General Hospital Corporation Using RNA-guided FokI nucleases (RFNs) to increase specificity for RNA-guided genome editing
US11634731B2 (en) 2013-03-15 2023-04-25 The General Hospital Corporation Using truncated guide RNAs (tru-gRNAs) to increase specificity for RNA-guided genome editing
US10844403B2 (en) 2013-03-15 2020-11-24 The General Hospital Corporation Increasing specificity for RNA-guided genome editing
US10767194B2 (en) 2013-06-04 2020-09-08 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US9267135B2 (en) 2013-06-04 2016-02-23 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US10640789B2 (en) 2013-06-04 2020-05-05 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US11981917B2 (en) 2013-06-04 2024-05-14 President And Fellows Of Harvard College RNA-guided transcriptional regulation
US10711285B2 (en) 2013-06-17 2020-07-14 The Broad Institute, Inc. Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation
US10781444B2 (en) 2013-06-17 2020-09-22 The Broad Institute, Inc. Functional genomics using CRISPR-Cas systems, compositions, methods, screens and applications thereof
US10577630B2 (en) 2013-06-17 2020-03-03 The Broad Institute, Inc. Delivery and use of the CRISPR-Cas systems, vectors and compositions for hepatic targeting and therapy
US10946108B2 (en) 2013-06-17 2021-03-16 The Broad Institute, Inc. Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for targeting disorders and diseases using viral components
US11008588B2 (en) 2013-06-17 2021-05-18 The Broad Institute, Inc. Delivery, engineering and optimization of tandem guide systems, methods and compositions for sequence manipulation
US11597949B2 (en) 2013-06-17 2023-03-07 The Broad Institute, Inc. Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation
US10011850B2 (en) 2013-06-21 2018-07-03 The General Hospital Corporation Using RNA-guided FokI Nucleases (RFNs) to increase specificity for RNA-Guided Genome Editing
US10329587B2 (en) 2013-07-10 2019-06-25 President And Fellows Of Harvard College Orthogonal Cas9 proteins for RNA-guided gene regulation and editing
US9587252B2 (en) 2013-07-10 2017-03-07 President And Fellows Of Harvard College Orthogonal Cas9 proteins for RNA-guided gene regulation and editing
US11649469B2 (en) 2013-07-10 2023-05-16 President And Fellows Of Harvard College Orthogonal Cas9 proteins for RNA-guided gene regulation and editing
US11306328B2 (en) 2013-07-26 2022-04-19 President And Fellows Of Harvard College Genome engineering
US10563225B2 (en) 2013-07-26 2020-02-18 President And Fellows Of Harvard College Genome engineering
US9163284B2 (en) 2013-08-09 2015-10-20 President And Fellows Of Harvard College Methods for identifying a target site of a Cas9 nuclease
US11920181B2 (en) 2013-08-09 2024-03-05 President And Fellows Of Harvard College Nuclease profiling system
US10508298B2 (en) 2013-08-09 2019-12-17 President And Fellows Of Harvard College Methods for identifying a target site of a CAS9 nuclease
US10954548B2 (en) 2013-08-09 2021-03-23 President And Fellows Of Harvard College Nuclease profiling system
US10227581B2 (en) 2013-08-22 2019-03-12 President And Fellows Of Harvard College Engineered transcription activator-like effector (TALE) domains and uses thereof
US9359599B2 (en) 2013-08-22 2016-06-07 President And Fellows Of Harvard College Engineered transcription activator-like effector (TALE) domains and uses thereof
US11046948B2 (en) 2013-08-22 2021-06-29 President And Fellows Of Harvard College Engineered transcription activator-like effector (TALE) domains and uses thereof
US9340800B2 (en) 2013-09-06 2016-05-17 President And Fellows Of Harvard College Extended DNA-sensing GRNAS
US9388430B2 (en) 2013-09-06 2016-07-12 President And Fellows Of Harvard College Cas9-recombinase fusion proteins and uses thereof
US9999671B2 (en) 2013-09-06 2018-06-19 President And Fellows Of Harvard College Delivery of negatively charged proteins using cationic lipids
US9526784B2 (en) 2013-09-06 2016-12-27 President And Fellows Of Harvard College Delivery system for functional nucleases
US9737604B2 (en) 2013-09-06 2017-08-22 President And Fellows Of Harvard College Use of cationic lipids to deliver CAS9
US10858639B2 (en) 2013-09-06 2020-12-08 President And Fellows Of Harvard College CAS9 variants and uses thereof
US11299755B2 (en) 2013-09-06 2022-04-12 President And Fellows Of Harvard College Switchable CAS9 nucleases and uses thereof
US10597679B2 (en) 2013-09-06 2020-03-24 President And Fellows Of Harvard College Switchable Cas9 nucleases and uses thereof
US10912833B2 (en) 2013-09-06 2021-02-09 President And Fellows Of Harvard College Delivery of negatively charged proteins using cationic lipids
US9228207B2 (en) 2013-09-06 2016-01-05 President And Fellows Of Harvard College Switchable gRNAs comprising aptamers
US10682410B2 (en) 2013-09-06 2020-06-16 President And Fellows Of Harvard College Delivery system for functional nucleases
US9322037B2 (en) 2013-09-06 2016-04-26 President And Fellows Of Harvard College Cas9-FokI fusion proteins and uses thereof
US9340799B2 (en) 2013-09-06 2016-05-17 President And Fellows Of Harvard College MRNA-sensing switchable gRNAs
US11920128B2 (en) 2013-09-18 2024-03-05 Kymab Limited Methods, cells and organisms
US11149267B2 (en) 2013-10-28 2021-10-19 The Broad Institute, Inc. Functional genomics using CRISPR-Cas systems, compositions, methods, screens and applications thereof
US9834791B2 (en) 2013-11-07 2017-12-05 Editas Medicine, Inc. CRISPR-related methods and compositions with governing gRNAS
US10640788B2 (en) 2013-11-07 2020-05-05 Editas Medicine, Inc. CRISPR-related methods and compositions with governing gRNAs
US11390887B2 (en) 2013-11-07 2022-07-19 Editas Medicine, Inc. CRISPR-related methods and compositions with governing gRNAS
US10190137B2 (en) 2013-11-07 2019-01-29 Editas Medicine, Inc. CRISPR-related methods and compositions with governing gRNAS
US10100291B2 (en) 2013-11-19 2018-10-16 President And Fellows Of Harvard College Mutant Cas9 proteins
US10787684B2 (en) 2013-11-19 2020-09-29 President And Fellows Of Harvard College Large gene excision and insertion
US10683490B2 (en) 2013-11-19 2020-06-16 President And Fellows Of Harvard College Mutant Cas9 proteins
US10435679B2 (en) 2013-11-19 2019-10-08 President And Fellows Of Harvard College Mutant Cas9 proteins
US11286470B2 (en) 2013-11-19 2022-03-29 President And Fellows Of Harvard College Mutant Cas9 proteins
US9074199B1 (en) 2013-11-19 2015-07-07 President And Fellows Of Harvard College Mutant Cas9 proteins
US10465176B2 (en) 2013-12-12 2019-11-05 President And Fellows Of Harvard College Cas variants for gene editing
US11407985B2 (en) 2013-12-12 2022-08-09 The Broad Institute, Inc. Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for genome editing
US11597919B2 (en) 2013-12-12 2023-03-07 The Broad Institute Inc. Systems, methods and compositions for sequence manipulation with optimized functional CRISPR-Cas systems
US11053481B2 (en) 2013-12-12 2021-07-06 President And Fellows Of Harvard College Fusions of Cas9 domains and nucleic acid-editing domains
US11155795B2 (en) 2013-12-12 2021-10-26 The Broad Institute, Inc. CRISPR-Cas systems, crystal structure and uses thereof
US10377998B2 (en) 2013-12-12 2019-08-13 The Broad Institute, Inc. CRISPR-CAS systems and methods for altering expression of gene products, structural information and inducible modular CAS enzymes
US11149259B2 (en) 2013-12-12 2021-10-19 The Broad Institute, Inc. CRISPR-Cas systems and methods for altering expression of gene products, structural information and inducible modular Cas enzymes
US10851357B2 (en) 2013-12-12 2020-12-01 The Broad Institute, Inc. Compositions and methods of use of CRISPR-Cas systems in nucleotide repeat disorders
US10550372B2 (en) 2013-12-12 2020-02-04 The Broad Institute, Inc. Systems, methods and compositions for sequence manipulation with optimized functional CRISPR-Cas systems
US9840699B2 (en) 2013-12-12 2017-12-12 President And Fellows Of Harvard College Methods for nucleic acid editing
US9068179B1 (en) 2013-12-12 2015-06-30 President And Fellows Of Harvard College Methods for correcting presenilin point mutations
US11124782B2 (en) 2013-12-12 2021-09-21 President And Fellows Of Harvard College Cas variants for gene editing
US9938521B2 (en) 2014-03-10 2018-04-10 Editas Medicine, Inc. CRISPR/CAS-related methods and compositions for treating leber's congenital amaurosis 10 (LCA10)
US10501794B2 (en) 2014-06-23 2019-12-10 The General Hospital Corporation Genomewide unbiased identification of DSBs evaluated by sequencing (GUIDE-seq)
US11578343B2 (en) 2014-07-30 2023-02-14 President And Fellows Of Harvard College CAS9 proteins including ligand-dependent inteins
US10077453B2 (en) 2014-07-30 2018-09-18 President And Fellows Of Harvard College CAS9 proteins including ligand-dependent inteins
US10704062B2 (en) 2014-07-30 2020-07-07 President And Fellows Of Harvard College CAS9 proteins including ligand-dependent inteins
US11234418B2 (en) 2014-12-10 2022-02-01 Regents Of The University Of Minnesota Genetically modified cells, tissues, and organs for treating disease
US10993419B2 (en) 2014-12-10 2021-05-04 Regents Of The University Of Minnesota Genetically modified cells, tissues, and organs for treating disease
US9888673B2 (en) 2014-12-10 2018-02-13 Regents Of The University Of Minnesota Genetically modified cells, tissues, and organs for treating disease
US10278372B2 (en) 2014-12-10 2019-05-07 Regents Of The University Of Minnesota Genetically modified cells, tissues, and organs for treating disease
US10696986B2 (en) 2014-12-12 2020-06-30 The Board Institute, Inc. Protected guide RNAS (PGRNAS)
US11624078B2 (en) 2014-12-12 2023-04-11 The Broad Institute, Inc. Protected guide RNAS (pgRNAS)
US9926545B2 (en) 2015-03-03 2018-03-27 The General Hospital Corporation Engineered CRISPR-CAS9 nucleases with altered PAM specificity
US9944912B2 (en) 2015-03-03 2018-04-17 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US11859220B2 (en) 2015-03-03 2024-01-02 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US11220678B2 (en) 2015-03-03 2022-01-11 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
EP3858990A1 (en) 2015-03-03 2021-08-04 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity
US10202589B2 (en) 2015-03-03 2019-02-12 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
WO2016141224A1 (en) 2015-03-03 2016-09-09 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity
US10808233B2 (en) 2015-03-03 2020-10-20 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US10479982B2 (en) 2015-03-03 2019-11-19 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US10767168B2 (en) 2015-03-03 2020-09-08 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US9752132B2 (en) 2015-03-03 2017-09-05 The General Hospital Corporation Engineered CRISPR-CAS9 nucleases with altered PAM specificity
US10494621B2 (en) 2015-06-18 2019-12-03 The Broad Institute, Inc. Crispr enzyme mutations reducing off-target effects
US10876100B2 (en) 2015-06-18 2020-12-29 The Broad Institute, Inc. Crispr enzyme mutations reducing off-target effects
US11578312B2 (en) 2015-06-18 2023-02-14 The Broad Institute Inc. Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation
US11642375B2 (en) 2015-07-31 2023-05-09 Intima Bioscience, Inc. Intracellular genomic transplant and methods of therapy
US10166255B2 (en) 2015-07-31 2019-01-01 Regents Of The University Of Minnesota Intracellular genomic transplant and methods of therapy
US11903966B2 (en) 2015-07-31 2024-02-20 Regents Of The University Of Minnesota Intracellular genomic transplant and methods of therapy
US11642374B2 (en) 2015-07-31 2023-05-09 Intima Bioscience, Inc. Intracellular genomic transplant and methods of therapy
US11147837B2 (en) 2015-07-31 2021-10-19 Regents Of The University Of Minnesota Modified cells and methods of therapy
US10406177B2 (en) 2015-07-31 2019-09-10 Regents Of The University Of Minnesota Modified cells and methods of therapy
US11583556B2 (en) 2015-07-31 2023-02-21 Regents Of The University Of Minnesota Modified cells and methods of therapy
US11925664B2 (en) 2015-07-31 2024-03-12 Intima Bioscience, Inc. Intracellular genomic transplant and methods of therapy
US11266692B2 (en) 2015-07-31 2022-03-08 Regents Of The University Of Minnesota Intracellular genomic transplant and methods of therapy
US10526591B2 (en) 2015-08-28 2020-01-07 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US9926546B2 (en) 2015-08-28 2018-03-27 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
EP4036236A1 (en) 2015-08-28 2022-08-03 The General Hospital Corporation Engineered crispr-cas9 nucleases
US10633642B2 (en) 2015-08-28 2020-04-28 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
WO2017040348A1 (en) 2015-08-28 2017-03-09 The General Hospital Corporation Engineered crispr-cas9 nucleases
US10093910B2 (en) 2015-08-28 2018-10-09 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US11060078B2 (en) 2015-08-28 2021-07-13 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US9512446B1 (en) 2015-08-28 2016-12-06 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases
US11028429B2 (en) 2015-09-11 2021-06-08 The General Hospital Corporation Full interrogation of nuclease DSBs and sequencing (FIND-seq)
US10738303B2 (en) 2015-09-30 2020-08-11 The General Hospital Corporation Comprehensive in vitro reporting of cleavage events by sequencing (CIRCLE-seq)
US10167457B2 (en) 2015-10-23 2019-01-01 President And Fellows Of Harvard College Nucleobase editors and uses thereof
US11214780B2 (en) 2015-10-23 2022-01-04 President And Fellows Of Harvard College Nucleobase editors and uses thereof
US11542554B2 (en) 2015-11-03 2023-01-03 President And Fellows Of Harvard College Method and apparatus for volumetric imaging
US11452750B2 (en) 2016-01-27 2022-09-27 Oncorus, Inc. Oncolytic viral vectors and uses thereof
EP4089166A1 (en) 2016-01-27 2022-11-16 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US10391132B2 (en) 2016-01-27 2019-08-27 Oncorus, Inc. Oncolytic viral vectors and uses thereof
WO2017132552A1 (en) 2016-01-27 2017-08-03 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US11713485B2 (en) 2016-04-25 2023-08-01 President And Fellows Of Harvard College Hybridization chain reaction methods for in situ molecular detection
US10266851B2 (en) 2016-06-02 2019-04-23 Sigma-Aldrich Co. Llc Using programmable DNA binding proteins to enhance targeted genome modification
US10947530B2 (en) 2016-08-03 2021-03-16 President And Fellows Of Harvard College Adenosine nucleobase editors and uses thereof
US11702651B2 (en) 2016-08-03 2023-07-18 President And Fellows Of Harvard College Adenosine nucleobase editors and uses thereof
US10113163B2 (en) 2016-08-03 2018-10-30 President And Fellows Of Harvard College Adenosine nucleobase editors and uses thereof
US11078481B1 (en) 2016-08-03 2021-08-03 KSQ Therapeutics, Inc. Methods for screening for cancer targets
US11999947B2 (en) 2016-08-03 2024-06-04 President And Fellows Of Harvard College Adenosine nucleobase editors and uses thereof
US11912987B2 (en) 2016-08-03 2024-02-27 KSQ Therapeutics, Inc. Methods for screening for cancer targets
US11661590B2 (en) 2016-08-09 2023-05-30 President And Fellows Of Harvard College Programmable CAS9-recombinase fusion proteins and uses thereof
US11542509B2 (en) 2016-08-24 2023-01-03 President And Fellows Of Harvard College Incorporation of unnatural amino acids into proteins using base editing
US11078483B1 (en) 2016-09-02 2021-08-03 KSQ Therapeutics, Inc. Methods for measuring and improving CRISPR reagent function
US11946163B2 (en) 2016-09-02 2024-04-02 KSQ Therapeutics, Inc. Methods for measuring and improving CRISPR reagent function
WO2018071892A1 (en) 2016-10-14 2018-04-19 Joung J Keith Epigenetically regulated site-specific nucleases
US11306324B2 (en) 2016-10-14 2022-04-19 President And Fellows Of Harvard College AAV delivery of nucleobase editors
US11154574B2 (en) 2016-10-18 2021-10-26 Regents Of The University Of Minnesota Tumor infiltrating lymphocytes and methods of therapy
US10912797B2 (en) 2016-10-18 2021-02-09 Intima Bioscience, Inc. Tumor infiltrating lymphocytes and methods of therapy
US10745677B2 (en) 2016-12-23 2020-08-18 President And Fellows Of Harvard College Editing of CCR5 receptor gene to protect against HIV infection
US11820969B2 (en) 2016-12-23 2023-11-21 President And Fellows Of Harvard College Editing of CCR2 receptor gene to protect against HIV infection
US11898179B2 (en) 2017-03-09 2024-02-13 President And Fellows Of Harvard College Suppression of pain by gene editing
US11542496B2 (en) 2017-03-10 2023-01-03 President And Fellows Of Harvard College Cytosine to guanine base editor
US11268082B2 (en) 2017-03-23 2022-03-08 President And Fellows Of Harvard College Nucleobase editors comprising nucleic acid programmable DNA binding proteins
WO2018195545A2 (en) 2017-04-21 2018-10-25 The General Hospital Corporation Variants of cpf1 (cas12a) with altered pam specificity
US11560566B2 (en) 2017-05-12 2023-01-24 President And Fellows Of Harvard College Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation
WO2018218206A1 (en) 2017-05-25 2018-11-29 The General Hospital Corporation Bipartite base editor (bbe) architectures and type-ii-c-cas9 zinc finger editing
WO2018218166A1 (en) 2017-05-25 2018-11-29 The General Hospital Corporation Using split deaminases to limit unwanted off-target base editor deamination
US11098325B2 (en) 2017-06-30 2021-08-24 Intima Bioscience, Inc. Adeno-associated viral vectors for gene therapy
WO2019023483A1 (en) 2017-07-26 2019-01-31 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US11612625B2 (en) 2017-07-26 2023-03-28 Oncorus, Inc. Oncolytic viral vectors and uses thereof
US11732274B2 (en) 2017-07-28 2023-08-22 President And Fellows Of Harvard College Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE)
US11624058B2 (en) 2017-08-23 2023-04-11 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US11286468B2 (en) 2017-08-23 2022-03-29 The General Hospital Corporation Engineered CRISPR-Cas9 nucleases with altered PAM specificity
US11319532B2 (en) 2017-08-30 2022-05-03 President And Fellows Of Harvard College High efficiency base editors comprising Gam
US11932884B2 (en) 2017-08-30 2024-03-19 President And Fellows Of Harvard College High efficiency base editors comprising Gam
US11725228B2 (en) 2017-10-11 2023-08-15 The General Hospital Corporation Methods for detecting site-specific and spurious genomic deamination induced by base editing technologies
US11795443B2 (en) 2017-10-16 2023-10-24 The Broad Institute, Inc. Uses of adenosine base editors
US11976324B2 (en) 2018-04-17 2024-05-07 The General Hospital Corporation Highly sensitive in vitro assays to define substrate preferences and sites of nucleic-acid binding, modifying, and cleaving agents
US11898203B2 (en) 2018-04-17 2024-02-13 The General Hospital Corporation Highly sensitive in vitro assays to define substrate preferences and sites of nucleic-acid binding, modifying, and cleaving agents
US11845987B2 (en) 2018-04-17 2023-12-19 The General Hospital Corporation Highly sensitive in vitro assays to define substrate preferences and sites of nucleic acid cleaving agents
WO2020163396A1 (en) 2019-02-04 2020-08-13 The General Hospital Corporation Adenine dna base editor variants with reduced off-target rna editing
US11643652B2 (en) 2019-03-19 2023-05-09 The Broad Institute, Inc. Methods and compositions for prime editing nucleotide sequences
US11447770B1 (en) 2019-03-19 2022-09-20 The Broad Institute, Inc. Methods and compositions for prime editing nucleotide sequences
US11795452B2 (en) 2019-03-19 2023-10-24 The Broad Institute, Inc. Methods and compositions for prime editing nucleotide sequences
US11912985B2 (en) 2020-05-08 2024-02-27 The Broad Institute, Inc. Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence
EP4198124A1 (en) 2021-12-15 2023-06-21 Versitech Limited Engineered cas9-nucleases and method of use thereof
US12031126B2 (en) 2023-12-08 2024-07-09 The Broad Institute, Inc. Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence

Also Published As

Publication number Publication date
WO2010054108A9 (en) 2010-12-29
WO2010054108A8 (en) 2011-05-05
US20110217739A1 (en) 2011-09-08
WO2010054108A3 (en) 2010-09-16
US9404098B2 (en) 2016-08-02

Similar Documents

Publication Publication Date Title
US9404098B2 (en) Method for cleaving a target RNA using a Cas6 polypeptide
Hegge et al. DNA-guided DNA cleavage at moderate temperatures by Clostridium butyricum Argonaute
Carte et al. Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes
US9422553B2 (en) Prokaryotic RNAi-like system and methods of use
Kuzmenko et al. Programmable DNA cleavage by Ago nucleases from mesophilic bacteria Clostridium butyricum and Limnothrix rosea
Kageyama et al. An alkaline phosphatase/phosphodiesterase, PhoD, induced by salt stress and secreted out of the cells of Aphanothece halophytica, a halotolerant cyanobacterium
Hartmann et al. Crystal structure of the 2′-specific and double-stranded RNA-activated interferon-induced antiviral protein 2′-5′-oligoadenylate synthetase
Nakanishi et al. Structure of yeast Argonaute with guide RNA
Holzmann et al. RNase P without RNA: identification and functional reconstitution of the human mitochondrial tRNA processing enzyme
Gesner et al. Recognition and maturation of effector RNAs in a CRISPR interference pathway
Gu et al. Insights into the structure, mechanism, and regulation of scavenger mRNA decapping activity
Weinberg et al. The inside-out mechanism of Dicers from budding yeasts
Ahmad et al. RNA topoisomerase is prevalent in all domains of life and associates with polyribosomes in animals
Minczuk et al. Localisation of the human hSuv3p helicase in the mitochondrial matrix and its preferential unwinding of dsDNA
Levy et al. Identification of LACTB2, a metallo-β-lactamase protein, as a human mitochondrial endoribonuclease
Bai et al. Structural basis for dimerization and activity of human PAPD1, a noncanonical poly (A) polymerase
Dominski et al. Emergence of the β-CASP ribonucleases: highly conserved and ubiquitous metallo-enzymes involved in messenger RNA maturation and degradation
Banroques et al. Analyses of the functional regions of DEAD-box RNA “helicases” with deletion and chimera constructs tested in vivo and in vitro
Moeder et al. Crystal structure and biochemical analyses reveal that the A rabidopsis triphosphate tunnel metalloenzyme A t TTM 3 is a tripolyphosphatase involved in root development
Silva et al. Structure and activity of a novel archaeal β-CASP protein with N-terminal KH domains
Babu et al. Sinorhizobium meliloti YbeY is a zinc-dependent single-strand specific endoribonuclease that plays an important role in 16S ribosomal RNA processing
Landthaler et al. The nicking homing endonuclease I‐BasI is encoded by a group I intron in the DNA polymerase gene of the Bacillus thuringiensis phage Bastille
Hirata et al. Cleavage of intron from the standard or non-standard position of the precursor tRNA by the splicing endonuclease of Aeropyrum pernix, a hyper-thermophilic Crenarchaeon, involves a novel RNA recognition site in the Crenarchaea specific loop
Richter et al. A mitochondrial rRNA dimethyladenosine methyltransferase in Arabidopsis
Zimmer et al. Genome-based analysis of Chlamydomonas reinhardtii exoribonucleases and poly (A) polymerases predicts unexpected organellar and exosomal features

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09825428

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 13127764

Country of ref document: US

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09825428

Country of ref document: EP

Kind code of ref document: A2