CN112639121A - Amplification compositions, systems, and methods based on CRISPR double nickases - Google Patents

Amplification compositions, systems, and methods based on CRISPR double nickases Download PDF

Info

Publication number
CN112639121A
CN112639121A CN201980055278.XA CN201980055278A CN112639121A CN 112639121 A CN112639121 A CN 112639121A CN 201980055278 A CN201980055278 A CN 201980055278A CN 112639121 A CN112639121 A CN 112639121A
Authority
CN
China
Prior art keywords
nucleic acid
crispr
cas
nickase
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980055278.XA
Other languages
Chinese (zh)
Inventor
F·张
M·凯尔纳
J·戈滕贝格
O·阿布达耶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harvard College
Massachusetts Institute of Technology
Broad Institute Inc
Original Assignee
Harvard College
Massachusetts Institute of Technology
Broad Institute Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harvard College, Massachusetts Institute of Technology, Broad Institute Inc filed Critical Harvard College
Publication of CN112639121A publication Critical patent/CN112639121A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2527/00Reactions demanding special reaction conditions
    • C12Q2527/101Temperature
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2527/00Reactions demanding special reaction conditions
    • C12Q2527/125Specific component of sample, medium or buffer
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Plant Pathology (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

Embodiments disclosed herein utilize RNA-targeted effectors to provide robust CRISPR-based nucleic acid amplification methods and systems. Embodiments disclosed herein can amplify both double-stranded nucleic acid targets and single-stranded nucleic acid targets. Furthermore, embodiments disclosed herein can be combined with various detection platforms, such as CRISPR-SHERLOCK, to enable detection and diagnosis with attomolar sensitivity. Such embodiments can be used in a variety of situations in human health, including, for example, viral detection, bacterial strain typing, sensitive genotyping, and detection of disease-associated cell-free DNA.

Description

Amplification compositions, systems, and methods based on CRISPR double nickases
Cross Reference to Related Applications
This application claims the benefit of U.S. provisional application No. 62/767,059 filed on 14/11/2018 and U.S. provisional application No. 62/690,278 filed on 26/6/2018. The entire contents of the above identified application are hereby fully incorporated by reference herein.
Statement regarding federally sponsored research
The present invention was made with government support in accordance with grant numbers MH100706, MH110049 and HL141201 awarded by the National Institutes of Health. The government has certain rights in the invention.
Technical Field
The subject matter disclosed herein relates generally to nucleic acid amplification methods, systems, and rapid diagnostics related to the use of CRISPR effector systems.
Background
Nucleic acids are a universal marker of biological information. The ability to rapidly detect nucleic acids with high sensitivity and single base specificity on portable platforms has the potential to revolutionize the diagnosis and monitoring of many diseases, provide valuable epidemiological information, and serve as a universal scientific tool. Although many methods for detecting nucleic acids have been developed (Du et al, 2017; Green et al, 2014; Kumar et al, 2014; Pardee et al, 2016; Urdea et al, 2006), they are inevitably compromised by tradeoffs between sensitivity, specificity, simplicity, and speed. For example, qPCR methods are sensitive but expensive and rely on complex instrumentation, limiting the availability to operators trained in laboratory environments. As nucleic acid diagnostics become increasingly relevant for various healthcare applications, detection techniques that provide high specificity and sensitivity at low cost will have great utility in both clinical and basic research settings.
Many nucleic acid amplification methods are available for use in a variety of detection platforms. Among them, isothermal nucleic acid amplification methods have been developed, which can perform amplification without severe temperature cycles and complicated instruments. These methods include Nucleic Acid Sequence Based Amplification (NASBA), Recombinase Polymerase Amplification (RPA), loop-mediated isothermal amplification (LAMP), Strand Displacement Amplification (SDA), Helicase Dependent Amplification (HDA), or nickase amplification reaction (NEAR). However, these isothermal amplification methods may still require an initial denaturation step and multiple sets of primers. Furthermore, a new approach (Du et al, 2017; Pardee et al, 2016) that combines isothermal nucleic acid amplification with a portable platform offers high detection specificity in point of care (POC) environments, but has some limitations in application due to low sensitivity.
Disclosure of Invention
The present disclosure relates generally to nickase-based nucleic acid amplification and detection methods.
In certain exemplary embodiments, the present invention provides a method of amplifying and/or detecting a target double-stranded nucleic acid, the method comprising: (a) combining a sample comprising a target double-stranded nucleic acid with an amplification reaction mixture comprising: (i) an amplifying CRISPR system comprising a first CRISPR/Cas complex and a second CRISPR/Cas complex, the first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first location on a target nucleic acid, and the second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second location on the target nucleic acid; and (ii) a polymerase; (b) amplifying the target nucleic acid; (c) adding to the reaction mixture a primer pair comprising a first primer and a second primer, the first primer comprising a portion complementary to a first position of the target nucleic acid and a portion comprising a binding site for the first guide molecule, and the second primer comprising a portion complementary to a second position of the target nucleic acid and a portion comprising a binding site for the second guide molecule; and (d) further amplifying the target nucleic acid by repeating the extension and nicking under isothermal conditions.
In embodiments, the first position and the second position are on the same strand of the target nucleic acid. In other embodiments, the first and second positions are on a first strand and a second strand of the double-stranded target nucleic acid. In applications where the first and second locations are on first and second strands of the target nucleic acid, the amplifying can include: nicking a first strand and a second strand of a target nucleic acid using a first CRISPR/Cas complex and a second CRISPR/Cas complex and displacing and extending the nicked strands using a polymerase, thereby generating a duplex comprising the target nucleic acid sequence between the first nick site and the second nick site.
In certain embodiments, the Cas-based nickase may be selected from the group consisting of: cas9 nickase, Cpf1 nickase, and C2C1 nickase.
In one embodiment, the Cas-based nickase is a Cas9 nickase protein, which Cas9 nickase protein comprises a mutation in the HNH domain. In one embodiment, the Cas-based nickase is a Cas9 nickase protein, said Cas9 nickase protein comprising a mutation corresponding to N863A in SpCas9 or N580A in SaCas 9. The Cas-based nickase may be a Cas9 protein derived from a bacterial species selected from the group consisting of: streptococcus pyogenes (Streptococcus pyogenes), Staphylococcus aureus (Staphylococcus aureus), Streptococcus thermophilus (Streptococcus thermophilus), Streptococcus mutans (s.mutans), Streptococcus agalactiae (s.agalactiae), Streptococcus equisimilis (s.equisimilis), Streptococcus sanguis (s.sanguinis), and Streptococcus pneumoniae (s.pneumonia); campylobacter jejuni (c.jejuni), campylobacter coli (c.coli); salsuginis, n tergarcus; staphylococcus aureus (s.auricularis), staphylococcus carnosus (s.carnosus); neisseria meningitidis (n.meningitides), neisseria gonorrhoeae (n.gonorrhoeae); listeria monocytogenes (l.monocytogenes), listeria monocytogenes (l.ivanovii); clostridium botulinum (C.botulium), Clostridium difficile (C.difficile), Clostridium tetani (C.tetani), Clostridium sordelii (C.sordelii), Francisella tularensis (Francisella tularensis)1, Prevotella facilis (Prevotella albensis), Lamenospiraceae (Lachnospiraceae bacterium) MC 20171, Vibrio proteolicus (Butyrivibrio proteoclasus), Heteromycosis (Perstriobacter bacterium) GW2011_ GWA2_33_10, Microbacterium antiferrobacterium ultramarinum (Parcuberia bacterium) GW2011_ GWC2_44_17, ScDC of the genus Smith (Smithlla sp.), BV3L6 of the genus Amidococcus sp, MA2020 of the family Mucor, Methanobacterium termiticides candidates (Candidatus Methanoplas termitum), Eubacterium leignens (Eubacterium elegans), Moraxella bovis (Moraxella bovis) 237, Leptospira oryzae (Leptospira inadi), bacterium of the family Mucor ND2006, Porphyromonas canicola (Porphyromonas creveris) 3, Prevotella sacchari (Prevotella disiae), and Porphyromonas macaca (Porphyromonas macrocacae).
In one embodiment, the Cas-based nickase is a Cpf1 nickase protein comprising a mutation in the Nuc domain. In another embodiment, the Cas-based nickase is a Cpf1 nickase protein comprising a mutation corresponding to R1226A in AsCpf 1. The Cas-based nickase may be a Cpf1 protein derived from a bacterial species selected from the group consisting of: francisella tularensis (Francisella tularensis), Prevotella facilis (Prevotella albensis), Microspiraceae bacteria, Vibrio proteolyticus (Butyrivibrio proteoclasius), Heterophaera bacteria (Peregrinibacter), Microspiraceae bacteria (Parcuberia), Smith species (Smithella sp.), amino acid species (Acidamicoccus sp.), Microspiraceae bacteria, Termite-candidate methane-bacterium (Candidatus Methanophaga, Eubacterium leigerensis (Eubacterium leigerensis), Moraxella bovis (Moraxella borvaceus), Leptospira subperioides (Leptospirillum peptone), Porphyromonas canicola (Porphyromonas), Porphyromonas rhodobacter (Porphyromonas), Porphyromonas lactis rhodobacter (Porphyromonas sp), Porphyromonas sinensis (Porphyromonas sp), Porphyromonas succinea (Porphyromonas sp), Porphyromonas sp (Porphyromonas sp), Porphyromonas sp, and Porphyromonas sp Microorganisms (rhizobacteria) bacteria), flavobacterium species, Prevotella breve (Prevotella breves), Moraxella capricolum (Moraxella caprae), bacteroides oralis (bacteroides oralis), Porphyromonas canicola (Porphyromonas cansulci), syntactica juniperi (syntesis jonesii), Prevotella brazii (Prevotella bentonii), vibrio anaerobicus (anaerobiosp sp.), vibrio fibrisolvens (Butyrivibrio fibrins), methanotrophic bacteria (Candidatus), vibrio butyricum (butyrovibrio transmethyla), vibrio butyricum (butyrobacter transmethyla butyricum), vibrio butyricum (vibrio butyricum), vibrio butyricum (rumen) species, vibrio butyricum (vibrio sp), vibrio butyricum (rumen) and clostridium butyricum (vibrio).
In one embodiment, the Cas-based nickase is a C2C1 nickase protein comprising a mutation in the Nuc domain. In another embodiment, the Cas-based nickase is a C2C1 nickase protein, said C2C1 nickase protein comprising a mutation corresponding to D570A, E848A, or D977A in AacC2C 1. The Cas-based nickase may be a C2C1 protein derived from a bacterial species selected from the group consisting of: acid-fast bacteria of the genus Alicyclobacillus (Alicyclobacillus acidoterrestris), Alicyclobacillus contaminans (Alicyclobacillus contaminans), Alicyclobacillus macrocephalus (Alicyclobacillus macrocephalus), Escherichia coli (Bacillus hisashii), Mycobacterium candidum (Candidatus Lindobacterium), Vibrio extraordinary devulcani (Desulfurinus), Bacillus thiodismutase (Desulfuricus), bacteria of the phylum Zygomycetaceae (Elusiobiobacter sp) RIXYA 12, bacteria of the phylum Nocardia 2 (Lactobacillus sp. 2), bacteria of the family Opitaceae (Bacillus taV5, bacteria of the genus Thermomyces (Lactobacillus sp.) Wolgensis (Bacillus subtilis), Bacillus subtilis D3613, Bacillus cereus (Bacillus cereus P13. sp.), Bacillus cereus (Bacillus cereus), Bacillus cereus P13. sp., Bacillus cereus), Bacillus cereus (Bacillus cereus) D2. sp. 12, Bacillus cereus P13. sp. sp.12, Bacillus cereus (Bacillus cereus), Bacillus cereus sp. 12, Bacillus cereus sp. sp.12, Bacillus cereus sp. sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp.sp., Small bacteria that eat butyrate-reducing sulfate (Delfamoxabradium butyricum), Bacillus alicyclolyticus (Alicyclobacillus herbarius), Citrobacter freundii (Citrobacter freundii), Brevibacillus agri (e.g., BAB-2500), and Methylobacterium nodosum (Methylobacterium nodulans).
In one embodiment, the first Cas-based nickase and the second Cas-based nickase are the same. In another embodiment, the first Cas-based nickase and the second Cas-based nickase are different.
The DNA polymerase may be selected from polymerases lacking 5 'to 3' exonuclease activity, and the polymerase may also optionally lack 3 'to 5' exonuclease activity. Examples of suitable DNA polymerases include exonuclease deficient Klenow fragment of e.coli DNA polymerase I (New England Biolabs, Inc. (Beverly, Mass.)), exonuclease deficient T7 DNA polymerase (sequencing enzyme; USB, (Cleveland, Ohio)), Klenow fragment of e.coli DNA polymerase I (New England Biolabs, Inc. (Beverly, Mass.)), large fragment of Bst DNA polymerase (New England Biolabs, Inc.)), KlenTaq DNA polymerase (AB Peptides, (St Louis, Mo.)), T5 DNA polymerase (U.S. patent No. 5,716,819), and Pol III DNA polymerase (U.S. patent No. 6,555,349). For helicase-dependent amplification, preferred are DNA polymerases with strand displacement activity, such as the exonuclease deficient Klenow fragment of E.coli DNA polymerase I, Bst DNA polymerase large fragment, and sequenases. The T7 polymerase is a high fidelity polymerase with an error rate of 3.5X 105Significantly lower than that of Tag polymerase (Keohavong and Thilly, Proc. Natl. Acad. Sci. USA 86,9253-9257 (1989)). However, T7 polymerase is not thermostable and therefore is not the best choice for amplification systems that require thermal cycling. In HDA, which can be performed isothermally, the T7 sequencing enzyme is one of the preferred polymerases for amplifying DNA.
In particular embodiments, the polymerase may be selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase and sequencer enzyme DNA polymerase.
In certain embodiments, the polymerase is selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full-length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase, Gst polymerase, Taq polymerase, Escherichia coli DNA polymerase I Klenow fragment, KlenaQ, Pol III DNA polymerase, T5 DNA polymerase and sequencer enzyme DNA polymerase. Amplification of the target nucleic acid can be performed at about 50 ℃ to 59 ℃, about 60 ℃ to 72 ℃, or about 37 ℃. In certain embodiments, amplification of the target nucleic acid sequence is performed at constant temperature. In certain embodiments, amplification of a target nucleic acid is performed over a range of temperatures.
In certain embodiments, the target nucleic acid sequence can be about 20-30, about 30-40, about 40-50, or about 50-100 nucleotides in length. In certain embodiments, the target nucleic acid sequence can be about 100-200, about 100-500, or about 100-1000 nucleotides in length. In other embodiments, the target nucleic acid sequence may be about 1000-.
In further embodiments, the first primer or the second primer further comprises an RNA polymerase promoter.
In certain embodiments, the method may further comprise detecting the amplified nucleic acid by a method selected from the group consisting of: gel electrophoresis, intercalating dye detection, PCR, real-time PCR, Fluorescence Resonance Energy Transfer (FRET), mass spectrometry, lateral flow assays, colorimetric assays (HRP, ALP, gold nanoparticle-based assays), and CRISPR-SHERLOCK. The CRISPR-SHIRLOCK method can be a Cas 13-based CRISPR-SHIRLOCK method. Target nucleic acids can be detected with attomole or femtomole sensitivity.
In certain embodiments, the target nucleic acid can be DNA or RNA. The DNA may be selected from the group consisting of: genomic DNA, mitochondrial DNA, viral DNA, plasmid DNA, non-circulating cellular DNA, environmental DNA, and synthetic double-stranded DNA. In certain embodiments, the target nucleic acid can be a double-stranded nucleic acid or a single-stranded nucleic acid. Where the target nucleic acid is a single-stranded nucleic acid, such single-stranded nucleic acids can include, but are not necessarily limited to, single-stranded viral DNA, viral RNA, messenger RNA, ribosomal RNA, transfer RNA, microrna, short interfering RNA, small nuclear RNA, synthetic RNA, or synthetic single-stranded DNA.
In one embodiment, the sample is a biological sample or an environmental sample. The biological sample is blood, plasma, serum, urine, stool, sputum, mucus, lymph, synovial fluid, bile, ascites, pleural effusion, seroma, saliva, cerebrospinal fluid, aqueous or vitreous fluid, or any bodily secretion, exudate, or fluid obtained from a joint, or a swab of a skin or mucosal surface. In certain embodiments, the sample is blood, plasma, or serum obtained from a human patient. In another embodiment, the sample is a plant sample. In other embodiments, the sample may be a crude sample or a purified sample.
In another aspect, the present disclosure provides a method for amplifying and/or detecting a target single-stranded nucleic acid, the method comprising: (a) converting single-stranded nucleic acid in the sample into target double-stranded nucleic acid; and (b) performing the steps of the foregoing method. The target single-stranded nucleic acid may be an RNA molecule. RNA molecules can be converted into double-stranded nucleic acids by reverse transcription and amplification steps. The target single-stranded nucleic acid may be selected from the group consisting of: single stranded viral DNA, viral RNA, messenger RNA, ribosomal RNA, transfer RNA, microrna, short interfering RNA, microrna, synthetic RNA, long noncoding RNA, microrna precursors, dsRNA, and synthetic single stranded DNA.
In another aspect, the present disclosure provides a system for amplifying and/or detecting a target double-stranded nucleic acid in a sample, the system comprising: (a) an amplifying CRISPR system comprising a first CRISPR/Cas complex and a second CRISPR/Cas complex, the first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first strand of a target nucleic acid, and the second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second strand of the target nucleic acid; (b) a polymerase; (c) a primer pair to the reaction mixture comprising a first primer and a second primer, the first primer comprising a portion complementary to a first strand of the target nucleic acid and a portion comprising a binding site for the first guide molecule, and the second primer comprising a portion complementary to a second strand of the target nucleic acid and a portion comprising a binding site for the second guide molecule; and optionally (d) a detection system for detecting amplification of the target nucleic acid. The Cas-based nickase may be selected from the group consisting of: cas9 nickase, Cpf1 nickase, C2C1 nickase, Cas13a nickase, Cas13b nickase, Cas13C nickase, and Cas13d nickase. The polymerase may be selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full-length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase, Gst polymerase, Taq polymerase, Escherichia coli DNA polymerase I Klenow fragment, KlenaQ, Pol III DNA polymerase, T5 DNA polymerase and sequencer enzyme DNA polymerase. In certain embodiments, the Cas-based nickase and the polymerase are performed at the same temperature. In certain embodiments, the Cas-based nickase and the polymerase are performed at different temperatures.
For helicase-dependent amplification, preferred are DNA polymerases with strand displacement activity, such as the exonuclease deficient Klenow fragment of E.coli DNA polymerase I, Bst DNA polymerase large fragment, and sequenases. The T7 polymerase is a high fidelity polymerase with an error rate of 3.5X 105This is significantly lower than Taq polymerase, andcan be used when carried out isothermally (Keohavong and Thilly, Proc. Natl. Acad. Sci. USA 86,9253-9257 (1989)).
In another aspect, the present disclosure provides a system for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the system comprising: (a) an agent for converting a target single-stranded nucleic acid into a double-stranded nucleic acid; and (b) components of the above system for amplifying and/or detecting a target double-stranded nucleic acid.
In another aspect, the present disclosure provides a kit for amplifying and/or detecting a target double-stranded nucleic acid in a sample, the kit comprising the components for amplifying and/or detecting a target double-stranded nucleic acid of the above system and a set of instructions for use. The kit may further comprise reagents for purifying double stranded nucleic acid in a sample.
In another aspect, the present disclosure provides a kit for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the kit comprising the components for amplifying and/or detecting a target single-stranded nucleic acid of the above system and a set of instructions for use. The kit may further comprise reagents for purifying single stranded nucleic acid in a sample.
These and other aspects, objects, features and advantages of the exemplary embodiments will become apparent to those skilled in the art from the following detailed description of the illustrated exemplary embodiments.
Drawings
An understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention may be utilized, and the accompanying drawings of which:
FIG. 1 is a schematic of a programmable nickase-based amplification, according to certain exemplary embodiments.
FIG. 2 is an image of an optimized gel electrophoresis showing a nicking enzyme amplification reaction. Red arrows indicate target amplification bands.
Fig. 3A is a graph showing nickase-based linear amplification using nt.a1w1 restriction enzyme with 20nM target. FIG. 3B is a graph showing nickase-based linear amplification using T7 mismatched Cpf1 with 20nM target. FIG. 3C is a graph showing nickase-based linear amplification using matched Cpf1 with 20nM target. Fig. 3D is a graph showing nickase-based linear amplification using nt.a1w1 restriction enzyme with 20fM target. FIG. 3E is a graph showing nickase-based linear amplification using T7 mismatched Cpf1 with 20fM target. FIG. 3F is a graph showing nickase-based linear amplification using matched Cpf1 and 20fM targets.
Fig. 4A is a graph showing nt.a1w1 amplification and detection using SYTO intercalating dye. FIG. 4B is a graph showing T7 mismatched Cpf1 amplification and detection using SYTO intercalating dye. FIG. 4C is a graph showing matched Cpf1 amplification and detection using SYTO intercalating dye. Fig. 4D is a graph showing nt.a1w1 amplification and detection using gel-based readout. FIG. 4E is a graph showing the amplification and detection of Cpf1 with T7 mismatch using gel-based readout. Fig. 4F is a graph showing matched Cpf1 amplification and detection using gel-based readout. Fig. 4G is a graph showing nt.a1w1 amplification and detection using CRISPR-SHERLOCK. Fig. 4H is a graph showing Cpf1 amplification and detection using the T7 mismatch of CRISPR-SHERLOCK. Fig. 4I is a graph showing matched Cpf1 amplification and detection using CRISPR-SHERLOCK.
Fig. 5 is a graph showing the results of nickase-based amplification with binding to SYTO or CRISPR-SHERLOCK detection, plotted as target/no-target ratio.
Figure 6A is a graph showing the results of NEAR amplification alone at varying target concentrations. Fig. 6B is a graph showing the results of NEAR amplification in conjunction with CRISPR-SHERLOCK detection at varying target concentrations.
FIG. 7A is a gel electrophoresis image showing the results of NEAR amplification performed using Bst 2.0 hot start polymerase at 60 ℃. FIG. 7B is a graph showing quantification of 119A. Fig. 7C is a graph showing the results of NEAR binding to CRISPR-SHERLOCK performed using Bst 2.0 hot start polymerase at 60 ℃.
FIG. 8A is a graph showing NEAR amplification performed with sequencer enzyme 2.0 at 37 ℃ at the 16min time point. FIG. 8B is a graph showing NEAR amplification performed at the endpoint with sequencer enzyme 2.0 at 37 ℃.
Fig. 9 is a schematic of CRISPR-NEAR detected in conjunction with SHERLOCK.
The drawings herein are for illustration purposes only and are not necessarily drawn to scale.
Detailed Description
General definitions
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Definitions of terms and techniques commonly used in molecular biology can be found in the following documents: molecular Cloning: a Laboratory Manual, 2 nd edition (1989) (Sambrook, Fritsch and Maniatis); molecular Cloning: a Laboratory Manual, 4 th edition (2012) (Green and Sambrook); current Protocols in Molecular Biology (1987) (edited by F.M. Ausubel et al); the series Methods in Enzymology (Academic Press, Inc.): and (3) PCR 2: a Practical Approach (1995) (edited by m.j.macpherson, b.d.hames, and g.r.taylor): antibodies, A Laboratory Manual (1988) (edited by Harlow and Lane): antibodies a laboratory Manual, 2 nd edition 2013 (edited by e.a. greenfield); animal Cell Culture (1987) (edited by r.i. freshney); benjamin Lewis, Genes IX, Jones and Bartlet publication 2008(ISBN 0763752223); kendrew et al (ed), The Encyclopedia of Molecular Biology, Blackwell Science ltd. published, 1994(ISBN 0632021829); robert a.meyers (eds), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, VCH Publishers, Inc. published 1995(ISBN 9780471185710); singleton et al, Dictionary of Microbiology and Molecular Biology 2 nd edition, J.Wiley & Sons (New York, N.Y.1994), March, Advanced Organic Chemistry Reactions, Mechanism and Structure 4 th edition, John Wiley & Sons (New York, N.Y.1992); hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2 nd edition (2011).
As used herein, the singular forms "a", "an" and "the" include both singular and plural referents unless the context clearly dictates otherwise.
The term "optional" or "optionally" means that the subsequently described event, circumstance, or alternative may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
The recitation of numerical ranges by endpoints includes all numbers and fractions within the corresponding range, as well as the recited endpoint.
As used herein, the term "about" or "approximately" when referring to a measurable value such as a parameter, amount, time distance, and the like, is intended to encompass variations in and from the specified value, such as +/-10% or less, +/-5% or less, +/-1% or less and +/-0.1% or less from the specified value, so long as such variations are suitable for implementation in the disclosed invention. It is to be understood that the value to which the modifier "about" or "approximately" refers is itself also specifically and preferably disclosed.
As used herein, a "biological sample" may contain whole and/or living cells and/or cell debris. The biological sample may contain (or be derived from) "body fluid". The present invention encompasses embodiments wherein the bodily fluid is selected from the group consisting of amniotic fluid, aqueous humor, vitreous humor, bile, serum, breast milk, cerebrospinal fluid, cerumen (cerumen), chyle, chyme, endolymph fluid, perilymph fluid, exudate, feces, female ejaculate, gastric acid, gastric juice, lymph fluid, mucus (including nasal drainage and sputum), pericardial fluid, peritoneal fluid, pleural fluid, pus, inflammatory secretions, saliva, sebum (sebum), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretions, vomit, and mixtures of one or more of these. Biological samples include cell cultures, body fluids, cell cultures derived from body fluids. Body fluids may be obtained from a mammalian organism, for example, by lancing or other collection or sampling procedures.
The terms "subject", "individual" and "patient" are used interchangeably herein and refer to a vertebrate, preferably a mammal, more preferably a human. Mammals include, but are not limited to, rats, monkeys, humans, farm animals, sport animals, and pets. Also included are tissues, cells and progeny of the biological entities obtained in vivo or cultured in vitro.
"C2C 2" is now referred to as "Cas13 a", and these terms are used interchangeably herein, unless otherwise indicated. The terms "group 29", "group 30", and Cas13b are used interchangeably herein. The terms "Cpf 1" and "Cas 12 a" are used interchangeably herein. The terms "C2C 1" and "Cas 12 b" are used interchangeably herein.
Various embodiments are described below. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation on the broader aspects discussed herein. An aspect described in connection with a particular embodiment is not necessarily limited to that embodiment, but may be practiced with any other embodiment or embodiments. Reference throughout this specification to "one embodiment," "an example embodiment," means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment," "in an embodiment," or "exemplary embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but are also possible. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments, as will be apparent to those skilled in the art from this disclosure. Furthermore, although some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are intended to be within the scope of the invention. For example, in the appended claims, any of the claimed embodiments may be used in any combination.
All publications, published patent documents and patent applications cited herein are hereby incorporated by reference to the same extent as if each individual publication, published patent document or patent application were specifically and individually indicated to be incorporated by reference in its entirety.
Overview
Embodiments disclosed herein provide methods for amplifying a target nucleic acid under isothermal conditions using a CRISPR-Cas based nickase.
In another aspect, embodiments disclosed herein relate to a system for amplifying and/or detecting a target double-stranded nucleic acid and a target single-stranded nucleic acid in a sample. In certain embodiments, the system comprises an amplifying CRISPR system, a polymerase, a primer pair, and optionally a detection system for detecting amplification of a target nucleic acid. In certain exemplary embodiments, the system may further comprise an agent for converting the target single-stranded nucleic acid into a double-stranded nucleic acid.
In another aspect, embodiments disclosed herein relate to a kit for amplifying and/or detecting a target double-stranded nucleic acid or a target single-stranded nucleic acid in a sample. In certain exemplary embodiments, the kit can include reagents for purifying double-stranded nucleic acids or single-stranded nucleic acids in a sample and a set of instructions for use.
Amplification system
A system for amplifying a target double-stranded nucleic acid in a sample is provided. The system includes an amplification CRISPR system, a polymerase, and a primer pair. In embodiments, the system may optionally include a detection system, thereby allowing detection of the target nucleic acid.
The amplified CRISPR system comprises a first CRISPR/Cas complex and a second CRISPR/Cas complex. Each CRISPR/Cas complex comprises a Cas-based nickase and a guide molecule that preferentially binds to, is specific for, e.g., has sufficient complementarity to bind to, a target molecule, thereby directing the CRISPR/Cas complex to the target nucleic acid. The amplification system comprises a polymerase; a primer pair comprising a first primer and a second primer of a reaction mixture, the first primer comprising a portion complementary to a first target position and a portion comprising a binding site for a first guide molecule, and the second primer comprising a portion complementary to a second target nucleic acid position and a portion comprising a binding site for a second guide molecule; and optionally a detection system for detecting amplification of the target nucleic acid. The first and second positions can be on the same strand, in which case the Cas-based nickase will nick on the same strand, or the first and second positions can be on two different strands.
CRISPR system
The CRISPR systems provided herein comprise a first CRISPR-Cas complex and a second CRISPR-Cas complex. The first CRISPR/Cas complex comprises a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first location of the target nucleic acid, and the second CRISPR/Cas complex comprises a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second location of the target nucleic acid.
In one aspect, the first CRISPR/Cas complex comprises a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first strand of the target nucleic acid, and the second CRISPR/Cas complex comprises a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second strand of the target nucleic acid. In one aspect, the first CRISPR/Cas complex comprises a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first location on a first strand of the target nucleic acid, and the second CRISPR/Cas complex comprises a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second location on the first strand of the target nucleic acid.
In general, a CRISPR-Cas or CRISPR system as used herein and in documents such as WO2014/093622(PCT/US2013/074667) collectively relate to transcripts and other elements involved in or directing the activity of a CRISPR-associated ("Cas") gene, including sequences encoding the Cas gene, tracr (trans-activating CRISPR) sequences (e.g., tracrRNA or active portions of tracrRNA), tracr mate sequences (encompassing "forward repeats" and portions of the forward repeats processed by tracrRNA in the case of an endogenous CRISPR system), guide sequences (also referred to as "spacers" in the case of an endogenous CRISPR system), or the term "RNA(s)" as used herein (e.g., one or more RNAs to guide Cas such as Cas9, e.g., CRISPR RNA and trans-activating (tracr) RNA or single guide RNA (sgrna)), or other sequences and transcripts from CRISPR loci. Generally, the CRISPR system is characterized by elements (also referred to as protospacers in the case of an endogenous CRISPR system) that promote CRISPR complex formation at the site of the target sequence. When the CRISPR protein is a Cpf1 protein, no tracrRNA is required.
As used herein, the term "Cas" refers generally to a (modified) effector protein of a CRISPR/Cas system or complex, and may be, without limitation, a (modified) Cas9, a (modified) Cas12 (e.g. Cas12a "Cpf 1", Cas12b "C2C 1", Cas12C "C2C 3"), (modified) Cas13 (e.g. Cas13a "C2C 2", Cas13b "group 29/30", Cas13C, Cas13 d). The term "Cas" may be used interchangeably herein with the terms "CRISPR" protein, "CRISPR/Cas protein," "CRISPR effector," "CRISPR/Cas effector," "CRISPR enzyme," "CRISPR/Cas enzyme," and the like, unless otherwise indicated, as specifically and exclusively referring to Cas 9. It is to be understood that the term "CRISPR protein" can be used interchangeably with "CRISPR enzyme" regardless of whether said CRISPR protein has altered, such as increased or decreased (or no) enzymatic activity as compared to a wild type CRISPR protein. Likewise, as used herein, in certain embodiments, the term "nuclease" may refer, where appropriate and as would be apparent to a skilled artisan, to a modified nuclease whose catalytic activity has been altered, such as having increased or decreased nuclease activity, or no nuclease activity at all, as well as a nickase activity, as well as to an otherwise modified nuclease as defined elsewhere herein, as specifically and exclusively referring to an unmodified nuclease, unless otherwise indicated.
In certain embodiments according to the invention, the CRISPR-Cas protein is preferably mutated with respect to the corresponding wild-type enzyme such that the mutated CRISPR-Cas protein lacks the ability to cleave one or both DNA strands of the target locus containing the target sequence.
In certain embodiments, the CRISPR-Cas protein is a mutant CRISPR-Cas protein that cleaves only one DNA strand, i.e., a nickase. In certain embodiments, the nicking enzyme cuts within a non-target sequence (i.e., a sequence that is on the opposite DNA strand of the target sequence and that is 3' to the PAM sequence).
The present invention contemplates methods using two or more nicking enzymes, particularly dual or double nicking enzyme methods. This results in binding of the target DNA by both Cas nickases. Furthermore, it is also contemplated that different orthologs can be used, e.g., a Cas nickase on one strand of DNA (e.g., the coding strand) and an ortholog on the non-coding or opposite DNA strand or second DNA target location. The ortholog may be, but is not limited to, a Cas9 nickase, such as a SaCas9 nickase or a SpCas9 nickase. It may be advantageous to use two different orthologs requiring different PAMs and possibly having different guideline requirements, thereby allowing the user a greater degree of control.
CRISPR-Cas protein
The nucleic acid molecule encoding a CRISPR effector protein is advantageously a codon optimized CRISPR effector protein. In this case, examples of codon-optimized sequences are sequences optimized for expression in a eukaryote, such as a human (i.e., optimized for expression in a human), or optimized for expression in another eukaryote, animal, or mammal as discussed herein; see, e.g., the SacAS9 human codon optimized sequence in WO2014/093622(PCT/US 2013/074667). While this is preferred, it will be appreciated that other examples may exist and that codon optimization for host species other than humans or for specific organs is known. In some embodiments, the enzyme coding sequence encoding a CRISPR effector protein is codon optimized for expression in a particular cell, such as a eukaryotic cell. Eukaryotic cells can be those of or derived from a particular organism, such as a plant or mammal, including but not limited to a human, or a non-human eukaryote or animal or mammal as discussed herein, e.g., a mouse, rat, rabbit, dog, livestock, or non-human mammal or primate. In some embodiments, processes that modify the germline genetic identity of humans and/or processes that modify the genetic identity of animals, and animals produced by such processes, that are likely to not bring any substantial medical benefit to humans or animals, may be excluded. In general, codon optimization refers to the process of modifying a nucleic acid sequence for enhanced expression in a target host cell by replacing at least one codon (e.g., about or greater than about 1, 2,3, 4,5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are used more frequently or most frequently in the gene of the host cell while maintaining the native amino acid sequence. Certain codons of different species for a particular amino acid exhibit particular biases. Codon bias (difference in codon usage between organisms) is often correlated with the efficiency of translation of messenger rna (mrna), which in turn is believed to depend, inter alia, on the identity of the codons translated and the availability of specific transfer rna (trna) molecules. Dominance of the selected tRNA in the cell generally reflects the codons most frequently used in peptide synthesis. Thus, genes can be adjusted for optimal gene expression in a given organism based on codon optimization. Codon Usage tables are readily available, for example, in the "Codon Usage Database (Codon Usage Database)" available on Kazusa. See Nakamura, Y., et al, "code use terminated from the international DNA sequences databases: status for the year 2000 "Nucl. acids Res.28:292 (2000). Computer algorithms for codon optimization of specific sequences for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.). In some embodiments, one or more codons (e.g., 1, 2,3, 4,5, 10, 15, 20, 25, 50, or more or all codons) in the Cas-encoding sequence correspond to the codons most frequently used for a particular amino acid.
In certain embodiments, a method as described herein can include providing a Cas transgenic cell in which one or more nucleic acids encoding one or more guide RNAs are provided or introduced, the one or more nucleic acids being operably linked in the cell to regulatory elements comprising a promoter of one or more genes of interest. As used herein, the term "Cas transgenic cell" refers to a cell, such as a eukaryotic cell, in which the Cas gene has been integrated on the genome. The nature, type or origin of the cells is not particularly restricted according to the invention. Moreover, the manner in which the Cas transgene is introduced into the cell can vary and can be any method as known in the art. In certain embodiments, the Cas transgenic cell is obtained by introducing a Cas transgene into an isolated cell. In certain other embodiments, the Cas transgenic cell is obtained by isolating the cell from a Cas transgenic organism. By way of example and not limitation, Cas transgenic cells as referred to herein may be derived from Cas transgenic eukaryotes, such as Cas knock-in eukaryotes. Reference is made to WO2014/093622(PCT/US 13/74667), incorporated herein by reference. The methods of U.S. patent publication nos. 20120017290 and 20110265198, assigned to Sangamo BioSciences, inc, for targeting Rosa loci can be modified to utilize the CRISPR Cas system of the present invention. The method of U.S. patent publication No. 20130236946 assigned to Cellectis for targeting Rosa loci can also be modified to utilize the CRISPR Cas system of the present invention. By way of another example, reference is made to Platt et al (Cell; 159(2):440-455(2014)) which describes Cas9 knock-in mice, incorporated herein by reference. The Cas transgene may also comprise a Lox-Stop-polyA-Lox (lsl) cassette, thereby facilitating Cas expression inducible by Cre recombinase. Alternatively, Cas transgenic cells can be obtained by introducing a Cas transgene into isolated cells. Delivery systems for transgenes are well known in the art. By way of example, a Cas transgene can be delivered in, for example, a eukaryotic cell by means of a vector (e.g., AAV, adenovirus, lentivirus) and/or particle and/or nanoparticle delivery as also described elsewhere herein.
The skilled person will appreciate that a cell as referred to herein, such as a Cas transgenic cell, may comprise a genomic alteration in addition to the integrated Cas gene or a mutation resulting from the sequence-specific action of Cas when complexed with an RNA capable of directing Cas to a target locus.
In certain aspects, the invention relates to vectors, e.g., for delivering or introducing Cas and/or an RNA capable of directing Cas to a target locus (i.e., a guide RNA) into a cell, and for propagating these components (e.g., in prokaryotic cells). As used herein, a "carrier" is a tool that allows or facilitates the transfer of an entity from one environment to another. A vector is a replicon, such as a plasmid, phage or cosmid, into which another DNA segment may be inserted in order to bring about replication of the inserted segment. Generally, the vector is capable of replication when associated with appropriate control elements. Generally, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it is linked. Vectors include, but are not limited to, single-stranded, double-stranded, or partially double-stranded nucleic acid molecules; nucleic acid molecules comprising one or more free ends, not comprising a free end (e.g., circular); a nucleic acid molecule comprising DNA, RNA, or both; and other species of polynucleotides known in the art. One type of vector is a "plasmid," which refers to a circular double-stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Another type of vector is a viral vector, in which the viral-derived DNA or RNA sequences are present in a vector packaged into a virus, such as a retrovirus, a replication-defective retrovirus, adenovirus, replication-defective adenovirus, and adeno-associated virus (AAV). Viral vectors also include polynucleotides carried by viruses transfected into host cells. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. In addition, certain vectors are capable of directing the expression of genes to which they are operably linked. Such vectors are referred to herein as "expression vectors". Commonly used expression vectors for effective use in recombinant DNA techniques are often in the form of plasmids.
A recombinant expression vector may comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vector comprises one or more regulatory elements, which may be selected on the basis of the host cell used for expression, operably linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of interest is linked to one or more regulatory elements in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). For the recombination and cloning method, reference is made to US patent application 10/815,730, published on 9, 2, 2004 as US2004-0171156a1, the content of which is incorporated herein by reference in its entirety. Accordingly, embodiments disclosed herein may also include transgenic cells comprising a CRISPR effector system. In certain exemplary embodiments, the transgenic cells may serve as individual discrete volumes. In other words, a sample comprising the masking construct may be delivered to a cell, for example, in a suitable delivery vesicle, and if the target is present in the delivery vesicle, the CRISPR effector is activated and generates a detectable signal.
The one or more vectors may include one or more regulatory elements, such as one or more promoters. One or more vectors may comprise a Cas coding sequence and/or a single, but may also comprise at least 3 or 8 or 16 or 32 or 48 or 50 guide RNA (e.g., sgRNA) coding sequences, such as 1-2, 1-3, 1-4, 1-5, 3-6, 3-7, 3-8, 3-9, 3-10, 3-8, 3-16, 3-30, 3-32, 3-48, 3-50 RNAs (e.g., sgrnas). In a single vector, a promoter for each RNA (e.g., sgRNA) can be present, advantageously when up to about 16 RNAs are present; and when a single vector provides more than 16 RNAs, one or more promoters may drive expression of more than one RNA, for example when there are 32 RNAs, each promoter may drive expression of two RNAs, and when there are 48 RNAs, each promoter may drive expression of three RNAs. Through simple mathematical and well established cloning protocols and teachings of the present disclosure, one skilled in the art can readily practice the present invention with respect to one or more RNAs of a suitable exemplary vector (such as AAV) and a suitable promoter, such as the U6 promoter. For example, the envelope limit of AAV is about 4.7 kb. The length of a single U6-gRNA (plus restriction sites for cloning) was 361 bp. Thus, the skilled person can easily assemble about 12-16, e.g. 13, U6-gRNA cassettes into a single vector. This can be assembled by any suitable means, such as the gold strategy for TALE assembly (genome-engineering. org/taleffectors /). The skilled artisan can also use a tandem guidance strategy to increase the number of U6-grnas by about 1.5 fold, e.g., from 12-16, e.g., 13, to about 18-24, e.g., about 19U 6-grnas. Thus, one skilled in the art can readily achieve about 18-24, e.g., about 19 promoter-RNAs, e.g., U6-grnas, in a single vector, e.g., an AAV vector. A further means for increasing the number of promoters and RNAs in a vector is to use a single promoter (e.g., U6) to express an array of RNAs separated by a cleavable sequence. And, a further way to increase the number of promoter-RNAs in a vector is to express a promoter-RNA array separated by a cleavable sequence in the coding sequence or intron of a gene; and in this case it is advantageous to use a polymerase II promoter, which can have increased expression and is capable of transcribing long RNAs in a tissue-specific manner. (see, e.g., nar. oxiford journals. org/content/34/7/e53.short and nature. com/mt/journal/v16/n9/abs/mt2008144a. html). In an advantageous embodiment, the AAV may encapsulate U6 tandem grnas targeting up to about 50 genes. Thus, according to the knowledge in the art and the teachings of the present disclosure, one can readily prepare and use, without undue experimentation, one or more vectors, e.g., a single vector, expressing multiple RNAs or guides under the control of or operatively or functionally linked to one or more promoters-especially the number of RNAs or guides discussed herein.
The guide RNA coding sequence and/or the Cas coding sequence may be functionally or operatively linked to one or more regulatory elements, and thus the one or more regulatory elements drive expression. The one or more promoters may be one or more constitutive promoters and/or one or more conditional promoters and/or one or more inducible promoters and/or one or more tissue specific promoters. The promoter may be selected from the group consisting of: RNA polymerase, pol I, pol II, pol III, T7, U6, H1, retroviral Rous Sarcoma Virus (RSV) LTR promoter, Cytomegalovirus (CMV) promoter, SV40 promoter, dihydrofolate reductase promoter, β -actin promoter, phosphoglycerate kinase (PGK) promoter, and EF1 α promoter. An advantageous promoter is the promoter U6.
The CRISPR-Cas protein may additionally be modified. As used herein, the term "modified" with respect to a CRISPR-Cas protein generally refers to a CRISPR-Cas protein having one or more modifications or mutations (including point mutations, truncations, insertions, deletions, chimeras, fusion proteins, etc.) as compared to the wild-type Cas protein from which it is derived. By derived, it is meant that the derivative enzyme is based primarily on the wild-type enzyme in the sense of having a high degree of sequence homology to the wild-type enzyme, but that the derivative enzyme has been mutated (modified) in some manner known in the art or as described herein.
Additional modifications to the CRISPR-Cas protein may or may not result in a functional change. For example, and in particular with regard to CRISPR-Cas proteins, modifications that do not result in a change in function include, for example, codon optimization for expression into a particular host, or providing a nuclease with a particular label (e.g., for visualization). Modifications that may result in altered function may also include mutations, including point mutations, insertions, deletions, truncations (including resolving nucleases), and the like, as well as chimeric nucleases (e.g., comprising domains from different orthologs or homologs) or fusion proteins. Fusion proteins may include, but are not limited to, for example, fusions with heterologous or functional domains (e.g., localization signals, catalytic domains, etc.). In certain embodiments, a variety of different modifications can be combined (e.g., a catalytically active mutant nuclease is further fused to a functional domain, e.g., to induce DNA methylation; or another nucleic acid modification, such as including but not limited to a break (e.g., by a different nuclease (domain)), a mutation, a deletion, an insertion, a substitution, a ligation, a digestion, a break, or a recombination). As used herein, "altered functionality" includes, but is not limited to, altered specificity (e.g., altered target recognition, increased (e.g., "enhanced" Cas protein) or decreased specificity, or altered PAM recognition), altered activity (e.g., increased or decreased catalytic activity, including catalytically inactive nucleases or nickases), and/or altered stability (e.g., fusion to a destabilizing domain). Suitable heterologous domains include, but are not limited to, nucleases, ligases, repair proteins, methyltransferases, (viral) integrases, recombinases, transposases, argonaute, cytidine deaminases, reverse transcriptions, group II introns, phosphatases, phosphorylases, sulfonylases (sulforylases), kinases, polymerases, exonucleases, and the like. Examples of all such modifications are known in the art. It will be understood that "modified" nucleases, and in particular "modified" Cas or "modified" CRISPR-Cas systems or complexes as referred to herein, preferably still have the ability to interact or bind with a polynucleic acid (e.g. complexed with a guide molecule). Such a modified Cas protein may be combined with a deaminase protein or an active domain thereof as described herein.
In certain embodiments, a CRISPR-Cas protein may comprise one or more modifications that enhance activity and/or specificity, for example, including mutated residues that stabilize targeted or non-targeted strands (e.g., eCas 9; "rational engineered Cas9 nucleotides with improved specificity," Slaymaker et al (2016), Science,351(6268):84-88, herein incorporated by reference in its entirety). In certain embodiments, the altered or modified activity of the engineered CRISPR protein comprises increased targeting efficiency or decreased off-target binding. In certain embodiments, the altered activity of the engineered CRISPR protein comprises a modified cleavage activity. In certain embodiments, the altered activity comprises increased cleavage activity at a target polynucleotide locus. In certain embodiments, the altered activity comprises reduced cleavage activity at a target polynucleotide locus. In certain embodiments, the altered activity comprises reduced cleavage activity at an off-target polynucleotide locus. In certain embodiments, the altered or modified activity of the modified nuclease comprises altered helicase kinetics. In certain embodiments, the modified nuclease comprises a modification that alters the association of a protein with a nucleic acid molecule comprising an RNA (in the case of a Cas protein), or a strand of a target polynucleotide locus, or a strand of an off-target polynucleotide. In one aspect of the invention, the engineered CRISPR protein comprises a modification that alters the formation of a CRISPR complex. In certain embodiments, the altered activity comprises increased cleavage activity at an off-target polynucleotide locus. Thus, in certain embodiments, the specificity for a target polynucleotide locus is increased as compared to an off-target polynucleotide locus. In other embodiments, the specificity for a target polynucleotide locus is reduced as compared to an off-target polynucleotide locus. In certain embodiments, the mutation results in a reduction in off-target effects (e.g., cleavage or binding properties, activity, or kinetics), such as in the case of Cas proteins, e.g., resulting in a reduction in tolerance to mismatches between the target and the guide RNA. Other mutations may result in increased off-target effects (e.g., cleavage or binding properties, activity or kinetics). Other mutations may result in increased or decreased on-target effects (e.g., cleavage or binding properties, activity or kinetics). In certain embodiments, the mutation causes altered (e.g., increased or decreased) helicase activity, association or formation of a functional nuclease complex (e.g., CRISPR-Cas complex). In certain embodiments, the mutation results in an altered PAM recognition, i.e., it is possible (additionally or alternatively) to recognize a different PAM compared to the unmodified Cas protein (see, e.g., "Engineered CRISPR-Cas9 nucleic with altered PAM specificities", kleintiver et al (2015), Nature,523(7561): 481-. To enhance specificity, particularly preferred mutations include positively charged residues and/or (evolutionarily) conserved residues, such as conserved positively charged residues. In certain embodiments, such residues may be mutated to uncharged residues, such as alanine.
Cas 9-based nickase
In certain embodiments, the CRISPR nickase is a Cas 9-based nickase. The Cas9 gene is present in several different bacterial genomes, typically in the same locus as the Cas1, Cas2 and Cas4 genes and the CRISPR cassette. In addition, Cas9 protein contains an easily recognizable C-terminal region that is homologous to transposon ORF-B and includes an active RuvC-like nuclease, arginine-rich region.
In particular embodiments, the nickase is a Cas9 nickase from an organism of the genus comprising: streptococcus, Campylobacter, nitrate lysis bacteria, Staphylococcus, Corynebacterium, Rogowsonia, Neisseria, gluconacetobacter, Azospirillum, Sphaerotheca, Lactobacillus, Eubacterium or Corynebacterium.
In particular embodiments, the nickase is a Cas9 nickase from an organism of the genus comprising: the genus Carnobacterium, rhodobacter, Listeria, Marangobacter, Clostridium, Lachnospiraceae, Clostridium, cilium, Francisella, Legionella, Alicyclobacillus, Methanophilus, Porphyromonas, Prevotella, Bacteroides, Paecilomyces, Leptospira, Desulfurovibrio, Desulfosalina, Toxomycetaceae, Bacillus, Brevibacterium, Methylobacterium, or Aminococcus.
In further particular embodiments, the Cas9 nickase is from an organism selected from the group consisting of: streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae. In particular embodiments, the nickase is a Cas9 nickase from an organism derived from streptococcus pyogenes, staphylococcus aureus, or streptococcus thermophilus Cas 9.
The nickase can comprise a chimeric protein comprising a first fragment from a first effector protein (e.g., Cas9) ortholog and a second fragment from a second effector protein (e.g., Cas9) ortholog, and wherein the first and second effector protein orthologs are different. At least one of the first and second effector protein (e.g., Cas9) orthologs may comprise an effector protein (e.g., Cas9) from an organism comprising: streptococcus, Campylobacter, nitrate lysis bacteria, Staphylococcus, Microclavus, Rogowsonia, Neisseria, gluconacetobacter, Azospirillum, Spirosoma, Lactobacillus, Eubacterium, Corynebacterium, Carnobacterium, rhodobacter, Listeria, Marsh Bacillus, Clostridium, Lachnospiraceae, Clostridia, Cicilia, Francisella, Legionella, Alicyclobacillus, Methanophilus, Porphyromonas, Prevotella, Bacteroides, Paederus, Trichostoma, Leptospira, Desulfurophyces, Desulfobacter, Fenugiaceae, Phyllobacterium, Bacillus, Brevibacterium, Methylobacterium, or Aminococcus; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein the first fragment and the second fragment are each selected from Cas9 of an organism comprising: streptococcus, campylobacter, nitrolytic bacteria, staphylococcus, parvulus, roche, neisseria, gluconacetobacter, azospirillum, unisporum, lactobacillus, eubacterium, corynebacterium, carnobacterium, rhodobacter, listeria, swamp bacillus, clostridium, lachnospiraceae, clostridium, leptospiridium, cilium, franciscium, legionella, alicyclobacillus, methanophilus, porphyromonas, prevotella, bacteroidetes, traudiococcus, leptospira, desulfuricus, sulfosalinobacterium, celulaceae, phyromobacterium, bacillus, brevibacillus, methylobacter, or aminoacidococcus, wherein the first and second fragments are not from the same bacterium; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein the first fragment and the second fragment are each selected from Cas 9: streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae, francisella tularensis 1, prevotella easily, lachnospiraceae MC 20171, vibrio proteolyticus, isocratic bacteria GW2011_ GWA2_33_10, centipede bacteria GW2011_ GWC2_44_17, smith bacteria SCADC, aminoacetococcus BV3L6, lachnospiraceae MA2020, candidate termite methanogen, shigella, moraxella bovis 237, leptospira paddychii, lachnospiraceae bacteria ND2006, porphyromonas canis 3, prevotella saccharolytica, and porphyromonas macaque, wherein the first and second fragments are not from the same bacterium.
In a more preferred embodiment, the Cas9 nickase is derived from a bacterial species selected from the group consisting of: streptococcus pyogenes, staphylococcus aureus or streptococcus thermophilus Cas 9. In certain embodiments, Cas9p is derived from a bacterial species selected from the group consisting of: francisella tularensis 1, Prevotella facilis, Prospirochaetaceae MC 20171, Vibrio proteolyticus, Heterophaera bacterium GW2011_ GWA2_33_10, Umochloa bacteria GW2011_ GWC2_44_17, SciSenella species SCADC, Aminococcus species BV3L6, Prospirochaetaceae bacterium MA2020, candidate termite methane mycoplasma, shiitake bacteria, Moraxella bovis 237, Leptospira paddy, Prospirochaetaceae bacteria ND2006, Porphyromonas canis 3, Prevotella saccharolytica, and Porphyromonas kiwii. In certain embodiments, Cas9p is derived from a bacterial species selected from the group consisting of: aminococcus BV3L6, Lachnospiraceae MA 2020. In certain embodiments, the effector protein is derived from a subspecies of francisella tularensis 1, including but not limited to, the neotamer subspecies of francisella tularensis.
In particular embodiments, a homolog or ortholog of Cas9 as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence homology or identity with Cas 9. In further embodiments, a homolog or ortholog of Cas9 as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with wild-type Cas 9. In case Cas9 has one or more mutations (is mutated), the homolog or ortholog of Cas9 as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with the mutated Cas 9.
In one embodiment, the Cas9 nickase may be an ortholog of an organism of the genus including, but not limited to: streptococcus species or staphylococcus; in particular embodiments, the Cas9 protein may be an ortholog of an organism of a species including, but not limited to: streptococcus pyogenes, staphylococcus aureus or streptococcus thermophilus Cas 9. In particular embodiments, a homolog or ortholog of Cas9p as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as, for example, at least 95% sequence homology or identity with one or more of the Cas9 sequences disclosed herein. In further embodiments, a homolog or ortholog of Cas9 as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with wild-type SpCas9, SaCas9 or StCas 9.
In particular embodiments, the Cas9 nickase of the invention has at least 60%, more particularly at least 70%, such as at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence homology or identity with SpCas9, SaCas9 or StCas 9. In further embodiments, the Cas9 protein as referred to herein has at least 60%, such as at least 70%, more particularly at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence identity with the wild-type SpCas9, SaCas9 or StCas 9. The skilled person will appreciate that this includes truncated forms of the Cas9 protein, whereby sequence identity is determined over the length of the truncated forms.
Modified Cas9 protein
In particular embodiments, of interest is the use of an engineered Cas9 protein (such as Cas9) as defined herein, wherein the protein is complexed with a nucleic acid molecule comprising an RNA to form a CRISPR complex, wherein the nucleic acid molecule targets one or more target polynucleotide loci when in the CRISPR complex, the protein comprising at least one modification as compared to the unmodified Cas9 protein, and wherein the CRISPR complex comprising the modified protein has altered activity as compared to the complex comprising the unmodified Cas9 protein. It is to be understood that when referring to CRISPR "proteins" herein, the Cas9 protein is preferably a modified CRISPR-Cas protein (e.g., having increased or decreased (or no) enzymatic activity), as including, without limitation, Cas 9. The term "CRISPR protein" can be used interchangeably with "CRISPR-Cas protein", regardless of whether the CRISPR protein has altered, such as increased or decreased (or no) enzymatic activity as compared to a wild-type CRISPR protein.
Several small segments of the unstructured region are predicted to be within the Cas9 primary structure. Solvent-exposed and non-conserved unstructured regions within the different Cas9 orthologs are preferred flanks for cleavage and insertion of small protein sequences. In addition, these sides can be used to generate chimeric proteins between Cas9 orthologs.
Based on the above information, mutants can be generated that inactivate enzymes or modify double-stranded nucleases to have nickase activity. In alternative embodiments, this information is used to develop enzymes with reduced off-target effects (described elsewhere herein).
Suitable Cas9 enzyme modifications to enhance specificity, particularly by reducing off-target effects, are described, for example, in PCT/US2016/038034, which is incorporated herein by reference in its entirety. In particular embodiments, reduction of off-target cleavage is ensured by destabilizing strand separation, more specifically by reducing positive charge in the DNA interaction region by introducing mutations in the Cas9 enzyme (as described herein and further exemplified for Cas9 by Slaymaker et al 2016(Science, 1; 351(6268): 84-8)). In further embodiments, reduction of off-target cleavage is ensured by introducing mutations into the Cas9 enzyme that affect the interaction between the target strand and the guide RNA sequence, more specifically disrupt the interaction between Cas9 and the phosphate backbone of the target DNA strand in such a way that specific target activity is retained but off-target activity is reduced (as described by Kleinstein et al 2016, Nature, 28; 529(7587):490-5 for Cas 9). In particular embodiments, off-target activity is reduced via modified Cas9, wherein the interaction with both target and non-target strands is modified compared to wild-type Cas 9.
Methods and mutations that can be used in various combinations to increase or decrease on-target activity and/or specificity relative to off-target activity, or to increase or decrease on-target binding and/or specificity relative to off-target binding, can be used to compensate for or enhance mutations or modifications intended to promote other effects. Such mutations or modifications intended to promote other effects include mutations or modifications to Cas9 effector proteins and or mutations or modifications to guide RNAs.
The specificity of Cas9 can be further improved by mutating residues of the stable non-targeted DNA strand using a similar strategy for improving Cas9 specificity (Slaymaker et al 2015 "rational engineered Cas9 cycles with improved specificity"). This can be done without a crystal structure by using linear structure alignment to predict 1) which domain of Cas9 binds to which strand of DNA, and 2) which residues in these domains contact the DNA.
However, this approach may be limited due to poor conservation of Cas9 with known proteins. Thus, it may be desirable to probe the function of all possible DNA interacting amino acids (lysine, histidine and arginine).
Catalytically active Cas9 protein makes a flat cut, whereby the cleavage site is usually located within the target sequence. More specifically, the nick is typically 2-3 nucleotides upstream of the PAM. In particular embodiments, the nick on the non-target strand is 3 nucleotides upstream of the PAM (i.e., between the 3 rd and 4 th nucleotides upstream of the PAM), while the nick on the target strand (i.e., the strand that hybridizes to the guide sequence) occurs at the same position on the complementary strand (this is 3 nucleotides upstream of the PAM complement on the 3' strand, or between nucleotides 3 and 4 upstream of the PAM complement).
In certain embodiments, one or more catalytic domains of a Cas9 protein (e.g., RuvC I, RuvC II, and RuvC III, or HNH domain of Cas9 protein) are mutated to generate a mutant Cas protein that cleaves only one DNA strand of a target sequence.
As a further guide and not by way of limitation, an aspartate to alanine substitution (D10A) in the RuvC I catalytic domain of Cas9 from streptococcus pyogenes, for example, converts Cas9 from a two-strand cleaving nuclease to a nickase (cleaving a single strand). Other examples of mutations that make Cas 9a nickase include, but are not limited to, H840A, N854A, and N863A. As a further guide, in the case where the enzyme is not SpCas9, mutations can be made at any or all of the residues corresponding to positions 10, 762, 840, 854, 863 and/or 986 of SpCas9 (which can be determined, for example, by standard sequence comparison tools). In particular, any or all of the following mutations are preferred in SpCas 9: D10A, E762A, H840A, N854A, N863A and/or D986A; conservative substitutions for any of the substituted amino acids are also contemplated.
In a first preferred embodiment, the CRISPR-Cas protein is a SpCas9 nickase with a catalytically inactive HNH domain (e.g., a SpCas9 nickase with the N863A mutation). In a second preferred embodiment, the CRISPR-Cas protein is SaCas9 with catalytically inactive HNH domain (e.g., a SaCas9 nickase with N580A mutation). In a third preferred embodiment, the CRISPR-Cas protein is a SpCas9 nickase with a partially or fully removed HNH domain. In a fourth preferred embodiment, the CRISPR-Cas protein is SaCas9 with the HNH domain partially or completely removed.
In some of the Cas9 enzymes described above, the enzymes are modified by mutation of one or more residues including, but not limited to, positions D917, E1006, E1028, D1227, D1255A, N1257 according to the FnCas9 protein or any corresponding ortholog. In one aspect, the invention provides a composition as discussed herein, wherein the Cas9 enzyme is an inactivated enzyme comprising one or more mutations selected from the group consisting of: D917A, E1006A, E1028A, D1227A, D1255A and N1257A. In one aspect, the present invention provides a composition as discussed herein, wherein the CRISPR-Cas protein comprises D917 or E1006 and D917, or D917 and D1255, according to the respective position in the FnCas9 protein or Cas9 ortholog.
In certain embodiments, the modification or mutation of Cas9 comprises a mutation in a RuvCI, RuvCIII, or HNH domain. In certain embodiments, the modification or mutation comprises an amino acid substitution at one or more of the following positions, relative to the amino acid position numbering of SpCas 9: 12. 13, 63, 415, 610, 775, 779, 780, 810, 832, 848, 855, 861, 862, 866, 961, 968, 974, 976, 982, 983, 1000, 1003, 1014, 1047, 1060, 1107, 1108, 1109, 1114, 1129, 1240, 1289, 1296, 1297, 1300, 1311, and 1325; preferably 855, 810, 1003 and 1060; or 848, 1003. In certain embodiments, the modifications or mutations at the following positions comprise alanine substitutions: 63. 415, 775, 779, 780, 810, 832, 848, 855, 861, 862, 866, 961, 968, 974, 976, 982, 983, 1000, 1003, 1014, 1047, 1060, 1107, 1108, 1109, 1114, 1129, 1240, 1289, 1296, 1297, 1300, 1311, or 1325; preferably 855; 810. 1003 and 1060; 848. 1003 and 1060; or 497, 661, 695, and 926. In certain embodiments, the modifications include K855A; K810A, K1003A and R1060A; or K848A, K1003A (relative to SpCas9), and R1060A. In certain embodiments, the modifications include N497A, R661A, Q695A, and Q926A (relative to SpCas 9).
As a further example, two or more catalytic domains of Cas9 (RuvC I, RuvC II, and RuvC III or HNH domains) may be mutated to generate a mutated Cas9 that lacks substantially all DNA cleavage activity. In some embodiments, the D10A mutation is combined with one or more of the H840A, N854A, or N863A mutations to produce a Cas9 enzyme that lacks substantially all DNA cleavage activity. In some embodiments, a CRISPR enzyme is considered to lack substantially all DNA cleavage activity when the DNA cleavage activity of the mutant enzyme is less than about 25%, 10%, 5%, 1%, 0.1%, 0.01% or less relative to its non-mutated form. In the case where the enzyme is not SpCas9, mutations can be made at any or all of the residues corresponding to positions 10, 762, 840, 854, 863 and/or 986 of SpCas9 (which can be determined, for example, by standard sequence comparison tools). In particular, any or all of the following mutations are preferred in SpCas 9: D10A, E762A, H840A, N854A, N863A and/or D986A; conservative substitutions for any of the substituted amino acids are also contemplated. The same (or conservative substitutions of these mutations) at corresponding positions in other Cas9 are also preferred. Particularly preferred are D10 and H840 in SpCas 9. However, in other Cas9, residues corresponding to SpCas 9D 10 and H840 are also preferred.
In certain embodiments, two different chimeric grnas may be used with a Cas9 nickase, which will introduce cleavage at the target site together with similar efficiency as using a single chimeric gRNA. Off-target effects can be reduced in this way because Cas9 nickase does not have the ability to induce double strand breaks as does wild-type Cas 9. Such double-nicking processes are described, for example, in PCT publication nos. WO2014093622 and WO2014204725, which are incorporated herein by reference.
Cas12 protein
In certain exemplary embodiments, the compositions, systems, and assays may include multiple Cas12 orthologs or one or more orthologs in combination with one or more Cas9 orthologs. In certain exemplary embodiments, the Cas12 ortholog is a Cpf1 ortholog, a C2C1 ortholog, or a C2C3 ortholog.
Cpf1 ortholog
The present invention encompasses the use of a nickase based on a mutated form of a wild-type Cpf1 effector protein derived from the Cpf1 locus designated as subtype V-a. Such effector proteins are also referred to herein as "Cpf 1 p", e.g., the Cpf1 protein (and such effector proteins or the Cpf1 protein or proteins derived from the Cpf1 locus are also referred to as "CRISPR enzymes"). Currently, subtype V-a loci include cas1, cas2 (unique gene designated cpf1) and CRISPR arrays. Cpf1 (CRISPR-associated protein Cpf1, subtype PREFRAN) is a large protein (about 1300 amino acids) containing a RuvC-like nuclease domain homologous to the corresponding domain of Cas9, and a portion corresponding to the characteristic arginine-rich cluster of Cas 9. However, Cpf1 lacks the HNH nuclease domain present in all Cas9 proteins, whereas RuvC-like domains are contiguous in the Cpf1 sequence, in contrast Cas9 contains a long insertion fragment, including the HNH domain. Thus, in particular embodiments, the CRISPR-Cas enzyme comprises only RuvC-like nuclease domains.
The terms "ortholog" (also referred to herein as "ortholog") and "homolog" (also referred to herein as "homolog") are well known in the art. By way of further guidance, a "homolog" of a protein as used herein is a protein of the same species that performs the same or similar function as the protein that is the homolog thereof. Homologous proteins may, but need not, be structurally related, or only partially structurally related. An "orthologue" of a protein as used herein is a different species of protein that performs the same or similar function as the protein that is an orthologue thereof. Orthologous proteins may, but need not, be structurally related, or only partially structurally related. Homologs and orthologs can be modeled by homology (see, e.g., Greer, Science, Vol.228 (1985)1055 and Blundedel et al Eur J Biochem vol 172(1988),513) or "structural BLAST" (Dey F, Cliff Zhang Q, Petrey D, Honig B. Toward a "structural BLAST": using structural relations to the information function in Sci.2013, 4 months; 22 (359-66. doi: 10.1002/pro.2225.). See also Shmakov et al (2015) for applications in the field of CRISPR-Cas loci. Homologous proteins may, but need not, be structurally related, or only partially structurally related.
The Cpf1 gene is present in several different bacterial genomes, typically in the same locus as cas1, cas2 and cas4 genes and CRISPR cassettes (e.g., FNFX1_1431-FNFX1_1428 of Francisella neoformans (Francisella cf. novicida) Fx 1). Thus, the layout of this putative novel CRISPR-Cas system appears to be similar to that of type II-B. Furthermore, similar to Cas9, Cpf1 protein contains an easily identifiable C-terminal region homologous to transposon ORF-B and contains an active RuvC-like nuclease, an arginine-rich region, and a Zn finger (absent in Cas 9). However, unlike Cas9, Cpf1 is also present in several genomes without CRISPR-Cas environment, and its relatively high similarity to ORF-B suggests that it is likely to be a transposon component. It was shown that if this is a true CRISPR-Cas system and Cpf1 is a functional analogue of Cas9, it will be of a novel CRISPR-Cas type, i.e. type V (see association and Classification of CRISPR-Cas systems, makarokroks va, Koonin ev. methods Mol biol. 2015; 1311: 47-75). However, as described herein, Cpf1 is designated as subtype V-a to distinguish it from C2C1p, which C2C1p does not have the same domain structure and is therefore designated as subtype V-B.
In particular embodiments, the effector protein is a Cpf1 effector protein from an organism derived from a genus comprising: streptococcus, Campylobacter, nitrate lysis bacteria, Staphylococcus, Microclavus, Rogowsonia, Neisseria, gluconacetobacter, Azospirillum, Spirochacterium, Lactobacillus, Eubacterium, Corynebacterium, Carnobacterium, rhodobacter, Listeria, Marsh Bacillus, Clostridium, Lachnospiraceae, Clostridia, Cicilia, Francisella, Legionella, Alicyclobacillus, Methanophilus, Porphyromonas, Prevotella, Bacteroides, traudiococcus, Leptospira, Desulfuricus, Desulfobacter, Bluesaceae, Phyllobacterium, Bacillus, Brevibacterium, Methylobacterium, or Aminococcus.
In further particular embodiments, the Cpf1 effector protein is from an organism selected from the group consisting of: streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae.
The nickase enzyme may comprise a chimeric protein comprising a first fragment from an ortholog of a first effector protein (e.g., Cpf1) and a second fragment from an ortholog of a second effector protein (e.g., Cpf1), and wherein the first and second effector protein orthologs are different. At least one of the first and second effector protein (e.g., Cpf1) orthologs may comprise an effector protein (e.g., Cpf1) from an organism comprising: streptococcus, Campylobacter, nitrate lysis bacteria, Staphylococcus, Microclavus, Rogowsonia, Neisseria, gluconacetobacter, Azospirillum, Spirosoma, Lactobacillus, Eubacterium, Corynebacterium, Carnobacterium, rhodobacter, Listeria, Marsh Bacillus, Clostridium, Lachnospiraceae, Clostridia, Cicilia, Francisella, Legionella, Alicyclobacillus, Methanophilus, Porphyromonas, Prevotella, Bacteroides, Paederus, Trichostoma, Leptospira, Desulfurophyces, Desulfobacter, Fenugiaceae, Phyllobacterium, Bacillus, Brevibacterium, Methylobacterium, or Aminococcus; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein each of the first fragment and the second fragment is selected from Cpf1 of an organism comprising: streptococcus, campylobacter, nitrolytic bacteria, staphylococcus, parvulus, roche, neisseria, gluconacetobacter, azospirillum, unisporum, lactobacillus, eubacterium, corynebacterium, carnobacterium, rhodobacter, listeria, swamp bacillus, clostridium, lachnospiraceae, clostridium, leptospiridium, cilium, franciscium, legionella, alicyclobacillus, methanophilus, porphyromonas, prevotella, bacteroidetes, traudiococcus, leptospira, desulfuricus, sulfosalinobacterium, celulaceae, phyromobacterium, bacillus, brevibacillus, methylobacter, or aminoacidococcus, wherein the first and second fragments are not from the same bacterium; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein each of the first fragment and the second fragment is selected from Cpf 1: streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae, francisella tularensis 1, prevotella easily, lachnospiraceae MC 20171, vibrio proteolyticus, isocratic bacteria GW2011_ GWA2_33_10, centipede bacteria GW2011_ GWC2_44_17, smith bacteria SCADC, aminoacetococcus BV3L6, lachnospiraceae MA2020, candidate termite methanogen, shigella, moraxella bovis 237, leptospira paddychii, lachnospiraceae bacteria ND2006, porphyromonas canis 3, prevotella saccharolytica, and porphyromonas macaque, wherein the first and second fragments are not from the same bacterium.
In a more preferred embodiment, the Cpf1p nickase is derived from a bacterial species selected from the group consisting of: francisella tularensis 1, Prevotella facilis, Prospirochaetaceae MC 20171, Vibrio proteolyticus, Heterophaera bacterium GW2011_ GWA2_33_10, Umochloa bacteria GW2011_ GWC2_44_17, SciSenella species SCADC, Aminococcus species BV3L6, Prospirochaetaceae bacterium MA2020, candidate termite methane mycoplasma, shiitake bacteria, Moraxella bovis 237, Leptospira paddy, Prospirochaetaceae bacteria ND2006, Porphyromonas canis 3, Prevotella saccharolytica, and Porphyromonas kiwii. In certain embodiments, Cpf1p is derived from a bacterial species selected from the group consisting of: aminococcus BV3L6, Lachnospiraceae MA 2020. In certain embodiments, the effector protein is derived from a subspecies of francisella tularensis 1, including but not limited to, the neotamer subspecies of francisella tularensis.
In some embodiments, the Cpf1p nickase is derived from an organism from the genus eubacterium. In some embodiments, the CRISPR nickase is derived from an organism of the bacterial species eubacterium recta. In some embodiments, the amino acid sequence of the wild-type Cpf1 effector protein corresponds to NCBI reference sequence WP _055225123.1, NCBI reference sequence WP _055237260.1, NCBI reference sequence WP _055272206.1, or GenBank ID OLA 16049.1. In some embodiments, the Cpf1 effector protein has at least 60%, more particularly at least 70%, such as at least 80%, more preferably at least 85%, even more preferably at least 90%, such as, for example, at least 95% sequence homology or sequence identity to NCBI reference sequence WP _055225123.1, NCBI reference sequence WP _055237260.1, NCBI reference sequence WP _055272206.1, or GenBank ID OLA 16049.1. The skilled person will appreciate that this includes truncated forms of the Cpf1 protein, whereby sequence identity is determined over the length of the truncated form. In some embodiments, the Cpf1 effector protein recognizes the PAM sequence of TTTN or CTTN.
In particular embodiments, a homolog or ortholog of Cpf1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence homology or identity with Cpf 1. In further embodiments, a homolog or ortholog of Cpf1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with wild-type Cpf 1. Where Cpf1 has one or more mutations (is mutated), the homolog or ortholog of Cpf1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence identity to the mutated Cpf 1.
In one embodiment, the Cpf1 protein may be an ortholog of an organism of the genus including, but not limited to: a species of the genus Aminococcus, a bacterium of the family Musaceae or Moraxella bovis; in particular embodiments, the V-type Cas protein may be an ortholog of an organism including, but not limited to, the species: the species Aminococcus BV3L6, the bacterium of the family Lachnospiraceae ND2006(LbCpf1) or Moraxella bovis. In particular embodiments, a homolog or ortholog of Cpf1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as, for example, at least 95% sequence homology or identity with one or more of the Cpf1 sequences disclosed herein. In further embodiments, a homolog or ortholog of Cpf1as as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with wild-type FnCpf1, AsCpf1 or LbCpf 1.
In particular embodiments, a Cpf1 protein of the invention has at least 60%, more particularly at least 70%, such as at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence homology or identity with FnCpf1, ascipf 1 or LbCpf 1. In further embodiments, a Cpf1 protein as referred to herein has at least 60%, such as at least 70%, more particularly at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence identity with wild-type aspcf 1 or LbCpf 1. In particular embodiments, the Cpf1 protein of the invention has less than 60% sequence identity with FnCpf 1. The skilled person will appreciate that this includes truncated forms of the Cpf1 protein, whereby sequence identity is determined over the length of the truncated form.
In some embodiments, the Cpf1 nickase comprises a mutation in the Nuc domain. In some embodiments, the Cpf1 nickase is capable of nicking a non-targeted DNA strand at a target locus of interest displaced by heteroduplex formation between the targeted DNA strand and the guide molecule. In some embodiments, the Cpf1 nickase comprises a mutation corresponding to R1226A in AsCpf 1.
As a further guide and not by way of limitation, an arginine to alanine substitution in the Nuc domain of Cpf1 from an aminoacetococcus species (R1226A) converts Cpf1 from a nuclease that cleaves both strands to a nickase (cleaves a single strand). One skilled in the art will appreciate that in the case where the enzyme is not ascipf 1, a mutation may be made at the residue at the corresponding position. In particular embodiments, Cpf1 is FnCpf1 and the mutation is at arginine at position R1218. In particular embodiments, Cpf1 is LbCpf1 and the mutation is on the arginine at position R1138. In particular embodiments, the Cpf1 is mbpcf 1 and the mutation is at the arginine at position R1293.
C2C1 ortholog
The present invention encompasses the use of a C2C 1-based nickase derived from the C2C1 locus designated as subtype V-B. Such effector proteins are also referred to herein as "C2C 1 p", e.g., C2C1 protein (and such effector proteins or C2C1 protein or proteins derived from the C2C1 locus are also referred to as "CRISPR enzymes"). Currently, subtype V-B loci include Cas1-Cas4 fusions, Cas2 (designated as a unique gene of C2C1), and CRISPR arrays. C2C1 (CRISPR-associated protein C2C1) is a large protein (about 1100-1300 amino acids) containing a RuvC-like nuclease domain homologous to the corresponding domain of Cas9 and a portion corresponding to the characteristic arginine-rich cluster of Cas 9. However, C2C1 lacks the HNH nuclease domain present in all Cas9 proteins, whereas RuvC-like domains are contiguous in the C2C1 sequence, in contrast to Cas9 which contains a long insertion fragment, including the HNH domain. Thus, in particular embodiments, the CRISPR-Cas enzyme comprises only RuvC-like nuclease domains.
The C2C1 (also known as Cas12b) protein is an RNA-guided nuclease. Its cleavage relies on tracr RNA to recruit a guide RNA comprising a guide sequence and a forward repeat sequence, wherein the guide sequence hybridizes to a target nucleotide sequence to form a DNA/RNA heteroduplex. Based on current studies, C2C1 nuclease activity also needs to rely on the recognition of PAM sequences. The C2C1PAM sequence is a T-rich sequence. In some embodiments, the PAM sequence is 5 'TTN 3' or 5 'ATTN 3', wherein N is any nucleotide. In particular embodiments, the PAM sequence is 5 'TTC 3'. In particular embodiments, the PAM is within the sequence of plasmodium falciparum.
C2C1 created staggered nicks at the target locus with 5' overhangs or "sticky ends" on the PAM distal side of the target sequence. In some embodiments, the 5' overhang is 7 nt. See Lewis and Ke, Mol cell.2017, 2 months and 2 days; 65(3):377-379.
The C2C1 gene is present in several different bacterial genomes, typically in the same locus as the cas1, cas2 and cas4 genes and the CRISPR cassette. Thus, the layout of this putative novel CRISPR-Cas system appears to be similar to that of type II-B. Furthermore, similar to Cas9, the C2C1 protein contains an active RuvC-like nuclease, an arginine-rich region, and a Zn finger (absent from Cas 9).
In particular embodiments, the CRISPR nickase is a C2C1 nickase from an organism derived from the genus comprising: alicyclobacillus, desulphatovibrio, desulphatosalinobacter, fusobacteriaceae, physodiumcentrotus, bacillus, brevibacillus, candidate species, desulphatobacillus, citrobacter, monarda, methylobacter, omnivora, planctomycetidae, planctomycetales, spirochaetes, and verrucomicrobiaceae.
In further particular embodiments, the C2C1 nickase is from a species selected from the group consisting of: acid-fast A.terrestris (e.g., ATCC 49025), a contaminated A.alicyclobacillus (e.g., DSM 17975), a A.macrocephalus (e.g., DSM 17980), a C4 strain of C.exotericus, a RIFCSPLOWO2 strain of the genus Lepidobacter, a Vibrio extraordinary desulforizing (e.g., DSM 10711), a thiodismutase desulforidinium (e.g., strain MLF-1), a RIFOXYA12 strain of the phylum Mitraz., a WOR _2 bacterium RIFCSPHIGHO2, a TAV5 bacterium of the family Tokyonaceae, a ST-NAGAB-D1 bacterium of the class Podospora, a RBG _13_46_10 bacterium of the phylum, a GWB B1_27_13, a UBA 9 bacterium of the family Microcomycetaceae, a Thermomyces (e.g.DSM 17572), a Thermomyces amyloliquefaciens (e.g., DSM strain B4166), a strain CF112, a strain NSP 2.P 1, a strain 1879), and a (e.g.g.DSM 2429), a 13609 strain, Citrobacter freundii (e.g., ATCC 8090), Brevibacillus agri (e.g., BAB-2500), Methylobacterium nodosum (e.g., ORS 2060).
The nickase enzyme may comprise a chimeric effector protein comprising a first fragment from an ortholog of a first effector protein (e.g., C2C1) and a second fragment from an ortholog of a second effector protein (e.g., C2C1), and wherein the first and second effector protein orthologs are different. At least one of the first and second effector protein (e.g., C2C1) orthologs may comprise an effector protein (e.g., C2C1) from an organism comprising: alicyclobacillus, desulphatovibrio, desulphatosalinobacter, fusobacteriaceae, physodiumbiobacillus, bacillus, brevibacillus, candidate species, desulphatobacillus, citrobacter, monarda, methylobacter, omnivora, planctomycetidae, planctomycetales, spirochaetes, and verrucomicrobiaceae; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein the first fragment and the second fragment are each selected from the group consisting of C2C1 of an organism comprising: alicyclobacillus, desulphatovibrio, desulphatosalinobacter, fusobacteriaceae, physodobacterium, bacillus, brevibacillus, candidate species, desulphatobacillus, phylum tracepellis, citrobacter, methylobacter, omnivora, phylum pumila, spirochaete, and verrucomicrobiaceae, wherein the first fragment and the second fragment are not from the same bacterium; for example, a chimeric effector protein comprising a first fragment and a second fragment, wherein the first fragment and the second fragment are each selected from the group consisting of C2C 1: acid-fast A.terrestris (e.g., ATCC 49025), a contaminated A.alicyclobacillus (e.g., DSM 17975), a A.macrocephalus (e.g., DSM 17980), a C4 strain of C.exotericus, a RIFCSPLOWO2 strain of the genus Lepidobacter, a Vibrio extraordinary desulforizing (e.g., DSM 10711), a thiodismutase desulforidinium (e.g., strain MLF-1), a RIFOXYA12 strain of the phylum Mitraz., a WOR _2 bacterium RIFCSPHIGHO2, a TAV5 bacterium of the family Tokyonaceae, a ST-NAGAB-D1 bacterium of the class Podospora, a RBG _13_46_10 bacterium of the phylum, a GWB B1_27_13, a UBA 9 bacterium of the family Microcomycetaceae, a Thermomyces (e.g.DSM 17572), a Thermomyces amyloliquefaciens (e.g., DSM strain B4166), a strain CF112, a strain NSP 2.P 1, a strain 1879), and a (e.g.g.DSM 2429), a 13609 strain, Citrobacter freundii (e.g., ATCC 8090), Brevibacillus agri (e.g., BAB-2500), Methylobacterium nodosum (e.g., ORS 2060), wherein the first fragment and the second fragment are not from the same bacterium.
In a more preferred embodiment, the C2C1p nickase is derived from a species selected from the group consisting of: acid-fast A.terrestris (e.g., ATCC 49025), a contaminated A.alicyclobacillus (e.g., DSM 17975), a A.macrocephalus (e.g., DSM 17980), a C4 strain of C.exotericus, a RIFCSPLOWO2 strain of the genus Lepidobacter, a Vibrio extraordinary desulforizing (e.g., DSM 10711), a thiodismutase desulforidinium (e.g., strain MLF-1), a RIFOXYA12 strain of the phylum Mitraz., a WOR _2 bacterium RIFCSPHIGHO2, a TAV5 bacterium of the family Tokyonaceae, a ST-NAGAB-D1 bacterium of the class Podospora, a RBG _13_46_10 bacterium of the phylum, a GWB B1_27_13, a UBA 9 bacterium of the family Microcomycetaceae, a Thermomyces (e.g.DSM 17572), a Thermomyces amyloliquefaciens (e.g., DSM strain B4166), a strain CF112, a strain NSP 2.P 1, a strain 1879), and a (e.g.g.DSM 2429), a 13609 strain, Citrobacter freundii (e.g., ATCC 8090), Brevibacillus agri (e.g., BAB-2500), Methylobacterium nodosum (e.g., ORS 2060). In certain embodiments, C2C1p is derived from a bacterial species selected from the group consisting of: alicyclobacillus acidophilus (e.g., ATCC 49025), Alicyclobacillus contaminated (e.g., DSM 17975).
In particular embodiments, the homolog or ortholog of C2C 1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence homology or identity with C2C 1. In further embodiments, a homolog or ortholog of C2C 1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with wild type C2C 1. In case C2C1 has one or more mutations (is mutated), the homologue or orthologue of C2C 1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as e.g. at least 95% sequence identity with the mutated C2C 1.
In one embodiment, the C2C1 protein may be an ortholog of an organism of the genus including, but not limited to: alicyclobacillus, desulphatovibrio, desulphatosalinobacter, fusobacteriaceae, physodiumbiobacillus, bacillus, brevibacillus, candidate species, desulphatobacillus, citrobacter, monarda, methylobacter, omnivora, planctomycetidae, planctomycetales, spirochaetes, and verrucomicrobiaceae; in particular embodiments, the V-type Cas protein may be an ortholog of an organism of a class including, but not limited to: acid-fast A.terrestris (e.g., ATCC 49025), a contaminated A.alicyclobacillus (e.g., DSM 17975), a A.macrocephalus (e.g., DSM 17980), a C4 strain of C.exotericus, a RIFCSPLOWO2 strain of the genus Lepidobacter, a Vibrio extraordinary desulforizing (e.g., DSM 10711), a thiodismutase desulforidinium (e.g., strain MLF-1), a RIFOXYA12 strain of the phylum Mitraz., a WOR _2 bacterium RIFCSPHIGHO2, a TAV5 bacterium of the family Tokyonaceae, a ST-NAGAB-D1 bacterium of the class Podospora, a RBG _13_46_10 bacterium of the phylum, a GWB B1_27_13, a UBA 9 bacterium of the family Microcomycetaceae, a Thermomyces (e.g.DSM 17572), a Thermomyces amyloliquefaciens (e.g., DSM strain B4166), a strain CF112, a strain NSP 2.P 1, a strain 1879), and a (e.g.g.DSM 2429), a 13609 strain, Citrobacter freundii (e.g., ATCC 8090), Brevibacillus agri (e.g., BAB-2500), Methylobacterium nodosum (e.g., ORS 2060). In particular embodiments, the homolog or ortholog of C2C 1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence homology or identity with one or more of the C2C1 sequences disclosed herein. In further embodiments, a homolog or ortholog of C2C 1as referred to herein has at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence identity with wild type AacC2C1 or BthC2C 1.
In particular embodiments, the C2C1 nickase of the invention has a sequence homology or identity of at least 60%, more particularly at least 70%, such as at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95%, to AacC2C1 or BthC2C 1. In a further embodiment, the C2C1 protein as referred to herein has at least 60%, such as at least 70%, more particularly at least 80%, more preferably at least 85%, even more preferably at least 90%, such as for example at least 95% sequence identity with wild type AacC2C 1. In a particular embodiment, the C2C1 protein of the invention has less than 60% sequence identity with AacC2C 1. The skilled person will appreciate that this includes truncated forms of the C2C1 protein, whereby sequence identity is determined over the length of the truncated forms.
In certain embodiments, the C2C1 nickase can be provided or expressed transiently or stably in an in vitro system or in a cell, and targeted or triggered to non-specifically cleave cellular nucleic acids. In one embodiment, C2C1 is engineered to knock down ssDNA, e.g., viral ssDNA. In another embodiment, C2C1 is engineered to knock down RNA. The system may be designed such that knockdown is dependent on the presence of target DNA in the cell or in vitro system, or is triggered by the addition of target nucleic acid to the system or cell.
In certain embodiments, the C2C1 protein is catalytically inactive C2C1, which comprises a mutation in the RuvC domain. In some embodiments, the catalytically inactive C2C1 protein comprises a mutation corresponding to amino acid position D570, E848, or D977 in alicyclobacillus C2C 1. In some embodiments, the catalytically inactive C2C1 protein comprises a mutation corresponding to D570A, E848A, or D977A in alicyclobacillus C2C 1.
In certain embodiments, the Cas-based nickase is a C2C1 nickase comprising a mutation in the Nuc domain. In some embodiments, the C2C1 nickase comprises a mutation corresponding to amino acid position R911, R1000, or R1015 in acid-fast a. terrestris C2C 1. In some embodiments, the C2C1 nickase comprises a mutation corresponding to R911A, R1000A, or R1015A in acid-fast terrestris C2C 1. The skilled person will understand that in case the enzyme is not a CRISPR-Cas enzyme as listed above, the mutation may be made at the residue of the corresponding position.
Mutations can also be made at adjacent residues, for example at amino acids close to those indicated above as being involved in nuclease activity. In some embodiments, only the RuvC domain is inactivated, while in other embodiments, another putative nuclease domain is inactivated, wherein the effector protein complex acts as a nickase and cleaves only one DNA strand. In some embodiments, two CRISPR-Cas variants (each a different nickase) are used to increase specificity, two nickase variants are used to cleave DNA at the target (where both nickases cleave the DNA strand while minimizing or eliminating off-target modifications, where only one DNA strand is cleaved and subsequently repaired).
In certain embodiments, the C2C1 effector protein cleaves a sequence associated with or at a target site of interest as a homodimer comprising two C2C1 effector protein molecules. In a preferred embodiment, the homodimer may comprise two C2C1 effector protein molecules comprising different mutations in the respective RuvC domains.
Guide sequences
The terms "guide sequence", "crRNA", "guide RNA" or "single guide RNA" or "gRNA" or "guide molecule" as used herein refer to a polynucleotide comprising any polynucleotide sequence having sufficient complementarity to a target nucleic acid sequence to hybridize to the target nucleic acid sequence and direct sequence-specific binding of a complex of a guide sequence and a CRISPR effector protein-containing targeting RNA to the target nucleic acid sequence. In some exemplary embodiments, the degree of complementarity is about or greater than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99% or greater when optimally aligned using a suitable alignment algorithm. The optimal alignment may be determined by means of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm (Smith-Waterman algorithm), nidman-Wunsch algorithm (Needleman-Wunsch algorithm), algorithms based on the barth-Wheeler Transform (e.g., barth-Wheeler Aligner (Burrows Wheeler), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available on www.novocraft.com), ELAND (illuma, San Diego, CA), SOAP (available on SOAP. The ability of the guide sequence (within the nucleic acid targeting guide RNA) to direct sequence-specific binding of the nucleic acid targeting complex to the target nucleic acid sequence can be assessed by any suitable assay. For example, components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, can be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with a vector encoding the components of the nucleic acid-targeting complex, followed by assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by a surfyor assay as described herein. Similarly, cleavage of a target nucleic acid sequence can be assessed in vitro by providing the target nucleic acid sequence, components of the nucleic acid targeting complex (including the guide sequence to be tested), and a control guide sequence that is different from the test guide sequence, and comparing the binding or cleavage rate at the target sequence between reactions of the test guide sequence and the control guide sequence. Other assays may exist and will occur to those of skill in the art. The guide sequence and thus the nucleic acid targeting guide can be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of: messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), microRNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nuclear RNA (snorRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (lncRNA), and small cytoplasmic RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA and lncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
In some embodiments, the nucleic acid targeting guide is selected to reduce the extent of secondary structure within the nucleic acid targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1% or less of the nucleotides of the nucleic acid targeting guide are involved in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimum Gibbs free energy (Gibbs free energy). An example of one such algorithm is mFold as described by Zuker and Stiegler (Nucleic Acids Res.9(1981), 133-148). Another exemplary folding algorithm is the online web server RNAfold developed by the Institute for Theoretical Chemistry at the University of Vienna (Institute for Theoretical Chemistry) using centroid structure prediction algorithms (see, e.g., a.r. gruber et al, 2008, Cell106 (1): 23-24; and PA Carr and GM Church,2009, Nature Biotechnology27 (12): 1151-62).
In certain embodiments, the guide RNA or crRNA may comprise, consist essentially of, or consist of a forward repeat (DR) sequence and a guide sequence or spacer sequence. In certain embodiments, the guide RNA or crRNA may comprise, consist essentially of, or consist of a forward repeat sequence fused or linked to a guide sequence or spacer sequence. In certain embodiments, the positive repeat sequence may be located upstream (i.e., 5') of the guide sequence or the spacer sequence. In other embodiments, the positive repeat sequence may be located downstream (i.e., 3') of the guide sequence or the spacer sequence.
In certain embodiments, the crRNA comprises a stem loop, preferably a single stem loop. In certain embodiments, the positive repeat sequence forms a stem loop, preferably a single stem loop.
In certain embodiments, the spacer of the guide RNA is 15 to 35nt in length. In certain embodiments, the spacer of the guide RNA is at least 15 nucleotides in length. In certain embodiments, the spacer is 15 to 17nt in length, e.g., 15, 16, or 17 nt; 17 to 20nt, such as 17, 18, 19 or 20 nt; 20 to 24nt, such as 20, 21, 22, 23 or 24 nt; 23 to 25nt, such as 23, 24 or 25 nt; 24 to 27nt, such as 24, 25, 26 or 27 nt; 27-30nt, such as 27, 28, 29, or 30 nt; 30-35nt, such as 30, 31, 32, 33, 34, or 35 nt; or 35nt or more.
Generally, a CRISPR-Cas, CRISPR-Cas9, or CRISPR system can be used as in the foregoing documents such as WO2014/093622(PCT/US2013/074667) and collectively involve transcripts and other elements involved in or directing the activity of a CRISPR-associated ("Cas") gene, including sequences encoding a Cas gene (particularly, Cas9 gene in the case of CRISPR-Cas 9), tracr (trans-activating CRISPR) sequences (e.g., tracrRNA or active partial tracrRNA), tracr mate sequences (encompassing "forward repeat" and tracrRNA processed partial forward repeat in the case of an endogenous CRISPR system), guide sequences (also referred to as "spacer" in the case of an endogenous CRISPR system), or the term "Cas RNA" as used herein (e.g., one or more RNAs to guide a 9, e.g., CRISPR RNA and trans-activating (tracrRNA) or single-finger chimeric RNA)), or other sequences and transcripts from CRISPR loci. Generally, the CRISPR system is characterized by elements (also referred to as protospacers in the case of an endogenous CRISPR system) that promote CRISPR complex formation at the site of the target sequence. In the context of forming a CRISPR complex, a "target sequence" refers to a sequence to which a guide sequence is designed to have complementarity, wherein hybridization between the target sequence and the guide sequence promotes formation of the CRISPR complex. The portion of the guide sequence that is complementary to the target sequence and important for cleavage activity is referred to herein as the seed sequence. The target sequence may comprise any polynucleotide, such as a DNA or RNA polynucleotide. In some embodiments, the target sequence is located in the nucleus or cytoplasm of the cell, and may include nucleic acids in or from mitochondria, organelles, vesicles, liposomes, or particles present within the cell. In some embodiments, particularly for non-nuclear uses, NLS is not preferred. In some embodiments, the CRISPR system comprises one or more Nuclear Export Signals (NES). In some embodiments, the CRISPR system comprises one or more NLS and one or more NES. In some embodiments, the forward repeat sequence can be identified in silico by searching for repeat motifs that satisfy any or all of the following conditions: 1. in the 2Kb genomic sequence window flanking the type II CRISPR locus; 2. the span is 20 to 50 bp; and 3. spacing 20 to 50 bp. In some embodiments, 2 of these criteria may be used, such as1 and 2,2 and 3, or 1 and 3. In some embodiments, all 3 criteria may be used.
In embodiments of the invention, the terms guide sequence and guide RNA, i.e., RNA capable of directing Cas to a target genomic locus, are used interchangeably as described in previously cited documents such as WO2014/093622(PCT/US 2013/074667). Generally, a guide sequence is any polynucleotide sequence that is sufficiently complementary to a target polynucleotide sequence to hybridize to the target sequence and direct sequence-specific binding of the CRISPR complex to the target sequence. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is about or greater than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more, when optimally aligned using a suitable alignment algorithm. The optimal alignment may be determined by means of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm (Smith-Waterman algorithm), nidman-Wunsch algorithm (Needleman-Wunsch algorithm), algorithms based on the barth-Wheeler Transform (e.g., barth-Wheeler Aligner (Burrows Wheeler), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available on www.novocraft.com), ELAND (illuma, San Diego, CA), SOAP (available on SOAP. In some embodiments, the guide sequence is about or greater than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75 or more nucleotides in length. In some embodiments, the guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12 or fewer nucleotides in length. Preferably, the guide sequence is 1030 nucleotides in length. The ability of the guide sequence to direct sequence-specific binding of the CRISPR complex to the target sequence can be assessed by any suitable assay. For example, components of the CRISPR system sufficient to form a CRISPR complex, including the guide sequence to be tested, can be provided to a host cell having the corresponding target sequence, such as by transfection with a vector encoding the components of the CRISPR sequence, followed by assessment of preferential cleavage within the target sequence, such as by a Surveyor assay as described herein. Similarly, cleavage of a target polynucleotide sequence can be assessed in vitro by providing the target sequence, components of the CRISPR complex (including the guide sequence to be tested), and a control guide sequence different from the test guide sequence, and comparing the binding or cleavage rate at the target sequence between reactions of the test guide sequence and the control guide sequence. Other assays may exist and will occur to those of skill in the art.
In some embodiments of the CRISPR-Cas system, the degree of complementarity between a guide sequence and its corresponding target sequence can be about or greater than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or 100%; the length of the guide or RNA or sgRNA can be about or greater than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides; or the length of the guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides; and advantageously the tracr RNA is 30 or 50 nucleotides in length. However, one aspect of the invention is to reduce off-target interactions, e.g., reduce the interaction of a guide with a target sequence having low complementarity. Indeed, it is shown in the examples that the present invention relates to mutations that enable a CRISPR-Cas system to distinguish a target sequence from off-target sequences having greater than 80% to about 95% complementarity, e.g., 83% -84% or 88-89% or 94-95% complementarity (e.g., to distinguish a target having 18 nucleotides from an 18 nucleotide off-target having 1, 2 or3 mismatches). Thus, in the context of the present invention, the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off-target is less than 100% or 99.9% or 99.5% or 99% or 98.5% or 98% or 97.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% of the complementarity between the sequence and the guide, advantageously, off-target is the complementarity between the sequence of 100% or 99.9% or 99.5% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% and the guide.
Guide decoration
In certain embodiments, the guide of the present invention comprises a non-naturally occurring nucleic acid and/or a non-naturally occurring nucleotide and/or nucleotide analogue and/or a chemical modification. Non-naturally occurring nucleic acids can include, for example, mixtures of naturally and non-naturally occurring nucleotides. Non-naturally occurring nucleotides and/or nucleotide analogs can be modified in the ribose, phosphate, and/or base moieties. In an embodiment of the invention, the guide nucleic acid comprises ribonucleotides and non-ribonucleotides. In one such embodiment, the guide comprises one or more ribonucleotides and one or more deoxyribonucleotides. In embodiments of the invention, the guide comprises one or more non-naturally occurring nucleotides or nucleotide analogues, such as nucleotides having a phosphorothioate linkage, a boronate phosphate linkage, Locked Nucleic Acids (LNA) comprising a methylene bridge between the 2 'and 4' carbon atoms of the ribose ring or Bridged Nucleic Acids (BNA). Other examples of modified nucleotides include 2' -O-methyl analogs, 2' -deoxy analogs, 2-thiouridine analogs, N6-methyladenosine analogs, or 2' -fluoro analogs. Other examples of modified bases include, but are not limited to, 2-aminopurine, 5-bromo-uridine, pseudouridine (Ψ), N1-methylpseudouridine (me)1Ψ), 5-methoxyuridine (5moU), inosine, 7-methylguanosine. Examples of guide RNA chemical modifications include, but are not limited to, incorporation of 2' -O-methyl (M), 2' -O-methyl-3 ' -phosphorothioate (MS), Phosphorothioate (PS), S-constrained ethyl (cEt), or 2' -O-methyl-3 ' -thiopace (msp) at one or more terminal nucleotides. Such chemically modified guides may comprise increased stability and increased activity compared to unmodified guides, although the target-to-off-target specificity is not predictable. (see Hendel,2015, Nat Biotechnol.33(9):985-9, doi: 10.1038/nbt.3290, online 29 months 6.2015; Ragdarm et al 0215, PNAS, E7110-E7111; Allerson et al, J.Med.chem.2005,48: 901-904; Bramsen et al, Front.Gene.2012, 3: 154; Deng et al, P.NAS,2015,112: 11870-11875; sharma et al, MedChemComm, 2014,5: 1454-; hendel et al, nat. biotechnol. (2015)33 (9): 985-; li et al, Nature Biomedical Engineering,2017,1,0066DOI 10.1038/s 41551-017-0066). In some embodiments, the 5 'and/or 3' end of the guide RNA is modified with a variety of functional moieties including fluorescent dyes, polyethylene glycol, cholesterol, proteins, or detection tags. (see Kelly et al, 2016, J.Biotech.233: 74-83). In certain embodiments, the guide comprises a ribonucleotide in the region that binds to the target DNA and one or more deoxyribonucleotides and/or nucleotide analogs in the region that binds to Cas9, Cpf1, or C2C 1. In embodiments of the invention, deoxyribonucleotides and/or nucleotide analogs are incorporated into engineered guide structures such as, but not limited to, the 5 'and/or 3' ends, stem-loop regions, and seed regions. In certain embodiments, the modification is not in the 5 'handle (5' -handle) of the stem-loop region. Chemical modification in the 5' stalk of the stem-loop region of the guide may abolish its function (see Li et al, Nature biological Engineering,2017,1: 0066). In certain embodiments, at least 1, 2,3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides of the guide are chemically modified. In some embodiments, 3-5 nucleotides of the 3 'or 5' end of the guide are chemically modified. In some embodiments, only minor modifications, such as 2' -F modifications, are introduced in the seed region. In some embodiments, a 2'-F modification is introduced at the 3' end of the guide. In certain embodiments, 3 to 5 nucleotides of the 5' and/or 3' end of the guide are chemically modified with 2' -O-methyl (M), 2' -O-methyl 3' phosphorothioate (MS), S-constrained ethyl (cEt), or 2' -O-methyl 3' thiopace (msp). Such modifications can improve genome editing efficiency (see Hendel et al, nat. Biotechnol. (2015)33 (9): 985-. In certain embodiments, all phosphodiester linkages of the guide are replaced with Phosphorothioate (PS) to enhance the level of gene disruption. In certain embodimentsMore than 5 nucleotides of the 5 'and/or 3' end of the guide are chemically modified with 2 '-O-Me, 2' -F or S-constrained ethyl (cEt). Such chemically modified guides can mediate enhanced levels of gene disruption (see Ragdarm et al, 0215, PNAS, E7110-E7111). In one embodiment of the invention, the guide is modified to include a chemical moiety at its 3 'and/or 5' end. Such moieties include, but are not limited to, amines, azides, alkynes, thio groups, Dibenzocyclooctyne (DBCO), or rhodamines. In certain embodiments, the chemical moiety is conjugated to the guide through a linker, such as an alkyl chain. In certain embodiments, the chemical moiety of the modified guide may be used to attach the guide to another molecule, such as DNA, RNA, protein, or nanoparticle. Such chemically modified guides can be used to identify or enrich for cells that are typically edited by the CRISPR system (see Lee et al, eLife,2017,6: e25312, DOI: 10.7554).
In certain embodiments, a CRISPR system as provided herein can utilize a crRNA or similar polynucleotide comprising a guide sequence, wherein the polynucleotide is an RNA, a DNA, or a mixture of RNA and DNA, and/or wherein the polynucleotide comprises one or more nucleotide analogs. The sequence may comprise any structure, including but not limited to that of a native crRNA, such as a bulge loop, hairpin, or stem-loop structure. In certain embodiments, the polynucleotide comprising the guide sequence forms a duplex with a second polynucleotide sequence, which may be an RNA or DNA sequence.
In certain embodiments, chemically modified guide RNAs are utilized. Examples of guide RNA chemical modifications include, but are not limited to, incorporation of 2' -O-methyl (M), 2' -O-methyl 3' phosphorothioate (MS), or 2' -O-methyl 3' thiopace (msp) at one or more terminal nucleotides. Such chemically modified guide RNAs may comprise increased stability and increased activity compared to unmodified guide RNAs, although the on-target to off-target specificity is unpredictable. (see Hendel,2015, Nat Biotechnol.33(9):985-9, doi: 10.1038/nbt.3290, online release at 29 months 6 of 2015). Chemically modified guide RNAs also include, but are not limited to, RNAs with phosphorothioate linkages and Locked Nucleic Acid (LNA) nucleotides comprising a methylene bridge between the 2 'and 4' carbons of the ribose ring.
In some embodiments, the guide sequence is about or greater than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75 or more nucleotides in length. In some embodiments, the guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12 or fewer nucleotides in length. Preferably, the guide sequence is 10 to 30 nucleotides in length. The ability of the guide sequence to direct sequence-specific binding of the CRISPR complex to the target sequence can be assessed by any suitable assay. For example, components of the CRISPR system sufficient to form a CRISPR complex, including the guide sequence to be tested, can be provided to a host cell having the corresponding target sequence, such as by transfection with a vector encoding the components of the CRISPR sequence, followed by assessment of preferential cleavage within the target sequence, such as by a Surveyor assay. Similarly, cleavage of a target RNA can be assessed in vitro by providing the target sequence, components of the CRISPR complex (including the guide sequence to be tested), and a control guide sequence different from the test guide sequence, and comparing the binding or cleavage rate at the target sequence between reactions of the test guide sequence and the control guide sequence. Other assays may exist and will occur to those of skill in the art.
In some embodiments, the modification to the guide is a chemical modification, insertion, deletion or resolution. In some embodiments, the chemical modification includes, but is not limited to, the incorporation of 2' -O-methyl (M) analogs, 2' -deoxy analogs, 2-thiouridine analogs, N6-methyladenosine analogs, 2' -fluoro analogs, 2-aminopurines, 5-bromo-uridine, pseudouridine (Ψ), N1-methylpseudouridine (me)1Ψ), 5-methoxyuridine (5moU), inosine, 7-methylguanosine, 2 '-O-methyl-3' -phosphorothioate (MS), S-constrained ethyl (cEt), Phosphorothioate (PS) or 2 '-O-methyl-3' -thioPACE (MSP). In some embodiments, the guide comprises one or more phosphorothioate modifications. In certain embodiments, at least 1, 2,3, or more than one of the guides is present,4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 25 nucleotides are chemically modified. In certain embodiments, one or more nucleotides in the seed region are chemically modified. In certain embodiments, one or more nucleotides at the 3' end are chemically modified. In certain embodiments, none of the nucleotides in the 5' handle are chemically modified. In some embodiments, the chemical modification in the seed region is a minor modification, such as the incorporation of a 2' -fluoro analog. In a specific embodiment, one nucleotide of the seed region is replaced with a 2' -fluoro analog. In some embodiments, 5 or 10 nucleotides in the 3' end are chemically modified. Such chemical modifications at the 3' end of Cpf1 CrRNA improve gene cleavage efficiency (see Li et al, Nature biological Engineering,2017,1: 0066). In a specific embodiment, 5 nucleotides in the 3 'end are replaced with a 2' -fluoro analog. In a specific embodiment, 10 nucleotides in the 3 'end are replaced with a 2' -fluoro analog. In a specific embodiment, 5 nucleotides in the 3 'end are replaced by 2' -O-methyl (M) analogs.
In some embodiments, the loop of the 5' handle of the guide is modified. In some embodiments, the loop of the 5' handle of the guide is modified to have a deletion, insertion, resolution, or chemical modification. In certain embodiments, the loop comprises 3, 4, or 5 nucleotides. In certain embodiments, the loop comprises the sequence uuu, uuuuuu, UAUU, or UGUU.
The guide sequence and thus the nucleic acid targeting guide RNA can be selected to target any target nucleic acid sequence. In the context of forming a CRISPR complex, a "target sequence" refers to a sequence to which a guide sequence is designed to have complementarity, wherein hybridization between the target sequence and the guide sequence promotes formation of the CRISPR complex. The target sequence may comprise an RNA polynucleotide. The term "target RNA" refers to an RNA polynucleotide that is or comprises a target sequence. In other words, the target RNA can be a portion of the gRNA, i.e., an RNA polynucleotide or a portion of an RNA polynucleotide to which the guide sequence is designed to have complementarity and for which an effector function is mediated by a complex comprising a CRISPR effector protein and the gRNA. In some embodiments, the target sequence is located in the nucleus or cytoplasm of the cell. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of: messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), microRNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nuclear RNA (snorRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (lncRNA), and small cytoplasmic RNA (scRNA). In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some preferred embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA and lncRNA. In some more preferred embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
In certain embodiments, the spacer of the guide RNA is less than 28 nucleotides in length. In certain embodiments, the spacer of the guide RNA is at least 18 nucleotides and less than 28 nucleotides in length. In certain embodiments, the spacer of the guide RNA is between 19 and 28 nucleotides in length. In certain embodiments, the spacer of the guide RNA is between 19 and 25 nucleotides in length. In certain embodiments, the spacer of the guide RNA is 20 nucleotides in length. In certain embodiments, the spacer of the guide RNA is 23 nucleotides in length. In certain embodiments, the spacer of the guide RNA is 25 nucleotides in length.
In certain embodiments, modulation of cleavage efficiency can be explored by introducing mismatches, e.g., 1 or more mismatches, such as1 or 2 mismatches, between the spacer sequence and the target sequence, including at mismatched positions along the spacer/target. For example, the more central (i.e., not 3 'or 5') the double mismatch, the more the cleavage efficiency is affected. Thus, by selecting the position of the mismatch along the spacer, the cleavage efficiency can be modulated. As an example, if less than 100% target cleavage is required (e.g. in a population of cells), then 1 or more, such as preferably 2 mismatches between the spacer and the target sequence may be introduced in the spacer sequence. The more central the mismatch location is along the spacer, the lower the percentage of cleavage.
In certain exemplary embodiments, cleavage efficiency can be explored to design single guides that can distinguish between two or more targets that vary due to a single nucleotide, such as a Single Nucleotide Polymorphism (SNP), variation, or (point) mutation. CRISPR effectors may have reduced sensitivity to SNPs (or other single nucleotide variations) and continue to cleave SNP targets with a certain level of efficiency. Thus, for two targets or a set of targets, the guide RNA can be designed to have a nucleotide sequence complementary to one of the targets, i.e., the on-target SNP. The guide RNA is further designed to have synthetic mismatches. As used herein, "synthetic mismatch" refers to a non-naturally occurring mismatch introduced upstream or downstream of a naturally occurring SNP, such as up to 5 nucleotides upstream or downstream, e.g., 4, 3, 2, or 1 nucleotide upstream or downstream, preferably up to 3 nucleotides upstream or downstream, more preferably up to 2 nucleotides upstream or downstream, most preferably 1 nucleotide upstream or downstream (i.e., adjacent SNPs). When the CRISPR effector binds to the on-target SNP, only a single mismatch will form with the synthetic mismatch and will continue to activate the CRISPR effector and produce a detectable signal. When the guide RNA hybridizes to an off-target SNP, two mismatches will form, i.e., a mismatch from the SNP and a synthetic mismatch, and no detectable signal will be produced. Thus, the systems disclosed herein can be designed to differentiate SNPs within a population. For example, the system can be used to distinguish pathogenic strains that differ by a single SNP or to detect certain disease-specific SNPs, such as, but not limited to, disease-associated SNPs, such as, but not limited to, cancer-associated SNPs.
In certain embodiments, the guide RNA is designed such that the SNP is located at position 1, 2,3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 of the spacer sequence (starting from the 5' end). In certain embodiments, the guide RNA is designed such that the SNP is located at position 1, 2,3, 4,5, 6, 7, 8 or 9 of the spacer sequence (starting from the 5' end). In certain embodiments, the guide RNA is designed such that the SNP is located at position 2,3, 4,5, 6, or 7 of the spacer sequence (starting at the 5' end). In certain embodiments, the guide RNA is designed such that the SNP is located at position 3, 4,5 or 6 of the spacer sequence (starting at the 5' end). In certain embodiments, the guide RNA is designed such that the SNP is located at position 3 (starting at the 5' end) of the spacer sequence.
In certain embodiments, the guide RNA is designed such that the mismatch (e.g., a synthetic mismatch, i.e., a mutation other than a SNP) is located at position 1, 2,3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 (starting at the 5' end) of the spacer sequence. In certain embodiments, the guide RNA is designed such that the mismatch is located at position 1, 2,3, 4,5, 6, 7, 8 or 9 of the spacer sequence (starting from the 5' end). In certain embodiments, the guide RNA is designed such that the mismatch is at position 4,5, 6, or 7 of the spacer sequence (starting at the 5' end). In certain embodiments, the guide RNA is designed such that the mismatch is located at position 5 (starting from the 5' end) of the spacer sequence.
In certain embodiments, the guide RNA is designed such that the mismatch is located 2 nucleotides upstream of the SNP (i.e., one intervening nucleotide).
In certain embodiments, the guide RNA is designed such that the mismatch is located 2 nucleotides downstream of the SNP (i.e., one intervening nucleotide).
In certain embodiments, the guide RNA is designed such that the mismatch is located at position 5 (starting from the 5 'end) and the SNP is located at position 3 (starting from the 5' end) of the spacer sequence.
Embodiments described herein encompass inducing one or more nucleotide modifications in a eukaryotic cell (in vitro, i.e., in an isolated eukaryotic cell) as discussed herein, including delivering a vector as discussed herein to a cell. The one or more mutations can include the introduction, deletion, or substitution of one or more nucleotides at each target sequence of the cell via one or more guide RNAs. Mutations may include the introduction, deletion, or substitution of 1-75 nucleotides at each target sequence of the one or more cells via one or more guide RNAs. Mutations can include the introduction, deletion, or substitution of 1, 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of the one or more cells via one or more guide RNAs. Mutations can include the introduction, deletion, or substitution of 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of the one or more cells via one or more guide RNAs. Mutations include the introduction, deletion, or substitution of 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of the one or more cells via one or more guide RNAs. Mutations may include the introduction, deletion, or substitution of 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of the one or more cells via one or more guide RNAs. Mutations may include the introduction, deletion, or substitution of 40, 45, 50, 75, 100, 200, 300, 400, or 500 nucleotides at each target sequence of the one or more cells via one or more guide RNAs.
Typically, in the case of an endogenous CRISPR system, the formation of a CRISPR complex (comprising a guide sequence that hybridizes to a target sequence and is complexed to one or more Cas proteins) results in cleavage in or near (e.g., within 1, 2,3, 4,5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from) the target sequence, but may depend on, for example, secondary structure, particularly in the case of an RNA target.
Amplification reagent
In certain exemplary embodiments, the systems disclosed herein may include amplification reagents. Described herein are different components or reagents useful for nucleic acid amplification. For example, amplification reagents as described herein may include buffers, such as Tris buffers. Tris buffer may be used at any concentration suitable for the desired application or use, for example including but not limited to concentrations of 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 11mM, 12mM, 13mM, 14mM, 15mM, 25mM, 50mM, 75mM, 1M and the like. One skilled in the art will be able to determine the appropriate concentration of a buffer (such as Tris) for use in the present invention.
To improve amplification of nucleic acid fragments, salts, such as magnesium chloride (MgCl), can be included in the amplification reaction (such as PCR)2) Potassium chloride (KCl) or sodium chloride (NaCl). Although the salt concentration will depend on the particular reaction and application, in some embodiments, a nucleic acid fragment of a particular size may produce optimal results at a particular salt concentration. Larger products may require varying salt concentrations, usually lower salts, to produce the desired results, while amplification of smaller products may produce better results at higher salt concentrations. One skilled in the art will appreciate that the presence and/or concentration of a salt and changes in salt concentration can alter the stringency of a biological or chemical reaction, and thus any salt that provides suitable conditions for the present invention and reactions as described herein can be used.
Other components of a biological or chemical reaction may include cell lysis components to break open or lyse cells for analysis of substances therein. Cell lysis components may include, but are not limited to, detergents; salts as described above, such as NaCl, KCl, ammonium sulfate [ (NH)4)2SO4](ii) a Or otherwise. Detergents that may be suitable for the present invention may include Triton X-100, Sodium Dodecyl Sulfate (SDS), CHAPS (3- [ (3-cholamidopropyl) dimethylammonium]-1-propanesulfonate), ethyltrimethylammonium bromide, nonylphenoxypolyethoxyethanol (NP-40). The concentration of the detergent may depend on the particular application, and in some cases may be specific to the reaction. As detailed herein, the amplification reaction may include dntps and nucleic acid primers used at any concentration.
The amplification reaction may include dNTPs and nucleic acid primers used at any concentration suitable for the present invention, such as, but not limited to, concentrations of 100nM, 150nM, 200nM, 250nM, 300nM, 350nM, 400nM, 450nM, 500nM, 550nM, 600nM, 650nM, 700nM, 750nM, 800nM, 850nM, 900nM, 950nM, 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 20mM, 30mM, 40mM, 50mM, 60mM, 70mM, 80mM, 90mM, 100mM, 150mM, 200mM, 250mM, 300mM, 350mM, 400mM, 450mM, 500mM, and the like. Likewise, polymerases useful according to the present invention can be any specific or general polymerase known in the art and useful in the present invention, including Taq polymerase, Q5 polymerase, and the like.
In some embodiments, amplification reagents as described herein may be suitable for use in hot start amplification. Hot start amplification may be beneficial in some embodiments to reduce or eliminate dimerization of adapter molecules or oligonucleotides, or to otherwise prevent undesirable amplification products or artifacts and obtain optimal amplification of desired products. Many of the components described herein for use in amplification may also be used in hot start amplification. In some embodiments, reagents or components suitable for hot start amplification may be used in place of one or more of the constituent components, as the case may be. For example, a polymerase or other reagent that exhibits the desired activity at a particular temperature or other reaction conditions may be used. In some embodiments, reagents designed or optimized for use in hot start amplification may be used, e.g., the polymerase may be activated after transposition or after reaching a particular temperature. Such polymerases may be antibody-based or aptamer-based. Polymerases as described herein are known in the art. Examples of such reagents may include, but are not limited to, hot-start polymerases, hot-start dntps, and photocaged dntps. Such reagents are known and available in the art. One skilled in the art will be able to determine the optimum temperature for an individual reagent.
Polymerase enzyme
The systems and methods herein utilize a polymerase to amplify a target sequence. Polymerases useful according to the present invention can be any specific or general polymerase known in the art and useful in the present invention, including Taq polymerase, Q5 polymerase, and the like. In embodiments, amplification may be utilized such that nicked DNA fragments may be nicked and extended in a cycling reaction that exponentially amplifies the target between nicked sites. In embodiments, the polymerase may be selected from the following: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full-length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase, Gst polymerase, Taq polymerase, Escherichia coli DNA polymerase I Klenow fragment, KlenaQ, Pol III DNA polymerase, T5 DNA polymerase, Gst polymerase and sequencer enzyme DNA polymerase.
Amplification may be isothermal and temperature-specific. In one embodiment, amplification proceeds rapidly at 37 degrees. In other embodiments, the temperature for isothermal amplification can be selected by selecting polymerases that can operate at different temperatures (e.g., Bsu, Bst, Phi29, klenow fragment, etc.). Nicking enzyme-based amplification can be performed over a range of temperatures or at a constant temperature. In certain embodiments, nicking enzyme-based amplification can be performed at about 50 ℃ to 59 ℃, about 60 ℃ to 72 ℃, or about 37 ℃. The Cas-based nickase and polymerase can be performed at the same temperature or at different temperatures.
Isothermal reaction generally refers to a reaction without severe temperature cycling and temperature fluctuations of no more than about 1 deg.C, 2 deg.C, 3 deg.C, 4 deg.C, 5 deg.C, 6 deg.C, 7 deg.C, 8 deg.C, 9 deg.C, 10 deg.C, 11 deg.C, 12 deg.C, 13 deg.C, 14 deg.C, 15 deg.C, 17 deg.C, 18 deg.C, 19 deg.C or 20 deg.C, or temperature fluctuations of less than about 1 deg.C, 2 deg.C, 3 deg.C, 4 deg.C, 5 deg.C, 6 deg.C, 7 deg.. In certain embodiments, the isothermal reaction is performed within the operable temperature range of the polymerase.
In some embodiments, amplification reagents as described herein may be suitable for use in hot start amplification. Hot start amplification may be beneficial in some embodiments to reduce or eliminate dimerization of adapter molecules or oligonucleotides, or to otherwise prevent undesirable amplification products or artifacts and obtain optimal amplification of desired products. Many of the components described herein for use in amplification may also be used in hot start amplification. In some embodiments, reagents or components suitable for hot start amplification may be used in place of one or more of the constituent components, as the case may be. For example, a polymerase or other reagent that exhibits the desired activity at a particular temperature or other reaction conditions may be used. In some embodiments, reagents designed or optimized for use in hot start amplification may be used, e.g., the polymerase may be activated after transposition or after reaching a particular temperature. Such polymerases may be antibody-based or aptamer-based. Polymerases as described herein are known in the art. Examples of such reagents may include, but are not limited to, hot-start polymerases, hot-start dntps, and photocaged dntps. Such reagents are known and available in the art. One skilled in the art will be able to determine the optimum temperature for an individual reagent.
Primer pair
Primer pairs are utilized in embodiments of the systems and methods provided herein. The primer pair comprises a first primer and a second primer. The first primer comprises a portion complementary to a first location on the target nucleic acid and comprises a portion comprising a binding site for the first guide molecule. The second primer comprises a portion complementary to a second location on the target nucleic acid and comprises a portion comprising a binding site for a second guide molecule.
In one aspect, a primer pair is provided, the primer pair comprising a first primer and a second primer of a reaction mixture, the first primer comprising a portion complementary to a first strand of a target nucleic acid and a portion comprising a binding site for a first guide molecule, and the second primer comprising a portion complementary to a second strand of the target nucleic acid and a portion comprising a binding site for a second guide molecule.
In one aspect, a primer pair is provided, the primer pair comprising a first primer and a second primer of a reaction mixture, the first primer comprising a portion complementary to a first position of a strand of a target nucleic acid and a portion comprising a binding site for a first guide molecule, and the second primer comprising a portion complementary to a second position of the strand of the target nucleic acid and a portion comprising a binding site for a second guide molecule.
In particular embodiments, the amplification reaction mixture may further comprise a primer capable of hybridizing to a target nucleic acid strand. The term "hybridization" refers to the binding of an oligonucleotide primer to a region of a single-stranded nucleic acid template under conditions in which the primer specifically binds to its complementary sequence on only one of the template strands, but not to other regions in the template. The specificity of hybridization may be influenced by the length of the oligonucleotide primer, the temperature at which the hybridization reaction is carried out, the ionic strength, and the pH. The term "primer" refers to a single-stranded nucleic acid capable of binding to a single-stranded region on a target nucleic acid to facilitate polymerase-dependent replication of a target nucleic acid strand. "complementary" nucleic acids or "complements" refer to nucleic acids that are capable of base pairing according to the standard Watson-Crick, Hoogsteen, or reverse Hoogsteen binding complementarity rules.
In certain embodiments, a primer is included in the reaction, which is capable of hybridizing to the extended strand, followed by further polymerase extension of the primer to regenerate two dsDNA fragments: a first dsDNA comprising a first strand CRISPR guide site or both a first strand CRISPR guide site and a second strand CRISPR guide site, and a second dsDNA comprising a second strand CRISPR guide site or both a first strand CRISPR guide site and a second strand CRISPR guide site. These fragments continue to be nicked and extended in a cycling reaction that exponentially amplifies the region of the target between the nicking sites.
The present methods provide advantages over previous nicking isothermal amplification techniques that use nicking enzymes with fixed sequence preference (e.g., in nicking enzyme amplification reactions or NEAR), which require denaturing the original dsDNA target to allow annealing and extension of primers that add nicking substrates to the ends of the target. The methods of the invention use CRISPR nickases in which the nicking site can be programmed via guide RNA, which means that no denaturation step is required, thus making the entire reaction truly isothermal. The reaction is simplified because these primers that add cleavage substrates are different from the primers used in the later part of the reaction, which means that NEAR requires two primer sets (i.e. 4 primers), whereas CRISPR cleavage such as Cpf1 cleavage amplification requires only one primer set (i.e. two primers). This makes CRISPR nicking amplification simpler and easier to handle without the need for denaturation using complex instruments and subsequent cooling to isothermal temperatures, thereby providing a simpler, faster amplification method.
The primer may comprise a promoter sequence. In certain embodiments, the promoter sequence is a sequence that can be used in an optional detection step. In embodiments, the primer comprises the T7 promoter sequence that can be used with the SHERLOCK detection method. One skilled in the art can select other promoter sequences for use with other downstream systems and methods.
The nucleic acid may be subjected to a polymerisation step. If the nucleic acid to be amplified is DNA, a DNA polymerase is selected. When the initial target is RNA, the RNA target can first be copied into a cDNA molecule using reverse transcriptase, and the cDNA then further amplified.
The amplification reaction may include dNTPs and nucleic acid primers used at any concentration suitable for the present invention, such as, but not limited to, concentrations of 100nM, 150nM, 200nM, 250nM, 300nM, 350nM, 400nM, 450nM, 500nM, 550nM, 600nM, 650nM, 700nM, 750nM, 800nM, 850nM, 900nM, 950nM, 1mM, 2mM, 3mM, 4mM, 5mM, 6mM, 7mM, 8mM, 9mM, 10mM, 20mM, 30mM, 40mM, 50mM, 60mM, 70mM, 80mM, 90mM, 100mM, 150mM, 200mM, 250mM, 300mM, 350mM, 400mM, 450mM, 500mM, and the like.
Target nucleic acid
In the context of forming a CRISPR complex, a "target sequence" refers to a sequence to which a guide sequence is designed to have complementarity, wherein hybridization between the target sequence and the guide sequence promotes formation of the CRISPR complex. The target sequence may comprise a DNA or RNA polynucleotide. The term "target DNA or RNA" refers to a DNA or RNA polynucleotide that is or comprises a target sequence. In other words, the target DNA or RNA may be a portion of the gRNA, i.e., a DNA or RNA polynucleotide or a portion of a DNA or RNA polynucleotide to which the guide sequence is designed to have complementarity and for which an effector function is mediated by a complex comprising a CRISPR effector protein and a gRNA. In some embodiments, the target sequence is located in the nucleus or cytoplasm of the cell.
Nickase-based amplification can be used to amplify target nucleic acid sequences of varying lengths. For example, the target nucleic acid sequence may be about 10-20, about 20-30, about 30-40, about 40-50, about 50-100, about 100-200, about 100-1000, about 1000-2000, about 2000-3000, about 3000-4000, or about 4000-5000 nucleotides in length. The target nucleic acid can be DNA, such as genomic DNA, mitochondrial DNA, viral DNA, plasmid DNA, acyclic cellular DNA, environmental DNA, or synthetic double-stranded DNA. The target nucleic acid can be a single-stranded nucleic acid, such as an RNA molecule. Single-stranded nucleic acids can be converted to double-stranded nucleic acids prior to nickase-based amplification. For example, an RNA molecule can be converted to double-stranded DNA by reverse transcription prior to amplification. The single-stranded nucleic acid may be selected from the group consisting of: single stranded viral DNA, viral RNA, messenger RNA, ribosomal RNA, transfer RNA, microrna, short interfering RNA, microrna, synthetic RNA, long noncoding RNA, microrna precursors, dsRNA, and synthetic single stranded DNA.
Sample (I)
As described herein, a sample for use in the present invention can be a biological or environmental sample, such as a food sample (fresh fruit or vegetable, meat), a beverage sample, a paper surface, a fabric surface, a metal surface, a wood surface, a plastic surface, a soil sample, a fresh water sample, a wastewater sample, a saline sample, an exposure to the atmosphere or other gas sample, or a combination thereof. For example, household/commercial/industrial surfaces made of any material including, but not limited to, metal, wood, plastic, rubber, etc. can be swabbed and tested for contaminants. Soil samples may be tested for the presence of pathogenic bacteria or parasites or other microorganisms for environmental purposes and/or for human, animal or plant disease testing. Water samples, such as fresh water samples, wastewater samples or brine samples, can be evaluated for cleanliness and safety and/or potability to detect the presence of, for example, Cryptosporidium parvum, Giardia lamblia or other microbial contamination. In other embodiments, the biological sample may be obtained from: including but not limited to tissue samples, saliva, blood, plasma, serum, stool, urine, sputum, mucus, lymph, synovial fluid, cerebrospinal fluid, ascites fluid, pleural effusion, seroma, pus, or swabs of skin or mucosal surfaces. In some embodiments, the environmental or biological sample may be a crude sample and/or the one or more target molecules may not be purified or amplified from the sample prior to application of the method. The identification of microorganisms may be useful and/or desirable for many applications, and thus any type of sample from any source deemed appropriate by one skilled in the art may be used in accordance with the present invention.
In some embodiments, the biological sample may include, but is not limited to, blood, plasma, serum, urine, stool, sputum, mucus, lymph, synovial fluid, bile, ascites, pleural effusion, seroma, saliva, cerebrospinal fluid, aqueous or vitreous fluid, or any bodily secretion, exudate, or fluid obtained from a joint, or a swab of the skin or mucosal surface.
In particular embodiments, the sample may be blood, plasma, or serum obtained from a human patient.
In some embodiments, the sample may be a plant sample. In some embodiments, the sample may be a crude sample. In some embodiments, the sample may be a purified sample.
Detection of
The systems described herein may also include a system for detecting. Nicking enzyme-based amplification can be combined with a variety of detection methods to detect the amplified nucleic acid product. For example, detection systems and methods may include gel electrophoresis, intercalating dye detection, PCR, real-time PCR, Fluorescence Resonance Energy Transfer (FRET), mass spectrometry, lateral flow assays, colorimetric assays (HRP, ALP, gold, nanoparticle-based assays), and CRISPR-SHERLOCK. The combination of amplification and detection can achieve attomole or femtomole sensitivity. In certain embodiments, DNA detection using the methods or systems of the invention requires transcription of the (amplified) DNA into RNA prior to detection.
It is clear that the detection method of the present invention may involve various combinations of nucleic acid amplification and detection procedures. The nucleic acid to be detected may be any naturally occurring or synthetic nucleic acid, including but not limited to DNA and RNA, which may be amplified by any suitable method to provide an intermediate product that can be detected. Detection of the intermediate can be performed by any suitable method, including but not limited to binding and activating a CRISPR protein that produces a detectable signal moiety, either directly or by side activity.
In particular embodiments, the amplified nucleic acid can be detected by a CRISPR Cas 13-based system. In particular embodiments, the amplified nucleic acids can be detected by a CRISPR Cas 12-based system (see Chen et al Science 360: 436-. In particular embodiments, the amplified nucleic acid can be detected by a combination of a CRISPR Cas 13-based system and a CRISPR Cas 12-based system.
Detection of nucleic acids (including single nucleotide variants), detection based on rRNA sequences, drug resistance screening, monitoring microbial outbreaks, genetic perturbation, and screening of environmental samples can be as described, for example, in [0183] - [0327] of WO/2019/07105 filed 2018, 10 months, 22, which is incorporated herein by reference. Reference is made to WO 2017/219027; WO 2018/107129; US 20180298445; US 2018-0274017; US 2018-0305773; WO 2018/170340; us application 15/922,837 entitled "diagnostic Devices for CRISPR effect System Based Diagnostics" filed 3/15 2018; PCT/US18/50091 entitled "Multi-Effect CRISPR Based Diagnostic Systems (Multi-Effect CRISPR Based Diagnostic Systems)" filed on 7.9.2018; PCT/US18/66940 entitled "CRISPR Effector System Based Multiplex Diagnostics" filed on 12/20/2018; PCT/US18/054472 entitled "CRISPR Effector System Based diagnostics", filed on 4.10.2018; us provisional 62/740,728 entitled "CRISPR effect System Based Diagnostics for hemorrhaging heat Detection" filed on 3.10.2018; us provisional 62/690,278 filed on 26.6.2018 and us provisional 62/767,059 filed on 14.11.2018, entitled "CRISPR Double Nickase Based Amplification, Compositions, Systems and Methods" (CRISPR Double Nickase Based Amplification, Compositions, Systems and Methods) "; us provisional 62/690,160 filed on 26.6.2018 And us provisional 62,767,077 filed on 14.11.2018, entitled "CRISPR/CAS And Transposase Based Amplification Compositions, Systems, And Methods" (CRISPR/CAS And Transposase Based Amplification Compositions, Systems, And Methods) "; us provisional 62/690,257 filed on 26.6.2018 And us provisional 62/767,052 filed on 14.11.2018, entitled "CRISPR effect System Based Amplification Methods, Systems And Diagnostics" for CRISPR effect System Based Amplification Methods, Systems, And d Diagnostics; us provisional title 62/767,076 entitled "Multiplexing Highly evolved Viral Variants With SHERLOCK Multiplexing (multiplexed highlyevling Viral Variants With SHERLOCK)" filed on 14/11 in 2018; and 62/767,070 entitled "droplet SHERLLOCK (Droplet SHERLLOCK)", filed on 11/14/2018. Reference is also made to WO 2017/127807; WO 2017/184786; WO 2017/184768; WO 2017/189308; WO 2018/035388; WO 2018/170333; WO 2018/191388; WO 2018/213708; WO 2019/005866; PCT/US18/67328 entitled "Novel CRISPR Enzymes and Systems" (Novel CRISPR Enzymes and Systems), filed 2018, 12, month 21; PCT/US18/67225 entitled "Novel CRISPR Enzymes and Systems" (Novel CRISPR Enzymes and Systems) filed on 21.12.2018 and PCT/US18/67307 entitled "Novel CRISPR Enzymes and Systems" (Novel CRISPR Enzymes and Systems) filed on 21.12.2018; US 62/712,809 entitled "Novel CRISPR Enzymes and Systems" (Novel CRISPR Enzymes and Systems) filed 2018, 7, 31/month; US62/744,080 entitled "Novel Cas12b Enzymes and Systems" (Novel Cas12b Enzymes and Systems) filed on 10/10.2018 and u.s.62/751,196 entitled "Novel Cas12b Enzymes and Systems" (Novel Cas12b Enzymes and Systems) filed on 26.10.2018; U.S. Pat. No. 715,640 entitled "Novel CRISPR Enzymes and Systems" (Novel CRISPR Enzymes and Systems) filed 2-18, 8/7/8; WO 2016/205711; U.S.9,790,490; WO 2016/205749; WO 2016/205764; WO 2017/070605; WO 2017/106657 and WO 2016/149661; WO 2018/035387; WO 2018/194963; cox DBT et al, RNA editing with CRISPR-Cas13, science.2017, 11 months 24 days; 358(6366) 1019-1027; gootenberg JS et al, Multiplexed and portable nucleic acid detection platform with Cas13, Cas12a, and Csm6, science.2018, month 4 and day 27; 360(6387) 439-444; gootenberg JS et al, Nucleic acid detection with CRISPR-Cas13a/C2c2, science.2017, 4 months and 28 days; 356(6336), 438 and 442; abudayyeh OO et al, RNA targeting with CRISPR-Cas13, Nature.2017, 10 months and 12 days; 550(7675) 280-284; smargon AA et al, Cas13B Is a Type VI-B CRISPR-Associated RNA-Guided RNase differential Regulated by access Proteins Csx27 and Csx28.mol cell.2017, 2 months 16 days; 65(4) 618-630.e 7; abudayyeh OO et al, C2C2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector, science.2016, 8/5; 353(6299) aaf 5573; yang L et al, Engineering and optimizing amino fusions for genome editing. nat Commun.2016, 11 months and 2 days; 13330, Myrvhold et al, Field-featured visual diagnostics using CRISPR-Cas13, Science 2018360, 444-; shmakov et al, "Diversity and evolution of class 2CRISPR-Cas systems," Nat Rev Microbiol.201715(3):169-182, each of which is incorporated herein by reference in its entirety.
In some specific embodiments, RNA-targeting effectors may be utilized to provide robust CRISPR-based detection. Embodiments disclosed herein can detect both DNA and RNA at comparable sensitivity levels and can be used in conjunction with the disclosed HDA methods and systems. For ease of reference, the detection embodiments disclosed herein, which may also be referred to as SHERLOCK (specific high sensitivity enzymatic reporter unlock), are in some embodiments performed after the HDA methods disclosed herein, including under mesophilic and thermophilic isothermal conditions.
In some embodiments, one or more elements of the nucleic acid targeted detection system are derived from a particular organism comprising an endogenous CRISPR RNA targeting system. In certain exemplary embodiments, the effector protein CRISPR RNA targeted detection system comprises at least one HEPN domain, including but not limited to the HEPN domains described herein, known in the art, and domains identified as HEPN domains by comparison to a consensus sequence motif. Several such domains are provided herein. In one non-limiting example, the consensus sequence can be derived from the sequences of the C2C2 or Cas13b orthologs provided herein. In certain exemplary embodiments, the effector protein comprises a single HEPN domain. In certain other exemplary embodiments, the effector protein comprises two HEPN domains.
In an exemplary embodiment, the effector protein comprises one or more HEPN domains comprising an rxxxh motif sequence. The rxxxxxh motif sequence can be, but is not limited to, a HEPN domain from those described herein or known in the art. The rxxxxxh motif sequence also includes motif sequences established by combining portions of two or more HEPN domains. As noted, the consensus sequence may be derived from the sequences of orthologs disclosed in the following documents: PCT/US2017/038154 entitled "Novel Type VI CRISPR Orthologs and Systems" (Novel Type VI CRISPR Orthologs and Systems) "for example at pages 256 and 285 and 336. U.S. provisional patent application 62/432,240 entitled" Novel CRISPR Enzymes and Systems "(Novel CRISPR Enzymes and Systems),. U.S. provisional patent application 62/471,710 entitled" Novel Type VI CRISPR Orthologs and Systems "(Novel Type VI CRISPR Orthologs and Systems)" filed on 3/15.2017 and U.S. provisional patent application 62/484,786 entitled "Novel Type VI CRISPR Orthologs and Systems" (Novel Type VI CRISPR Orthologs and Systems) "filed on 12.4.2017.
In an embodiment of the invention, the HEPN domain comprises at least one RxxxxH motif comprising the sequence R { N/H/K } X1X2X3H (SEQ ID NO: 15). In an embodiment of the invention, the HEPN domain includes the RxxxxxxH motif comprising the sequence R { N/H } X1X2X3H (SEQ ID NO: 16). In an embodiment of the invention, the HEPN domain comprises the sequence R { N/K } X1X2X3H (SEQ ID NO: 17). In certain embodiments, X1 is R, S, D, E, Q, N, G, Y or H. In certain embodiments, X2 is I, S, T, V or L. In certain embodiments, X3 is L, F, N, Y, V, I, S, D, E or a.
The additional effectors used according to the present invention may be identified by their proximity to the cas1 gene, for example but not limited to within a region 20kb from the beginning of the cas1 gene and 20kb from the end of the cas1 gene. In certain embodiments, the effector protein comprises at least one HEPN domain and at least 500 amino acids, and wherein the C2C2 effector protein is naturally present in the prokaryotic genome within 20kb upstream or downstream of the Cas gene or CRISPR array. Non-limiting examples of Cas proteins include Cas1, Cas1B, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7 (also known as Csn 7 and Csx 7), Cas7, Csy 7, Cse 7, Csc 7, Csa 7, Csn 7, Csm 7, Cmr 7, Csb 7, Csx 7, CsaX 7, csaf 7, or a 7 modifications thereof. In certain exemplary embodiments, the C2C2 effector protein is naturally present in the prokaryotic genome within 20kb upstream or downstream of the Cas1 gene. The terms "ortholog" (also referred to herein as "ortholog") and "homolog" (also referred to herein as "homolog") are well known in the art. By way of further guidance, a "homolog" of a protein as used herein is a protein of the same species that performs the same or similar function as the protein that is the homolog thereof. Homologous proteins may, but need not, be structurally related, or only partially structurally related. An "orthologue" of a protein as used herein is a different species of protein that performs the same or similar function as the protein that is an orthologue thereof. Orthologous proteins may, but need not, be structurally related, or only partially structurally related.
In particular embodiments, the RNA-targeting type VI Cas enzyme is C2C2. In other exemplary embodiments, the RNA-targeting type VI Cas enzyme is Cas13 b. In particular embodiments, a type VI protein as referred to herein, such as a homolog or ortholog of C2C2, has at least one of C2C2 (e.g., a wild-type sequence based on any of cilium saxifrage C2C2, lachnospiraceae MA 2020C 2C2, lachnospiraceae NK4a 179C 2C2, clostridium ammoniaphilum (DSM 10710) C2C2, gallibacterium (DSM 4847) C2C2, manobacterium propionicum (WB4) C2C2, Listeria westersii (FSL R9-0317) C2C2, listeriaceae bacterium (FSL M6-0635) C2C2, Listeria newyoensis (Listeria newwenshuensis) (FSL 6-0635) C2C2, vibrio westercoriella C (F9) C2C 3642, capsular strain (C) C4630, C2%, rhodobacter caldarius sp 2C 4630, or rhodobacter caldarieri (FSL) C4635), rhodobacter caldarierii (FSL) C24, C2), or at least 60%, or at least 70%, or at least 80%, more preferably at least 85%, even more preferably at least 90%, such as at least 95% sequence homology or identity. In other embodiments, a type VI protein as referred to herein, such as a homolog or ortholog of C2C2, has at least one of wild-type C2C2 (e.g., wild-type sequences based on any of cilium sartorius C2C2, lachnospiraceae MA 2020C 2C2, lachnospiraceae NK4a 179C 2C2, clostridium ammoniaphilum (DSM 10710) C2C2, gallibacterium gallinarum (DSM 4847) C2, manobacterium propionicum (WB4) C2C2, listeria wegener (FSL R9-0317) C2C2, listeriaceae (FSL M6-0635) C2C2, listeria newyork (FSL M6-0635) C2C 84, listeria wegener (F0279) C2C2, listeria rhynchophylla (SB 1003) C2C2, rhodobacter capsulatus (R) C462C 2C 5830%, or rhodobacter iwoffii (DE 4630), at least 24%, or at least 24% C5830%, at least 20%, or at least one of rhodobacter iwoffii (FSL 20), or at least 80%, more preferably at least 85%, even more preferably at least 90%, such as at least 95% sequence identity.
In certain other exemplary embodiments, the CRISPR system effector protein is C2C2 nuclease. The activity of C2C2 may depend on the presence of two HEPN domains. These have been shown to be rnase domains, i.e., nucleases (particularly endonucleases) that cleave RNA. C2C2 HEPN can also target DNA, or potentially DNA and/or RNA. Based on the fact that the HEPN domain of C2C2 is at least able to bind to RNA and cleave RNA in its wild-type form, it is preferred that the C2C2 effector protein has rnase function. For the C2C 2CRISPR system, reference is made to us provisional 62/351,662 filed 2016, month 6, and day 17, and us provisional 62/376,377 filed 2016, month 8, and day 17. Reference is also made to U.S. provisional 62/351,803 filed on 6/17/2016. Reference is also made to the U.S. provisional entitled "Novel Crispr Enzymes and Systems (Novel Crispr Enzymes and Systems)" filed on 8.12.2016, with the Border Institute (Broad Institute) number 10035.PA4 and attorney docket number 47627.03.2133. Further reference is made to East-Seletsky et al, "Two partition RNase activities of CRISPR-C2C2 enable guide-RNA processing and RNA detection" Nature doi:10/1038/Nature19802 and Abudayyeh et al, "C2C 2a single-component programmable RNA-guided RNA targeting CRISPR effector" bioRxiv doi: 10.1101/054742.
RNAse function in CRISPR systems is known, for example, mRNA targeting has been reported for certain type III CRISPR-Cas systems (Hale et al 2014, Genes Dev, Vol.28, 2432-. In the Staphylococcus epidermidis type III-A system, transcription across the target cleaves target DNA and its transcripts, which is mediated by an independent active site within the Cas10-Csm ribonucleoprotein effector complex (see Samai et al, 2015, Cell, Vol 151, 1164-1174). Thereby providing CRISPR-Cas systems, compositions, or methods of targeting RNA via the effector proteins of the invention.
In one embodiment, the Cas protein may be a C2C2 ortholog of an organism of the genus: including but not limited to, cilia, listeria, corynebacterium, sauteria, legionella, treponema, Proteus, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Vibrio, Flavobacterium, Spirochacterium, Azospirillum, gluconacetobacter, Neisseria, Rochelia, Microclavus, Staphylococcus, nitrate lyase, Mycoplasma, Campylobacter, and Muspirillum. The species of organisms of this genus can be as discussed elsewhere herein.
In certain exemplary embodiments, the C2C2 effector proteins of the invention include, but are not limited to, the following 21 ortholog species (including multiple CRISPR loci): ciliate sarmentosum; velveteenia virginica (Lw 2); listeria monocytogenes; lachnospiraceae MA 2020; a bacterium of the family lachnospiraceae NK4a 179; clostridium ammoniaphilum DSM 10710; carnis gallus Domesticus DSM 4847; gallibacterium gallisepticum DSM 4847 (second CRISPR locus); producing the methane propionic acid bacillus WB 4; listeria wegener FSL R9-0317; listeria family bacteria FSL M6-0635; ciliate wedder F0279; rhodobacter capsulatus SB 1003; rhodobacter capsulatus R121; rhodobacter capsulatus DE 442; ciliate stomatitis bacterium C-1013-b; decomposing the hemicelluloses of the Hericium; rectum [ eubacterium ]; eubacteriaceae CHKCI 004; blautia species mosaic-P2398; and cilium oral taxon 879 strain F0557. Another twelve (12) non-limiting examples are: a bacterium of the family lachnospiraceae NK4a 144; collecting green flexor bacteria; norquinone bacterium aurantiacus; sea spira species TSL 5-1; pseudobutyric acid vibrio species OR 37; vibrio butyricum species YAB 3001; blautia species mosaic-P2398; cilium species mosaic-P3007; bacteroides albopictus; a bacterium belonging to the family of monosporaceae, KH3CP3 RA; listeria fringensis; and strange non-adapted spirochete bacteria.
Some methods of identifying orthologs of CRISPR-Cas system enzymes may involve identifying tracr sequences in the genome of interest. Identification of tracr sequences may involve the following steps: the forward repeat sequence or tracr mate sequence is searched in the database to identify CRISPR regions comprising CRISPR enzymes. The CRISPR regions flanking the CRISPR enzyme in sense and antisense orientations were searched for homologous sequences. Search for transcriptional terminators and secondary structures. Any sequence that is not a forward repeat sequence or tracr mate sequence, but has greater than 50% identity to the forward repeat sequence or tracr mate sequence, is identified as a potential tracr sequence. The potential tracr sequences were obtained and analyzed for transcription terminator sequences associated therewith.
It is to be understood that any of the functionalities described herein can be engineered into CRISPR enzymes from other orthologs, including chimeric enzymes comprising fragments from multiple orthologs. Examples of such orthologs are described elsewhere herein. Thus, a chimeric enzyme may comprise fragments of CRISPR enzyme orthologs of the following organisms: including but not limited to, cilia, listeria, corynebacterium, sauteria, legionella, treponema, Proteus, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Vibrio, Flavobacterium, Spirochaete, Azospirillum, gluconacetobacter, Neisseria, Rochelia, Microclavulirus, Staphylococcus, nitrate lyase, Mycoplasma, and Campylobacter. The chimeric enzyme may comprise a first fragment and a second fragment, and the fragments may be fragments of CRISPR enzyme orthologs of organisms of the genus or species mentioned herein; advantageously, the fragments are from different species of CRISPR enzyme orthologs.
In embodiments, the C2C2 protein as referred to herein also encompasses functional variants of C2C2 or a homolog or ortholog thereof. As used herein, a "functional variant" of a protein refers to a variant of such a protein that at least partially retains the activity of the protein. Functional variants may include mutants (which may be insertion, deletion or substitution mutants), including polymorphs and the like. Functional variants also include fusion products of such a protein with another, usually unrelated, nucleic acid, protein, polypeptide or peptide. Functional variants may be naturally occurring or may be artificial. Advantageous embodiments may relate to engineered or non-naturally occurring RNA targeting type VI effector proteins.
In one embodiment, one or more nucleic acid molecules encoding C2C2 or an ortholog or homolog thereof may be codon optimized for expression in a eukaryotic cell. Eukaryotes can be as discussed herein. One or more nucleic acid molecules may be engineered or non-naturally occurring.
In one embodiment, C2C2 or an ortholog or homolog thereof may comprise one or more mutations, and thus one or more nucleic acid molecules encoding the same may have one or more mutations. The mutation may be an artificially introduced mutation and may include, but is not limited to, one or more mutations in the catalytic domain. Examples of catalytic domains for Cas9 enzymes may include, but are not limited to, RuvC I, RuvC II, RuvC III, and HNH domains.
In embodiments, C2C2 or an orthologue or homolog thereof may comprise one or more mutations. The mutation may be an artificially introduced mutation and may include, but is not limited to, one or more mutations in the catalytic domain. Examples of catalytic domains for Cas enzymes may include, but are not limited to, HEPN domains.
In one embodiment, C2C2 or an ortholog or homolog thereof can be used as a universal nucleic acid binding protein fused to or operably linked to a functional domain. Exemplary functional domains may include, but are not limited to, translation initiators, translation activators, translation repressors, nucleases (particularly ribonucleases), spliceosomes, beads, light inducible/controllable domains or chemically inducible/controllable domains.
In certain exemplary embodiments, the C2C2 effector protein may be from an organism selected from the group consisting of: cilium, listeria, corynebacterium, sauter, legionella, treponema, Proteus, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Vibrio, Flavobacterium, Spirochaeta, Azospirillum, gluconacetobacter, Neisseria, Rochelia, Microclavus, Staphylococcus, nitrate lyase, Mycoplasma and Campylobacter.
In certain embodiments, the effector protein may be listeria species C2p, preferably listeria monocytogenes C2p, more preferably listeria monocytogenes serovar 1/2b strain SLCC 3954C 2p, and the crRNA sequence may be 44 to 47 nucleotides in length with a 5'29nt forward repeat (DR) and a 15nt to 18nt spacer.
In certain embodiments, the effector protein may be cilium species C2p, preferably cilium saxatilis C2p, more preferably cilium saxatilis DSM 19757C 2p, and the crRNA sequence may be 42 to 58 nucleotides in length with a5 'forward repeat of at least 24nt, such as a 5'24-28nt forward repeat (DR), and a spacer of at least 14nt, such as 14nt to 28nt, or at least 18nt, such as 19, 20, 21, 22 or more nt, such as 18-28, 19-28, 20-28, 21-28, or 22-28 nt.
In certain exemplary embodiments, the effector protein may be a cilium species, widescreenia F0279; or a species of Listeria, preferably Listeria newyork FSL M6-0635.
In certain exemplary embodiments, the C2C2 effector proteins of the invention include, but are not limited to, the following 21 ortholog species (including multiple CRISPR loci): ciliate sarmentosum; velveteenia virginica (Lw 2); listeria monocytogenes; lachnospiraceae MA 2020; a bacterium of the family lachnospiraceae NK4a 179; clostridium ammoniaphilum DSM 10710; carnis gallus Domesticus DSM 4847; gallibacterium gallisepticum DSM 4847 (second CRISPR locus); producing the methane propionic acid bacillus WB 4; listeria wegener FSL R9-0317; listeria family bacteria FSL M6-0635; ciliate wedder F0279; rhodobacter capsulatus SB 1003; rhodobacter capsulatus R121; rhodobacter capsulatus DE 442; ciliate stomatitis bacterium C-1013-b; decomposing the hemicelluloses of the Hericium; rectum [ eubacterium ]; eubacteriaceae CHKCI 004; blautia species mosaic-P2398; and cilium oral taxon 879 strain F0557. Another twelve (12) non-limiting examples are: a bacterium of the family lachnospiraceae NK4a 144; collecting green flexor bacteria; norquinone bacterium aurantiacus; sea spira species TSL 5-1; pseudobutyric acid vibrio species OR 37; vibrio butyricum species YAB 3001; blautia species mosaic-P2398; cilium species mosaic-P3007; bacteroides albopictus; a bacterium belonging to the family of monosporaceae, KH3CP3 RA; listeria fringensis; and strange non-adapted spirochete bacteria.
In certain embodiments, the C2C2 protein according to the invention is or is derived from one of the orthologs, or is a chimeric protein of two or more of the orthologs as described herein, or is a mutant or variant (or chimeric mutant or variant) of one of the orthologs, including dead C2C2, split C2C2, destabilized C2C2, etc., as defined elsewhere herein, with or without fusion to heterologous/functional domains.
In certain exemplary embodiments, the RNA-targeting effector protein is a VI-B type effector protein, such as Cas13B and a group 29 or group 30 protein. In certain exemplary embodiments, the RNA-targeting effector protein comprises one or more HEPN domains. In certain exemplary embodiments, the RNA-targeting effector protein comprises a C-terminal HEPN domain, an N-terminal HEPN domain, or both domains. With respect to exemplary Type VI-B effector proteins that may be used in the context of the present invention, reference is made to US application No. 15/331,792 entitled "Novel CRISPR Enzymes and Systems (Novel CRISPR Enzymes and Systems)" and filed 2016, 10, 21, a international patent application No. PCT/US2016/058302 entitled "Novel CRISPR Enzymes and Systems" and filed 2016, 10, 21, 2016, and smarton et al, "Cas13B is a Type VI-B CRISPR-associated RNA-Guided RNase differential regulated by access proteins Csx27 Csx28" Molecular Cell,65,1-13 (2017); dx.doi.org/10.1016/j.molcel.2016.12.023, and us provisional application number to be assigned entitled "Novel Cas13b ortholog CRISPR enzyme and System (Novel Cas13b Orthologues CRISPR Enzymes and systems)" filed on 3, 15, 2017. In a preferred embodiment, the Cas13 protein is LwaCas 13.
Masking constructs
As used herein, a "masking construct" refers to a molecule that can be cleaved or otherwise inactivated by an activated CRISPR system effector protein described herein. The term "masking construct" may alternatively also be referred to as a "detection construct". In certain exemplary embodiments, the masking construct is an RNA-based masking construct. The RNA-based masking construct comprises an RNA element that is cleavable by a CRISPR effector protein. Cleavage of the RNA element releases the agent or produces a conformational change that allows the generation of a detectable signal. Exemplary constructs demonstrating how to use RNA elements to prevent or mask the generation of detectable signals are described below, and embodiments of the invention include variants thereof. Prior to cleavage, or when the masking construct is in an "active" state, the masking construct blocks the generation or detection of a positive detectable signal. It will be appreciated that in certain exemplary embodiments, minimal background signal may be generated in the presence of an active RNA-masking construct. The positively detectable signal can be any signal that can be detected using optical, fluorescent, chemiluminescent, electrochemical, or other detection methods known in the art. The term "positive detectable signal" is used to distinguish it from other detectable signals detectable in the presence of the masking construct. For example, in certain embodiments, a first signal (i.e., a negative detectable signal) can be detected when a masking agent is present, which is then converted to a second signal (e.g., a positive detectable signal) when the target molecule is detected and the masking agent is cleaved or inactivated by the activated CRISPR effector protein.
In certain exemplary embodiments, the masking construct may repress the production of a gene product. The gene product may be encoded by a reporter construct added to the sample. The masking construct may be interfering RNA, such as short hairpin RNA (shrna) or small interfering RNA (sirna), involved in the RNA interference pathway. The masking construct may also comprise a microrna (mirna). When present, the masking construct represses expression of the gene product. The gene product may be a fluorescent protein or other RNA transcript or protein that can be detected by a labeled probe, aptamer or antibody in the absence of the masking construct. Upon activation of the effector protein, the masking construct is cleaved or otherwise silenced to allow the gene product to be expressed and detected as a positively detectable signal.
In certain exemplary embodiments, the masking construct may sequester one or more reagents required to generate a detectable positive signal, such that release of the one or more reagents from the masking construct results in the generation of a detectable positive signal. The one or more reagents may be combined to produce a colorimetric signal, a chemiluminescent signal, a fluorescent signal, or any other detectable signal, and may include any reagent known to be suitable for such a purpose. In certain exemplary embodiments, the one or more agents are chelated by the RNA aptamer that binds to the one or more agents. One or more reagents are released when the target molecule is detected and the effector protein is activated and the RNA aptamer is degraded.
In other embodiments of the invention, the RNA-based masking construct suppresses the generation of a detectable positive signal, or the RNA-based masking construct suppresses the generation of a detectable positive signal by masking the detectable positive signal or alternatively generating a detectable negative signal, or the RNA-based masking construct comprises a silencing RNA that suppresses the generation of a gene product encoded by the reporter construct, wherein the gene product, when expressed, generates the detectable positive signal.
In other embodiments, the RNA-based masking construct is a ribozyme that generates the negative detectable signal, and wherein the positive detectable signal is generated when the ribozyme is inactivated, or the ribozyme converts a substrate to a first color, and wherein the substrate is converted to a second color when the ribozyme is inactivated.
In other embodiments, the RNA-based masking agent is an RNA aptamer, or the aptamer chelates an enzyme, wherein the enzyme generates a detectable signal upon release from the aptamer by acting on a substrate, or the aptamer chelates a pair of agents that combine to generate a detectable signal upon release from the aptamer.
In another embodiment, the RNA-based masking construct comprises an RNA oligonucleotide to which a detectable ligand and a masking component are attached. In another embodiment, the detectable ligand is a fluorophore and the masking component is a quenching molecule, or an agent used to amplify a target RNA molecule, such as
In certain exemplary embodiments, the masking constructs may be immobilized on individual discrete volumes (further defined below) of a solid substrate and sequestered in a single reagent. For example, the reagent may be a bead comprising a dye. When sequestered by an immobilized agent, individual beads are too diffuse to generate a detectable signal, but are able to generate a detectable signal upon release from the masking construct, for example by aggregation or simply increase in solution concentration. In certain exemplary embodiments, the immobilized masking agent is an RNA-based aptamer that can be cleaved by an activated effector protein upon detection of the target molecule.
In certain other exemplary embodiments, the masking construct binds to an immobilized reagent in solution, thereby blocking the ability of the reagent to bind to a free, individually labeled binding partner in solution. Thus, after applying a washing step to the sample, the labeled binding partner may be washed out of the sample in the absence of the target molecule. However, if the effector protein is activated, the masking construct is cleaved to a degree sufficient to interfere with the ability of the masking construct to bind to the agent, thereby allowing the labeled binding partner to bind to the immobilized agent. Thus, the labeled binding partner remains after the washing step, indicating the presence of the target molecule in the sample. In certain aspects, the masking construct that binds the immobilized agent is an RNA aptamer. The immobilized reagent may be a protein and the labeled binding partner may be a labeled antibody. Alternatively, the immobilized reagent may be streptavidin and the labeled binding partner may be labeled biotin. The label on the binding partner used in the above embodiments may be any detectable label known in the art. In addition, other known binding partners may be used according to the general design described herein.
In certain exemplary embodiments, the masking construct may comprise a ribozyme. Ribozymes are RNA molecules with catalytic properties. Both natural and engineered ribozymes comprise or consist of an RNA that can be targeted by the effector proteins disclosed herein. Ribozymes may be selected or engineered to catalyze a reaction that generates a negative detectable signal or prevents the generation of a positive control signal. Upon inactivation of the ribozyme by the activated effector protein, the reaction that generates a negative control signal or prevents the generation of a positive detectable signal is removed, thereby allowing the generation of a positive detectable signal. In an exemplary embodiment, the ribozyme may catalyze a colorimetric reaction that results in a solution that exhibits a first color. When the ribozyme is inactivated, the solution then changes to a second color, which is a detectable positive signal. ZHao et al, "Signal amplification of glucosamine-6-phosphate based on ribozyme glmS," Biosens bioelectron.2014; 16:337-42 describes examples of how ribozymes can be used to catalyze colorimetric reactions and provides examples of how such systems can be modified to work in the context of the embodiments disclosed herein. Alternatively, ribozymes, when present, can produce cleavage products, e.g., RNA transcripts. Thus, detection of a positively detectable signal can include detection of an uncleaved RNA transcript that is only produced in the absence of a ribozyme.
In certain exemplary embodiments, the one or more reagents are proteins, such as enzymes, that are capable of promoting the generation of a detectable signal, such as a colorimetric, chemiluminescent, or fluorescent signal, that are inhibited or sequestered such that the protein is unable to generate a detectable signal due to the binding of the one or more RNA aptamers to the protein. Upon activation of the effector proteins disclosed herein, the RNA aptamers are cleaved or degraded to the extent that they no longer inhibit the ability of the proteins to produce a detectable signal. In certain exemplary embodiments, the aptamer is a thrombin inhibitor aptamer. In certain exemplary embodiments, the thrombin inhibitor aptamer has the sequence of GGGAACAAAGCUGAAGUACUUACCC (SEQ ID NO: 18). When the aptamer is cleaved, thrombin will become active and will cleave the peptide colorimetric or fluorescent substrate. In certain exemplary embodiments, the colorimetric substrate is p-nitroaniline (pNA) covalently linked to a peptide substrate of thrombin. Upon cleavage by thrombin, pNA is released and becomes yellow and readily visible to the eye. In certain exemplary embodiments, the fluorogenic substrate is a blue fluorophore of 7-amino-4-methylcoumarin that can be detected using a fluorescence detector. Inhibitory aptamers can also be used with horseradish peroxidase (HRP), beta-galactosidase, or Calf Alkaline Phosphatase (CAP), and are within the general principles described above.
In certain embodiments, the rnase is detected colorimetrically via cleavage of the enzyme-inhibiting aptamer. One potential mode of converting rnases to colorimetric signals is to combine cleavage of RNA aptamers with reactivation of enzymes capable of producing a colorimetric output. In the absence of RNA cleavage, the intact aptamer will bind to the enzyme target and inhibit its activity. The advantage of this readout system is that the enzyme provides an additional amplification step: once released from the aptamer via an accessory activity (e.g., Cas13a accessory activity), the colorimetric enzyme will continue to produce a colorimetric product, resulting in signal amplification.
In certain embodiments, existing aptamers that inhibit enzymes with colorimetric read-outs are used. There are several aptamer/enzyme pairs with colorimetric read-out, such as thrombin, protein C, neutrophil elastase, and subtilisin. These proteases have pNA-based colorimetric substrates and are commercially available. In certain embodiments, novel aptamers that target a common colorimetric enzyme are used. Common and robust enzymes, such as β -galactosidase, horseradish peroxidase or calf intestinal alkaline phosphatase, can be targeted by engineered aptamers designed by selection strategies (such as SELEX). Such a strategy allows for the rapid selection of aptamers with nanomolar binding efficiency and can be used to develop additional enzyme/aptamer pairs for colorimetric readout.
In certain embodiments, rnase activity is detected colorimetrically via cleavage of an inhibitor of the RNA tether. Many common colorimetric enzymes have competitive reversible inhibitors: for example, β -galactosidase can be inhibited by galactose. Many of these inhibitors are weak, but their effectiveness can be increased by local concentration increases. Colorimetric enzyme and inhibitor pairs can be engineered into rnase sensors by correlating local concentrations of inhibitors to rnase activity. Small molecule inhibitor based colorimetric rnase sensors involve three components: a colorimetric enzyme, an inhibitor, and a bridging RNA covalently linked to the inhibitor and the enzyme to tether the inhibitor to the enzyme. In the uncleaved configuration, the enzyme is inhibited by an increased local concentration of small molecules; when the RNA is cleaved (e.g., by-pass cleavage by Cas13 a), the inhibitor will be released and the colorimetric enzyme will be activated.
In certain embodiments, rnase activity is detected by colorimetric methods via the formation and/or activation of G quadruplexes. The G quadruplex in DNA can complex with heme (iron (III) -protoporphyrin IX) to form a dnase with peroxidase activity. When a peroxidase substrate (e.g., ABTS (2, 2' -azabis [ 3-ethylbenzothiazoline-6-sulfonic acid ] -diammonium salt)) is provided, the G quadruplex-heme complex oxidizes the substrate in the presence of hydrogen peroxide, which then forms a green color in solution. Exemplary G quadruplex-forming DNA sequences are: GGGTAGGGCGGGTTGGGA (SEQ. I.D. No. 19). By hybridizing RNA sequences to the DNA aptamers, the formation of G quadruplex structures will be limited. Following accessory activation of the rnase (e.g., of the C2C2 complex), the RNA staple will be cleaved, allowing the G quadruplex to form and bind to heme. This strategy is particularly attractive because color formation is enzymatic, which means that there is additional amplification in addition to rnase activation.
In certain exemplary embodiments, the masking constructs may be immobilized on individual discrete volumes (further defined below) of a solid substrate and sequestered in a single reagent. For example, the reagent may be a bead comprising a dye. When sequestered by an immobilized agent, individual beads are too diffuse to generate a detectable signal, but are able to generate a detectable signal upon release from the masking construct, for example by aggregation or simply increase in solution concentration. In certain exemplary embodiments, the immobilized masking agent is an RNA-based aptamer that can be cleaved by an activated effector protein upon detection of the target molecule.
In an exemplary embodiment, the masking construct comprises a detection agent that changes color upon aggregation or dispersion of the detection agent in solution. For example, certain nanoparticles, such as colloidal gold, undergo a visible violet to red color shift as they move from aggregates to dispersed particles. Thus, in certain exemplary embodiments, such detection agents may aggregate through one or more bridge molecules. At least a portion of the bridge molecule comprises RNA. Upon activation of the effector proteins disclosed herein, the RNA portion of the bridge molecule is cleaved, allowing the detection agent to disperse and cause a corresponding color change. In certain exemplary embodiments, the bridge molecule is an RNA molecule. In some examplesIn an exemplary embodiment, the detection agent is a colloidal metal. The colloidal metal material may comprise water-insoluble metal particles or metal compounds dispersed in a liquid, hydrosol or metal sol. The colloidal metal may be selected from the metals of groups IA, IB, IIB and IIIB of the periodic Table, as well as transition metals, especially those of group VIII. Preferred metals include gold, silver, aluminum, ruthenium, zinc, iron, nickel, and calcium. Other suitable metals also include the various oxidation states of the following metals: lithium, sodium, magnesium, potassium, scandium, titanium, vanadium, chromium, manganese, cobalt, copper, gallium, strontium, niobium, molybdenum, palladium, indium, tin, tungsten, rhenium, platinum, and gadolinium. The metal is preferably provided in ionic form and is derived from a suitable metal compound, for example A13+、Ru3+、Zn2+、Fe3+、Ni2+And Ca2+Ions.
The aforementioned color shift is observed when the RNA bridge is cleaved by the activated CRISPR effector. In certain exemplary embodiments, the particles are colloidal metals. In certain other exemplary embodiments, the colloidal metal is colloidal gold. In certain exemplary embodiments, the colloidal nanoparticle is a 15nm gold nanoparticle (AuNP). Due to the unique surface characteristics of colloidal gold nanoparticles, a maximum absorbance was observed at 520nm when fully dispersed in solution and appeared red to the naked eye. Upon aggregation of aunps, they exhibited a red-shift in maximum absorbance and appeared darker in color, eventually precipitating out of solution as dark purple aggregates. In certain exemplary embodiments, the nanoparticle is modified to include a DNA linker extending from the surface of the nanoparticle. The individual particles are linked together by single-stranded RNA (ssrna) bridges that hybridize to at least a portion of the DNA linkers at each end of the RNA. Thus, the nanoparticles will form a network of connected particles and aggregates, appearing as a dark precipitate. Upon activation of the CRISPR effectors disclosed herein, the ssRNA bridges will be cleaved, releasing the AU NPs from the junction lattice and producing a visible red color. Exemplary DNA linker and RNA bridge sequences are listed below. Thiol linkers at the end of the DNA linker can be used for conjugation to the surface of the AuNP. Other forms of conjugation may be used. In certain exemplary embodiments, two AuNP populations may be generated, one for each DNA linker. This will help to promote the correct binding of the ssRNA bridges in the correct orientation. In certain exemplary embodiments, the first DNA linker is conjugated through the 3 'end and the second DNA linker is conjugated through the 5' end.
Table 1.
Figure BDA0002947399140000811
In certain other exemplary embodiments, the masking construct may comprise an RNA oligonucleotide to which a detectable label is attached and a masking agent for the detectable label. Examples of such detectable label/masking agent pairs are fluorophores and quenchers of fluorophores. Quenching of a fluorophore may occur due to the formation of a non-fluorescent complex between the fluorophore and another fluorophore or a non-fluorescent molecule. This mechanism is called ground state complex formation, static quenching or contact quenching. Thus, the RNA oligonucleotide can be designed such that the fluorophore and quencher are sufficiently close for contact quenching to occur. Fluorophores and their associated quenchers are known in the art and can be selected for this purpose by one of ordinary skill in the art. The particular fluorophore/quencher is not critical in the context of the present invention, so long as the fluorophore/quencher pair is selected to ensure masking of the fluorophore. Upon activation of the effector proteins disclosed herein, the RNA oligonucleotide is cleaved, thereby severing the proximity between the fluorophore and quencher needed to maintain the contact quenching effect. Thus, detection of a fluorophore can be used to determine the presence of the target molecule in a sample.
In certain other exemplary embodiments, the masking construct may comprise one or more RNA oligonucleotides to which one or more metal nanoparticles, such as gold particles, are attached. In some embodiments, the masking construct comprises a plurality of metal nanoparticles crosslinked by a plurality of RNA oligonucleotides that form closed loops. In one embodiment, the masking construct comprises three gold nanoparticles crosslinked by three RNA oligonucleotides forming a closed loop. In some embodiments, the cleavage of the RNA oligonucleotide by the CRISPR effector protein results in the production of a detectable signal by the metal nanoparticle.
In certain other exemplary embodiments, the masking construct may comprise one or more RNA oligonucleotides to which one or more quantum dots are attached. In some embodiments, the cleavage of the RNA oligonucleotide by the CRISPR effector protein results in a detectable signal produced by the quantum dot.
In one exemplary embodiment, the masking construct may comprise quantum dots. The quantum dots can have a plurality of linker molecules attached to the surface. At least a portion of the linker molecule comprises RNA. The linker molecule is attached to the quantum dot at one end and to one or more quenchers along the length of the linker or at the ends of the linker, such that the quenchers remain close enough for quenching of the quantum dot to occur. The linker may be branched. As mentioned above, the quantum dot/quencher pair is not critical, so long as the quantum dot/quencher pair is selected to ensure masking of the fluorophore. Quantum dots and their associated quenchers are known in the art and can be selected for this purpose by one of ordinary skill in the art. Upon activation of the effector proteins disclosed herein, the RNA portion of the linker molecule is cleaved, thereby eliminating the proximity between the quantum dots and the quencher or quenchers required to maintain the quenching effect. In certain exemplary embodiments, the quantum dots are streptavidin-conjugated. The RNA was attached via a biotin linker and the quencher molecule was recruited with the sequence/5 Biosg/UCUCGUACGUUC/3IAbRQSP/(SEQ ID NO.23) or/5 Biosg/UCUCGUACGUUCUCUCGUACGUUC/3IAbRQSP/(SEQ ID NO.24), wherein/5 Biosg/is a biotin tag and/3 lAbRQSP/is an Iowa black quencher. Upon cleavage by the activated effectors disclosed herein, the quantum dots will visibly fluoresce.
In a similar manner, fluorescence energy transfer (FRET) may be used to generate a detectable positive signal. FRET is a non-radiative process by which a photon from an energy-excited fluorophore (i.e., a "donor fluorophore") raises the energy state of an electron in another molecule (i.e., an "acceptor") to a higher vibrational level that excites a singlet state. The donor fluorophore returns to the ground state without emitting the fluorescent features of the fluorophore. The acceptor may be another fluorophore or a non-fluorescent molecule. If the acceptor is a fluorophore, the transferred energy is emitted as a fluorescent signature of the fluorophore. If the acceptor is a non-fluorescent molecule, the absorbed energy is lost as heat. Thus, in the context of embodiments as disclosed herein, a fluorophore/quencher pair is replaced by a donor fluorophore/acceptor pair attached to an oligonucleotide molecule. When intact, as detected by fluorescence or heat emitted from the receptor, the masking construct generates a first signal (a negative detectable signal). Upon activation of the effector proteins disclosed herein, the RNA oligonucleotide is cleaved and FRET is disrupted, such that fluorescence of the donor fluorophore (positive detectable signal) is now detected.
In certain exemplary embodiments, the masking construct comprises the use of intercalating dyes that change their absorbance in response to cleavage of long RNAs into short nucleotides. There are several such dyes. For example, pyronin-Y will complex with RNA and form a complex with absorbance at 572 nm. Cleavage of RNA results in loss of absorbance and color change. Methylene blue can be used in a similar manner, with the absorbance change at 688nm of methylene blue after RNA cleavage. Thus, in certain exemplary embodiments, the masking construct comprises an RNA and an intercalating dye complex that changes absorbance upon cleavage of the RNA by the effector proteins disclosed herein.
In certain exemplary embodiments, the masking construct may comprise an initiator for the HCR reaction. See, e.g., Dirks and pierce. pnas 101, 15275-. The HCR reaction exploits the potential energy in two hairpin species. When a single-stranded initiator having a portion complementary to a corresponding region on one of the hairpins is released into a previously stabilized mixture, it opens the hairpin of one substance. This process in turn exposes a single-stranded region of the hairpin that opens up other material. This process in turn exposes the same single-chain region as the original initiator. The resulting chain reaction can result in the formation of a nicked double helix that grows until the hairpin supply is depleted. The detection of the resulting product can be carried out on a gel or by colorimetric methods. Exemplary colorimetric detection methods include, for example, those described in "Ultra-sensitive colorimetric assay system based on the hybridization reaction-triggered enzyme assay ACS application interface, 2017,9(1): 167-; wang et al, "An enzyme-free colorimetric estimation hybridization reaction and split aptamers" analysis 2015,150, 7657-7662; and Song et al, "Non-covalent fluorescent labeling of hairpin DNA coupled with hybridization reaction for sensitive DNA detection," Applied Spectroscopy,70 (4): 686 694 (2016).
In certain exemplary embodiments, the masking construct may comprise an HCR initiator sequence and a cleavable structural element, such as a loop or hairpin, that prevents the initiator from initiating the HCR reaction. Following cleavage of the cleavage structural element by the activated CRISPR effector protein, followed by release of the initiator to trigger an HCR reaction, detection of the HCR reaction indicates the presence of the one or more targets in the sample. In certain exemplary embodiments, the masking construct comprises a hairpin with an RNA loop. When an activated CRISRP effector protein cleaves an RNA loop, an initiator can be released to trigger an HCR reaction.
In particular embodiments, target nucleic acids can be detected with a sensitivity of the order of attomoles. In particular embodiments, target nucleic acids can be detected with femtomolar sensitivity. In some embodiments, the method is performed in less than about 2 hours, less than about 90 minutes, less than about 60 minutes, less than about 30 minutes, or less than about 15 minutes. In some preferred embodiments, amplification and detection can be performed in a one-pot method with detection of 2fM in less than about 2 hours.
Kit for amplification and detection
Also provided herein are kits for amplifying and/or detecting a target double-stranded nucleic acid in a sample. Such kits may include, but are not necessarily limited to, the amplification CRISPR system as described herein.
In some embodiments, the kit may include reagents for purifying double stranded nucleic acids in a sample.
In some embodiments, the kit may be a kit for amplifying and/or detecting a target single-stranded nucleic acid in a sample, and may include reagents for purifying single-stranded nucleic acid in a sample. The kit may also include a set of instructions for use.
The kit may also comprise a detection system, in a preferred embodiment a CRISPR detection system. The detection system may be, for example, in U.S. application 62/432,553 filed on 9/12/2016; US 62/456,645 filed on 8.2.2017; 62/471,930 filed on 3, 15, 2017; 62/484,869 filed on 12.4.2017; 62/568,268 filed on 4.10.2017, all of which are incorporated by reference in their entirety; and is also described in PCT/US2017/065477 entitled "diagnosis based on CRISPR effector system", filed on 12/8/2017, which is incorporated herein by reference, and in particular the components of the CRISPR system for detection are described at [0142] - [0289 ].
Method
Amplification and/or detection methods are provided and may be used with systems as disclosed herein.
In one embodiment of the invention, nicking enzyme based amplification may be included. The nickase enzyme can be a CRISPR protein. Thus, the introduction of nicks into dsDNA can be programmable and sequence specific. Figure 1 depicts an embodiment of the invention that begins with two guides designed to target opposite strands of a dsDNA target. According to the present invention, the nickase may be Cpf1, C2C1, Cas9 or any ortholog or CRISPR protein that cleaves or is engineered to cleave a single strand of a DNA duplex. The nicked strand can then be extended by a polymerase. In one embodiment, the position of the nick is selected such that the polymerase extends the strand towards the central portion of the target duplex DNA between the nick sites. In certain embodiments, a primer is included in the reaction, which is capable of hybridizing to the extended strand, followed by further polymerase extension of the primer to regenerate two dsDNA fragments: a first dsDNA comprising a first strand CRISPR guide site or both a first strand CRISPR guide site and a second strand CRISPR guide site, and a second dsDNA comprising a second strand CRISPR guide site or both a first strand CRISPR guide site and a second strand CRISPR guide site. These fragments continue to be nicked and extended in a cycling reaction that exponentially amplifies the region of the target between the nicking sites.
In certain embodiments, the amplification is based on CRISPR-nickase amplification, i.e., programmable CRISPR nicking amplification. The amplification may comprise: (a) combining a sample comprising a target double-stranded nucleic acid with an amplification reaction mixture comprising: (i) an amplifying CRISPR system comprising a first CRISPR/Cas complex and a second CRISPR/Cas complex, the first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first strand of a target nucleic acid, and the second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second strand of the target nucleic acid; and (ii) a polymerase; (b) amplifying a target nucleic acid by: nicking a first strand and a second strand of a target nucleic acid using a first CRISPR/Cas complex and a second CRISPR/Cas complex and displacing and extending the nicked strands using a polymerase, thereby generating a duplex comprising the target nucleic acid sequence between the first nick site and the second nick site; (c) adding to the reaction mixture a primer pair comprising a first primer and a second primer, the first primer comprising a portion complementary to a first strand of the target nucleic acid and a portion comprising a binding site for the first guide molecule, and the second primer comprising a portion complementary to a second strand of the target nucleic acid and a portion comprising a binding site for the second guide molecule; and (d) further amplifying the target nucleic acid by repeating the extension and nicking under isothermal conditions. The first Cas-based nickase and the second Cas-based nickase may be the same or different.
Nucleic acid amplification can be performed using a particular thermal cycling machine or apparatus, and can be performed in a single reaction or in batches, so that any desired number of reactions can be performed simultaneously. In some embodiments, amplification can be performed using a microfluidic or robotic device, or can be performed using manual changes in temperature to achieve the desired amplification. In some embodiments, optimization may be performed to obtain optimal reaction conditions for a particular application or material. One skilled in the art will know and be able to optimize the reaction conditions to obtain sufficient amplification.
In some embodiments, amplification of the target nucleic acid is performed at about 37 ℃ to 65 ℃. In some embodiments, amplification of the target nucleic acid is performed at about 50 ℃ to 59 ℃. In some embodiments, amplification of the target nucleic acid is performed at about 60 ℃ to 72 ℃. In some embodiments, amplification of the target nucleic acid is performed at about 37 ℃. In some embodiments, amplification of the target nucleic acid is performed at room temperature.
Additional embodiments are disclosed in the following numbered paragraphs.
1. A method of amplifying and/or detecting a target double-stranded nucleic acid, the method comprising:
a. combining a sample comprising the target double-stranded nucleic acid with an amplification reaction mixture comprising:
i. an amplifying CRISPR system comprising a first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first target nucleic acid location and a second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second target nucleic acid location; and
a polymerase;
b. amplifying the target nucleic acid;
c. adding to the reaction mixture a primer pair comprising a first primer and a second primer, the first primer comprising a portion complementary to the first position and the second primer comprising a portion complementary to the second position and a portion comprising a binding site for the second guide molecule; and
d. the target nucleic acid is further amplified by repeating the extension and nicking under isothermal conditions.
2. The method of paragraph 1, wherein the first guide molecule directs the first CRISPR/Cas complex to a first strand of the target nucleic acid and the second guide molecule directs the second CRISPR/Cas complex to a second strand of the target nucleic acid.
3. The method of paragraph 1, wherein the first and second target nucleic acid positions are on the first strand of the target nucleic acid, thereby generating ssDNA comprising the sequence of the first strand of the target nucleic acid between the first and second target nucleic acid positions.
4. The method of paragraph 2, the method comprising amplifying the target nucleic acid by: nicking the first and second strands of the target nucleic acid using the first and second CRISPR/Cas complexes and replacing and extending the nicked strands using the polymerase, thereby generating a duplex comprising the target nucleic acid sequence between the first and second nick sites.
5. The method of paragraph 1, wherein the Cas-based nickase is selected from the group consisting of: cas9 nickase, Cpf1 nickase, and C2C1 nickase.
6. The method of paragraph 2, wherein the Cas-based nickase is a Cas9 nickase protein, the Cas9 nickase protein comprising a mutation in an HNH domain.
7. The method of paragraph 2, wherein the Cas-based nickase is a Cas9 nickase protein, the Cas9 nickase protein comprising a mutation corresponding to N863A in SpCas9 or N580A in SaCas 9.
8. The method of paragraphs 3 or 4, wherein the Cas-based nickase is a Cas9 protein derived from a bacterial species selected from the group consisting of: streptococcus pyogenes, Staphylococcus aureus, Streptococcus thermophilus, Streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae, francisella tularensis 1, prevotella easily, lachnospiraceae MC 20171, vibrio proteolyticus, isocratic bacteria GW2011_ GWA2_33_10, centipede bacteria GW2011_ GWC2_44_17, smith bacteria SCADC, aminoacetococcus BV3L6, lachnospiraceae MA2020, candidate termite methanogen, shigella, moraxella bovis 237, paddy field leptospira, lachnospiraceae bacteria ND2006, porphyromonas canicola 3, prevotella saccharolytica, and porphyromonas macaque.
9. The method of paragraph 2, wherein the Cas-based nickase is a Cpf1 nickase protein comprising a mutation in the Nuc domain.
10. The method of paragraph 6, wherein the Cas-based nickase is a Cpf1 nickase protein, said Cpf1 nickase protein comprising a mutation corresponding to R1226A in AsCpf 1.
11. The method of paragraph 6 or 7, wherein the Cas-based nickase is a Cpf1 protein derived from a bacterial species selected from the group consisting of: lenflansii, Prevotella facilis, Despirochaetaceae, Vibrio proteolyticus, Isodomycota, thrifton, Smith, Aminococcus, Demospiromyces, Methanobacterium candidate for Termite, Shigella, Moraxella bovis, Leptospira padina, Porphyromonas canicola, Prevotella saccharolytica, Porphyromonas kiwii, Vibrio amylovorans, Prevotella saccharolytica, Flavobacterium gilophilum, Statezomucor, Pseudomomyces poricus, Eubacterium species, GenBank (Loermania), Flavobacterium species, Prevotella brevis, Moraxella capriae, Bacteroides stomatitis, Porphyromonas canicola, Intertrophomonas Otopteria, Prevotella, Anaerobacter, Vibrio fibrinolytica, Methylophilus methanophilus candidate, Vibrio butyricum, Vibrio species, Oral spore-free anaerobe species, rumen vibrio pseudobutyrate and butyric acid producing bacteria.
12. The method of paragraph 2, wherein the Cas-based nickase is a C2C1 nickase protein comprising a mutation in the Nuc domain.
13. The method of paragraph 9, wherein the Cas-based nickase is a C2C1 nickase protein, the C2C1 nickase protein comprising a mutation corresponding to D570A, E848A, or D977A in AacC2C 1.
14. The method of paragraphs 9 or 10, wherein the Cas-based nickase is a C2C1 protein derived from a bacterial species selected from the group consisting of: acid-fast alicyclic acid bacillus, contaminated alicyclic acid bacillus, alicyclobacillus macrocephalosporans, cottoniella, cottonwood bacillus, candidate forest tree bacteria, very desulfovibrio, sulfur dismutase desulfonium saline alkali bacillus, Zygomycota bacteria RIFOXYA12, omnivora WOR _2 bacteria RIFCSPHIGHO2, Blastomycetaceae bacteria TAV5, Furomycetes bacteria ST-NAGAB-D1, Fomitomycota bacteria RBG _13_46_10, spirochete bacteria GWB1_27_13, Microbacteraceae bacteria UBA2429, Thermobacter thermoblock, Bacillus amyloliquefaciens CF112, Bacillus NSP2.1, butyrate-reducing sulfate bacillus, alicyclobacillus, Citrobacter freundii, Brevibacillus agri (e.g. BAB-2500) and Methylobacterium nodosum.
15. The method of any of the preceding paragraphs, wherein the first Cas-based nickase and the second Cas-based nickase are the same.
16. The method of any of paragraphs 1-11, wherein the first Cas-based nickase and the second Cas-based nickase are different.
17. The method of any one of the preceding paragraphs, wherein the polymerase is selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full-length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase, Gst polymerase, Taq polymerase, Escherichia coli DNA polymerase I Klenow fragment, KlenaQ, Pol III DNA polymerase, T5 DNA polymerase, Gst polymerase and sequencer enzyme DNA polymerase.
18. The method of any one of the preceding paragraphs, wherein amplification of the target nucleic acid is performed at about 50-59 ℃.
19. The method of any of paragraphs 1-14, wherein amplification of the target nucleic acid is performed at about 60-72 ℃.
20. The method of any of paragraphs 1-14, wherein amplification of the target nucleic acid is performed at about 37 ℃ or about 65 ℃.
21. The method of any of paragraphs 1-14, wherein amplification of the target nucleic acid is performed at a constant temperature.
22. The method of any of the preceding paragraphs, wherein the target nucleic acid sequence is about 20-30, about 30-40, about 40-50, or about 50-100 nucleotides in length.
23. The method of any one of paragraphs 1-18, wherein the length of the target nucleic acid sequence is about 100-200, about 100-500 or about 100-1000 nucleotides.
24. The method of any of paragraphs 1-18, wherein the length of the target nucleic acid sequence is about 1000-2000 nucleotides, about 2000-3000 nucleotides, about 3000-4000 nucleotides or about 4000-5000 nucleotides.
25. The method of any one of the preceding paragraphs, wherein the first primer or the second primer comprises an RNA polymerase promoter.
26. The method of any one of the preceding paragraphs, further comprising detecting the amplified nucleic acids by a method selected from the group consisting of: gel electrophoresis, intercalating dye detection, PCR, real-time PCR, Fluorescence Resonance Energy Transfer (FRET), mass spectrometry, and CRISPR-SHERLOCK.
27. The method of paragraph 23, wherein the amplified nucleic acid is detected by Cas 13-based CRISPR-SHERLOCK method.
28. The method of any one of the preceding paragraphs, wherein the target nucleic acid is detected with attomole sensitivity.
29. The method of any one of paragraphs 1-24, wherein the target nucleic acid is detected with femtomolar sensitivity.
30. The method of any one of the preceding paragraphs, wherein the target nucleic acid is selected from the group consisting of: genomic DNA, mitochondrial DNA, viral DNA, plasmid DNA, and synthetic double-stranded DNA.
31. The method of any of the preceding paragraphs, wherein the sample is a biological sample or an environmental sample.
32. The method of paragraph 28, wherein the biological sample is blood, plasma, serum, urine, stool, sputum, mucus, lymph, synovial fluid, bile, ascites, pleural effusion, seroma, saliva, cerebrospinal fluid, aqueous or vitreous humor, or any bodily secretion, exudate, or fluid obtained from a joint, or a swab of a skin or mucosal surface.
33. The method of paragraph 29 wherein said sample is blood, plasma or serum obtained from a human patient.
34. The method of paragraph 28 wherein said sample is a plant sample.
35. The method of any of the preceding paragraphs, wherein the sample is a crude sample.
36. The method of any of paragraphs 1-31, wherein the sample is a purified sample.
37. A method for amplifying and/or detecting a target single-stranded nucleic acid, the method comprising:
(a) converting the single-stranded nucleic acid in the sample into a target double-stranded nucleic acid; and
(b) the steps as described in paragraph 1 are performed.
38. The method of paragraph 34, wherein the target single stranded nucleic acid is an RNA molecule.
39. The method of paragraph 35, wherein said RNA molecule is converted to said double stranded nucleic acid by reverse transcription and amplification steps.
40. The method of paragraph 34, wherein said target single stranded nucleic acid is selected from the group consisting of: single-stranded viral DNA, viral RNA, messenger RNA, ribosomal RNA, transfer RNA, microrna, short interfering RNA, microrna, synthetic RNA, and synthetic single-stranded DNA.
41. A system for amplifying and/or detecting a target double-stranded nucleic acid in a sample, the system comprising:
a) an amplifying CRISPR system comprising a first CRISPR/Cas complex and a second CRISPR/Cas complex, the first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first strand of the target nucleic acid, and the second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second strand of the target nucleic acid;
b) a polymerase;
c) a primer pair comprising a first primer and a second primer of the reaction mixture, the first primer comprising a portion complementary to the first strand of the target nucleic acid and a portion comprising a binding site for the first guide molecule, and the second primer comprising a portion complementary to the second strand of the target nucleic acid and a portion comprising a binding site for the second guide molecule; and optionally
d) A detection system for detecting amplification of the target nucleic acid.
42. The system of paragraph 38, wherein the Cas-based nickase is selected from the group consisting of: cas9 nickase, Cpf1 nickase, and C2C1 nickase.
43. The system of paragraph 38 or 39, wherein the polymerase is selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase and sequencer enzyme DNA polymerase.
44. The system of any of paragraphs 38-40, wherein the Cas-based nickase and the polymerase are performed at the same temperature.
45. A system for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the system comprising:
a) an agent for converting the target single-stranded nucleic acid into a double-stranded nucleic acid;
b) a component as described in paragraph 38.
46. A kit for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the kit comprising the components of paragraph 38 and a set of instructions for use.
47. The kit of paragraph 43, further comprising reagents for purifying the double stranded nucleic acid in the sample.
48. A kit for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the kit comprising the components of paragraph 43 and a set of instructions for use.
49. The kit of paragraph 4, further comprising reagents for purifying the single stranded nucleic acid in the sample.
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
Examples
Working examples
Example 1 CRISPR-nickase-based amplification (CRISPR-NEAR) and NEAR SHERLLOCK assays
In this example, a CRISPR-Cas enzyme called CRISPR-NEAR in combination with the CRISPR SHERLOCK detection method was used to test nickase-based amplification. Figure 1 shows a schematic of nickase-based amplification using CRISPR-Cas enzymes.
CRISPR-NEAR can be performed using DNA or RNA input. By incorporating the T7 promoter sequence in the amplification primers, the CRISPR-NEAR is also compatible with downstream SHERLOCK detection methods. Figure 9 shows a schematic of CRISPR-NEAR in conjunction with SHERLOCK detection. One of the main advantages of using CRISPR-NEAR is that it can be amplified much faster than RPA. The method uses a very simple buffer that allows easy combination of all steps of the SHERLOCK assay into one reaction. On the other hand, RPA amplification uses very viscous buffers, which are difficult to use with other reagents.
FIG. 2 is an image of an optimized gel electrophoresis showing a nicking enzyme amplification reaction. The results show that NEAR amplification is dependent on both nicking enzyme and polymerase. In the absence of primers, only linear amplification occurs. Primers and other PCR additives (such as gp32 SSB or trehalose) may increase amplification and modulate non-specific product formation.
FIGS. 3A-3F show a series of experiments demonstrating that nicking enzyme-based linear amplification depends on optimal nicking enzyme concentration. In these experiments, no other primers were included in the reaction, so only nick-based linear amplification occurred. The nicking enzymes used in these experiments were either nt.a1w1 (used as positive control), T7 mismatched naccpf 1 or matched naccpf 1. The guide concentration was kept consistent at 5 μ M input while the nicking enzyme concentration was titrated. nAsCpf1 is capable of nicking double-stranded DNA, which is amplified by a displacement strand polymerase. These data show that the optimal concentration for nAsCpf1 amplification was 500nM, not the highest concentration tested (1. mu.M).
Using the amplified NEAR reaction as input, serial experiments were performed in which nucleic acid targets were amplified and detected using SYTO intercalating dyes (fig. 4A-4C), gel-based reads (fig. 4D-4F), or Cas 13-based SHERLOCK detection (fig. 4G-4I). These results indicate that amplification with NEAR produces many non-specific products and is therefore incompatible with SYTO or gel-based readout. However, CRISPR shift-based detection can avoid this problem and allow specific detection of the product of interest. Data obtained using SYTO or CRISPR SHERLOCK-based detection (using Cas13 or Cpf1 detection) were further plotted as target/no target ratios (figure 5). The figure shows that the LwCas13a and Cpf1 directed complexes programmed to the target site are able to distinguish between specific and non-specific amplification, whereas SYTO intercalating dye detection is not possible under standard conditions.
Fig. 6A and 6B are two graphs showing data for NEAR alone versus NEAR detected in combination with SHERLOCK. Several conclusions can be drawn from these figures. First, LwCas13s SHERLOCK can achieve a lower detection limit through T7 amplification and strong accessory rnase activity. Second, the detection limit of 2aM can be reached by using Nt.A1w1 NEAR and Cas13, and the detection limit of 2fM can be reached by using nAsCpf1-NEAR and Cas 13. Finally, the AsCpf1 detection was not sensitive enough to bind to any NEAR reaction to provide a reliable signal <20 fM.
NEAR SHERLOCK can be performed at different temperatures depending on the polymerase used. FIGS. 7A-7C show that NEAR can be performed using Bst 2.0 hot start polymerase at 60 ℃; FIGS. 8A-8B show that NEAR can also be performed using the sequencer enzyme 2.0 polymerase at 37 ℃.
***
Various modifications and variations of the methods, pharmaceutical compositions and kits described herein will be apparent to those skilled in the art without departing from the scope and spirit of the invention. While the invention has been described in conjunction with specific embodiments, it will be understood that the invention is capable of further modifications and that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in the art are intended to be within the scope of the invention. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features herein before set forth.

Claims (49)

1. A method of amplifying and/or detecting a target double-stranded nucleic acid, the method comprising:
a. combining a sample comprising the target double-stranded nucleic acid with an amplification reaction mixture comprising:
i. an amplifying CRISPR system comprising a first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first target nucleic acid location and a second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second target nucleic acid location; and
a polymerase;
b. amplifying the target nucleic acid;
c. adding to the reaction mixture a primer pair comprising a first primer and a second primer, the first primer comprising a portion complementary to the first position and the second primer comprising a portion complementary to the second position and a portion comprising a binding site for the second guide molecule; and
d. the target nucleic acid is further amplified by repeating the extension and nicking under isothermal conditions.
2. The method of claim 1, wherein the first guide molecule directs the first CRISPR/Cas complex to a first strand of the target nucleic acid and the second guide molecule directs the second CRISPR/Cas complex to a second strand of the target nucleic acid.
3. The method of claim 1, wherein the first and second target nucleic acid positions are on the first strand of the target nucleic acid, thereby generating ssDNA comprising the sequence of the first strand of the target nucleic acid between the first and second target nucleic acid positions.
4. The method of claim 2, comprising amplifying the target nucleic acid by: nicking the first and second strands of the target nucleic acid using the first and second CRISPR/Cas complexes and replacing and extending the nicked strands using the polymerase, thereby generating a duplex comprising the target nucleic acid sequence between the first and second nick sites.
5. The method of claim 1, wherein the Cas-based nickase is selected from the group consisting of: cas9 nickase, Cpf1 nickase, and C2C1 nickase.
6. The method of claim 2, wherein the Cas-based nickase is a Cas9 nickase protein comprising a mutation in an HNH domain.
7. The method of claim 2, wherein the Cas-based nickase is a Cas9 nickase protein, the Cas9 nickase protein comprising a mutation corresponding to N863A in SpCas9 or N580A in SaCas 9.
8. The method of claim 3 or 4, wherein the Cas-based nickase is a Cas9 protein derived from a bacterial species selected from the group consisting of: streptococcus pyogenes, Staphylococcus aureus, Streptococcus thermophilus, Streptococcus mutans, Streptococcus agalactiae, Streptococcus equisimilis, Streptococcus sanguis, and Streptococcus pneumoniae; campylobacter jejuni, campylobacter coli; salsuginis, n tergarcus; staphylococcus aureus, staphylococcus carnosus; neisseria meningitidis, neisseria gonorrhoeae; listeria monocytogenes, listeria monocytogenes; clostridium botulinum, clostridium difficile, clostridium tetani, clostridium sojae, francisella tularensis 1, prevotella easily, lachnospiraceae MC 20171, vibrio proteolyticus, isocratic bacteria GW2011_ GWA2_33_10, centipede bacteria GW2011_ GWC2_44_17, smith bacteria SCADC, aminoacetococcus BV3L6, lachnospiraceae MA2020, candidate termite methanogen, shigella, moraxella bovis 237, paddy field leptospira, lachnospiraceae bacteria ND2006, porphyromonas canicola 3, prevotella saccharolytica, and porphyromonas macaque.
9. The method of claim 2, wherein the Cas-based nickase is a Cpf1 nickase protein comprising a mutation in the Nuc domain.
10. The method of claim 6, wherein the Cas-based nickase is a Cpf1 nickase protein comprising a mutation corresponding to R1226A in AsCpf 1.
11. The method of claim 6 or 7, wherein the Cas-based nickase is a Cpf1 protein derived from a bacterial species selected from the group consisting of: lenflansii, Prevotella facilis, Despirochaetaceae, Vibrio proteolyticus, Isodomycota, thrifton, Smith, Aminococcus, Demospiromyces, Methanobacterium candidate for Termite, Shigella, Moraxella bovis, Leptospira padina, Porphyromonas canicola, Prevotella saccharolytica, Porphyromonas kiwii, Vibrio amylovorans, Prevotella saccharolytica, Flavobacterium gilophilum, Statezomucor, Pseudomomyces poricus, Eubacterium species, GenBank (Loermania), Flavobacterium species, Prevotella brevis, Moraxella capriae, Bacteroides stomatitis, Porphyromonas canicola, Intertrophomonas Otopteria, Prevotella, Anaerobacter, Vibrio fibrinolytica, Methylophilus methanophilus candidate, Vibrio butyricum, Vibrio species, Oral spore-free anaerobe species, rumen vibrio pseudobutyrate and butyric acid producing bacteria.
12. The method of claim 2, wherein the Cas-based nickase is a C2C1 nickase protein comprising a mutation in the Nuc domain.
13. The method of claim 9, wherein the Cas-based nickase is a C2C1 nickase protein and the C2C1 nickase protein comprises a mutation corresponding to D570A, E848A, or D977A in AacC2C 1.
14. The method of claim 9 or 10, wherein the Cas-based nickase is a C2C1 protein derived from a bacterial species selected from the group consisting of: acid-fast alicyclic acid bacillus, contaminated alicyclic acid bacillus, alicyclobacillus macrocephalosporans, cottoniella, cottonwood bacillus, candidate forest tree bacteria, very desulfovibrio, sulfur dismutase desulfonium saline alkali bacillus, Zygomycota bacteria RIFOXYA12, omnivora WOR _2 bacteria RIFCSPHIGHO2, Blastomycetaceae bacteria TAV5, Furomycetes bacteria ST-NAGAB-D1, Fomitomycota bacteria RBG _13_46_10, spirochete bacteria GWB1_27_13, Microbacteraceae bacteria UBA2429, Thermobacter thermoblock, Bacillus amyloliquefaciens CF112, Bacillus NSP2.1, butyrate-reducing sulfate bacillus, alicyclobacillus, Citrobacter freundii, Brevibacillus agri (e.g. BAB-2500) and Methylobacterium nodosum.
15. The method of any one of the preceding claims, wherein the first Cas-based nickase and the second Cas-based nickase are the same.
16. The method of any one of claims 1-11, wherein the first Cas-based nickase and the second Cas-based nickase are different.
17. The method of any one of the preceding claims, wherein the polymerase is selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full-length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase, Gst polymerase, Taq polymerase, Escherichia coli DNA polymerase I Klenow fragment, KlenaQ, Pol III DNA polymerase, T5 DNA polymerase, Gst polymerase and sequencer enzyme DNA polymerase.
18. The method of any one of the preceding claims, wherein amplification of the target nucleic acid is performed at about 50-59 ℃.
19. The method of any one of claims 1-14, wherein amplification of the target nucleic acid is performed at about 60-72 ℃.
20. The method of any one of claims 1-14, wherein amplification of the target nucleic acid is performed at about 37 ℃ or about 65 ℃.
21. The method of any one of claims 1-14, wherein amplification of the target nucleic acid is performed at a constant temperature.
22. The method of any one of the preceding claims, wherein the target nucleic acid sequence is about 20-30, about 30-40, about 40-50, or about 50-100 nucleotides in length.
23. The method of any one of claims 1-18, wherein the target nucleic acid sequence is about 100-200, about 100-500, or about 100-1000 nucleotides in length.
24. The method of any one of claims 1-18, wherein the length of the target nucleic acid sequence is about 1000-2000 nucleotides, about 2000-3000 nucleotides, about 3000-4000 nucleotides or about 4000-5000 nucleotides.
25. The method of any one of the preceding claims, wherein the first primer or the second primer comprises an RNA polymerase promoter.
26. The method of any one of the preceding claims, further comprising detecting the amplified nucleic acid by a method selected from the group consisting of: gel electrophoresis, intercalating dye detection, PCR, real-time PCR, Fluorescence Resonance Energy Transfer (FRET), mass spectrometry, and CRISPR-SHERLOCK.
27. The method of claim 23, wherein the amplified nucleic acid is detected by a Cas 13-based CRISPR-SHERLOCK method.
28. The method of any one of the preceding claims, wherein the target nucleic acid is detected with attomole sensitivity.
29. The method of any one of claims 1-24, wherein the target nucleic acid is detected with femtomolar sensitivity.
30. The method of any one of the preceding claims, wherein the target nucleic acid is selected from the group consisting of: genomic DNA, mitochondrial DNA, viral DNA, plasmid DNA, and synthetic double-stranded DNA.
31. The method of any one of the preceding claims, wherein the sample is a biological sample or an environmental sample.
32. The method of claim 28, wherein the biological sample is blood, plasma, serum, urine, stool, sputum, mucus, lymph, synovial fluid, bile, ascites, pleural effusion, seroma, saliva, cerebrospinal fluid, aqueous or vitreous fluid, or any bodily secretion, exudate, or fluid obtained from a joint, or a swab of a skin or mucosal surface.
33. The method of claim 29, wherein the sample is blood, plasma, or serum obtained from a human patient.
34. The method of claim 28, wherein the sample is a plant sample.
35. The method of any one of the preceding claims, wherein the sample is a crude sample.
36. The method of any one of claims 1-31, wherein the sample is a purified sample.
37. A method for amplifying and/or detecting a target single-stranded nucleic acid, the method comprising:
(a) converting the single-stranded nucleic acid in the sample into a target double-stranded nucleic acid; and
(b) the steps of claim 1 are performed.
38. The method of claim 34, wherein the target single-stranded nucleic acid is an RNA molecule.
39. The method of claim 35, wherein the RNA molecule is converted to the double-stranded nucleic acid by a reverse transcription and amplification step.
40. The method of claim 34, wherein the target single-stranded nucleic acid is selected from the group consisting of: single-stranded viral DNA, viral RNA, messenger RNA, ribosomal RNA, transfer RNA, microrna, short interfering RNA, microrna, synthetic RNA, and synthetic single-stranded DNA.
41. A system for amplifying and/or detecting a target double-stranded nucleic acid in a sample, the system comprising:
e) an amplifying CRISPR system comprising a first CRISPR/Cas complex and a second CRISPR/Cas complex, the first CRISPR/Cas complex comprising a first Cas-based nickase and a first guide molecule that directs the first CRISPR/Cas complex to a first strand of the target nucleic acid, and the second CRISPR/Cas complex comprising a second Cas-based nickase and a second guide molecule that directs the second CRISPR/Cas complex to a second strand of the target nucleic acid;
f) a polymerase;
g) a primer pair comprising a first primer and a second primer of the reaction mixture, the first primer comprising a portion complementary to the first strand of the target nucleic acid and a portion comprising a binding site for the first guide molecule, and the second primer comprising a portion complementary to the second strand of the target nucleic acid and a portion comprising a binding site for the second guide molecule; and optionally
h) A detection system for detecting amplification of the target nucleic acid.
42. The system of claim 38, wherein the Cas-based nickase is selected from the group consisting of: cas9 nickase, Cpf1 nickase, and C2C1 nickase.
43. The system of claim 38 or 39, wherein the polymerase is selected from the group consisting of: bst 2.0DNA polymerase, Bst 2.0WarmStart DNA polymerase, Bst 3.0DNA polymerase, full length Bst DNA polymerase, large fragment Bsu DNA polymerase, phi29 DNA polymerase, T7 DNA polymerase and sequencer enzyme DNA polymerase.
44. The system of any one of claims 38-40, wherein the Cas-based nickase and the polymerase are performed at the same temperature.
45. A system for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the system comprising:
c) an agent for converting the target single-stranded nucleic acid into a double-stranded nucleic acid;
d) the composition of claim 38.
46. A kit for amplifying and/or detecting a target double-stranded nucleic acid in a sample, the kit comprising the components of claim 38 and a set of instructions for use.
47. The kit of claim 43, further comprising reagents for purifying the double stranded nucleic acid in the sample.
48. A kit for amplifying and/or detecting a target single-stranded nucleic acid in a sample, the kit comprising the components of claim 43 and a set of instructions for use.
49. The kit of claim 4, further comprising reagents for purifying the single stranded nucleic acid in the sample.
CN201980055278.XA 2018-06-26 2019-06-26 Amplification compositions, systems, and methods based on CRISPR double nickases Pending CN112639121A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201862690278P 2018-06-26 2018-06-26
US62/690,278 2018-06-26
US201862767059P 2018-11-14 2018-11-14
US62/767,059 2018-11-14
PCT/US2019/039221 WO2020006067A1 (en) 2018-06-26 2019-06-26 Crispr double nickase based amplification compositions, systems, and methods

Publications (1)

Publication Number Publication Date
CN112639121A true CN112639121A (en) 2021-04-09

Family

ID=67297327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980055278.XA Pending CN112639121A (en) 2018-06-26 2019-06-26 Amplification compositions, systems, and methods based on CRISPR double nickases

Country Status (12)

Country Link
US (1) US20210207203A1 (en)
EP (1) EP3814520A1 (en)
JP (1) JP2021528091A (en)
KR (1) KR20210024010A (en)
CN (1) CN112639121A (en)
AU (1) AU2019291827A1 (en)
BR (1) BR112020026246A2 (en)
CA (1) CA3102211A1 (en)
IL (1) IL278963A (en)
MX (1) MX2020013461A (en)
SG (1) SG11202012785VA (en)
WO (1) WO2020006067A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021247870A1 (en) * 2020-06-03 2021-12-09 Siphox, Inc. Cascading amplification for chemical and biosensing
CN111733216B (en) * 2020-06-22 2023-03-28 山东舜丰生物科技有限公司 Method for improving detection efficiency of target nucleic acid
CN112831544B (en) * 2020-12-31 2024-06-14 华南农业大学 Biological detection method and biological detection device based on CRISPR/Cas12a system
CN113186253B (en) * 2021-04-27 2022-06-21 福州大学 Cas12a-DNAzyme sensor for detecting Lewy body disease marker and preparation method thereof
EP4373963A2 (en) 2021-07-21 2024-05-29 Montana State University Nucleic acid detection using type iii crispr complex

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090017453A1 (en) * 2007-07-14 2009-01-15 Maples Brian K Nicking and extension amplification reaction for the exponential amplification of nucleic acids
WO2016205749A1 (en) * 2015-06-18 2016-12-22 The Broad Institute Inc. Novel crispr enzymes and systems
CN107075546A (en) * 2014-08-19 2017-08-18 哈佛学院董事及会员团体 For the system to nuclei acid probe and the RNA guiding mapped
WO2017189308A1 (en) * 2016-04-19 2017-11-02 The Broad Institute Inc. Novel crispr enzymes and systems
US20170332610A1 (en) * 2016-05-20 2017-11-23 Regeneron Pharmaceuticals, Inc. Methods for breaking immunological tolerance using multiple guide rnas
CN107488710A (en) * 2017-07-14 2017-12-19 上海吐露港生物科技有限公司 A kind of purposes of Cas albumen and the detection method and kit of target nucleic acids molecule

Family Cites Families (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US715640A (en) 1902-09-10 1902-12-09 Whitney Mfg Company Clutch mechanism.
US5541099A (en) 1989-08-10 1996-07-30 Life Technologies, Inc. Cloning and expression of T5 DNA polymerase reduced in 3'-to-5' exonuclease activity
US6555349B1 (en) 1993-01-22 2003-04-29 Cornell Research Foundation, Inc. Methods for amplifying and sequencing nucleic acid molecules using a three component polymerase
NZ520579A (en) 1997-10-24 2004-08-27 Invitrogen Corp Recombinational cloning using nucleic acids having recombination sites and methods for synthesizing double stranded nucleic acids
WO2008149176A1 (en) 2007-06-06 2008-12-11 Cellectis Meganuclease variants cleaving a dna target sequence from the mouse rosa26 locus and uses thereof
US9567573B2 (en) 2010-04-26 2017-02-14 Sangamo Biosciences, Inc. Genome editing of a Rosa locus using nucleases
US9096897B2 (en) * 2012-04-09 2015-08-04 Envirologix Inc. Compositions and methods for quantifying a nucleic acid sequence in a sample comprising a primer oligonucleotide with a 3′-terminal region comprising a 2′-modified nucleotide
RU2721275C2 (en) 2012-12-12 2020-05-18 Те Брод Инститьют, Инк. Delivery, construction and optimization of systems, methods and compositions for sequence manipulation and use in therapy
US20140255928A1 (en) * 2013-03-11 2014-09-11 Elitech Holding B.V. Methods for true isothermal strand displacement amplification
AU2014281027A1 (en) 2013-06-17 2016-01-28 Massachusetts Institute Of Technology Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation
WO2016149661A1 (en) 2015-03-18 2016-09-22 The Broad Institute, Inc. Massively parallel on-chip coalescence of microemulsions
US9790490B2 (en) 2015-06-18 2017-10-17 The Broad Institute Inc. CRISPR enzymes and systems
CA3012631A1 (en) 2015-06-18 2016-12-22 The Broad Institute Inc. Novel crispr enzymes and systems
CA3024543A1 (en) 2015-10-22 2017-04-27 The Broad Institute, Inc. Type vi-b crispr enzymes and systems
WO2017106657A1 (en) 2015-12-18 2017-06-22 The Broad Institute Inc. Novel crispr enzymes and systems
JP6914274B2 (en) 2016-01-22 2021-08-04 ザ・ブロード・インスティテュート・インコーポレイテッド Crystal structure of CRISPRCPF1
EP3445848A1 (en) 2016-04-19 2019-02-27 The Broad Institute, Inc. Novel crispr enzymes and systems
AU2017253107B2 (en) 2016-04-19 2023-07-20 Massachusetts Institute Of Technology CPF1 complexes with reduced indel activity
CA3028158A1 (en) 2016-06-17 2017-12-21 The Broad Institute, Inc. Type vi crispr orthologs and systems
WO2018035388A1 (en) 2016-08-17 2018-02-22 The Broad Institute, Inc. Novel crispr enzymes and systems
WO2018035387A1 (en) 2016-08-17 2018-02-22 The Broad Institute, Inc. Novel crispr enzymes and systems
JP7228514B2 (en) 2016-12-09 2023-02-24 ザ・ブロード・インスティテュート・インコーポレイテッド CRISPR effector system-based diagnostics
WO2018170340A1 (en) 2017-03-15 2018-09-20 The Broad Institute, Inc. Crispr effector system based diagnostics for virus detection
US11739308B2 (en) 2017-03-15 2023-08-29 The Broad Institute, Inc. Cas13b orthologues CRISPR enzymes and systems
US11174515B2 (en) 2017-03-15 2021-11-16 The Broad Institute, Inc. CRISPR effector system based diagnostics
US11104937B2 (en) 2017-03-15 2021-08-31 The Broad Institute, Inc. CRISPR effector system based diagnostics
US11618928B2 (en) 2017-04-12 2023-04-04 The Broad Institute, Inc. CRISPR effector system based diagnostics for malaria detection
BR112019021378A2 (en) 2017-04-12 2020-05-05 Massachusetts Inst Technology innovative crispr type vi orthologs and systems
US20210121280A1 (en) 2017-04-16 2021-04-29 Sanford Health Filter for Stent Retriever and Methods for Use Thereof
JP7364472B2 (en) 2017-05-18 2023-10-18 ザ・ブロード・インスティテュート・インコーポレイテッド Systems, methods, and compositions for targeted nucleic acid editing
WO2019005866A1 (en) 2017-06-26 2019-01-03 The Broad Institute, Inc. Novel type vi crispr orthologs and systems
CN109209763B (en) 2017-07-06 2019-11-29 北京金风科创风电设备有限公司 Blade pitch changing device and method for wind generating set and wind generating set

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090017453A1 (en) * 2007-07-14 2009-01-15 Maples Brian K Nicking and extension amplification reaction for the exponential amplification of nucleic acids
CN107075546A (en) * 2014-08-19 2017-08-18 哈佛学院董事及会员团体 For the system to nuclei acid probe and the RNA guiding mapped
WO2016205749A1 (en) * 2015-06-18 2016-12-22 The Broad Institute Inc. Novel crispr enzymes and systems
US20180320163A1 (en) * 2015-06-18 2018-11-08 The Broad Institute Inc. Novel crispr enzymes and systems
WO2017189308A1 (en) * 2016-04-19 2017-11-02 The Broad Institute Inc. Novel crispr enzymes and systems
US20170332610A1 (en) * 2016-05-20 2017-11-23 Regeneron Pharmaceuticals, Inc. Methods for breaking immunological tolerance using multiple guide rnas
CN107488710A (en) * 2017-07-14 2017-12-19 上海吐露港生物科技有限公司 A kind of purposes of Cas albumen and the detection method and kit of target nucleic acids molecule

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUI YANG等: "PAM-dependent target DNA recognition and Cleavage by C2c1 CRISPR-CAS endonuclease", 《CELL》, vol. 167, no. 7, pages 7 *
JONATHAN S GOOTENBERG等: "Nucleic acid detection with CRISPR-Cas13a/C2c2", 《SCIENCE》, vol. 356, no. 6336, XP055781069, DOI: 10.1126/science.aam9321 *

Also Published As

Publication number Publication date
JP2021528091A (en) 2021-10-21
US20210207203A1 (en) 2021-07-08
IL278963A (en) 2021-01-31
BR112020026246A2 (en) 2021-04-20
WO2020006067A1 (en) 2020-01-02
CA3102211A1 (en) 2020-01-02
MX2020013461A (en) 2021-04-28
AU2019291827A1 (en) 2020-12-24
KR20210024010A (en) 2021-03-04
EP3814520A1 (en) 2021-05-05
SG11202012785VA (en) 2021-01-28

Similar Documents

Publication Publication Date Title
CN112543812A (en) Amplification methods, systems and diagnostics based on CRISPR effector systems
CN112639121A (en) Amplification compositions, systems, and methods based on CRISPR double nickases
EP3765616B1 (en) Novel crispr dna and rna targeting enzymes and systems
CA3106035A1 (en) Cas12b enzymes and systems
CN112041444A (en) Novel CRISPR DNA targeting enzymes and systems
US20220154258A1 (en) Crispr effector system based multiplex diagnostics
US20220333208A1 (en) Crispr effector system based multiplex cancer diagnostics
EA038500B1 (en) THERMOSTABLE Cas9 NUCLEASES
CN105408497A (en) Using truncated guide rnas (tru-grnas) to increase specificity for rna-guided genome editing
CN109844113B (en) Scalable biotechnological production of sequence and length-defined DNA single-stranded molecules
CN114207145A (en) Type III CRISPR/Cas based diagnostics
US20210147915A1 (en) Crispr/cas and transposase based amplification compositions, systems and methods
JP2020517299A (en) Site-specific DNA modification using a donor DNA repair template with tandem repeats
Qu et al. Group II intron inhibits conjugative relaxase expression in bacteria by mRNA targeting
CA3093580A1 (en) Novel crispr dna and rna targeting enzymes and systems
EP4121532A1 (en) Crispr system high throughput diagnostic systems and methods
CN113302312A (en) Multiplexing of highly evolved virus variants using the SHERLock detection method
Yang et al. Antisense RNA elements for downregulating expression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination