WO2019067011A1 - Programmed modulation of crispr/cas9 activity - Google Patents

Programmed modulation of crispr/cas9 activity Download PDF

Info

Publication number
WO2019067011A1
WO2019067011A1 PCT/US2018/016231 US2018016231W WO2019067011A1 WO 2019067011 A1 WO2019067011 A1 WO 2019067011A1 US 2018016231 W US2018016231 W US 2018016231W WO 2019067011 A1 WO2019067011 A1 WO 2019067011A1
Authority
WO
WIPO (PCT)
Prior art keywords
cas9
crispr
sequence
gene
pgf
Prior art date
Application number
PCT/US2018/016231
Other languages
French (fr)
Inventor
Gregory FINNIGAN
Original Assignee
Kansas State University Research Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kansas State University Research Foundation filed Critical Kansas State University Research Foundation
Publication of WO2019067011A1 publication Critical patent/WO2019067011A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/905Stable introduction of foreign DNA into chromosome using homologous recombination in yeast
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/095Fusion polypeptide containing a localisation/targetting motif containing a nuclear export signal

Abstract

Modifications to the CRISPR-Cas gene editing and gene drive systems are described for inhibiting the effectiveness of the gene editing system, such as by including a regulatory element that reduces expression of the CRISPR nuclease in the cell, or by inducing one or more base pair mismatches in the single guide RNA to reduce specificity of the system to the target sequence, or by including one or more nuclear export signal sequences to reduce accumulation of the CRISPR nuclease in the nucleus of the cell, as well as by inducing competition between the CRISPR nuclease by inclusion of a secondary functional or dead CRISPR nuclease in the system. Anti- CRISPR peptides that inhibit the activity of the CRISPR nuclease are also described.

Description

PROGRAMMED MODULATION OF CRISPR/CAS9 ACTIVITY
CROSS-REFERENCE TO RELATED APPLICATIONS The present application claims the priority benefit of U.S. Provisional Patent Application Serial No. 62/565,651, filed September 29, 2017, entitled PROGRAMMED MODULATION OF CRISPR/CAS9 ACTIVITY, incorporated by reference in its entirety herein.
FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with government support under P20 GM103418 awarded by the National Institutes of Health. The government has certain rights in the invention.
SEQUENCE LISTING
The following application contains a sequence listing in computer readable format (CRF), submitted as a text file in ASCII format entitled "Sequence Listing," created on January 31, 2018, as 25KB. The content of the CRF is hereby incorporated by reference.
BACKGROUND OF THE INVENTION
Field of the Invention
The present invention relates to modified CRISPR-based gene editing and gene drive systems.
Description of Related Art
The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) gene editing system has emerged with the potential to revolutionize industries including basic research, biomedical application, and uses in agriculture. The discovery and rapid expansion of CRISPR- based gene editing across all fields of molecular biology has led to many unique applications of this biotechnology. These range from "traditional" editing (deletion, modification, or replacement of genomic loci), to genome-wide screens, to repurposing of dCas9 to modulate gene transcription, to imaging of chromosome dynamics in real-time.
Arguably one of the most powerful (and ethically concerning) uses of CRISPR is within the nuclease-based "gene drive." CRISPR-associated protein-9 nuclease (Cas9) from Streptococcus pyogenes is a particularly well-studied arrangement of this system. The gene drive consists of the Cas9 nuclease, a single-guide RNA, and genomic DNA. Once expressed and activated, the RNA-loaded Cas9 targets the chromosomal DNA and creates a double stranded break (DSB). Upon induction of a DSB within a diploid cell, the homologous chromosome pair serves as the DNA "donor" and, via homology directed repair (HDR), will copy the entire gene drive (Cas9 and its sgRNA) to the second chromosome, modifying the endogenous sequence in the process. Example modifications include inserting or installing the gene drive "cargo" via the DSB at the desired genomic location or even deleting and replacing the entire endogenous copy of the gene. However, action and copying of Cas9 and guide RNA remain the same for both versions. In any event, this process "forces" the heterozygous individual into the homozygous state, ensuring rapid propagation through a population. The technology allows targeted bypass and editing of genetic material (in the form or one or more genes). The CRISPR-based gene drive bypasses traditional genetic actions and " force s/drives" a predetermined genetic element into a population at a very high rate of speed and penetrance. Indeed, early studies in the laboratory have demonstrated the unnatural power of a gene drive system to impose strong selection and over a 95% reduction in a population in only a few generations.
CRISPR-based systems have been proposed as a technique to control pest populations, and are currently being studied in model systems including mosquitos for their role as hosts for pathogens including malaria. In some cases, such as insects that harbor human pathogens (e.g. malaria), elimination or reduction of a small number of species would have a dramatic impact across the globe. Destruction of critical genes can cause a reduction in organism breeding, spread, and population size. This can cause a dramatic decrease in population levels of key insects, pests, or pathogens in wild populations.
However, there are logistical, technical, and ethical considerations for utilizing this very new technology in real applications. Recent studies have demonstrated the ability of a gene drive to rapidly spread within and nearly eliminate insect populations in a laboratory setting. Further, given the incredibly strong selective pressure on a population, insect species have been shown to evolve resistance to the gene drive. In addition, wild populations may naturally have a diverse set of polymorphisms within a given gene target (and thus provide a "native" source to evade a gene drive). Moreover, there is ongoing competition between HR-based propagation of the drive and repair of the double-stranded break via non-homologous end-joining.
Before deployment of this powerful biological agent, we must consider mechanisms to control the unintentional, accidental, or malicious introduction of CRISPR-based genetic drive systems into native ecological systems, and potential unintended consequences of extinction- level population control of a given species, as well as naturally evolved resistance to gene drives that may render the CRISPR approach ineffective in the future.
SUMMARY OF THE INVENTION
The present invention is broadly concerned with four independent approaches, illustrated in Fig. 1 (Cas9 protein level, sgRNA identity, NLS/NES shuttling of Cas9, dCas9 fusion competition) to modulate CRISPR/Cas9 activity, as well as the use of anti-CRISPR proteins. The techniques can be applied to traditional gene editing technologies in any cell type that relies on RNA-guided nuclease activity, as well as to gene drive systems in diploid (or polyploid) organism s/cells. Across a population, each of these mechanisms may result in a proportion of individuals that achieve proper activity and copying of the gene drive system; other individuals will be unable to propagate the drive in a Super-Mendelian fashion (right). Together, these systems may be used to titrate a specific success (propagation) rate for a CRISPR gene drive within a population.
Embodiments include transcriptional regulation of Cas9, wherein titration of Cas9 protein level using an inducible promoter causes a range of gene drive activities within a population given the amount of induced transcript/protein of the nuclease.
Embodiments also include modification of the guide RNA to reduce drive activity. Embodiments also include nucleic acid constructs using nuclear localization signals (NLS) and/or nuclear export signals (NES), with combinations of NLS and NES sequences, mutation of such sequences, adjusted positioning of the sequences (N-terminus of protein, C- terminus, etc.) to modulate nuclear shuttling of Cas9 protein.
Embodiments also include dead Cas9 (dCas9) competition, including a fusion of identical Cas9 and dCas9 as tandem (single) protein; use of other orthologs (other species) in fusion; separation of dCas9 and active Cas9 but target both (sgRNAs have identical information) to identical position to compete with each other. Cas9 and dCas9 systems can be created using either identical species, or different species. Likewise, separation of dCas9 and active Cas9 to two different promoters to allow for titration of one over the other.
Embodiments also include the use of AcrIIA2/AcrIIA4 anti-CRISPR proteins to inhibit drives or modulate/titrate activity, including mutation of anti-CRISPR protein sequence; use of biochemical or cellular tags (NLS sequence, or GFP) to reduce A2/A4 effectiveness at interacting/inhibiting Cas9; temporal expression control - differential expression (Cas9 on first, inhibitor on second), lower promoter for inhibitor, etc.; and spatial control - placement or trafficking of inhibitor into nucleus, or other cellular compartment/structure to limit or regulate access to interaction with Cas9.
Combinations of one or more of the foregoing approaches can also be used in to create a near limitless ability to finely tune the activity of the gene editing system.
It will be appreciated that these techniques would be useful for Agricultural pests (plant, fungal, or animal (insect) eradication; Human health - removal of disease carrying hosts (e.g. plant, fungal, or animal—mainly insect); Invasive species (plant, fungal, animal); Ecological control/programming (restoration of ecological balance, removal of pests, invasive species, etc.); Basic biological research, biomedical research - aids in construction of genetically modified (homozygous diploid) strains or individuals; Safeguard to accidental or malicious release of gene drive bio-agent (e.g., defense/countermeasure); and detach a pathogen from the host (disrupt some key factor so that the mosquito does not harbor malarial parasite any more), rather than kill the host itself.
The ability to insert pre-programmed genetic information into an entire population (gene drive used as a "cargo" delivery system) has numerous advantages. This technology means that gene editing does not have to involve eradication or reduction in population size - but instead can serve to insert genetic material, modify, alter, change, or pre-program and install genetic "software" into a given species to a) provide protection, immune response, b) provide resistance to foreign chemical or agent, c) improve the species (resistant to disease, pathogen— e.g., bees and pollinators), or make them more susceptible to other agents (removal resistant strains) and "sensitize" the population. This could include priming populations with the ability to combat climate change - make the species more able to tolerate higher temperatures, higher salinity, higher acidity, etc.
In one aspect, a modified CRISPR-Cas gene drive system is described herein. The gene drive is configured for integration into a diploid eukaryotic cell genome at a target site and comprises a gene drive construct comprising a first nucleotide sequence encoding a single guide RNA sequence complementary to the endogenous target site; a second nucleotide sequence encoding for a functional CRISPR nuclease that induces a double-stranded break in or near the target site; and a pair of endogenous flanking sequences homologous to sequences adjacent the target site of integration. The sgRNA and CRISPR nuclease coding sequences are typically on the same vector and located between the pair of flanking sequences in the construct. Advantageously, the gene drive system further comprises one or more modifications that inhibit activity of the gene drive in the cell. One modification includes a regulatory element operably linked to the second nucleotide sequence that reduces expression of the functional CRISPR nuclease in the cell, such that the nuclease activity of the enzyme is reduced. One modification includes one or more base pair mismatches in the single guide RNA to reduce specificity of the system to the target site. One modification includes the includes of one or more nuclear export signal sequences to reduce accumulation of the functional CRISPR nuclease in the nucleus of the cell. One modification includes a secondary CRISPR nuclease expressed in the cell that competes with the functional CRISPR nuclease for binding to the target site. Combinations of these modifications can be included in the system, along with one or more anti-CRISPR peptides.
Also described herein are methods of integrating a gene drive into a eukaryotic cell genome at a target site. The methods generally comprise introducing into the eukaryotic cell a modified CRISPR-Cas gene drive system according to the various embodiments described herein under conditions where the gene drive is expressed in the cell to produce functional CRISPR nuclease and single guide RNA, which co-localize in the cell at the endogenous target site on a first chromosome, so that the CRISPR nuclease can induce a double-stranded break at the site, wherein homology-directed repair mediated by the flanking sequences integrates the gene drive construct into (or in replacement of) the target site.
Methods of altering genetic sequences of a genome at a target site in a target population of a sexually-reproducing species are also contemplated herein. The methods generally comprise integrating a modified CRISPR-Cas gene drive system according to the various embodiments described herein into a genome of a member of the population. The modified system has at least 5% lower effectiveness in integrating into a genome of subsequent members of the target population as compared to a system introduced into the population without one or more of the proposed modifications.
Also described herein is a modified CRISPR-Cas gene editing system for alteration of genetic sequences in a eukaryotic cell containing a DNA molecule having a target sequence and encoding a gene product. The system comprises one or more vectors comprising a first nucleotide sequence encoding a single guide RNA sequence complementary to the target sequence in said eukaryotic cell; and a second nucleotide sequence encoding for a functional CRISPR nuclease that induces a double-stranded break in or near the target sequence thereby altering the genetic sequence. Advantageously, the gene editing system also includes one or more modifications to inhibit the effectiveness of the alteration of the genetic sequence in the cell. One modification includes a regulatory element operably linked to the second nucleotide sequence that reduces expression of the functional CRISPR nuclease in the cell, such that the nuclease activity of the enzyme is reduced. One modification includes one or more base pair mismatches in the single guide RNA to reduce specificity of the system to the target sequence. One modification includes the includes of one or more nuclear export signal sequences to reduce accumulation of the functional CRISPR nuclease in the nucleus of the cell. One modification includes a secondary CRISPR nuclease expressed in the cell that competes with the functional CRISPR nuclease for binding to the target sequence. Combinations of these modifications can be included in the system, along with one or more anti-CRISPR peptides.
Methods of altering a genetic sequence in a eukaryotic cell containing a DNA molecule having a target sequence and encoding a gene product are also described. The methods generally comprise introducing or expressing in the eukaryotic cell a modified CRISPR-Cas gene editing system according to the various embodiments described herein.
Embodiments of the invention are also concerned with eukaryotic host cells comprising a CRISPR-Cas gene drive system or gene editing system according to the various embodiments described herein. Organisms comprising such modified or altered (engineered) cells are also contemplated.
Also described herein are methods of inhibiting a CRISPR-Cas gene editing or gene drive system, wherein said system has been introduced into a eukaryotic cell, said method comprising introducing or expressing an anti-CRISPR protein comprising at least a portion of an amino acid sequences selected from the group consisting of AcrIIA2 (SEQ ID NO:2), AcrIIA4 (SEQ ID NO:3), mutant variants thereof, and combinations thereof into the eukaryotic cell.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure (Fig.) 1 is a schematic illustration of the model for four independent mechanisms for titration of Cas9-based gene drive activity.
Fig. 2A is a schematic illustration of vectors and system for analyzing CRISPR editing in an artificial yeast system.
Fig. 2B is a graph demonstrating that Cas9-dependent editing results in cell inviability.
Fig. 2C is an from diagnostic PCR of the HIS3 locus after editing.
Fig. 2D shows sequence analysis of insertions and deletions at "repaired' ul sites following Cas9 editing.
Fig. 3A shows quantification and images of colonies following dextrose or galactose treatment. Fig. 3B(B) shows sequence analysis of colonies from Fig. 3A after dextrose or galactose treatment.
Fig. 4A shows select sgRNA variants (left) tested in the yeast editing system, along with quantification of colonies (middle) and representative plates (right).
Fig. 4B shows various sgRNA variants and quantification of colonies and comparison of the 19 bp (1 mismatch) embodiment using an unpaired t-test.
Fig. 5 (A) shows quantification of colonies from haploid editing with the modified 19-bp sgRNA (1 mismach); and (B) representative plates for gene drive function with the modified 19- bp sgRNA.
Fig. 6. is a graph of data examining editing with GFP-tagged and catalytically dead Cas 9 in haploid yeast, confirming that GFP tagging does not interfere with editing ability.
Fig. 7A illustrates vectors used to examine nucleocytoplasmic shuttling of Cas9 to control gene editing and quantification of colonies.
Fig. 7B provides comparisons of the colony counts between two strains from 7A analyzed using an unpaired t-test. Red text indicates p-values larger than 0.05.
Fig. 7C contains images of Cas9-eGFP fusions integrated into the yeast genome at the HIS3 locus in a strain expressing an endogenously tagged Nupl88-mCherry to mark the nuclear periphery.
Fig. 8A illustrates the editing percentage in haploid yeast using a combination of sgRNA mismatch and Cas9 nuclear localization.
Fig. 8B shows the data from Fig. 8A displayed from lowest to highest editing percentage (left) or as a histogram with 10% binning categories (right).
Fig. 8C shows select comparisons (sgRNA(ul) 19 WT versus 19 with 1 mismatch) between editing percentages analyzed using an unpaired t-test. Red text, p-values greater than 0.05.
Fig. 9A is a schematic illustration of a programmable artificial gene drive for diploid yeast cells.
Fig. 9B contains images of representative colonies on SD-LEU and SD-HIS plates over time.
Fig. 9C is a graph showing drive activity over time when grown in galactose, causing higher expression of Cas9.
Fig. 9D are images from diagnostic PCR of the HIS3 locus of a random sampling of clonal isolates from SD-LEU plates. Fig. 10 shows (A) images of representative plates from colonies on SD-LEU and SD-HIS plates; (B) isolates analyzed at each time point; and (C) tested for (i) ploidy status, (ii) G418 resistance, and (iii) growth on SD-HIS showing correlation between phenotype (loss of marker) following gene drive action and genotype (HIS3 locus).
Fig. 11 shows sgRNA variants tested and the percentage of yeast colonies with an active gene drive as quantified.
Fig. 12 A shows images of representative plates of colonies edited with the Cas9-eGFP fusions containing various LS/NES combinations.
Fig. 12B shows two-way comparisons between strains from Fig. 12A using an unpaired t-test. Red text, p-values greater than 0.05. Asterisk, the collective average of all three strains was used for comparisons.
Fig. 12C shows the data from Fig. 12A was reordered from least to greatest percentage of active gene drive (top), and presented in a histogram with 10% binning categories (bottom).
Fig. 13 shows data confirming partial editing by the Cas9-eGPF fusions containing LS and/or ES signals, based upon (A) colony count; (B) gene drive activity, as well as (C) imaging of these colonies.
Fig. 14A contains a schematic illustration of tandem Cas9 fusion design.
Fig. 14B quantifies colonies edited with the Cas9 fusion designs.
Fig. 14C shows images of SD-URA-LEU and SD-HIS plates of colonies integrated with the Cas9 fusion designs.
Fig. 14D shows data comparing the strains at different time points and drive activity. Fig. 15 shows images and quantification of colonies demonstrating rapid loss of sgRNA(ul) plasmid from diploid yeast in the absence of any selective pressure.
Fig. 16A is a schematic illustration of five independent approaches to activate self- excision of Cas9 from the genome using the (u2) sites and a plasmid expressing the sgRNA(u2) guide sequence
Fig. 16B shows a graph of the give different approaches, representative plates, and colony counts.
Fig. 17A (A) Schematic of the yeast Cas9 expression platform at the endogenous HIS3 locus.
Fig. 17B shows a table of sgRNA[u2] plasmid (A, pGF-V809) or an empty vector control (B; pRS425), or co-transformed with a PCR fragment (C; WT HIS3 ORF with 1,000 bp of flanking 5' and 3' UTR). The total number of surviving colonies was quantified and graphed on a logio scale.
Fig. 18A shows schematic illustration of three scenarios of gene drive activity involving AcrIIA2 and AcrIIA4 in yeast.
Fig. 18B shows representative places of colonies edited using CRISP-Cas9 editing and either A2 or A4 anti-CRISPR proteins.
Fig. 18C show the total number of surviving colonies quantified for each plate.
Fig. 18D shows the images from diagnostic PCRs on the colonies.
Fig. 19A shows images from inhibition of S. pyogenes Cas9 editing through in vivo expression of AcrIIA2 and AcrIIA4.
Fig. 19B shows the select comparisons between experimental conditions that were analyzed using an unpaired t-test
Fig. 20 shows images of example editing plates (S. pyogenes) for SD-URA-LEU (repair via HEJ) or SD-HIS (repair via FIDR) after 3-5 days of incubation at 30°C. Empty Vector, pRS425. Plasmid harboring sgRNA[u2], pGF-V809. HIS3 PCR includes approximately 1,000 bp of 5' and 3' UTR.
Fig. 21 shows data demonstrating that small deletions of the AcrIIA2 and AcrIIA4 proteins are not tolerated in vivo and disrupt the ability to inhibit S. pyogenes Cas9 editing..
Fig. 22A show the results of an unbiased alanine scan of the AcrIIA2 protein and effects on inhibition of S. pyogenes Cas9 editing.
Fig. 22B show the results of an unbiased alanine scan of the AcrIIA4 protein and effects on inhibition of S. pyogenes Cas9 editing.
Fig. 23A shows the results of mutational analysis of the Cas9 inhibitory function of AcrIIA2 in an active gene drive system.
Fig. 23B shows the results of mutational analysis of the Cas9 inhibitory function of AcrIIA4 in an active gene drive system.
Fig. 23C shows the crystal structure of the AcrIIA4 protein bound to Cas9/sgRNA (PDB
5XBL).
Fig. 24A shows images of colonies in vivo dCas9 association assays.
Fig. 24B shows the data for localization of A4 to the plasma membrane or cytosol for different strains.
Fig. 25 shows images demonstrating that GFP -tagged AcrIIA2 and AcrIIA4 proteins are not recruited to the plasma membrane by mCherry or the LactC2 domain.
Fig. 26A shows a gene drive system harboring an inducible AcrIIA2/A4 within the drive cassette for temporal control of anti-CRISPR expression, which can modulate the overall activity of a nuclease-based gene drive.
Fig. 26B illustrates five distinct growth conditions tested (labeled 1-5) altering the order of either Cas9 induction, AcrIIA2/A4 induction, or control conditions and images of diploids plated on SD-LEU or SD-HIS medium as previously described (500-1000 cells per plate), along with quantification of percentage of drive activity.
DETAILED DESCRIPTION
The present invention is concerned with various genetic safety mechanisms to program, control, or inhibit CRISPR activity, such as through pre-programming or inhibiting the activity level of the gene drive for the CRISPR expression system. The invention includes multiple independent approaches to regulate, control, titer, optimize, and/or inhibit gene drive systems, that can be used alone or in combination to provide a specific level of desired activity. This technology is currently demonstrated in model yeast systems, but can be applied in various eukaryotic systems including animals, insects, fungi, and plants, which may lead to the eradication of pests, pathogens, as well as removal of accidentally released gene drive systems from the environment. The invention can be used to develop safe, ethically, and environmentally conscientious CRISPR-based gene drive systems with multiple points of regulation, control, or self-eradication. Again, the technology can also address accidental (or malicious) release of a gene-drive system into the wild, and aid other researchers in safely implementing CRISPR-drive systems as a useful genetic tool for basic or applied research.
In some embodiments, described herein are "sub-lethal" CRISPR-based gene editing and gene drive arrangements that are tunable for a desired level of activity and/or effectiveness in the target population. Described herein are gene editing expression vectors, cassettes, plasmids, constructs, gene packages, and the like for programmed gene editing systems, and specifically CRISPR-based systems. Methods of using the same, and methods of modulating gene editing systems are also described. Embodiments of the invention can be used for titrated or controlled modulation of gene editing or gene drive systems in a target pest species to reduce pest populations. Embodiments of the invention can also be used for general research and study related to a variety of gene editing technologies, including therapeutic approaches in humans and animals, agricultural technology (both plant and pest), and the like.
The programmed regulation of gene editing or gene drive activity is achieved through modification of one or more of four conserved components of all CRISPR-based drives and demonstrate the ability of each drive component— Cas9 protein level, sgRNA specificity, Cas9 nucleocytoplasmic shuttling, and Cas9-dCas9 fusion variants— to modulate the overall drive activity within a host population. Advantageously, the techniques to program a desired drive effectiveness described herein are fully self-contained and require no external regulatory mechanisms. Thus, the editing drive system is tunable along a full spectrum of drive efficiencies (up to less than 99%) within a population, for example efficiencies, effectiveness, or activities that are less than about 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10% and even less than about 5% as efficient, effective, and/or active than the same system without one or more of the modifications described herein. In other words, the efficiencies, effectiveness, or activities of the system are decreased by at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% and even by at least 95% as compared to the same system without one or more of the modifications described herein.
Embodiments of the invention include methods for reducing a pest population through genetic modification or alteration of genetic sequences of one or more target genes in the target pest using a CRISPR-based system, wherein the system is less than 99% effective in achieving genetic modification throughout the target population. Preferably, the system is less than about 95% effective, preferably less than about 90% effective, more preferably less than about 75% effective, and in some cases is less than about 50% effective in achieving genetic modification throughout the target population.
Alternative embodiments described herein are also concerned with anti-CRISPR peptides that can be included in the vector system to reduce Cas9 activity once expressed in the system.
Thus, CRISPR-based gene editing systems are contemplated herein which have an (intentionally) impaired activity level in a eukaryotic cell due to one or more pre-programmed impairments to the gene drive system. In more detail, CRISPR-Cas vector systems are described comprising one or more vectors. The vectors comprise generally a single guide RNA (sgRNA) sequence and an enzyme-coding sequence encoding a functional CRISPR enzyme (nuclease). The sgRNA and/or CRISPR enzyme sequences may be operably linked to one or more respective regulatory elements.
The sgRNA can be any short nucleotide sequence that is complementary to the target sequence and therefore capable of hybridizing to the target sequence of a DNA molecule in a eukaryotic cell for the desired genetic modification. The sgRNA interacts (via noncovalent binding) with the CRISPR enzyme (e.g., Cas9) to form a ribonucleoprotein (R P) complex and thereby directs sequence-specific binding of the CRISPR complex to the target sequence via base pairing of the sgRNA and target genomic DNA sequence. The sgRNA contains the preprogrammed CRISPR RNA (crRNA) and tracer RNA (tracrRNA) which is specific to the particular species of Cas9 used. For S. pyogenes, a particular tracrRNA is used, but can include modified forms of tracrRNA (e.g., different size, position of stem-loop, chemical modifications, etc.). Generally, the tracrRNA is required either fused to the crRNA (a "single" guide), or expressed as its own piece of RNA. Thus, the "CRISPR complex" comprises the CRISPR enzyme complexed with the sgRNA that is hybridized to the target sequence. Examples of various CRISPR enzymes and sgRNA systems can be found in the art.
The CRISPR enzyme is a nuclease, and preferably a type II CRISPR system enzyme, and more preferably a Cas9 enzyme. Exemplary Cas9 enzymes include S. pyogenes, S. aureus, S. pneumoniae, S. thermophilus, or N. meningitidis Cas9 nucleases, as well as Cas9 orthologs and mutant/variant Cas9 derived from these organisms, so long as the mutant remains functional (i.e., can bind and cleave the target nucleotide sequence). An exemplary Cas9 sequence is provided in SEQ ID NO: 1. Alternative Cas9-like nucleases that could be used in the CRISPR system include Cpfl (Casl2a), among others, as well as functional mutants and derivatives of Cas9 that otherwise maintain nuclease activity. In practice, the CRISPR enzyme in the bound CRISPR complex cleaves both strands in or near the target sequence in the eukaryotic cell, such that repair of the targeted gene is activated within the given cell (via non-homologous end joining (NJEJ) or homologous recombination). In addition to the sgRNA, site-specific Cas9 activity is targeted by presence of a protospacer adjacent motif (PAM) sequence adjacent to the target DNA sequence. That is, editing will not occur at any location other than the one at which the Cas9 in the complex recognizes the PAM. PAM sequences are selected to correspond to the particular Cas9 species selected for the complex. The conventional PAM the sequence for S. pyogenes is 5'-NGG-3' where "N" is any nucleotide, and in the Examples GGG was typically used. However, different PAMs are associated with different Cas9 proteins (e.g., 5'-NGA-3', 5'-YG-3'), and attempts have also been made to engineer Cas9 to recognize different PAMs. In addition, other nucleases, such as Cpfl have been shown to recognize other PAM sequences (e.g., 5'- YTN-3'). In some cases, the PAM sequence is present at the designated position in the genome prior to integration of the CRISPR complex, such that the gene drive is inserted next to the PAM. In some cases, the PAM sequence is part of the gene drive package, such that both the PAM sequence and gene drive sequence are inserted into the genome. In one or more embodiments, the vector system further comprises one or more nuclear localization sequences (NLS) of sufficient strength to drive accumulation of the CRISPR enzyme in the nucleus of the eukaryotic cell. In one or more embodiments, the vector system contains active Cas9 nuclease and guide RNA and no donor DNA. In such case, NHEJ repair systems will introduce mutation, insertion, or deletion or nucleotides at the intended cut site and disrupt gene expression (e.g., deletion of start codon, etc.) or protein function (e.g., frameshift, premature stop, etc.). In one or more embodiments, the vector comprises a double stranded DNA (dsDNA) donor (exogenous) sequence for incorporation into the eukaryotic cell. In such embodiments, the cleaved DNA sequence can be repaired via homologous recombination with the dsDNA resulting in alteration of the target sequence (and in some cases altered expression of the gene product(s)). This may result in insertion of exogenous sequences or deletion of endogenous sequences leading to the introduction of non-native DNA sequence at the site of cleavage (e.g., translational protein fusions, tags, entire gene expression cassettes, or multiple genetic pathways). The components of the CRISPR-based gene editing systems may be located on the same vector or co- expressed on different vectors of the system. The component sequences are preferably codon optimized for expression in the eukaryotic cell.
In the context of modulating engineered gene drive activity, a gene drive vector system typically comprises nucleic acid encoding for the CRISPR nuclease, sgRNA, optional additional donor DNA (and any corresponding regulatory elements), which is capable of copying (integrating) itself into the genome of the host cell into which it is introduced. More specifically, the gene drive components are integrated into an endogenous locus in the genome and may either replace and delete an endogenous gene, or are integrated next to an endogenous gene not causing any disruption. The gene drive vector further comprises flanking sequences on either (both) ends of the gene drive construct that are homologous to sequences adjacent to the insertion (cleavage) site in the host genome. Thus, integration can be at any site in the target sequence guided by the RNA and having homologous flanking sequence adjacent the insertion site. Once integrated, the sgRNA and CRISPR nuclease are then expressed from the integrated gene drive construct, and the sgRNA targets the resulting CRISPR complex to the corresponding locus on the "wild-type" (or endogenous) copy of the target DNA on the homologous chromosome in a diploid cell at the same site of insertion of the original gene drive construct. The CRISPR complex introduces the double-stranded break at this site, and homologous recombination mediated by the flanking sequences allows the gene drive construct to align with the wild-type (or endogneous) sequence, such that the entire gene drive construct (including the Cas9 and sgRNA nucleotide cassette) is copied over into the chromosome. In this way, a heterozygote diploid is immediately converted into a homozygous diploid condition (i.e., it now contains two copies of the gene drive, rather than one). Thus, the organism (yeast, or multicellular creature) will develop and produce haploid egg/sperm via meiosis where 100% of these gametes have the gene drive present. In general, the gene drive contains regulatory elements that induce expression of the CRISPR complex only during defined developmental conditions, such as early development, or in the fertilized embryo (single stage diploid/polyploid stage). In some cases, inducible promoters can be arranged to drive sgRNA to be "on," while Cas9 protein expression is only timed to be "on" during specific developmental stages. It will be appreciated that the gene drive system must target a DNA sequence that is only found within the "wild type" chromosome and not present within itself to prevent "self-cutting ." In the context of the present invention, the term "wild type" is not limited to truly naturally-occurring sequences, but includes previously mutated (whether naturally or artificially) sequences that are nonetheless different from the modified sequence being sought through CRISPR gene editing or incorporation of the gene drive.
The approaches described herein can be applied to improve emerging as well as existing
CRISPR-based gene editing systems, such as those described in U.S. Patent No. 8,697,359, filed October 15, 2013, U.S. Patent No. 9,260,723, filed June 30, 2014, each of which is incorporated by reference herein. The approaches described herein can be applied to improve emerging as well as existing CRISPR-based gene drive systems. Exemplary gene drives have been proposed and are described in detail in US 2016/0333376, filed June 3, 2016 (also WO 2015/105928); Dicarlo et al., Safeguarding CRISPR-Cas9 gene drives in yeast, Nat. Biotechnol. 33 : 1250-1255 (2015); Hammond et al., A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae, Nat. Biotechnol. 34:78-83 (2016); Gantz et al., Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi, Proc. Nat. Acad. Sci. 112(49):E6736-E6743 (2015); Champer et al., Novel CRISPR/Cas9 gene drive constructs reveal insights into mechanisms of resistance allele formation and drive efficiency in genetically diverse populations, PLoS Genet 13 : el006796 (2017); Drury et al., CRISPR/Cas9 gene drives in genetically variable and nonrandomly mating wild populations. Sci Adv 3 : el 601910 (2017), each of which incorporated by reference herein with respect to the described gene drive constructs.
CRISPR-based gene editing and gene drive systems according to the embodiments of the invention further comprise one or more components for modulating the activity of the system, and in fact, for impairing or inhibiting the effectiveness of the system. In one or more embodiments, CRISPR/Cas9 gene editing or gene drive activity is programmed through transcriptional regulation of the CRISPR enzyme, wherein titration of Cas9 protein level using an inducible regulatory element (e.g., promoter) causes a range of gene drive activities within a population in relation to the amount of induced transcript/protein of the nuclease. Thus, embodiments of the invention include CRISPR-based gene editing expression vectors comprising a nucleic acid encoding a Cas9 protein under the control of an inefficient, inactivated, or ineffective regulatory element to reduce or inhibit the expression of Cas9 protein in the system. Inducible regulatory elements include those that are activated by external stimuli (e.g., chemical exposure, temperature, etc.) or a specific developmental stage (or biological process) of the host. Various regulatory elements and inducible promoters are known in the art and can be tuned to the particular host/target species. Examples include metallothionein promoter, heat shock protein promoters, vitellogenin (Vg) promoter, ubiquitin C promoter, carboxypeptidase (Cp), TetOn promoter, U6 promoter, CBh promoter, promoters listed in the Examples, and the like. For example, for mosquitoes, preferred promoters include vitellogenin (fat body expression upon blood meal) and carboxypeptidase (midgut expression upon blood meal), which permit the system to specifically target females.
In one or more embodiments, CRISPR/Cas9 gene editing or gene drive activity is modulated through modifications to the single guide RNA, such as through changing its length or creating a sequence "mismatch" to the proposed target sequence and reduce binding efficiencies. In one or more embodiments, a 19 bp guide RNA with a single mismatch at the 5' end is demonstrated to reduce drive activity by about 50% when included in the CRISPR gene editing vector (haploid editing and diploid gene drive systems) in the exemplary yeast system.
In one or more embodiments, CRISPR/Cas9 drive activity is programmed through the inclusion of one or more nuclear export signal (NES) sequences in an expression vector of the CRISPR-Cas vector system. In one or more embodiments, only one NES sequence is included in the expression vector system. In one or more embodiments, the expression vector system comprises only two or three NES sequences. In one or more embodiments, the vector system may include one or more NES sequence under the control of an inducible regulatory element (e.g., promoter) to selectively express an NES-containing peptide to attach and bind to Cas9 (to cause export from the nucleus) under designated conditions. Exemplary regulatory elements are described above, and can be tuned or adjusted to the particular host species. In one or more embodiments, the NES sequence can be used to reduce drive effectiveness to zero, and thus provide constructs that serve a primarily inhibitory role. Exemplary NES sequences that can be used in the expression vectors include those encoding for leucine-rich peptides, such as LAKILGALDIN (SEQ ID NO: 4). NES peptides can be exemplified by the consensus sequence, LXXXLXXLXLX (SEQ ID NO: 5) where L stands for a hydrophobic amino acid residue (typically leucine or isoleucine), and X is any other amino acid. Various NES sequences have been characterized in the art, with many being catalogued via NESBase, maintained by the DTU Bioinformatics at the Technical University of Denmark (see la Cour et al., Protein Engineering, Design & Selection, vol. 17, no. 6, pp. 527-536 (2004), incorporated by reference herein). NES peptides are typically less than 15 residues in length.
In one or more embodiments, one or more nuclear localization signal (NLS) sequences is included in the vector system. In one or more embodiments, only one NLS sequence is included in the expression vector system. In one or more embodiments, the expression vector system comprises only two NLS sequences, and in some cases only three NLS sequences. In some embodiments, activity is regulated through inclusion of an NLS sequence that has been mutated or selected for impaired activity. In other words, instead of (or in addition to) relying on export signals from NES in the expression vector system, nuclear localization is reduced or inhibited through use of an NLS sequence with reduced activity. Various NLS signals are known in the art, including signal sequences that encode for short peptides of less than about 20 residues, including SV40 NLS sequence SRADPKKKRKV (SEQ ID NO:6), nucleoplasm^, KRPAATKKAGQAKKKK (SEQ ID NO: 7), and the like.
The NES/NLS signal mechanism is highly conserved across all eukaryotic species, and this approach has wide applicability to a variety of CRISPR expressions systems for tunable gene editing in various organisms. In one or more embodiments, NES and NLS sequences are included in the vector system in a 1 : 1 ratio. In one or more embodiments, the expression vector system comprises a 2: 1 ratio of NLS to NES. In one or more embodiments, the expression vector system comprises a 3 : 1 ratio of NLS to NES. Competition between the import (NLS) and export (NES) signals can also be used to titer the amount of Cas9 presented within the nucleus of the cell. Limiting residence time of the expression vector in the cell (e.g., less NLS or more NES contribution) causes a tunable reduction in Cas9 editing and subsequent gene drive activity in the cell. NES and/or NLS sequences can be included in the same or on different expression vectors in the system. In one or more embodiments, the NES and/or NLS sequence is included at or near the N- and/or C-terminus of the CRISPR enzyme sequence in the expression vector, such that expression yields a fusion protein with one or more NLS and/or NES at or near (i.e., within about 10 residues of) the N-terminus and/or the C-terminus of the CRISPR enzyme. When more than one NLS and/or ES is present, each may be selected independently of the other, such that different NLS and/or NES sequences may be included in the same vector system.
In one or more embodiments, CRISPR/Cas9 drive activity is programmed through inclusion of a secondary CRISPR nuclease that inhibits gene drive effectiveness, possibly through competition with the functional Cas9 in the system. In one or more embodiments, a nucleic acid encoding a "dead" Cas9 (dCas9) is included in the system. In one or more embodiments, the nucleic acid encodes for a fusion protein, wherein the fusion protein comprises two Cas9 enzymes: one active Cas9 and one dead Cas9 (or vice versa). Other fusions include active Cas9-Cas9 fusion variants with identical proteins. In one or more embodiments, the two Cas9 enzymes are in tandem fusion separated by an artificial flexible linker sequence of GRRIPGLINGGSSGS (SEQ ID NO:8), although other arrangements can be used. Use of the terms "active" (or "functional") in relation to Cas9 refers to an enzymatically active (and in some cases "native") version of the enzyme that can both bind to the target sequence and induce the double-stranded break (DSB) fundamental to the CRISPR gene editing process. In contrast, a "dead" Cas9 is enzymatically impaired or inactive (such as through a mutation), provided that it can still compete with the active Cas9 for binding to the target DNA site(s). Exemplary mutations that reduce or eliminate nuclease activity in Cas9 include one or more mutations selected from D10A, H840A, and combinations thereof.
Advantageously, in this system both Cas9 versions can accept the same guide RNA fragment, such that the each have equal propensity to seek out, bind, and associate with the target sequence. However, dCas9 cannot induce a DSB, and thus just "sits" in position on the DNA and blocks active Cas9 from accessing (and inducing the cut at) the same chromosomal position. Dead Cas9 may originate from within the fusion (e.g., directly fused to the N- or C-terminus of active Cas9) or from a different tandem pair or expression vector (provided with an identical sgRNA), or expressed under a different promoter. In other words, embodiments contemplated herein also include non-fused or independently expressed dCas9 from independent expression vectors and not part of a translational fusion. In any case, dCas9 competes with the active Cas9 in the expression system to reduce overall activity of the system, such that activity of the system is reduced to about 45 to about 70% effectiveness. In some cases, the "dead" Cas9 may simply be impaired and not necessarily inactive altogether, but have reduced activity compared to the active/native Cas9. This could be achieved, for example, by selecting a Cas9 from a different species (e.g., other than S. pyogenes) that has lower activity than the native Cas9, or though mutations in the enzyme or nucleic acid sequence. For example, orthologous Cas9 enzymes (from S. pyogenes, N. meningitidis, and S. thermophilics) can be assessed for their ability to function within a given editing system and directly compare their editing efficiencies. In addition, as noted fusion proteins of two active Cas9 enzymes can also be used to disrupt the wild-type efficiency of the system to a reduced level of about 80% to about 90%.
It will be appreciated that one or more above approaches can be used in combination to provide a wide range of gene drive efficiencies ranging from very low activity (near 0%) to wild- type levels (95%-99%), and levels in between (~l%-90%). "Drive activity," "effectiveness" or "efficiency" as used herein can be calculated by synthesizing drives or editing systems with selectable markers associated with the genetic modification and detecting the presence (or absence) of the selectable marker in the target population (of cells, individual subjects, etc.), which can be calculated to give a percentage based upon the total number of targets into which the gene editing or drive system was introduced. Further, it will be appreciated that one or more reporter genes can be included in the vector system. Components of the vector system can also be labeled with a detectable moiety or label if desired.
In one or more embodiments, the approaches of the invention also include the use of a new class of protein (termed "anti-CRISPR" peptides) for their ability to directly inhibit the enzymatic action of Cas9 expressed by the vector system. These peptides originally evolved within bacteriophage as a response to the CRISPR system, and act as a DNA mimic to associate with the nuclease where the PAM sequence normally resides.
A2 and/or A4 anti-CRISPR genes can be synthesized, cloned, and expressed in the host cell in combination with the CRISRP editing system. The Examples use budding yeast as a safe and fully-contained model system to pilot the use of bacteriophage proteins AcrIIA2 (SEQ ID NO:2) and AcrIIA4 (SEQ ID NO:3) to inhibit the action of Cas9 in vivo within a gene drive. Mutant variants of these proteins can also be used, such that those that reduce binding of the protein to Cas9. Exemplary mutants that can be used to titrate inhibitory activity on Cas9 are listed in Table 6 below. Thus, in one or more embodiments, expression vector systems in the invention can further comprise sequences encoding for one or more of AcrIIA2 and/or AcrIIA4 or mutants thereof for co-expression in the host cell with the CRISPR-Cas9 gene drive system. It will be appreciated that A2 or A4 could be introduced exogenously or included in the expression vector, plasmid, or gene package. Further, either of A2 or A4 can be operably linked to one or more regulatory elements to control expression of the proteins. For example, either of A2 or A4 can be operably linked to an inducible promoter to selectively turn on expression of the anti-CRISPR proteins to cause inhibition (or titration) of the gene drive under designated conditions. The sequences encoding for each of A2 and/or A4 can be codon optimized for expression in the particular eukaryotic cell.
The foregoing approaches can be used in methods of altering expression of one or more gene products from a target nucleotide sequence in a diploid eukaryotic cell, in vitro, in vivo, or ex vivo. The methods involve introducing into the eukaryotic cell one or more of the modified vector systems of the invention. The vector systems drive expression of the CRISPR components in the cell, including sequence specific binding, cleavage and repair, such that expression of the gene products is altered. However, because the expression vector systems are impaired, it will be understood that within a host population, the effectiveness of the alteration will typically be less than 95%, and in some cases less than about 90%, and in some cases about 50%). The modified or impaired "activity" of the modified gene drive systems according to the various embodiments of the invention can be measured in a variety of ways as known in the art. For example, expression, activity, or level of a reporter gene, or expression or activity of a gene encoded by the genetic element can be measured.
The modified CRISPR-Cas vector systems or its components can be introduced into the cell as a nucleic acid construct encoding the CRISPR enzyme, sgRNA, anti-CRISPR peptides, or other components for expression in the cell. Methods of the invention may involve activating regulatory elements, such as inducible promoters to induce expression of the construct. The components can also be introduced as preassembled proteins or R P complexes. In either case, expression of the construct or activity of the preassembled protein is impaired in the modified vector systems of the invention, such that it has reduced effectiveness or efficiency in altering expression of the gene product(s) in the eukaryotic cell.
Additional advantages of the various embodiments of the invention will be apparent to those skilled in the art upon review of the disclosure herein and the working examples below. It will be appreciated that the various embodiments described herein are not necessarily mutually exclusive unless otherwise indicated herein. For example, a feature described or depicted in one embodiment may also be included in other embodiments, but is not necessarily included. Thus, the present invention encompasses a variety of combinations and/or integrations of the specific embodiments described herein.
As used herein, the phrase "and/or," when used in a list of two or more items, means that any one of the listed items can be employed by itself or any combination of two or more of the listed items can be employed. For example, if a composition is described as containing or excluding components A, B, and/or C, the composition can contain or exclude A alone; B alone; C alone; A and B in combination; A and C in combination; B and C in combination; or A, B, and C in combination.
In general, and throughout this specification, the term "vector" refers to a nucleic acid molecule capable of transporting nucleic acids into a host cell to thereby produce transcripts, proteins, or peptides encoded by the nucleic acids in the cell. The term includes recombinant DNA molecules containing a desired coding sequence(s) and appropriate nucleic acid sequences (e.g., promoters) necessary for the expression of the operably linked coding sequence in a particular host organism. One type of vector is a "plasmid," which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. It will be appreciated that the design of the expression vector(s) will depend on various factors, including the organism species, host cell type, and type of editing desired.
The term "gene drive" refers to a nucleic acid construct that is capable of copying itself into the genome of the cell into which it is introduced. The present application is particularly concerned with endonuclease gene drives in diploid (or polyploid) organisms, wherein the gene drive induces a double-stranded break in the chromosome and induces the cell to copy the gene drive sequence to repair the damaged sequence via homologous recombination using the gene drive construct as the template. As a result, this process will re-occur in each organism that inherits one copy of the modification (and one wild-type copy). In this manner, gene drives are self-propagating, in comparison to the self-limited nature of traditional gene editing techniques.
The phrase, "operably linked" refers to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule (regulatory element) is capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced. "Regulatory element" refers to promoters, enhancers, and other expression control signals that direct constitutive expression of a nucleotide sequence. Such elements may be host- specific or may drive expression broadly across various host cell types. Such elements may also be inducible and direct expression only under certain conditions (e.g., active or "on" only under an external stimulus, tissue-specific, or developmentally determined parameter).
A "host cell" or "target cell" as used herein, refers to the eukaryotic cell into which the
CRISPR vector system has been introduced, include the progeny of the original transformed cell. A "host" or "subject" as used herein refers to an individual organism targeted for altered gene expression via CRISPR-based gene editing. Likewise, a "host" or "target" population refers to a plurality of individual host organisms which may be targeted for altered gene expression through CRISPR-based gene editing, such as a population of mosquitoes or other pests.
The present description also uses numerical ranges to quantify certain parameters relating to various embodiments of the invention. It should be understood that when numerical ranges are provided, such ranges are to be construed as providing literal support for claim limitations that only recite the lower value of the range as well as claim limitations that only recite the upper value of the range. For example, a disclosed numerical range of about 10 to about 100 provides literal support for a claim reciting "greater than about 10" (with no upper bounds) and a claim reciting "less than about 100" (with no lower bounds).
EXAMPLES
The following examples set forth methods in accordance with the invention. It is to be understood, however, that these examples are provided by way of illustration and nothing therein should be taken as a limitation upon the overall scope of the invention.
EXAMPLE 1
Tuning CRISP R gene drives in yeast
MATERIALS and METHODS
Yeast strains and plasmids
Saccharomyces cerevisiae strains used in this study are in Table 1. Standard molecular and cellular biology procedures were used to manipulate all plasmids and yeast strains. The Cas9 gene was synthesized de novo with a yeast codon bias (Genscript, Piscataway, NJ). The second Cas9* gene used for the tandem fusions was also synthesized after manual manipulation of each codon to an alternate codon (primarily within the Wobble position). Enzymatically dead Cas9 (D10A H840A) was generated by a modified PCR mutagenesis protocol on the pUC57-based plasmid(s) harboring the Cas9 gene using a high-fidelity DNA polymerase (KOD Hot Start, EMD Millipore). The general strategy for integration of Cas9 (or the target gene cassette) into the yeast genome was as follows. First, a CEN-based (pRS316) plasmid was generated by in vivo plasmid assembly including the GALl/10 promoter (814 bp), Cas9 open reading frame (ORF), a C-terminal NLS signal (SEQ ID NO: 6), the ADHl terminator (238 bp), and the MX-based kanamycin resistance cassette. Second, the assembled Cas9 gene cassette was PCR amplified and a second round of in vivo ligation was performed using a second vector (pGF-IVL974) to insert 992 bp of HIS3 5' UTR, 993 bp 3' UTR, and two (u2) sites upstream of the GALl/10 promoter and downstream of the MX terminator. Third, the entire ensemble was amplified in two fragments (generating 120 bp of overlapping sequence within the Cas9 ORF), treated with Dpnl enzyme, and transformed into BY4741 yeast using a modified lithium acetate protocol for integration at the native HIS3 locus (his3Al). Colonies resistant to G418 sulfate (and lacking the selectable marker for the yeast vector used as a PCR template) were tested by diagnostic PCR and Sanger DNA sequencing (Genscript) to generate GFY-2383.
Table 1 : Yeast strains used in this study.
Figure imgf000024_0001
GFY- BY4741; This study 2754 his3A::(u2)::prGAL::SpCas9::NLS::eGFP::NLS::NES::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2755 his3A::(u2)::prGAL::SpCas9::eGFP::ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; This study 2756 his3A::(u2)::prGAL::SpCas9::eGFP::NLS::ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; This study 3101 his3A::(u2)::prGAL::SpCas9::eGFP::NES::ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; This study 2758 his3A::(u2)::prGAL::SpCas9::eGFP::NLS::NES::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2759 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2760 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::NLS::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2761 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::NES::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2762 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::NLS::NES::ADHl(t)::KanR::
(u2)::HIS3(t)
GFY- BY4741; his3A::(u2)::prGAL::NLS::SpCas9::eGFP::ADHl(t)::KanR::(u2):: This study 2763 HIS3(t)
GFY- BY4741; This study 2764 his3A::(u2)::prGAL::NLS::SpCas9::eGFP::NLS::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2765 his3A::(u2)::prGAL::NLS::SpCas9::eGFP::NES::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; This study 2766 his3A::(u2)::prGAL::NLS::SpCas9::eGFP::NLS::NES::ADHl(t)::KanR::(u2)::
HIS3(t)
GFY- BY4741; his3A::(u2)::prGAL::SpCas9(D10A H840A)::NLS::ADHl(t):: This study 32506 KanR::(u2)::HIS3(t)
GFY- BY4741; his3A::(u2)::prGAL::SpCas9(D10A This study 30997 H840A)::Link::SpCas9*::NLS::
ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; his3A::(u2)::prGAL::SpCas9::Link::SpCas9(D10A This study 3100 H840A)*::NLS::
ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; his3A::(u2)::prGAL::SpCas9::Link::SpCas9*::NLS:;ADHl(t):: This study 3336 KanR::(u2)::HIS3(t)
GFY- BY4741; This study 3264s his3A::(u2)::prGAL::SpCas9::NLS::eGFP::ADHl(t)::KanR::(u2)::HIS3(t) 18
NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3265 his3A::(u2)::prGAL::SpCas9::NLS::eGFP::NLS::ADHl(t)::KanR::(u2):: 19
HIS3(t) NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3266 his3A::(u2)::prGAL::SpCas9::NLS::eGFP::NES::ADHl(t)::KanR::(u2):: 20
HIS3(t) NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3267 his3A::(u2)::prGAL::SpCas9::NLS::eGFP::NLS::NES::ADHl(t)::KanR::(u2):: 21
HIS3(t) NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3270 his3A::(u2)::prGAL::SpCas9::eGFP::NES::ADHl(t)::KanR::(u2)::HIS3(t) 24
NUP188::mCherry::ADHl(t)::SpHIS5 GFY- BY4741; This study 3271 his3A::(u2)::prGAL::SpCas9::eGFP::NLS::NES::ADHl(t)::KanR::(u2):: 25
HIS3(t) NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3273 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::NLS::ADHl(t)::KanR::(u2):: 27
HIS3(t) NUP188::mCherry::ADHl(t)::SpHIS5
GFY- BY4741; This study 3275 his3A::(u2)::prGAL::NLS::SpCas9::NLS::eGFP::NLS::NES::ADHl(t)::KanR:: 29
(u2)::HIS3(t) NUP188::mCherry::ADHl (t)::SpHIS5
Brachmann et al., 1998 Designer deletion strains derived from Saccharomyces cerevisiae S288C: a useful set of strains and plasmids for PCR-mediated gene disruption and other applications. Yeast 14: 115-132, incorporated by reference herein.
xThe "unique Cas9 target site" (ul) contains the 20 bp target site (SEQ ID NO:9) with adjacent PAM sequence (GGG) on the 3' end. This (ul) sequence was inserted directly flanking the HygR MX-based cassette and integrated at the native HIS3 locus in BY4741 WT yeast by amplifying the entire locus from pGF-IVLl 143.
2The HygR cassette was replaced with the KanR cassette. Strain GFY-2588 is otherwise isogenic to GFY-2353.
3The Cas9-expressing gene drive strain is flanked by (u2) sites at the HIS3 locus of the sequence containing the 20 bp target site (SEQ ID NO: 10) and adjacent PAM sequence (GGG) on the 3' end.
4The gene drive target locus contains 448 bp of 5' UTR of the CDC 12 gene, 486 bp of 3' UTR of the SHS1 gene, and 992 bp of 5' UTR of the CCW12 gene. The S. pombe HIS5 gene is the functional equivalent to S. cerevisiae HIS3.
5Strains GFY-2751 - GFY-2756, GFY-2758 - GFY-2766, and GFY-3101 were constructed by first generating plasmids containing the Cas9-expression cassettes from pGF-IVL1162 through pGF-IVL1177 flanked by (u2) sites and HIS3 5' and 3' UTR (plasmids pGF-IVL1318 - pGF- IVL1333, respectively) using in vivo plasmid assembly. Next, the entire cassette was PCR amplified in two fragments using overlapping primers within the coding sequence of the Cas9 gene, transformed into BY4741 WT yeast, and integrated at the HIS3 locus. Each strain was confirmed by DNA sequencing of PCR amplified fragments spanning the entire expression cassette and flanking UTR. 6The catalytic dead mutations (D10A and H840A) were mutagenized by a modified Quikchange protocol in the pUC57 vector prior to assembly by in vivo ligation in yeast. The dCas9 expression cassette was first assembled into pGF-IVL1180 followed by a second round of assembly to include flanking (u2) sites and HIS3 5' and 3' UTR. The entire cassette was PCR amplified and integrated at the HIS3 locus.
7GFY-3099, GFY-3100, and GFY-3336 were constructed by the following methodology. First, two parental plasmids were constructed by in vivo assembly containing either prGAL- SpCas9(D10A H840A) -Spel-ADHl ( t) -KarP or prGALSpCas9-SpeI-ADHl(t)-Karfi (pGF- IVL1312 and pGF-IVL1313, respectively). A 15-residue flexible linker sequence (SEQ ID NO:8) was also inserted in-frame at the C-terminus of Cas9. Second, a second SpCas9 gene (designated SpCas9*) was synthesized de novo with over 90% of all codons changed to an alternate sequence (if possible), primarily within the Wobble position (to provide maximum mismatch between the two identical copies of SpCas9 and prevent homologous recombination between the tandem genes). Third, digestion with Spel and a second round of in vivo ligation including the amplified SpCas9* (either a WT or catalytically dead mutant version) created a tandem fusion between dCas9-Cas9* (pGF-IVL1345) and Cas9-dCas9* (pGF-IVL1346B). Attempts to perform a third round of in vivo ligation (to include the flanking (u2) and HIS3 UTR sequences) were unsuccessful. Therefore, the fourth step included direct integration at the HIS3 locus with 4 overlapping PCR fragments (treated with Dpnl) from pGF-IVL1396 and pGF-1345 (to construct GFY-3099) or pGF-IVL1192 and pGF-IVL1346B (to construct GFY-3100) in a single transformation event. For GFY-3336, similar PCR fragments were generated from the same set of parental vectors harboring WT Cas9 (native or Wobble variants). Confirmation of these strains included multiple diagnostic PCRs and DNA sequencing of the entire locus.
8Strains GFY-2751 - GFY-2756, GFY-2758 - GFY-2766, and GFY-3101 were transformed with an amplified PCR fragment of the C-terminus of NUP188 fused to mCherry-ADHl(t)- SpHIS5 from a chromosomal DNA preparation from GFY-1517.
DNA plasmids used in this study are found in Table 2. In vivo plasmid assembly was used to construct all Cas9 and gene target vectors including those used for integration into the genome. Plasmids expressing the sgRNA cassette were created from previous vectors. Briefly, the SNR52 promoter (269 bp), crRNA sequence (16-22 bp), tracrRNA sequence (79 bp), and SUP 4 terminator (20 bp) were synthesized de novo (Genscript), subcloned into the pRS423, pRS425, or pRS426 (high-copy) plasmids, and mutagenized by PCR to generate all sgRNA variants. All vectors were confirmed by Sanger DNA Sequencing (Genscript). Table 2: Plasmids used in this study.
Figure imgf000029_0001
pGF- pRS425; prSNR52::Sp-sgRNA(ul-18WT)::SUP4(t) This study V1218
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19WT)::SUP4(t) This study V1219
pGF- pRS425; prSNR52::Sp-sgRNA(ul-20WT)::SUP4(t) This study V1220
pGF- pRS426; prSNR52::Sp-sgRNA(ul-20WT)::SUP4(t) This study V1625
pGF- pRS425; prSNR52::Sp-sgRNA(ul-21 WT)::SUP4(t) This study V1221
pGF- pRS425; prSNR52::Sp-sgRNA(ul-22WT)::SUP4(t) This study V1222
pGF- pRS425; prSNR52::Sp-sgRNA(ul-17-lmut)::SUP4(t) This study V12235
pGF- pRS425; prSNR52::Sp-sgRNA(ul-18-lmut)::SUP4(t) This study V1224
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-lmut)::SUP4(t) [5' G ~>A] This study V1225
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-lmut)::SUP4(t) [5' G ~>C] This study V1797
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-lmut)::SUP4(t) [5' G ~>T] This study V1799
pGF- pRS425; prSNR52::Sp-sgRNA(ul-20-lmut)::SUP4(t) This study V1226
pGF- pRS425; prSNR52::Sp-sgRNA(ul-21-lmut)::SUP4(t) This study V1227
pGF- pRS425; prSNR52::Sp-sgRNA(ul-22-lmut)::SUP4(t) This study V1228
pGF- pRS425; prSNR52::Sp-sgRNA(ul-17-2mut)::SUP4(t) This study V12295 pGF- pRS425; prSNR52::Sp-sgRNA(ul-18-2mut)::SUP4(t) This study V1230
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-2mut)::SUP4(t) This study V1231
pGF- pRS425; prSNR52::Sp-sgRNA(ul-20-2mut)::SUP4(t) This study V1232
pGF- pRS425; prSNR52::Sp-sgRNA(ul-21-2mut)::SUP4(t) This study V1233
pGF- pRS425; prSNR52::Sp-sgRNA(ul-22-2mut)::SUP4(t) This study V1234
pGF- pRS425; prSNR52::Sp-sgRNA(ul-17-3mut)::SUP4(t) This study V12355
pGF- pRS425; prSNR52::Sp-sgRNA(ul-18-3mut)::SUP4(t) This study V1236
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-3mut)::SUP4(t) This study V1237
pGF- pRS425; prSNR52::Sp-sgRNA(ul-20-3mut)::SUP4(t) This study V1238
pGF- pRS425; prSNR52::Sp-sgRNA(ul-21-3mut)::SUP4(t) This study V1239
pGF- pRS425; prSNR52::Sp-sgRNA(ul-22-3mut)::SUP4(t) This study V1240
pGF- pRS425; prSNR52::Sp-sgRNA(ul-17-lDel)::SUP4(t) This study V12416
pGF- pRS425; prSNR52::Sp-sgRNA(ul-18-lDel)::SUP4(t) This study V1242
pGF- pRS425; prSNR52::Sp-sgRNA(ul-19-lDel)::SUP4(t) This study V1243
pGF- pRS425; prSNR52::Sp-sgRNA(ul-20-lDel)::SUP4(t) This study V1244 pGF- pRS425; prSNR52::Sp-sgRNA(ul-21-lDel)::SUP4(t) This study V1245
pGF- pRS425; prSNR52::Sp-sgRNA(ul-22-lDel)::SUP4(t) This study V1246
pGF- pRS316; prGAL::SpCas9::NLS::eGFP::ADHl(t)::KanR This study IVL11627
pGF- pRS316; prGAL::SpCas9::NLS::eGFP::NLS::ADHl(t)::KanR This study IVL1163
pGF- pRS316; prGAL::SpCas9::NLS::eGFP::NES::ADHl(t)::KanR This study IVL11648
pGF- pRS316; prGAL::SpCas9::NLS::eGFP::NLS::NES::ADHl (t)::KanR This study IVL11659
pGF- pRS316; prGAL::SpCas9::eGFP::ADHl(t)::KanR This study IVL1166
pGF- pRS316; prGAL::SpCas9::eGFP::NLS::ADHl(t)::KanR This study IVL1167
pGF- pRS316; prGAL::SpCas9::eGFP::NES::ADHl(t)::KanR This study IVL1168
pGF- pRS316; prGAL::SpCas9::eGFP::NLS::NES::ADHl(t)::KanR This study IVL1169
pGF- pRS316; prGAL::NLS::SpCas9::NLS::eGFP::ADHl(t)::KanR This study IVL1170
pGF- pRS316; prGAL::NLS::SpCas9::NLS::eGFP::NLS::ADHl(t)::KanR This study IVL1171
pGF- pRS316; prGAL::NLS::SpCas9::NLS::eGFP::NES::ADHl(t)::KanR This study IVL1172
pGF- pRS316; prGAL::NLS::SpCas9::NLS::eGFP::NLS::NES::ADHl(t)::KanR This study IVL1173
pGF- pRS316; prGAL::NLS::SpCas9::eGFP::ADHl(t)::KanR This study IVL1174 pGF- pRS316; prGAL::NLS::SpCas9::eGFP::NLS::ADHl(t)::KanR This study IVL1175
pGF- pRS316; prGAL::NLS::SpCas9::eGFP::NES::ADHl(t)::KanR This study IVL1176
pGF- pRS316; prGAL::NLS::SpCas9::eGFP::NLS::NES::ADHl(t)::KanR This study IVL1177
Sikorski & Hieter, 1989 A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics 122: 19-27; Christianson et al., 1992 Multifunctional yeast high-copy-number shuttle vectors. Gene 1 10: 1 19-122; Finnigan & Thorner, 2016 mCAL: A New Approach for Versatile Multiplex Action of Cas9 Using One sgRNA and Loci Flanked by a Programmed Target Sequence. G3 (Bethesda) 6: 2147-2156, each of which is incorporated by reference herein.
lT e S. pyogenes Cas9 gene was synthesized de novo (Genscript) with a yeast codon bias and assembled by in vivo ligation under control of the GALl/10 promoter (814 bp 5' UTR) and a C- terminal NLS (SEQ ID NO:6) signal sequence.
2The sgRNA cassette was synthesized de novo. It contains 269 bp of the SNR52 promoter and 20 bp of the 3 ' UTR of SUP4. Various methodologies (e.g. restriction digests and in vitro ligation) were used to sub-clone the sgRNA cassette from the original pUC57 vector to TOPO II (pCR™- Blunt II-TOPO®, KanR-marked, Invitrogen) and to either pRS315 or pRS425 (or other pRS- family vectors). The u2 guide sequence SEQ ID NO: 10 was used. For the sgRNA(u2) plasmid cloned into pRS423 (pGF-V798), the backbone sequence contains 317 bp of 5' UTR and 201 bp of 3 ' UTR flanking genomic sequence to the HIS3 locus.
3The 20 bp (ul) guide sequence used was SEQ ID NO:9. For guide RNAs of 21 or 22 bp, the sequence included additional GA residues inserted at the 5' end.
4For sgRNAs of lengths less than 20 bp, the 3 ' most segment of the target site was used.
5The mismatch(es) occur at the 5' end of the sgRNA guide sequence. G/C was (randomly) changed to A/T and vice versa.
6The penultimate bp at the 5' end of the sgRNA sequence was deleted.
7The NLS signal sequences used in pGF-IVL1 162 - pGF-IVL1 177 are identical at the amino acid level, yet have codons altered at the DNA sequence level to aid in plasmid assembly. The central (between Cas9 and eGFP) NLS signal is immediately followed by a short flexible linker (SEQ ID NO: 1 1). The S. pyogenes Cas9 gene has a yeast codon bias. 8The ES signal (SEQ ID NO:4) immediately follows the eGFP sequence. This sequence was modeled after the prototypical cyclic AMP-dependent protein kinase inhibitor NES.
9The C-terminal NES signal is separated from the penultimate NLS signal by two glycine residues.
Culture Conditions
Yeast were propagated in solid or liquid medium including YPD (2% peptone, 1% yeast extract, 2% dextrose) or synthetic media containing a nitrogen base, ammonium sulfate, and all necessary amino acid supplements. Pre-induction medium included a 2% raffinose and 0.2% sucrose mixture. For experiments requiring induction of the GALl/10 promoter, liquid media containing 2% galactose was used. All sugars were filtered sterilized rather than autoclaved. The nomenclature for synthetic media is as follows: "S" refers to "synthetic complete" (e.g. SD- URA, synthetic complete plus dextrose minus uracil).
CRISPR/Cas9-Based Editing
The mCAL system was used for all Cas9 editing in haploid yeast and in diploid gene drive-containing strains. Briefly, this system harnesses two artificial, programmed Cas9 target sequences (ul and u2) that contain a maximum mismatch to the S. cerevisiae genome. Two identical (ul) sites flank the gene "target" locus whereas two identical (u2) sites flank the Cas9 gene cassette itself. Both the (ul) and (u2) sites within the genome also contain the 5'-NGG-3' PAM required for S. pyogenes Cas9. All sgRNAs were designed to target either the (ul) or (u2) artificial sites. For plasmid-driven Cas9, the URA3-based vector (e.g. pGF-IVL1116) was transformed into the appropriate yeast strain prior to editing for several reasons: (i) growth on dextrose repressed Cas9 expression, (ii) rapid counter-selection of the vector could be achieved on media containing 5-FOA, and (iii) consistent propagation of the plasmid could be maintained prior to introduction of the sgRNA-expressing plasmid.
Activation of Cas9 and gene editing in haploid yeast was performed as follows. Briefly, strains harboring the Cas9 vector were cultured overnight in a raffinose/sucrose mixture at 30°C to saturation. Next, yeast were back-diluted into a YP + galactose mixture to an OD6oo of approximately 0.35OD/mL and grown for an additional 4.5 hr. Cells were harvested and transformed with 1000 ng of sgRNA-containing plasmid, heat-shocked for 0.75 hr at 42°C, incubated in YP + galactose overnight (approximately 16 hr) without shaking, spread onto SD- URA-LEU plates, and incubated for 3-4 days prior to imaging. The number of colonies was quantified using a single-blind protocol (researchers counting were unaware of the genotype of each plate) and a sectoring method (when appropriate). Several random fractions (1/4 or 1/8, etc.) were sampled and averaged to estimate the total number of colonies. For plates containing less than 500 colonies, the entire plate was quantified. To assess the drug-resistance of individual colonies, 50-200 colonies were randomly selected and transferred to an identical plate type as a small patch (to increase the surface area) and incubated for 1-2 additional days. Next, plates (each containing between 50-100 colony patches) were replica-plated using sterile velvet cloths to rich medium containing dextrose and hygromycin (300 μg/mL) or G418 (200 μg/mL). Assessment of the HIS3 locus was performed by re-selecting clonal isolates, preparing purified chromosomal DNA, and PCR amplification and DNA sequencing (when appropriate).
Gene Drives and Containment
Cas9 gene drives were constructed and manipulated using the following protocol. First, the Cas9-expression cassette was integrated at the HIS3 locus and maintained on dextrose (to repress the GALl/10 promoter). Second, yeast were transformed with the sgRNA(ul) plasmid; since the target (ul) sequence does not exist within S. cerevisiae, editing is halted even if Cas9 was present and primed with the guide RNA. Third, haploid yeast expressing the sgRNA were mated to the gene "target" containing strain (harboring (ul) sites) of the opposite mating type on rich media containing dextrose for 24 hr at 30°C. Fourth, yeast were transferred to SD-LEU-HIS plates using sterile velvet cloths to select for diploid yeast and incubated for 24 hr at 30°C. The diploid selection step was repeated a second (or sometimes third) time on the same media type. The choice of the S. pombe HIS5 gene within the target genome to select for diploids ensures any (rare) promiscuous Cas9 that may have prematurely been activated will be destroyed on SD- LEU-HIS medium. Fifth, diploid yeast were cultured overnight in synthetic medium containing raffinose, sucrose, and lacking leucine to saturation. Sixth, cells were back-diluted to an OD6oo of approximately 0.35 OD/mL in YP medium containing galactose for 0 to 24 hr at 30°C depending on the time course. Following induction of Cas9, yeast were harvested, washed with water once, and diluted to a density of approximately 1,000 cells/mL. Between 250-1000 cells were spread onto SD-LEU plates and incubated for 2-3 days. Yeast were transferred to a fresh SD-LEU plate and SD-HIS plate using sterile velvet cloths and incubated for 24 hr prior to imaging. The number of colonies sensitive on SD-HIS medium was quantified (over total number of colonies on SD-LEU) to obtain the percentage of active drives within a given genotype. Analysis of individual colonies was done by re-selecting yeast on SD-LEU medium and testing individual clonal isolates for growth on various media types, ploidy status (by mating to MATa and MATa control strains), and chromosomal DNA isolation for diagnostic PCRs and DNA sequencing. For gene drives harboring two plasmids (URA 3/LE f/2-marked), the procedure included diploid selection on SD-URA-LEU-HIS plates, and final testing on SD-URA-LEU to maintain the presence of both plasmids.
Fluorescence Microscopy
Yeast were grown overnight in a pre-induction culture containing raffinose and sucrose to saturation and back-diluted into YP + galactose for 4.5 hr. Cells were harvested, washed with water, prepared on a microscope slide with a coverslip, and imaged within 15 min. Yeast were imaged on an inverted Leica DMI6500 fluorescence microscope (Leica Microsystems Inc., Buffalo Grove, IL) with a lOOx objective lens, and fluorescence filters (Semrock, GFP-4050B- LDKM-ZERO and mCherry-C-LDMK-ZERO). A Leica DFC340 FX camera, Leica Microsystems Application Suite AF software, and ImageJ (National Institute of Health) were used to process all images. All images were obtained using identical exposure times and were processed and rescaled together. The "merged" images were generated in ImageJ and do not contain any additional processing from the GFP/mCherry images. The yeast cell periphery (and the yeast vacuole) was determined using the DIC image. Representative cells were chosen for each image.
RESULTS
An artificial system for Cas9-based gene editing in budding yeast
The goal of this study was to identify aspects of CRISPR gene editing that have the potential to modify, in a predictable manner, the activity and effectiveness of gene drives. The molecular mechanisms identified have direct relevance and application to CRISPR drive systems in other organisms for various reasons. Given the recent interest and potential applications of nuclease-based gene drives across numerous industries, describes here is a safe, programmable system in S. cerevisiae to investigate control of Cas9 editing in vivo. The methodology includes several unique aspects that (i) remove concerns of off-target effects, (ii) minimize the number of sgRNA constructs required, (iii) allow for an unbiased assessment of editing {no selective pressure for or against an editing event), (iv) allow for full excision of a target marker (rather than rely on disruption of a coding sequence by a single targeted event), and (v) provide a potent genetic safeguard for use within a gene drive system (Fig. 2A). As shown in Fig. 2A, the design for a yeast system for analysis of CRISPR editing includes (i) an inducible S. pyogenes Cas9 expressed from a URA3-based plasmid, (ii) a sgRNA expression cassette on a high-copy LEU2- based plasmid, and (iii) a programmable gene "target" (consisting of a drug resistance marker cassette) at a safe-harbor locus (HIS3) flanked by two "unique" DNA sequences (ul) that do not exist within the S. cerevisiae genome. Induction of Cas9 allows targeting and double-stranded break formation at the identical (ul) sequences. In the absence of exogenous DNA (e.g. amplified PCR product) to be used for HDR, NHEJ of the exposed chromosomal ends causes full excision of the selectable marker. However, given the unique arrangement of the identical (ul) sites, NHEJ in the absence of any insertion/deletion mutation at the Cas9 cut site (left) recreates another WT (ul) site and subsequent re-editing of the same target sequence until Cas9 expression is shutoff or a mutation is positioned within the (ul) site (right).
Thus, the work involves development of a "target" strain of yeast flanked by two unique artificial sequences, termed (ul). Since these programmed (ul) sequences do not exist within the yeast genome (and provide a maximum mismatch to the closest native sequences), this reduces any possibility of off-target effects, or biased results based on similar gene target(s) or repetitive sequences. The design also includes two (ul) sites (see WO 2017/189336, filed April 20, 2017, incorporated by reference herein) flanking a selectable marker (conferring hygromycin or G418 resistance). This is in stark contrast to numerous other Cas9 editing assays which target the coding sequence of a given gene (or marker) and depend upon disruption of the final protein product to provide a detectible growth phenotype. This approach includes several issues that might arise from "traditional" targeting with a single sgRNA construct, opposed than our system of a "clean" excision of the entire physical gene. First, since disruption of the gene (e.g. CAN1) product allows for growth (on medium containing canavanine), there would be selective pressure to perform either NHEJ or HDR and allow for cell survival. Second, while traditional "red/white" colony color screening has been a very useful genetic tool for screening (adenine biosynthesis), the presence of the red pigment is slightly toxic to cells. Third, editing events that do not include an insertion or deletion (indel) of any sort would be virtually undetectable as there would be no change in sequence following repair via NHEJ. Fourth, use of two distinct sgRNAs to target two positions at a genomic locus of interest would require different target sequences— this may allow for preference or bias to one cut site. Our novel system controls for any difference in target sequence since all targets are identical (and allow for up to 22 bp targeting with a 3 bp PAM); this is the first setup to provide two identical Cas9 cut sites flanking a selectable marker (that imposes no native selective pressure for or against said marker since selection for the target is not assayed until after the editing event— this is an important technical difference between our system and traditional Cas9 editing approaches). Successful targeting of Cas9 to both sites, introduction of two double-stranded breaks (DSB), and subsequent repair of the broken DNA by NHEJ would recreate a single (ul) site from fusion of the two flanking "half sites" along with full excision of the marker gene (Fig. 2A). In this way, we would be able to detect Cas9 editing of a single sequence in the absence of any generated indel by HEJ.
Yeast containing the (ul) flanked target gene and an inducible S. pyogenes Cas9 {GALl/10 promoter) were transformed with either control (empty) plasmids or sgRNA(ul)- expressing plasmids (low or high copy) in the presence of galactose (Fig. 2B). GFY-2353 yeast already harboring Cas9- LS on a vector (pGF-IVL1116) or an empty vector control (pRS316) were induced in medium containing galactose, transformed with the sgRNA(ul)-expression cassette on either a CEN-based (pGF-V1215) or 2μ-based (pGF-V1220) plasmid, and plated onto SD-URA-LEU media. Importantly, the subsequent selection step following introduction of the guide RNA plasmid and activation of Cas9 was for the presence of both vectors. In the absence of any donor DNA fragment to repair the edited HIS3 locus, yeast activate the NHEJ pathway and repair the cleaved DNA together (sans our selectable marker). In the presence of both the sgRNA(ul) guide and active Cas9, only a small number of surviving colonies were obtained; this is consistent with previous studies in budding yeast using S. pyogenes Cas9. Following introduction of a DSB, the NHEJ system allows for precise repair of the severed (ul) sites. However, NHEJ with no alteration of sequence would result in a functional target (ul) DNA site and a second round of Cas9-dependent cleavage. Consecutive rounds of editing followed by repair occur during treatment with galactose and, given the excess of Cas9 available per cell, result in few cells that are viable once plated onto selective media. (Fig. 2B). In this way, general cell viability serves as a potent selection mechanism for assaying Cas9 editing in yeast. When individual surviving clones were tested for drug resistance following editing, the majority had fully excised the gene cassette as assayed by diagnostic PCRs of the HIS3 locus (Fig. 2C). As illustrated in Fig. 2C, GFY-2588 yeast containing pGF-IVL1342 were transformed with sgRNA(ul) plasmid (pGF-V1216) and selected on SD-URA-LEU medium. The isolated chromosomal DNA of individual clonal (surviving) isolates was assayed by PCR using DNA oligonucleotides (Fl/Rl) to the flanking HIS3 UTR. The expected product sizes of the amplified PCR fragments are approximately 379 bp (depending on the type of insertion/deletion(s) at the cut site, if any), or 1839 bp in the absence of any editing. Colonies were tested for resistance on medium containing G418.
Finally, DNA sequencing of a large pool of isolated clones yielded a diverse assortment of insertions and deletions at the "repaired" (ul) site following editing by Cas9 (Fig. 2D). Clonal isolates from Cas9 editing (a dozen independent experiments) using the high copy sgRNA(ul) plasmid from (B) and that had also excised the selection cassette were analyzed by DNA sequencing at the HIS3 locus. The number of each genotype obtained is listed. Many clones (27 separate isolates) did not contain any alteration at the expected Cas9 cut site (+3 position upstream of the 5' end of the PAM). However, this was not due to a lack of editing because the entire drug selection marker was not present and had been fully excised. Interestingly, plating of yeast onto galactose-containing medium (rather than dextrose) following transformation of the sgRNA plasmid resulted in a ten-fold decrease in the number of surviving colonies (Fig. 3A & B). GFY-2588 yeast containing pGF-IVL1342 (inducible Cas9) were transformed with sgRNA(ul) plasmid (pGF-V1220) and incubated overnight (16 hr) in YPGal as previous described. For 6 independent trials, yeast were plated onto SD-URA-LEU media for 4 days. For 29 additional independent experiments, yeast were plated onto SGAL-URA-LEU. The total number of colonies across all treatments were quantified (top). Selection on galactose-containing plates (>4 days) resulted in nearly a 10-fold reduction of viable colonies following Cas9-based editing. All surviving colonies (14 for dextrose treatment, 1-14, and 8 for galactose treatment, 15-22) were selected a second time (identical plate condition) as single clonal isolates and tested for G418-resistance (bottom). As illustrated in Fig. 3B, chromosomal DNA was prepared, PCR amplified at the HIS3 locus, and assayed by Sanger DNA sequencing. For dextrose-treated isolates (14), a variety of sequences were obtained including the original unmodified target cassette and dual (ul) sites, or various insertions/deletions. However, for galactose-treated isolates (8), only the unmodified target cassette (with KanR) was obtained.
Moreover, none of the surviving clones had excised the selectable marker or included any indel at the (ul) site(s) suggesting that prolonged Cas9 activation increases the stringency of selection. Thus, we have provided the first evidence to suggest that, in fact, estimates of editing are limited to detection of a generated indel at the proposed editing site. Our arrangement clearly demonstrates the ability of NHEJ to repair cleaved DNA with no alteration of sequence, at least in budding yeast. Additionally, our assay for Cas9 editing does not impose selection for any altered genetic marker until after the editing event, prevents off-target effects, and can also achieve multiplexing using identical sequences pre-programmed within a genome and a single guide RNA fragment.
Exploring control of CRISPR editing in haploid cells
Since the long-term goal is the exploration of gene drive control and its possible application in plants, fungi, microbes, and metazoans, focus here is limited to the conserved components of the CRISPR system, namely, the Cas9 nuclease and the sgRNA. We tested a set of sgRNAs variants within the yeast editing system (Fig. 4A & 4B). Of note, we have maintained an identical target DNA sequence (namely, the (ul) artificial target sites) in order to directly compare modifications to each guide sequence. We first altered the guide length from 16 to 22 base pairs and found that editing in our system required a crRNA length between 19 to 22 bp (Fig. 4A). Briefly, GFY-2353 yeast containing the Cas9-NLS vector (pGF-IVLl 116) were transformed with sgRNA(ul) cassettes (plasmids pGF-V1216 - pGF-V1222) with guide sequences of varying length along with an empty pRS425 vector control. The number of colonies was quantified {left) for three independent trials. Error, SD. Representative plates are shown {right). A random sampling of colonies was chosen across all three trials following editing on SD-URA-LEU plates and tested for growth on rich medium containing hygromycin. The percentage of isolates displaying sensitivity to the drug were quantified. For conditions (e.g. sgRNA(ul) 20 bp length) where a small number of colonies were viable, all surviving isolates (typically 5-20 total) were tested on hygromycin; for other combinations, between 150-200 colonies were sampled.
Next, for guide lengths of 17 to 22 bp, we varied the number of 5' mismatches (one, two, or three consecutive mutations— A/T to G/C and vice versa, chosen at random) as well as deletion of the penultimate base at the 5' end (Fig. 4B). Cas9 editing was repeated using sgRNA(ul) cassettes containing varying mismatches at the 5' end of the guide sequence. A single mismatch at the 5' end (pGF-V1223 - pGF-V1228), two mismatches (pGF-V1229 - pGF- V1234), three mismatches (pGF-V1235 - pGF-V1240), or a deletion of one base at the penultimate -2 position from the 5' end (pGF-V1241 - pGF-V1246) were assayed for both total surviving colonies and the percentage of isolates with an excised marker cassette at the target locus {top). Select comparisons with the sgRNA(ul) 19 bp guide with 1 mismatch data were performed using an unpaired t-test {bottom).
Testing of isolated clones for excision of the selectable marker largely mirrored the cell viability results— guide RNAs allowing efficient Cas9 editing resulted in loss of the marker whereas a lack of or reduction in editing was paired with drug resistance. Our results indicate that sgRNA length and identity had either no effect on editing or resulted in a complete loss of editing with only a single exception— a guide length of 19 bp with a single mismatch at the 5' end (SEQ ID NO: 19, with G- A 5' substitution). Numerous independent trials found an intermediate level of editing for this sgRNA to the dual (ul) sites. 85% of yeast clones had excised the drug resistance cassette, yet editing with this sgRNA always resulted in more surviving colonies. We suspect that, given the unique arrangement of our (ul) flanked system, repair of the broken chromosome ends results in presentation of a (third) wild-type (ul) target site that can also be edited by Cas9 (Fig. 4A). A slight reduction in Cas9 targeting may impede this editing and "re-editing" cycle to allow an increase in surviving yeast clones.
However, altering the 5' mismatch from G- A to either G- C or G- T resulted in a near loss of editing and a phenocopy of the 18 bp guide length sgRNA (Fig. 5). GFY-2588 haploid yeast harboring pGF-IVL1342 (inducible Cas9) were grown overnight to saturation in a raffinose/sucrose mixture and back-diluted into YPGal medium for 4.5 hrs in triplicate. Cell were harvested, washed, and transformed with 1000 ng of sgRNA(ul) plasmid harboring a mutation at the 5' end (19 bp guide sequence; pGF-V1225, pGF-V1797, or pGF-V1799, triplicate) or control plasmids (pGF-V1220, 20 bp WT sgRNA or pRS425, empty vector, duplicate). Following recovery in fresh YPGal overnight, cells were plated onto SD-LEU-URA and incubated for 3 days prior to imaging and quantification. A sampling of colonies was tested on media containing G418 to assess the percentage of cells that had excised the KanR marker at the HIS3 locus. Fig. 5(A) shows the results. Next, GFY-2383 yeast were transformed with high- copy sgRNA(ul)-expressing plasmids with a 19 base pair guide sequence containing a single substitution at the 5' end as in (A). Yeast were mated to the gene drive target strains of the opposite mating type (GFY-3206 and GFY-3207) in quadruplicate and diploids were selected on SD-LEU-HIS media (three consecutive selection steps). Strains were pre-induced overnight (raffinose/sucrose mixture) and grown in rich medium containing galactose for 5 hr prior to plating onto SD-LEU plates (500-1000 cells per plate). Colonies were replica-plated to both SD- LEU and SD-HIS after two days of growth and grown for an additional 24 hr before imaging. Representative plates are shown. The percentage of colonies with active drives is illustrated for sample plates (red text) in Fig. 5(B).
These results suggest editing using the 19 bp guide length with a 5' mismatch is likely sequence and/or context dependent. We also observed a similar trend for other sgRNA variants that included mismatches. For instance, a 20 bp guide with two mismatches at the 5' end provided a slight reduction in the total number of colonies (compared to the 18 bp guide length with no mismatch) and a small fraction of colonies that had properly excised the drug marker. While our analyses of sgRNAs is by no means comprehensive, our results demonstrate that all but one guide sequence (32/33 tested) either allow for optimal editing, or do not allow for editing at all with potentially only a few rare exceptions (such as the 19 bp guide with one 5' mismatch) that may result in an intermediate level of editing.
Next, we investigated whether nucleocytoplasmic shuffling might be used to control Cas9 editing. We hypothesized that nuclear localization and residence time might provide a conserved means to control and titrate editing when all other factors (Cas9 protein, sgRNA, target DNA) are held constant. We constructed a C-terminal Cas9-eGFP fusion and confirmed that it was fully functional and competent for editing compared to Cas9 alone. Yeast (GFY-2353) containing the HygR cassette flanked by the artificial (ul) sites were transformed with plasmids containing Cas9- LS (pGF-IVL1116), dCas9- LS (pGF-IVL1180), Cas9-eGFP-NLS (pGF-IVL1119), or dCas9-eGFP- LS (pGF-IVL1183), cultured overnight in pre-induction media (raffinose/sucrose), back-diluted into YP + galactose for 4.5 hr, and transformed with either the sgRNA(ul) plasmid (pGF-V1220) or an empty pRS425 vector, recovered overnight in galactose, and plated to SD-URA-LEU medium. The total number of colonies was quantified after 3 days at 30°C. Error, SD. Addition of the D10A and H840A mutations to Cas9 (catalytically dead Cas9) prevented any editing. Fusion of eGFP between Cas9 and a C-terminal NLS did not affect the ability of Cas9 to edit the yeast genome, as shown in Fig. 6. This provided three locations (N- terminus, C-terminus, and between the Cas9 and eGFP fusion) onto which to include an NLS sequence.
We tested 8 variants of Cas9 containing 0, 1, 2, or 3 NLS signals in all possible combinations and their ability to edit in vivo (Fig. 7A & 7B). 16 variations of Cas9-eGFP were constructed that included combinations of NLS and/or NES signals at various protein positions. Either 0, 1, 2, or 3 (identical) NLS signals were included along with either 0 or 1 NES signals; the positions chosen included the N-terminus, between Cas9 and eGFP, or at the C-terminus (left). GFY-2353 yeast were transformed with each Cas9 fusion (pGF-IVLl 162 - pGFIVLl 177) along with Cas9-NLS (pGF-IVLl 116) as a positive control. Editing was performed by induction of Cas9 expression followed by transformation of equimolar amounts of sgRNA(ul) (20 bp WT guide) plasmid in triplicate. The strain expressing Cas9-NLS served as a control (transformed with either sgRNA(ul) or an empty pRS425 vector). The total number of surviving colonies (SD-URA-LEU medium) was quantified. Error, SD. Following editing, randomly selected isolates from all trials (n = 100-200) were tested for growth on rich media containing hygromycin. For combinations where only a few surviving colonies existed, all possible isolates were tested for hygromycin sensitivity.
A Cas9-eGFP (lacking any added NLS) variant was still able to edit, albeit at a lower level compared to all other Cas9-eGFP variants harboring at least one NLS sequence. We suspect that Cas9 itself may (i) harbor a cryptic NLS signal(s) consisting of a cluster of positively-charged residues and/or (ii) achieve a low level of diffusion into the nucleus, despite its large molecular weight. We did not observe any significant differences in this assay for all Cas9 fusions containing only 1-3 NLS signals (1 NLS was sufficient to promote maximum editing), although others have reported increases in editing following the addition of numerous NLS sequences in other cell systems.
Additionally, we directly fused a C-terminal NES signal sequence to each of the 8 Cas9- NLS variants to determine if the interplay between import and export shuttling would affect editing (Fig. 7 A & 7B). Our results demonstrate that the direct competition between nuclear export and import can titrate the level of gene editing in otherwise isogenic strains. Direct fusion of an NES signal to the Cas9-eGFP variant greatly reduced editing, yet did not fully eliminate nuclear access {compare fusion 1 to 9) because nearly 50% of surviving clones had properly excised the target drug resistance marker (Fig. 7A). Furthermore, this supported the observation that our Cas9-eGFP could access the nucleus in the absence of any added NLS signal. Direct competition of one NLS versus one NES signal had the strongest reduction in gene editing {compare fusions 2/3/4 to 10/11/12). Altering the competition to two NLSs versus one NES shifted the level of editing {compare fusions 5/6/7 to 13/14/15). Finally, a Cas9 variant harboring three NLS signals and one C-terminal NES displayed wild-type levels of editing {compare fusions 2/3/4 to 16). Live cell imaging of strains harboring 1, 2, or 3 NLS were compared to the same constructs also containing an NES (Fig. 7C). Six Cas9-eGFP fusions were integrated into the yeast genome at the HIS3 locus in a strain expressing an endogenously tagged Nupl88- mCherry to mark the nuclear periphery (strains GFY-3264 - GFY-3267, GFY-3273, and GFY- 3275). Cultures were induced in galactose for 4.5 hr prior to imaging by fluorescence microscopy. Scale bar, 3 μπι. White dotted lines, cell periphery. White triangles, yeast vacuole. Strain numbers {right) refer to the Cas9 fusions in (A) for clarity. The nuclear periphery was marked with Nupl88-mCherry; steady-state levels of Cas9-eGFP were found within the nucleus for fusions containing only NLS signals (Fig. 7C, left) whereas the presence of the NES signal caused spatial exclusion from the nucleus (Fig. 7C, right).
These data demonstrate that, given sub-optimal nuclear localization (via the presence of an appended NES signal) there is a definite contribution of having more than one NLS sequence present. In our yeast system, given (i) sufficient GALl/10-dnven expression of Cas9 and (ii) an extended recovery phase overnight in medium containing galactose, the SV40 NLS sequence was sufficient to promote maximal editing (Fig. 7 A). Therefore, in other cell systems where the SV40 may represent a non-optimal signal sequence, or a lower level/amount of Cas9 protein present, it appears the addition of extra nuclear localization signals can improve import and subsequent editing. Finally, we examined whether there would be an additive effect between sgRNA identity (Fig. 4A & 4B) and nuclear shuttling (Fig. 7A & 7B) by testing each of the Cas9-eGFP fusions with three different sgRNAs (20 bp WT, 19 bp WT, and 19 with one 5' G^A mismatch) (Fig. 8 A). GFY-2353 yeast containing 16 Cas9-eGFP fusions with NLS/NES combinations (pGF- IVL1 162 - pGF-IVLl 177) from Fig. 7 A were transformed with sgRNA(ul)-expressing plasmids (pGF-V1219, pGF-V1220, and pGF-V1225) or empty pRS425 and the total number of colonies quantified. For each Cas9-eGFP fusion, all four plasmids were transformed in triplicate. The editing efficiency is displayed as a percentage by the following calculation: (i) the total number of colonies (per sample) was first divided by the total number of colonies obtained for the empty vector control followed by (ii) 100% minus the calculated percentage from (i). Error, SD from (i). By comparing the number of yeast colonies for each of the guide RNAs (editing) to the number of colonies for an empty sgRNA vector (control) transformation, we determined a standardized "editing percentage." (For combinations of Cas9 and sgRNA that resulted in 0 colonies, the editing percentage would be 100%. Conversely, a combination with 1,000 colonies where the empty vector control also yielded 1,000 colonies would calculate as 0% editing.) As expected, the 8 Cas9 fusions lacking any NES signal provided nearly 100% editing. Use of the sgRNA with a 19 bp guide and single 5' mismatch reduced the editing by approximately 15 to 25%) (Fig. 8A). Testing of the Cas9 fusions in the presence of a C-terminal NES signal displayed the same pattern for both 20 bp (WT) and 19 bp (WT) sgRNAs— a reduction in editing based on the number of competing NLS signals present. Use of the 19 bp guide RNA with a single mismatch caused a significant reduction (15%> to 75%) in editing that was also correlated to the number of NLS signals— more NLS sequences buffered against the effects of the less-effective guide RNA. Additionally, we observed the Cas9-eGFP-NLS-NES construct (fusion 12) deviated from the other two single NLS versus single NES variants (fusions 10 and 11). In our experiments, the NLS-NES tag (fusion 12) nearly phenocopied the NES tag alone (fusion 9). Given the immediate proximity of the two signals (separated by only two residues), we suspect that physical access to the penultimate NLS might be restricted by binding and competition by nuclear export machinery (and possibly vice versa as well). We demonstrate using these 48 combinations of Cas9 fusions and sgRNAs a wide range of editing efficiencies can be achieved with the majority between 75-100%) effectiveness (Fig. 8B). Moreover, our data illustrate there is a statistically significant added alteration in editing when using the 19 bp guide RNA with a single 5' mismatch across 15 out of 16 of our Cas9 fusions with the only exception being Cas9- eGFP with a triple NLS signal (Fig. 8C). Four independent mechanisms to titrate gene drive activity
Based upon our design for haploid gene editing, we construed a safe and programmable gene drive system in budding yeast (Fig. 9A). Our design of a programmable gene drive included (i) an integrated copy of S. pyogenes Cas9 (asterisk denotes use of various Cas9 fusions in an otherwise identical construct) under the inducible GALl/10 promoter at the HIS3 locus in MATa cells, (ii) a Kanamycin-resi stance gene cassette, (iii) flanking unique sites (u2) (see WO 2017/189336, filed April 20, 2017, incorporated by reference herein) surrounding the entire gene drive system to be used as a genetic failsafe (see Fig. 16A), and (iv) an artificial gene "target" containing a different selectable marker (S. pombe HISS) and flanked by (ul) artificial Cas9 sites at the HIS3 locus in a strain of the opposite mating type {MATa).
As described, our drive strain included Cas9 under the inducible GALl/10 promoter, a drug resistance cassette (KanR), and flanking artificial (u2) sites. For biosecurity reasons, we have kept the sgRNA-expression cassette separate from the physical chromosomal gene drive and, instead, maintained it on a high-copy plasmid. We generated an artificial gene "target" strain in yeast of the opposite mating type containing flanking (ul) sites at the HIS3 locus including (ii) a target gene to be excised and (ii) a distinct selection cassette (S. pombe HIS5) under control of the constitutive CCW12 promoter. Importantly, we included unique terminator sequences (ADH1 and SHS1 3' UTR) for both the drive and target systems— the presence of any identical sequences between the two homologous chromosomes can allow for inappropriate crossover within the drive itself (our unpublished results).
The gene drive strain was transformed with the sgRNA(ul) plasmid (20 bp guide WT), mated to the target strain, and diploids were selected while maintaining growth on dextrose to repress Cas9 expression. Cultures were shifted to galactose for a time course between 0 to 12 hr and plated onto SD-LEU medium. Additionally, activity (or lack thereof) of the drive was not necessary for cell survival— in this way, we provided an unbiased assessment of the proportion of cells within a given sampling that were able to (i) express Cas9, (ii) bind sgRNA, (iii) target the dual (ul) artificial sequence(s), and (iv) undergo homologous recombination to repair the DSB and copying the entirety of the drive to the opposite chromosome to replace the target gene.
Like our experimental design for assaying Cas9 activity in haploids (Fig. 9A), there is no selective pressure of any kind during the editing and repair events. Once single colonies were sufficiently grown, the entire sampling was transferred to both SD-LEU and SD-HIS plates where we assessed the status of the HIS3 locus gene drive and/or target. 100% of colonies maintained G418 resistance regardless of whether an active drive was induced (our unpublished results). However, as cells were grown in galactose for increasing amounts of time, causing higher expression of Cas9, a larger proportion of the population was sensitive on the SD-HIS condition (Fig. 9B & 9C). Activation and testing of all gene drives was performed as follows. First, the Cas9-containing strain (shown, GFY-2383) was transformed with the sgRNA(ul) plasmid (pGF-IVL1220) or an empty vector (pRS425) control and maintained on dextrose. Second, the gene drive strain (M4Ja) harboring the sgRNA(ul) plasmid was mated to the target strain (MATa; GFY-3206 or GFY-3207) on rich medium for 24 hr 30°C. Third, diploid yeast were selected twice on SD-LEU-HIS medium (24 hr incubation at 30°C). Fourth, diploids were cultured overnight in S-LEU+Raffinose/Sucrose liquid medium. Fifth, strains were back-diluted to an OD6oo of approximately 0.35 OD/mL in YP+Galactose and grown at 30°C for various amounts of time. Sixth, yeast were harvested by a brief centrifugation, washed with water, diluted to approximately 1000 cells/mL, and 0.5 mL was plated onto SD-LEU medium and incubated at 30°C for two days. Finally, yeast were transferred by replica-plating to SD-LEU and SD-HIS plates and incubated for 24 additional hr before imaging. Representative plates are shown for the GFY-3206 cross.
Fig. 9C shows the quantification of the percentage of colonies displaying an active gene drive (assayed by sensitivity on SD-HIS medium). Error, SD. Statistically significant comparisons are denoted using an unpaired t-test. N.S., not significant. The value for 0 hr is 0% drive activity, not 50%. Experimental runs with an empty plasmid (pRS425) were also performed and displayed a value of zero drive activity for all time points.
After 4 hr post galactose shift, 99% of colonies had lost the S. pombe HIS5 marker within the target locus. Clonal isolates were randomly selected from SD-LEU plates and retested on G418 and SD-HIS media. Multiple crosses were used to determine ploidy status. Diagnostic PCRs were performed on isolated diploid chromosomal DNA to assess the HIS3 locus at 0 hr and 12 hr post galactose shift. A random sampling of clonal isolates (all tested and confirmed as diploids) from each time point were chosen from SD-LEU plates and the HIS3 locus was interrogated by multiple diagnostic PCRs (Fig. 9D, Fig. 10).
In Fig. 10, GFY-2583 yeast harboring the high-copy sgRNA(ul) plasmid, pGF-V1220, were mated with the gene drive target strains, GFY-3206 and GFY-3207, and diploids were selected on SD-LEU-HIS medium. Following an overnight culture in pre-induction medium (raffinose/sucrose), strains were cultured in rich media containing galactose for 0, 1, 2, 4, or 8 hr before being plated onto SD-LEU (roughly 500-1000 cells per plate). Colonies were replica- plated to SD-LEU and SD-HIS medium and incubated for 24 hrs prior to imaging (top left). (B) Individual clonal isolates were obtained from the SD-LEU condition for each time point. A summary of the total number of isolates tested (including from Fig. 9) is provided. (C) Clonal isolates were tested for (i) ploidy status, (ii) G418 resistance, and (iii) growth on SD-HIS. Chromosomal DNA was isolated and subjected to four independent diagnostic PCRs to assess the status of the HIS3 locus following action of the gene drive at each condition. The Cas9 gene drive was marked with the KanR cassette (G418 resistance) whereas the target locus included the S. pombe HIS5 gene. PCRs A-D are identical to Fig. 9 (A and B assay the presence of the Cas9 gene drive whereas C and D are specific to the target locus). Including the data presented in Fig. 9B-9D, a total of 10 HIS+ diploids (all from the 0 hr time point) and 39 HIS- diploids (across multiple time points), each confirmed by four PCRs to the HIS3 locus, demonstrate action of the gene drive and correlation between phenotype and genotype. There was a 100% correlation between colonies sensitive on SD-HIS medium and lack of the entire target locus as assayed by PCR. For those (few) surviving colonies on SD-HIS (even after the 12 hr galactose shift), the diploid genome still contained both the Cas9 gene drive and the target cassette (our unpublished results). These data suggest that titration of the Cas9 nuclease itself correlates strongly with gene drive activity within a population.
We also examined the effects of altering the sgRNA length and mismatch in the context of our gene drive system (Fig. 11). GFY-2383 yeast was transformed with the collection of sgRNA(ul) cassettes. Yeast were mated to the target strains (GFY-3206 and GFY-3207), diploids selected, and drives were activated as described above. Diploids were induced in YP+Galactose for 24 hr prior to plating in triplicate. For sgRNA(ul) 20 bps (WT) and 19 bps (1 mismatch), 6 independent trials were performed. The percentage of yeast colonies with an active gene drive was quantified. The total number of dead colonies on SD-HIS plates compared to the corresponding colonies on SD-LEU plates represented the active gene drive percentage. Error, SD. The two comparisons highlighted were analyzed using an unpaired t-test. We obtained nearly identical results to our haploid editing experiments— only guide lengths of 19 to 22 allowed for an active gene drive even when the galactose induction time was increased to 24 hr. However, the 19 bp guide length with one 5' mismatch (G- A) did provide the only partially functioning (approximately 45% active) drive system. We recognize that this is reduced compared to the efficiency of editing in haploid cells (80-85% effective) yet there are major differences in the mode of repair between the two assays (NHEJ with the option to re-edit multiple rounds versus homologous recombination along the length of the paired chromosome in a diploid genome for action of a gene drive). Altering the 5' base to either C or T resulted in a near total loss of drive activity.
We observed the same trend (lower gene drive activity when compared to haploid NHEJ- based editing) when we examined the Cas9-eGFP fusions containing various LS/NES combinations (Fig. 12A). Given that our WT Cas9-based drive (Fig. 9A) was nearly 99% active after 4 hr post galactose shift, we tested our 16 Cas9-eGFP variants at 1.25, 2.5, and 5.0 hr following induction (Fig. 12A). Gene drives were generated based on the 16 plasmid-borne Cas9 constructs described above and integrated at the HIS3 locus. All gene drive strains (GFY-2751 - GFY-2766) were transformed with the sgRNA(ul) 20 bp WT guide plasmid and mated to the two target strains (GFY-3206 and GFY-3207). Following diploid selection and pre-induction in a raffinose/sucrose mixture, diploid yeast were cultured in YP + galactose for 1.25, 2.5 or 5.0 hr prior to plating. Representative plates (the Cas9-eGFP fusion number illustrated for clarity) for two groupings are illustrated at the 5 hr time point on SD-LEU and SD-HIS medium (left). The percentage of yeast with active gene drives (percentage of colonies dead on SD-HIS) was quantified in triplicate (right). Error, SD.
Indeed, all constructs containing only added LS signals (fusions 2-8) displayed nearly
99% active drive efficiencies after 5 hr of Cas9 expression; the Cas9-eGFP fusion with no added signal sequence was markedly lower at 75% (fusion 1). Addition of the C-terminal ES signal resulted in a dramatic reduction in nearly all drive activities (fusions 9-16). In fact, constructs containing only an NES or NLS-NES motif showed no activity at the 5 hr mark; however, this was not due to a lack of Cas9 expression (or nuclease activity) as these constructs were readily expressed, excluded from the nucleus, and were able to initiate editing in both haploid cells and gene drive diploids at a later time point (Fig. 13). Yeast (GFY-3101 and GFY-2758) were assayed for self-excision by using the sgRNA(u2)-expression plasmid (pGF-V809). Strains were pre-induced in a synthetic raffinose/sucrose mixture, back-diluted into YP + galactose, and transformed with equimolar amounts of either empty pRS425 vector or sgRNA(u2) vector, recovered overnight in galactose, and plated to SD-LEU medium for 3 days at 30°C. Colonies were quantified in triplicate. Error, SD. An unpaired t-test was used to compare the final colony counts. While the decrease in the total number of colonies was modest, these data illustrate there is (some) action of Cas9 for both the NES or NLS-NES tagged Cas9-eGFP fusions (Fig. 13).
Haploid strains from Fig. 13(A) as well as a WT Cas9 control strain (GFY-2383) all containing the sgRNA(ul) plasmid, were mated to the gene drive "target" strains (GFY-3206 and GFY-3207), diploids selected, Cas9 expression was pre-induced (raffinose/sucrose) overnight, and yeast were cultured in YP + galactose for 24 hr prior to plating onto SD-LEU medium. After 3 days, clonal isolates (randomly selected) were re-tested on SD-LEU plates and assayed for sensitivity on SD-HIS (between 50-75 colonies were tested for each strain) in quadruplicate. Error, SD. These data illustrate that both Cas9-eGFP- ES and Cas9-eGFP- LS- ES display some gene drive activity, but greatly reduced compared to the action of WT Cas9- LS. Moreover, our assessment of colonies from SD-LEU onto SD-HIS is likely an overestimate of drive activity since there is the possibility of clonal siblings to be accidently assayed. Regardless, these data present that there is gene drive action for the two ES-containing Cas9 constructs, albeit much lower, and requiring a 24 hr induction. Strains (GFY-3270 and GFY-3271) were constructed based on those used in Fig. 13A(A) that contain an integrated /P/SS-mCherry to mark the nuclear periphery. Yeast were pre-induced overnight (raffinose/sucrose), back-diluted into YP + galactose for 4.5 hr, and imaged by fluorescence microscopy. White dotted lines, cell periphery. Scale bar, 3 μιη. A faint circular haze seen in the mCherry channel represents the yeast vacuole (co-localized with the circular indentation in the DIC image). Representative images are shown. Steady state levels of Cas9-eGFP are excluded from the nucleus in both strains.
The same general trend was observed for the competition between LS and ES signals with additional NLS motifs providing higher drive activity (Fig. 12B). These results highlight that titration of Cas9 protein expression with nucleocytoplasmic shuttling can provide a wide range of gene drive efficiencies from very low activity (0%) to wild-type levels (99%) (Fig. 12C). Moreover, our findings demonstrate that nuclear exclusion (or limited residence time) may serve as a reasonable mechanism to fully inhibit or titrate drive activity (depending on the presence of other signals). Finally, given less time for Cas9 induction (less than 5 hr) there was still a modest increase in drive activity for constructs with two or three NLS sequences compared to only one nuclear signal, even in the absence of any NES.
Finally, we tested whether enzymatically dead Cas9 (dCas9)— which is still able to associate with the guide RNA and to target DNA— could serve as a direct competitor of active Cas9 (of the same species) when provided with an identical sgRNA (Fig. 14A). Rather than construct two separate expression cassettes for active Cas9 and dCas9, we created three separate tandem fusions between S. pyogenes Cas9 and either a second active nuclease, or an enzymatically dead version for several reasons (while the protein products were identical, the coding sequences were altered to prevent inappropriate homologous recombination and potential loss of one copy) (Fig. 14A). First, translational fusions to Cas9 (or dCas9) have been extensively used by many groups to add fluorescent proteins, chromatin-modifying enzymes, transcriptional activators/repressors, or other DNA modifying enzymes. Second, a gene fusion between otherwise identical Cas9 enzymes would circumvent the need for exacting titration of transcript/translation for a direct one-to-one comparison. Third, incorporation of both Cas9 genes at the site of the gene drive (HIS3 locus) would require the addition of unique promoter and terminator sequences flanking the secondary gene copy in a "separate" arrangement. Fourth, use of an additional yeast-specific (inducible) promoter sequence to drive expression of a second Cas9 variant would have limited utility to gene drive systems in other organisms. Thus, we have focused our initial efforts on tandem S. pyogenes Cas9 gene fusions, although we recognize that future iterations might consist of "competing" freely-expressed Cas9 nucleases to titrate editing.
As illustrated in Fig. 14A, second Cas9 gene (asterisk) was synthesized de novo by altering greater than 90% of the codons (primarily within the Wobble position). A 15-residue flexible linker was inserted between the two Cas9 copies. Dead Cas9 contains the mutations D10A and H840A. Transformation of these tandem Cas9-Cas9 fusions along with controls with the self-excising guide RNA (u2) demonstrated that both orientations (Cas9-dCas9 and dCas9-Cas9) resulted in a reduced level of editing in haploids compared to wild-type Cas9 expressed alone (Fig. 14B). GFY-2383, GFY-3250, GFY-3099, GFY-3100, and GFY-3336 yeast were transformed with equimolar amounts of either an empty vector (pRS425, duplicate), or a plasmid expressing the sgRNA(u2) 20 bp WT cassette (pGF-V809, triplicate), plated onto SD-LEU medium and incubated for 3 days. The total number of viable colonies were quantified {left)- Error, SD. Two-strain comparisons were performed using an unpaired t-test {right). Red text, p-values higher than 0.05. Interestingly, the tandem fusion of two nuclease-active Cas9 proteins {Strain T) displayed a marked increase in editing compared to the dCas9-containing fusions yet was still significantly impeded when compared to free WT Cas9. We next examined these Cas9 variants in the context of our gene drive system. We hypothesized that one possible contributing factor to the overall reduction in editing with all Cas9 fusions could be the requirement of additional sgRNA fragments per Cas9 polypeptide— the fusion might serve as an sgRNA "sink" requiring double the level of RNA compared to free Cas9. Therefore, we tested three conditions at various time points using two identical sgRNA-expressing plasmids (marked with URA3 or LEU2): (i) two empty vectors, (ii) two sgRNA(ul) plasmids, and (iii) one sgRNA/one empty plasmid (Fig. 14C). Yeast strains were each transformed with two plasmids {URA3 and LEU2 markers) resulting in four conditions: (i) sgRNA(ul)/sgRNA(ul), (ii) sgRNA(ul)/empty, (iii) empty/sgRNA(ul), and (iv) empty/empty. These included pRS425, pRS426, pGF-V1220, and pGF-V1625. Only the data for one of the sgRNA(ul)/empty combinations (pGF-V1625/pRS425) is presented. Strains were mated to the gene drive target strains (GFY-3206 and GFY-3207) and diploids were selected on SD-URA-LEU-HIS three consecutive rounds. Strains harboring either (i) two empty vectors or (ii) expressing a single copy of dCas9, were only mated to GFY-3206. Diploid yeast were pre-induced overnight as previously described, and Cas9 expression was induced for 5, 12, or 24 hr in YPGal medium prior to dilution onto SD-URA-LEU plates. Finally, yeast were transferred to SD-URA-LEU and SD-HIS plates before imaging {top).
Both combinations of the sgRNA/empty plasmid condition were tested and displayed nearly identical results (our unpublished data). We found that the Cas9-dCas9 and dCas9-Cas9 fusions resulted in approximately 45-70% drive activities (24 hr) {Conditions A and B). Testing of the dual active Cas9-Cas9 fusion revealed a marked increase (24 hr, 80-90%) in overall drive activity, especially at earlier time points, yet still fell short of the optimal freely-expressed WT Cas9 (24 hr, >95%) {Condition C). The percentage of active gene drives (percentage of colonies dead on SD-HIS plates) was quantified {bottom). Error, SD. Comparisons between strains (all time points included) were performed using an unpaired t-test. For p-values >0.10 (red text), between 0.05 to 0.10 (green text), and <0.05 (black text). For individual time point comparisons, see Table 3. See Fig. 14D.
Table 3. Unpaired T-test comparisons between gene drive activities for Cas9 fusion variants.
Figure imgf000051_0001
Al-5 versus Bl-5 0.790
Al-12 versus Bl-12 0.586
Al-24 versus Bl-24 0.285
A2-5 versus B2-5 0.424
A2-12 versus B2-12 0.777
A2-24 versus B2-24 0.389
Al-5 versus Cl-5 0.084
Al-12 versus Cl-12 0.036
Al-24 versus Cl-24 0.073
A2-5 versus C2-5 0.102
A2-12 versus C2-12 0.063
A2-24 versus C2-24 0.053
Bl-5 versus Cl-5 0.154
Bl-12 versus Cl-12 0.047
Bl-24 versus Cl-24 0.065
B2-5 versus C2-5 0.057
B2-12 versus C2-12 0.026
B2-24 versus C2-24 0.035
Cl-5 versus Dl-5 0.005
Cl-12 versus Dl-12 0.036
Cl-24 versus Dl-24 0.071
C2-5 versus D2-5 0.038
C2-12 versus D2-12 0.043
C2-24 versus D2-24 0.226
Yeast strains (A-D) correspond to the nomenclature used in Fig. 14C (A, dCas9-Cas9 fusion; B, Cas9-dCas9 fusion; C, Cas9-Cas9 fusion, and D, freely expressed WT Cas9). The number designation (e.g. Al, A2) refers to the presence of either 1 or 2 identical sgRNA-expressing plasmids (e.g. A2 contains both pRS425-based and pRS426-based sgRNA(ul) cassettes). For consistency, strains with only 1 sgRNA plasmid also harbor the corresponding high-copy empty vector. The final number indicates the induction time in galactose for each gene drive experiment (5, 12, or 24 hr).
Interestingly, we observed only a modest contribution of having multiple guide RNA- expressing plasmids under all conditions tested and the majority were not statistically significant (Table 3) suggesting that, in our experimental setup (high-copy plasmid and SNR52 promoter), there is likely near-sufficient guide RNA present. Next, there was no difference between the placement of the dCas9 variant on the N- or C-terminus of active Cas9 (both are separated by a 15 residue flexible linker). Also, use of the Cas9-Cas9 variant, identical save for two mutational substitutions, illustrated that there is a direct contribution of the second active nuclease within this arrangement and there is likely direct competition of the dCas9 variant within the mixed fusions.
Finally, the decrease in drive activity (and haploid editing) of the Cas9-Cas9 fusion compared with free WT Cas9 may result from slowed nuclear import as the expected molecular weight of the tandem fusions is expected to be nearly 320 kDa. However, passive diffusion of chimeric model proteins was found to exceed the 60 kDa limit in a previous study, suggesting that, perhaps, the nuclear pore complex (NPC) can accommodate much larger proteins, especially those constructed from protein fusions, rather than natively assembled masses (Wang and Brattain 2007). Moreover, upper estimates of a eukaryotic NPC cargo size included a diameter of up to 39 nm (using cargo-receptor-gold particle complexes) whereas the diameter of a single Cas9 protein is well below 15 nm. In support of this model of slowed nuclear import, the earliest time point (5 hr) for this fusion displayed a dramatic difference with WT Cas9, yet given additional induction time (12 or 24 hr), the Cas9-Cas9 variant was able to achieve a significant level of editing and drive activity. Collectively, our results present four molecular aspects of CRISPR-based gene drives that have the potential to modulate— in a programmable, predictable fashion— the activity and effectiveness of a given gene drive system.
DISCUSSION
In this work, one major aspect of gene drives that has never been studied or considered to date is described: methodologies to pre-program a desired drive effectiveness that is fully self- contained and requires no external regulatory mechanism.
We envision numerous reasons why a "sub-optimal" drive arrangement may be desirable in practice. First, lowering the rate by which a gene drive can propagate through a given population would require more time to achieve full penetrance (e.g. 10% effective drive versus 99%). Future work will be required to carefully map the relationship between drive activity, organism generation time, and penetrance within a population; indeed, recent studies have begun using in silico modeling to examine the dynamics of gene drives within populations. A titratable drive system might be useful in the initial stages of field testing or studying optimization of a drive system in a native or controlled population— increasing the number of generations required for full conversion would allow for (i) control over the length of time required for full penetrance which may be of use for organisms with a very rapid (or too rapid) generation time and (ii) the option to counter, halt, or release a fail-safe drive to reverse, inhibit, or destroy the primary drive should the need arise. Population suppression (rather than elimination) is still a useful goal that might benefit from a tunable drive system.
Second, the use of gene drives does not necessarily have to be restricted to application within wild populations. Indeed, several studies have demonstrated that gene drives represent powerful genetic screening tools that can be used within basic laboratory research. A tunable gene drive could be useful in studying population dynamics, evolved resistance, or in the generation of a heterogeneous population of edited cells/cell types for use in high throughput screening. Third, nearly every form of biological control (e.g. especially that used in agriculture such as natural predators, chemical agents, or physical barriers/traps) of pests or pathogens includes the ability to titrate the proposed solution to a level that is safe (to surrounding plant and animal life and to humans), cost-effective, and appropriate given the nature of the problem at hand. In its current state, deployment of an active gene drive would have two levels: 0% (not active) or nearly 99% (fully active) rates. Fourth, components that can partially reduce/slow drive activity may be able to selectively, or in combination, impose maximum reduction in drive activity should the need arise and serve as a controllable and inducible (temporary or permeant) off-switch. Therefore, we have performed an investigation into invariant components of all gene drive systems regardless of the proposed organism of use: Cas9 and the sgRNA.
Molecular mechanisms to tune drive activity in yeast
Use of S. cerevisiae as a model system to study gene drives comes with the numerous benefits of this popular and genetically tractable model organism. As a model eukaryote, many have begun to use yeast to demonstrate novel applications and uses of Cas9. We have developed a system for examination of Cas9 editing both in a haploid genome and in the diploid state (for gene drive activity). This model system provides several important benefits to the study of gene drives: the risk of unintended (or malicious) drive release is minimized because laboratory yeast do not sporulate well, are not airborne, and, as we have documented here, can be programmed with multiple genetic safeguards that render any escaped drive inviable. These simple, yet powerful genetic additions (our artificial "unique" target sequences and self-excising sequences flanking Cas9 itself) ensure no native yeast strain or population would ever be inappropriately targeted by the gene drive. Moreover, separation of the sgRNA (on a plasmid) serves as a potent natural failsafe for studying of gene drives in yeast. Finally, the rapid lifecycle, molecular tools available to the yeast community, and high-throughput infrastructure allow a true exploration and investigation of hundreds, if not thousands, of gene drive arrangements to be tested. Given the technical challenges to constructing only one or two viable drive systems in insects, it would be extremely difficult to explore more than a single variable given the challenging and time- consuming in vivo systems in arthropods or higher eukaryotes.
Here, we explored four conserved mechanisms of the CRISPR system and demonstrate that all four are independent techniques to titrate gene drive activity within our yeast system. We envision further development of each of these modes of control to varying degrees. While an inducible promoter system driving Cas9 transcription may not be a practical solution to study/test in an insect model, this could still be useful in promoter choice, and possibly complex modulation of the Cas9 promoter itself. For instance, we envision that multiple layers of transcriptional regulation could be assembled onto the "primary" Cas9 nuclease used to initiate the gene drive by a secondary copy of dCas9 fused to either an activator or repressor to modulate expression. Moreover, numerous evolved variants of Cas9 are now paired with either an inducible promoter or external stimuli including small molecules, temperature, and even light.
A plethora of studies have extensively tested many variables surrounding sgRNA design (single or two-part guides), length, mismatch, deletions, stem loop identities, and chemical modifications or fusions. Based on our limited set, and given further study, we are hopeful that there is a viable population of guide RNA options that might produce a titratable level of editing (like our 19 bp guide with a single 5' G- A mismatch) in the context of a drive system. Importantly, other groups have also reported varying levels of editing given distinct mismatches within the two most 5' bases of the crRNA sequence as we have demonstrated in this study. This might be explained by varied expression, stability, and/or Cas9 loading of the RNA. The commonly used U6 promoter requires a 5' G base pair for expression although crRNA transcript levels might also be affected by other positions within the guide including the seed region.
An important finding of our study involves titration of Cas9 nuclear occupancy through the active nucleocytoplasmic shuttling achieved by presentation of multiple NLS or NES signal(s). Nuclear Cas9/sgRNA complex residence time has been shown to limit editing efficiency. Given that the mechanisms for nuclear transport of proteins are conserved and that canonical signal sequences including, but not restricted to, the SV40 NLS are used across model systems, we envision this as a viable and rich option for modulation of editing in gene drives. The increased contribution to promoting Cas9 editing has already been demonstrated in other cell types and current Cas9 systems use two or three NLS sequences to maximize editing. In our designed system, using the potent GALl/10 promoter, a single SV40 NLS appeared sufficient to direct trafficking of Cas9 in either haploid cells or diploid gene drive setups. However, our work demonstrates that, like other cell systems that often require more than one nuclear signal, multiple NLS sequences provide a more robust import signal when challenged with either a sub- optimal degree of Cas9 expression and/or opposing nuclease export signal. Of note, our genetically encoded Cas9 system uses one of the highest expressed promoters in yeast and other CRISPR editing systems utilize various means of Cas9 delivery including chromosomally- encoded Cas9, plasmid-expressed, or microinjected purified Cas9/sgRNA ribonucleoprotein. Furthermore, different cell types have been shown to display highly variable Cas9 localizations regardless of the presence of one or two NLS sequences. Finally, the placement of NLS signal sequences (distance from the nuclease coding sequence) may also be a factor in accessibility of nuclear import machinery as well as context to the fused protein of interest. However, as a general strategy, our study and others have concluded that the addition of more than one NLS sequence can serve to increase nuclear localization and editing and may buffer against other factors that could interfere with optimal import. The identification and characterization of endogenous NLS signal sequences specific to the organism of interest would also provide an additional suite of options for either optimized or titratable nuclear import and subsequent Cas9 editing. Using the dynamic nuclear import/export of Cas9, we have demonstrated that both the level of Cas9 and its nuclear occupancy can modulate drive activity over a wide range of efficiencies. In fact, one possible mechanism for shutoff (or reduction) of Cas9 editing could be induced attachment or recruitment of a NES-containing peptide or protein and subsequent nuclear exclusion. This could even be coupled with the newly discovered "anti-CRISPR" family of short peptides that serve to directly bind and inhibit Cas9 nuclease activity.
Finally, we have piloted the use of a set of Cas9-Cas9 fusions to illustrate that (i) dCas9 can compete with native nuclease-active Cas9 and (ii) the large size of a tandem active fusion can partially impede editing. We recognize that alternative Cas9 orthologs would also provide a unique mode of titration between one active and one dead nuclease (or other combinations therein) competing for identical or nearby target sequences. Our system was developed because (i) only a single sgRNA cassette was required, (ii) only a single gene module was needed to express the dual Cas9 fusion (rather than two separate or identical promoters), (iii) a flexible linker provided both the N- and C-terminal nucleases ample spacing and (iv) addition of coding sequence as part of a gene fusion could be directly applicable to gene drive systems in other organisms. We are excited about the potential that unique Cas9 fusions may serve within the context of gene drives given the rapid explosion of new variants that currently exist. Indeed, dCas9 has provided the expansion of an entirely new field of fusing other enzymes of interest from DNA modifying enzymes, to transcriptional regulators, to "base editors," to fluorescent proteins, to other nuclease enzymes. Similarly, placement of different arrangements of Cas9 fusions (or expressed as separate proteins, or a cleavable fusion), different linker lengths or restrictions between fused proteins, or the presence of other non-nuclease modifying enzymes or tags within the drive system could aid to optimize, inhibit, or, as we have demonstrated, titrate the level of overall drive activity.
Given the technical, societal, and ethical challenges facing application of gene drives in the wild, additional study in controlled laboratory settings is critical. Our yeast drive system represents a safe, contained, and rapid testing platform to explore the numerous new Cas9 variants, sgRNA arrangements, and the subcellular trafficking of the Cas9/sgRNA complex to identify new means for future control, regulation, or inhibition in fungi, plant, or metazoan hosts and possible application in wild populations.
Finally, a number of safeguards were included to ensure the safe, ethical, and contained use of all yeast strains harboring the (potentially) active CRISPR gene drive arrangement. First, the most powerful safeguard includes the use of programmed target DNA site (ul) at the HIS3 locus that does not exist within the native budding yeast genome. This sequence has a maximum mismatch to any other sequence within S. cerevisiae reducing the possibility of inappropriate editing (off-target effects) and virtually eliminating the possibility of the drive to propagate within a wild yeast population of any strain type or related species. Second, all haploid gene drive strains (containing Cas9 and the (ul)-targeting sgRNA plasmid) were grown on dextrose to repress transcription of Cas9. Within the diploid strain, Cas9 was only activated for a limited amount of time (0-24 hr). Third, the sgRNA to target the (ul) sequence was exclusively maintained on a high-copy yeast plasmid (pRS425). Previous work has suggested that separation of the guide sequence from the Cas9 gene can provide an additional safeguard. Here, we have also documented the rapid loss of sgRNA(ul) plasmid from diploid yeast in the absence of any selective pressure (Fig. 15). Yeast (GFY-2383) were transformed with the sgRNA(ul) expression cassette on the high-copy pRS425 vector (pGF-IVL1220), mated to the gene drive "target" strain (GFY-3206), diploids selected (SD-LEU-HIS medium), and Cas9 was pre- induced (raffinose/sucrose) overnight, and (with no activation of Cas9), directly plated to SD- LEU medium (to maintain the plasmid), and incubated for 3 days. A random sampling of colonies from the SD-LEU plate was separated into single colonies on rich medium (YP + Dextrose) and grown for 3 additional days. From this YPD plate, a random sampling of individual clonal isolates was chosen (n = 60-80 colonies), moved to a fresh YPD plate {left), incubated for 24 hr, replica-plated to SD-LEU {right), and grown for 24 hr prior to imaging. The number of clonal isolates sensitive on the SD-LEU condition was quantified (far right) in duplicate. Error, SD. From the remaining yeast on the YPD plate, a random sampling (full streak across the plate) was taken and propagated to a fresh YPD plate to isolate a new round of single colonies. The process was repeated 3 times on successive rounds of YPD plates. By the third round of colony isolation, nearly 80% of isolates had lost the pRS425-based plasmid (in less than 2 weeks) in the absence of any selective pressure or counter- selection.
Fourth, all the Cas9-expression cassettes are also flanked by the artificial (u2) site. This serves as a specific safeguard to exactly excise the entire Cas9 gene and associated sequence from any haploid or diploid genome in a single step. We demonstrate that introduction (by direct transformation or mating via a strain of the opposite mating type) of the (u2) sgRNA plasmid causes removal and destruction of the existing drive system (Fig. 16A & 16B). All gene drive strains were constructed with this (u2) system in place. Our Cas9 gene drives all contain a preprogrammed genetic mechanism to active self-excision in either haploid or diploid cells. We have purposefully inserted the two artificial sites (u2) flanking the entire Cas9 drive cassette in all our gene drive configurations. These (u2) sites also contain a maximum mismatch to the S. cerevisiae genome and are distinct from the (ul) sites built into the target strains (these have been omitted from the diagrams for clarity). As illustrated in Fig. 16A, we have demonstrated five independent means to activate self-excision of Cas9 from the genome using the (u2) sites and a plasmid expressing the sgRNA(u2) guide sequence. WT, the traditional mode of action for all gene drives. Scenario 1, haploid yeast containing Cas9 in the gene drive strain (GFY-2383) were mated to the target strain of the opposite mating type (GFY-3206) harboring the sgRNA(u2) plasmid (pGF-V809) with a LEU2 marker, and diploids were selected twice. Following activation of the drive (pre-induction overnight followed by 24 hr in galactose medium), approximately 2,000 to 4,000 cells were plated to SD-LEU, grown for four days at 30°C, and transferred to fresh SD-LEU and G418 medium in triplicate (B, lower middle). Total colonies were quantified (B, bottom, right). Error, SD. Action of Cas9 on the flanking (u2) sites caused self-excision of the gene drive expression cassette and removal of the KanR marker by replacement with the target strain cassette (His+). We achieved >99.9% removal of the gene drive. Scenario 2, an identical procedure was performed with the same gene drive strain as in Scenario 1, but rather than the MATa target strain, WT BY4742 yeast were used harboring the same LEU2-marked sgRNA(u2). The HIS3 locus, following self-excision of the drive, would be homozygous diploid for the his3Al allele (His-). Scenario 3, the same protocol as Scenario 2 was performed, but rather than a pRS425 vector containing the sgRNA(u2) cassette, a HIS3- containing pRS423 plasmid (pGF-V798) was included in the BY4742 strain for delivery to the Cas9 gene drive diploid. This plasmid includes 317 bp of 5' UTR and 201 bp of 3' UTR flanking genomic sequence to the HIS3 locus and therefore serves as a source of donor DNA for repair of the DSB. Homology directed repair could occur through HR off the plasmid HIS3 sequence, or from action of the his3Al allele present on the homologous chromosome— in either scenario, the Cas9 drive is excised and removed (note, all cells would remain His+ due to the presence of the pRS423 plasmid). Scenario 4, the haploid gene drive strain (GFY-2383) was directly transformed with the sgRNA(u2) on pRS423 (pGF-V798), induced for expression, and plated onto SD-HIS medium prior to assaying on G418 plates. Self-excision of Cas9 was coupled with repair of the DSB by the provided HIS3 donor DNA on the pRS423 plasmid (cells His+). Scenario 5, the haploid gene drive strain was transformed with the L£J72-based sgRNA(u2) plasmid and editing was initiated as previously described. Yeast were plated onto SD-LEU medium and action of Cas9 excises itself out of the genome; there is no donor DNA present to repair the DSB and surviving colonies rely on NHEJ. We estimate that 96-98% of edited yeast (plates not shown) were inviable following action of Cas9 (B, bottom, right). Together, our system provides several options to approach destruction and removal of the gene drive itself in either a haploid state (by direct addition of the guide RNA), or by introduction of a "suiciding" strain of the opposite mating designed to deliver the self-excising guide RNA. Moreover, our gene drive suicide system utilizes the artificial (u2) sites, and does not require targeting of any native yeast genomic sequence nor any other aspect native to the gene drive or the Cas9 gene itself. Our system is distinct from a previous method describing use of a second gene drive to destroy an initial drive-containing strain because (i) any yeast strain of the opposite mating type can serve as the delivery mechanism of the sgRNA(u2) plasmid, (ii) no additional Cas9 drive is required— the original drive "self-excises" itself from the genome, (iii) our safety mechanism utilizes non-native DNA targets (u2) and would not present any risk of off-target or inappropriate editing, and (iv) our system could include a secondary Cas9 drive (say, under a distinct promoter sequence) to selectively target, edit, and destroy any intended (original) drive set up. These data demonstrate a powerful mechanism for destruction of our programmed gene drives in either the haploid or diploid state.
Fifth, the laboratory diploid strain (BY4741/BY4742) has been previously documented to be very inefficient at sporulation, even under optimal conditions that induce meiosis and spore formation. Sixth, careful destruction of all haploid and diploid yeast immediately following experimentation was performed (including capture of all washes of glassware for autoclaving). All materials used (tubes, velvet cloths, pipette tips, plates, wooden sticks, liquid cultures, etc.) were autoclaved at > 121°C for at least 0.75 hr before disposal or reuse. Diploid yeast strains containing gene drive systems were all destroyed (not frozen) following experimentation.
EXAMPLE 2
Anti-CRISPR Proteins
MATERIALS and METHODS
Yeast strains and plasmids
Saccharomyces cerevisiae strains can be found in Table S I . Standard molecular biology techniques were used to generate all constructs. Strains containing Cas9 were constructed by first creating a CSV-based plasmid including HIS3 UTR sequence, artificial [u2] sequences, and the KanR cassette using in vivo assembly. The Streptococcus pyogenes Cas9 gene was synthesized de novo with a yeast codon bias. Two overlapping (120 bp within the Cas9 ORF) PCRs were amplified, digested with Dpnl, transformed into WT BY4741 yeast, and selected on media containing G418. The artificial site [u2] was placed directly upstream of the GALl/10 promoter sequence (814 bp) and downstream of the MX(t) sequence.
The haploid "gene drive" strain (harboring inducible S. pyogenes Cas9) was built in
MATa yeast (GFY-2383) and included (i) a LEU2-based high copy plasmid with the sgRNA[ul] cassette and (ii) the URA3-based CEN plasmid expressing untagged AcrIIA2 or AcrIIA4 {top). For GFY-2586 and GFY-2583, the [u2] sequence included the Sp Cas9 target site (SEQ ID NO: 10) and adjacent 3 ' PAM (GGG). A C-terminal NLS (SEQ ID NO:6) was also included. Target yeast strains (GFY-3206 and GFY-3207) were constructed built in MATa yeast containing an artificial target gene and selectable HIS5 marker (from S. pombe) with flanking (ul) sites (SEQ ID NO:9) and 3 ' PAM sequence (GGG). 992 bp of the constitutive CCW12 promoter were used to drive S. pombe HIS5 expression. For strains GFY-3285 and GFY-3287, the following methodology was employed. First, plasmids harboring ADHl(t)::prMET25::AcrIIA2::CDC10(t)::prCCW12::SpHIS5::MX(t) (pGF-IVL1410) or the same construct with AcrIIA4 (pGF-IVL1411) were generated, PCR amplified in overlapping fragments, Dpnl treated, integrated into GFY-2383, and colonies were selected for survival on SD-HIS medium and sensitivity to G418. 384 bp of the MET25 promoter were used. Second, the HIS5 marker was replaced with the KanR MX-based cassette using an amplified fragment containing CDC10(t)::prMX::KarP::MX(t) (from pGF-IVL1412). There is a (ul) site downstream of the MX(t) sequence rather than a (u2) site. Construction of GFY-3104 and GFY- 3268 included use of enzymatically dead Cas9 (D10A H840A). A modified site directed mutagenesis protocol was used in introduce substitutions to the ORF in a pUC57 vector prior to in vivo plasmid assembly. The LactC2 domain was amplified from pGF-IVL687. The constructs were integrated into the yeast genome as previously described.
Table 4. Yeast strains used in this study.
Figure imgf000061_0001
3104b his3A::(u2)::prGALl/10::dCos9(D10A H840A)::mCherry::LoctC2(l- 158)::ADHl(t)::KanR::(u2)::HIS3(t)
GFY- BY4741; This study 326813 his3A::(u2)::prGALl/10::mCherry::LactC2(l-
158)::ADHl(t)::KanR::(u2)
::HIS3(t)
a. For strains GFY-3285 and GFY-3287, the following methodology was employed. First, plasmids harboring ADHl(t)::prMET25::AcrIIA2::CDC10(t)::prCCW12::SpHIS5::MX(t) (pGF- IVL1410) or the same construct with AcrIIA4 (pGF-IVL1411) were generated, PCR amplified in overlapping fragments, Dpnl treated, integrated into GFY-2383, and colonies were selected for survival on SD-HIS medium and sensitivity to G418. 384 bp of the MET25 promoter were used. Second, the HIS5 marker was replaced with the KanR MX-based cassette using an amplified fragment containing CDC10(t)::prMX::KarP::MX(t) (from pGF-IVL1412). There is a (ul) site downstream of the MX(t) sequence rather than a (u2) site.
b. Construction of GFY-3104 and GFY-3268 included use of enzymatically dead Cas9 (DIOA H840A). A modified site directed mutagenesis protocol was used in introduce substitutions to the ORF in a pUC57 vector prior to in vivo plasmid assembly. The LactC2 domain was amplified from pGF-IVL687. The constructs were integrated into the yeast genome as previously described. Cassettes were PCR amplified with a high-fidelity polymerase (KOD Hot Start, EMD
Millipore) and transformed into yeast using a lithium acetate protocol. Diagnostic PCRs followed by DNA sequencing confirmed successful integration.
DNA plasmids generated in this study are in Table 5. Genes for the anti-CRISPR genes AcrIIA2 and AcrIIA4 were cloned (pGF-IVL1384 to pGF-IVL1387) under control of the CDC11 promoter on CEN-based plasmids, tagged with GFP at either their N- or C-terminus, transformed into WT yeast (BY4741). The AcrIIA2 protein was mutated using pairs of alanine substitutions, and a similar mutational analysis was performed on the A4 protein. For all substitutions, the AcrIIA2 and AcrIIA4 expression cassettes were amplified, cloned into a TOPO II vector (pCR™-Blunt II-TOPO®, Invitrogen), mutagenized by PCR, and sub-cloned to pRS316 using flanking Notl/Spel sites. Yeast (GFY-2383) were first transformed with all URA3- based plasmids: (i) empty pRS316, (ii) WT AcrIIA2 (pGF-IVL1336) or WT A4 (pGF-IVL1337), or (iii) mutant AcrIIA2 (pGF-V1399 to pGF-V1420) or A4 (pGF-V1421 to pGF-V1439).
Plasmids containing sgRNA cassettes included 269 bp of the SNR52 promoter sequence, 20 bp of the SUP 4 terminator sequence, and appropriate crRNA and tracrRNA (per orthologous species used) were included. For S. pyogenes Cas9, a guide sequence of 20 bp was used. The S. pyogenes Cas9 (yeast codon bias) expression cassette is in SEQ ID NO: 13, including the terminal SV40 NLS sequence. The S. pyogenes sgRNA expression cassette [u2] is in SEQ ID NO: 14, including SNR52 promoter (residues 34-302), crRNA guide sequence (residues 303- 322), tracrRNA (residues 323-401), and flanking restriction sides (residues 1-6 and 450-455). All vectors were confirmed via DNA sequencing (Genscript).
Table 5. Plasmids used in this study.
Figure imgf000063_0001
pGF-V1409 pRS316; prCDCll::AcrllA2(D99A L100A)::ADHl(t) This study pGF-V1410 pRS316; prCDCll::AcrllA2(l97A D98A)::ADHl(t) This study pGF-V1411 pRS316; prCDCll::AcrllA2(E72A V75A)::ADHl(t) This study pGF-V1412 pRS316; prCDCll::AcrllA2(D65A D71A)::ADHl(t) This study pGF-V1413 pRS316; prCDCll::AcrllA2(E63A Y64A)::ADHl(t) This study pGF-V1414 pRS316; prCDCll::AcrllA2(D60A E61A)::ADHl(t) This study pGF-V1415 pRS316; prCDCll::AcrllA2(E93A D96A)::ADHl(t) This study pGF-V1416 pRS316; prCDCll::AcrllA2(D76A D81A)::ADHl(t) This study pGF-V1417 pRS316; prCDCll::AcrllA2(D38A D40A)::ADHl(t) This study pGF-V1418 pRS316; prCDCll::AcrllA2(E25A E26A)::ADHl(t) This study pGF-V1419 pRS316; prCDCll::AcrllA2(D22A D23A)::ADHl(t) This study pGF-V1420 pRS316; prCDCll::AcrllA2(E12A E16A)::ADHl(t) This study pGF-V1421 pRS316; prCDCll::AcrllA4(L86A N87A)::ADHl(t) This study pGF-V1422 pRS316; prCDCll::AcrllA4(S84A E85A)::ADHl(t) This study pGF-V1423 pRS316; prCDCll::AcrllA4(L82A K83A)::ADHl(t) This study pGF-V1424 pRS316; prCDCll::AcrllA4(l80A T81A)::ADHl(t) This study pGF-V1425 pRS316; prCDCll::AcrllA4(Q78A T79A)::ADHl(t) This study pGF-V1426 pRS316; prCDCll::AcrllA4(D76A M77A)::ADHl(t) This study pGF-V1427 pRS316; prCDCll::AcrllA4(Y74A N75A)::ADHl(t) This study pGF-V1428 pRS316; prCDCll::AcrllA4(E72A F73A)::ADHl(t) This study pGF-V1429 pRS316; prCDCll::AcrllA4(E70A E71A)::ADHl(t) This study pGF-V1430 pRS316; prCDCll::AcrllA4(E68A D69A)::ADHl(t) This study pGF-V1431 pRS316; prCDCll::AcrllA4(E66A Y67A)::ADHl(t) This study pGF-V1432 pRS316; prCDCll::AcrllA4(N64A Q65A)::ADHl(t) This study pGF-V1433 pRS316; prCDCll::AcrllA4(E49A V52A)::ADHl(t) This study pGF-V1434 pRS316; prCDCll::AcrllA4(E45A E47A)::ADHl(t) This study pGF-V1435 pRS316; prCDCll::AcrllA4(E40A Y41A)::ADHl(t) This study pGF-V1436 pRS316; prCDCll::AcrllA4(D37A G38A)::ADHl(t) This study pGF-V1437 pRS316; prCDCll::AcrllA4(T22A D23A)::ADHl(t) This study pGF-V1438 pRS316; prCDCll::AcrllA4(D14A Y15A)::ADHl(t) This study pGF-V1439 pRS316; prCDCll::AcrllA4(D5A E9A)::ADHl(t) This study pGF-V1470 pRS316; prCDCll::AcrllA4(D14A)::ADHl(t) This study pGF-V1471 pRS316; prCDCll::AcrllA4(N36A)::ADHl(t) This study pGF-V1472 pRS316; prCDCll::AcrllA4(D37A)::ADHl(t) This study pGF-V1535 pRS316; prCDCll::AcrllA4(G38A)::ADHl(t) This study pGF-V1536 pRS316; prCDCll::AcrllA4(N39A)::ADHl(t) This study pGF-V1473 pRS316; prCDCll::AcrllA4(E40A)::ADHl(t) This study pGF-V1474 pRS316; prCDCll::AcrllA4(N48A)::ADHl(t) This study pGF-V1475 pRS316; prCDCll::AcrllA4(D69A)::ADHl (t) This study pGF-V1476 pRS316; prCDCll::AcrllA4(E70A)::ADHl(t) This study pGF-V1477 pRS316; prCDCll::AcrllA4(E72A)::ADHl(t) This study pGF-V1534 pRS316; prCDCll::AcrllA4(F73A)::ADHl (t) This study pGF-V1478 pRS316; prCDCll::AcrllA4(D76A)::ADHl (t) This study pGF-V1479 pRS316; prCDCll::AcrllA4(M77A)::ADHl(t) This study pGF-V1480 pRS316; prCDCll::AcrllA4(D23R)::ADHl (t) This study pGF-V1481 pRS316; prCDCll::AcrllA4(N39R)::ADHl(t) This study pGF-V1482 pRS316; prCDCll::AcrllA4(D69R)::ADHl (t) This study pGF-V1483 pRS316; prCDCll::AcrllA4(E70R)::ADHl (t) This study pGF-V1484 pRS316; prCDCll::AcrllA4(Y67A D69A)::ADHl(t) This study pGF-V1485 pRS316; prCDCll::AcrllA4(D69A E70A)::ADHl(t) This study pGF-V1220f pRS425; prSNR52::Sp-sgRNA(ul-20WT)::SUP4(t) Example 1 pGF- pRS425; prSNR52::Sp-sgRNA(mCherry)::SUP4(t) Roggenkamp
425+IVL1277g et al. 2017 pGF-IVL1431 pRS316; prCDCll::GFP::AcrllA4(E70A E71A)::ADHl(t)::HygR This study pGF-IVL1432 pRS316; prCDCll::GFP::AcrllA4(E40A Y41A)::ADHl(t)::HygR This study pGF-IVL1433 pRS316; This study prCDCll::GFP::AcrllA4(D14A Y15A)::ADHl(t)::HygR
pGF-IVL1388h pRS316; prCDCll::AcrllA2(2-HA)::ADHl(t)::HygR This study pGF-IVL1389 pRS316; prCDCll::AcrllA2(2-21A)::ADHl(t)::HygR This study pGF-IVL1392 pRS316; prCDCll::AcrllA2(114-123A)::ADHl(t)::HygR This study pGF-IVL1393 pRS316; prCDCll::AcrllA2(104-123A)::ADHl(t)::HygR This study pGF-IVL1390 pRS316; prCDCll::AcrllA4(2-HA)::ADHl(t)::HygR This study pGF-IVL1391 pRS316; prCDCll::AcrllA4::ADHl(t)::HygR This study pGF-IVL1394 pRS316; prCDCll::AcrllA4(78-87)::ADHl(t)::HygR This study pGF-IVL1395 pRS316; prCDCll::AcrllA4(68-87)::ADHl(t)::HygR This study
Roggenkamp et al. CRISPR-UnLOCK: multipurpose Cas9-Based strategies for conversion of yeast libraries and strains. Front Microbiol 2017; 8: 1773, incorporated by reference herein.
aThe sgRNA cassette was constructed as previous described. Briefly, 269 bp of the SNR52 promoter sequence, 20 bp of the SUP 4 terminator sequence, and appropriate crRNA and tracrRNA (per orthologous species used) were included. For S. pyogenes Cas9, a guide sequence of 20 bp was used.
bThe anti-CRISPR AcrIIA2 and AcrIIA4 genes (or CI, C2, and C3) were synthesized de novo with a yeast codon bias and cloned into CSV-based yeast expression vectors under control of the CDC11 promoter using in vivo plasmid assembly.
CGFP includes the substitutions F64L and S65T.
deGFP includes the substitutions F64L, S65T, R88Q, and H239L.
eThe general cloning scheme for AcrIIA2 and AcrIIA4 mutants included the following. First, the prCDCll::AcrIIA2/A4::ADHl(t) fragment was sub-cloned into a TOPO II vector (pCR-Blunt II- TOPO, Life Technologies, Inc.). Second, substitutions were introduced using PCR. Third, the entire construct was sub-cloned to the pRS316 vector using the flanking Notl/Spel sites.
fThe sgRNA(ul) includes the crRNA (SEQ ID NO: 9).
The sgRNA(mCherry) crRNA is in SEQ ID NO: 15.
hFor constructs containing an N-terminal or C-terminal deletion, residues were removed during in vivo plasmid assembly.
Culture Conditions
Yeast were grown on solid or in liquid medium including YPD (2% peptone, 1% yeast extract, 2% dextrose) or synthetic media (nitrogen base, ammonium sulfate, and amino acids). Pre-induction medium included 2% raffmose and 0.2% sucrose. Induction (prGALl/10) media included 2% galactose. All sugars were filter sterilized.
CRISPR/Cas9-Based Editing
Editing of haploid yeast utilized the mCAL system. Multiplexing of Cas9 was accomplished by programming two DNA sequences flanking a locus of interest with a maximum mismatch to the genome. These artificial targets [ul]/[u2] also included a PAM. Fig. 17A shows the schematic of the yeast Cas9 expression platform at the endogenous HIS3 locus. The Cas9 gene is under control of the inducible GALl/10 promoter and the locus is marked with the KanR cassette. The entire expression module is flanked by two identical artificial [u2] sites (23 base pairs including the PAM sequence), as previously described. A high-copy LEU2-marked plasmid harbors the sgRNA[u2] cassette whereas a URA3-based plasmid is also present (empty or expressing an anti-CRISPR gene). The S. pyogenes Cas9 gene was synthesized with a yeast codon bias and integrated. Editing was performed by transforming the pRS316-based plasmid (empty or harboring the Acrll gene) into yeast followed by a second transformation event to add the sgRNA plasmid.
Activation of Cas9 was performed according to published protocols. Yeast (GFY-2383) harboring an empty URA3-based plasmid (pRS316) were cultured in pre-induction medium (raffinose/sucrose mixture) overnight at 30°C, back-diluted to an OD6oo of approximately 0.35 in rich medium containing galactose (YPGal medium) and grown for 4.5 additional hr. Cells were harvested, transformed with the equimolar amounts of the appropriate sgRNA[u2] plasmid (A, pGF-V809) or an empty vector control (B; pRS425), recovered overnight in fresh YPGal medium, and plated onto SD-URA-LEU selection plates. In one sample, the guide RNA plasmid was co-transformed with a PCR fragment (C; WT HIS3 ORF with 1,000 bp of flanking 5' and 3' UTR). The total number of surviving colonies was quantified using a single-blind protocol and sectoring method and graphed on a logio scale.
Gene Drives and Containment
Three scenarios are depicted in Fig. 18A involving (i) a fully active gene drive, (ii) partially active drive activity, and (iii) fully inhibited drive activity due to the presence of the anti-CRISPR proteins. The haploid "gene drive" strain (harboring inducible S. pyogenes Cas9) was built m MATa yeast (GFY-2383) and included (i) a LEU2-based high copy plasmid with the sgRNA[ul] cassette and (ii) the URA3-based CEN plasmid expressing untagged AcrIIA2 or AcrIIA4 {top). Two nearly isogenic "target" strains were built in MAT yeast (GFY-3206 and GFY-3207) containing an artificial target gene and selectable HIS5 marker (from S. pombe). The entire construct was flanked with two [ul] artificial target sites. Middle, the three scenarios are depicted.
For activation of an artificial gene drive system, yeast were first transformed with (i) either an empty vector (pRS316) or plasmid expressing AcrIIA2 (pGF-IVL1336) or AcrIIA4 (pGF-IVL1337) and (ii) the sgRNA[u2] plasmid (pGF-V1220), and mated to the target strains (pGF-3206 and pGF-3207) on rich medium (containing dextrose) for 24 hr at 30°C. Second, diploid yeast were obtained by velvet transfer of all colonies to SD-URA-LEU-HIS medium for two consecutive rounds of selection. Third, cultures of pre-induction medium (raffinose/sucrose lacking leucine and lacking uracil) were grown overnight, back-diluted into YPGal, and cultured for between 0 and 12 hr. Fourth, cells were harvested, washed, and diluted to approximately 500- 1000 cells per plate (SD-URA-LEU medium) and grown for 2-3 days. Fifth, yeast were transferred by velvet to an identical plate type and SD-HIS medium for an additional 24 hr incubation before imaging. Representative plates for each time point are illustrated. Representative plates for each time point are illustrated in Fig. 18B.
The total number of surviving colonies was quantified for each plate type in duplicate, "gene drive activity" was illustrated as the proportion of sampled colonies (n = 100-300 colonies per plate) sensitive to the SD-HIS condition (e.g. 99% of colonies present on SD-URA-LEU plate but absent on the SD-HIS plate corresponds to 99% gene drive activity). (Fig. 18C). Molecular analysis of diploid yeast following gene drive activation is shown in Fig. 18D. Clonal isolates were obtained from the 0 and 12 hr time points (SD-URA-LEU plate), chromosomal DNA was purified, and PCRs were performed on the diploid genomes. Primer combinations and the expected fragment sizes {right) are illustrated in the gene drive schematic. Four representative isolates from each genotype are illustrated— for the gene drive containing AcrIIA2, two isolates displaying no growth on SD-HIS were also tested (red asterisk).
Numerous safeguards were implemented to ensure contained use of all yeast strains harboring gene drives. Briefly, these included (i) use of the artificial sequences at the HIS3 locus, (ii) flanking [u2] sites at the drive cassette itself providing the option for rapid self-excision and removal, (iii) placement of the sgRNA expression module on an unstable high-copy (2μ) vector, (iv) poor sporulation of the S. cerevisiae BY4741/BY4742 background, (v) constant repression of Cas9 expression until required, and (vi), careful inactivation of all yeast strains (cultures, plates, and consumables) by autoclaving.
Fluorescence Microscopy, Imaging, and Graphics
Yeast were grown overnight in a pre-induction culture, back-diluted into medium containing galactose for 4.5 hr, washed, and prepared on a microscope slide. Cells were imaged using a Leica DMI6500 fluorescence microscope (Leica Microsystems Inc., Buffalo Grove, IL) with a lOOx lens, and fluorescence filters (Semrock, GFP-4050B-LDKM-ZERO, mCherry-C- LDMK-ZERO). A Leica DFC340 FX camera, Leica Microsystems Application Suite software, and ImageJ (National Institute of Health) were used. All images were obtained using identical exposure times and were rescaled together. The "merged" images do not contain any additional processing. Representative cells were chosen for each image. Quantification of the plasma membrane (maximum) pixel value was done by measuring the cell periphery with a line tool; the cytosol (mean) pixel value was obtained using a line tool. Ten independent measurements were made per cell. Samples were analyzed in a single-blind fashion.
The anti-CRISPR genes AcrIIA2 and AcrIIA4 were transformed into WT yeast (BY4741), and imaged by fluorescence microscopy (Fig. 19A). The anti-CRISPR genes AcrIIA2 and AcrIIA4 were cloned (pGF-IVL1384 to pGF-IVL1387) under control of the CDC 11 promoter on CSV-based plasmids, tagged with GFP at either their N- or C-terminus, transformed into WT yeast (BY4741), and imaged by fluorescence microscopy. White dotted outline, cell periphery. Scale bar, 3 μπι. Representative cells are illustrated {left). An average pixel intensity (cytosol) was measured for individual cells {right, n = 30-75 cells per genotype). Error, SD.
Molecular graphics were performed with the Univ. of California, San Francisco Chimera package (Resource for Biocomputing, Visualization, and Informatics).
RESULTS
Cas9-based editing in haploid yeast
Given the enormous utility and versatility of the CRISPR/Cas editing system across many fields of scientific inquiry, we have developed a programmable, artificial editing system in budding yeast for use in analyzing genome editing (haploids) and gene drives (diploids). This system allows for rapid exploration of conserved CRISPR components ranging from guide RNA identities to Cas9 subcellular localization(s). Our system utilizes the presence of artificially programmed sites that contain a maximum mismatch to the yeast genome and are positioned flanking the introduced Cas9 expression cassette and target locus of interest (Fig. 17A). In this way, (i) off-target effects are virtually eliminated, (ii) multiplexing at identical targets requires only a single guide RNA, and (iii) gene drive containment and security are maximized. Here, we modified our in vivo editing assay to assess whether S. pyogenes Cas9 could efficiently edit the yeast genome (Figs. 17A & 17B). Yeast codon optimized Cas9 was integrated under control of an inducible GALl/10 promoter and flanked by two identical [u2] sites including the PAM sequence 5'-NGG-3' . Targeting of the dual identical [u2] sites causes full excision of the Cas9 cassette and KanR marker. Survival of the yeast cell requires non-homologous end joining (NHEJ) to repair the DSB in the absence of any provided donor DNA. However, given that exact repair of the dual [u2] sites allows for the formation of a new [u2] site (Fig. 17A), subsequent editing will cause cell death unless a sequence modification occurs to disrupt the target site. We have previously demonstrated that successful editing with S. pyogenes Cas9 causes cell inviability. In our haploid editing assay, plasmids expressing the sgRNA cassette were transformed into yeast (selecting only for the presence of the plasmid marker). An identical protocol was performed by co-transforming the guide plasmid and a PCR fragment providing donor DNA for homologous recombination (Figs. 17B, 20). Fig. 20 shows example editing plates (S. pyogenes) from Figs. 17A & 17B. A sample SD-URA-LEU (repair via HEJ) or SD- HIS (repair via FIDR) are illustrated after 3-5 days of incubation at 30°C. Empty Vector, pRS425. Plasmid harboring sgRNA[u2], pGF-V809. HIS3 PCR includes approximately 1,000 bp of 5' and 3' UTR. The total number of surviving colonies demonstrated that editing was very efficient with S. pyogenes Cas9.
Our aim was to provide a controlled setting to test for inhibition of Cas9 editing in vivo by the newly identified class of anti-CRISPR proteins. These proteins evolved within bacteriophages to counteract the enzymatic function of the Cas9 nuclease. The application of such a naturally occurring counter to the CRISPR editing system could have great potential in many areas of current investigation. We synthesized two anti-CRISPR proteins— AcrIIA2, AcrIIA4,— with a yeast codon bias and expressed them in vivo to address whether they could efficiently inhibit the editing function of Cas9. Expression of AcrIIA2 or AcrIIA4 tagged with GFP and under control of a modest promoter element (CDCll) provided detectible levels of expression within both the cytosol and nuclei of WT cells (Fig. 19A). In Fig. 19A, the white dotted outline is the cell periphery, and the scale bar is 3 μπι. Representative cells are illustrated (left). An average pixel intensity (cytosol) was measured for individual cells (right, n = 30-75 cells per genotype). We observed a higher steady-state level of AcrIIA4 protein in multiple experiments compared to AcrIIA2. GFP-tagged and untagged versions of the AcrIIA2 and AcrIIA4 proteins were expressed in haploid yeast containing ,SpCas9 and were transformed with the sgRNA plasmid. Fig. 19B shows select comparisons between experimental conditions (left) that were analyzed using an unpaired t-test. The haploid yeast strain harboring S. pyogenes Cas9 (GFY-2383) was first transformed with plasmids expressing either GFP-tagged (A) or untagged (pGF-IVL1336 and pGF-IVL1337) AcrIIA2 and AcrIIA4 constructs. Second, following induction of Cas9 in galactose, yeast were transformed with equimolar amounts of sgRNA[u2]- containing plasmid (pGF-V809), and the total number of colonies was quantified on SD-URA- LEU plates in triplicate. Error, SD. Select comparisons between experimental conditions (left) were analyzed using an unpaired t-test. Red text highlights p-values greater than 0.05. Red text highlights p-values greater than 0.05. We found significant inhibition of Cas9 editing in cells expressing either WT AcrIIA2 or AcrIIA4; however, tagging of AcrIIA2 with GFP at either the N- or C-terminus impaired its ability to inhibit editing, distinct from AcrIIA4, which could tolerate the presence of full-length GFP.
In order to determine which residue(s) or motifs within AcrIIA2 and AcrIIA4 were necessary and/or sufficient for their inhibitory function, we constructed a series of small deletions from either the N- or C-terminus of each protein, such that 10 or 20 residues were removed from the A2 and A4 N- or C-termini and expressed in vivo on a plasmid under the CDC11 promoter (pGF-IVL1388 to pGF-IVL1395). Vectors were transformed into yeast harboring an inducible S. pyogenes Cas9 expression cassette (GFY-2383). Editing in haploid yeast was performed using the sgRNA[u2] vector (pGF-V809). Yeast were plated onto SD- URA-LEU media and the total number of surviving cells were quantified in triplicate. However, deletion of only ten residues from either terminus was sufficient to destroy the inhibitory function as shown in Fig. 21. 10 or 20 residues were removed from the A2 and A4 N- or C- termini and expressed in vivo on a plasmid under the CDC11 promoter (pGF-IVL1388 to pGF- IVL1395). Vectors were transformed into yeast harboring an inducible S. pyogenes Cas9 expression cassette (GFY-2383). Editing in haploid yeast was performed using the sgRNA[u2] vector (pGF-V809). Yeast were plated onto SD-URA-LEU media and the total number of surviving cells were quantified in triplicate. Error, SD
Therefore, we performed an unbiased alanine scan across both anti-CRISPR proteins.
Pairs of adjacent residues were chosen as well as several combinations of acidic residues (as both proteins contain many Asp and Glu amino acids). Editing in haploid yeast was performed on all alanine mutants with an identical guide RNA. The AcrIIA2 protein was mutated using pairs of alanine substitutions— red text highlights all amino acids included in the mutation analysis (top). Yeast (GFY-2383) were first transformed with all URA3-based plasmids: (i) empty pRS316, (ii) WT AcrIIA2 (pGF-IVL1336), or (iii) mutant AcrIIA2 (pGF-V1399 to pGF-V1420). Second, editing of haploid yeast was performed as previously described. Briefly, following induction of Cas9, sgRNA[u2] plasmid (pGF-V809) was transformed, recovered overnight, and plated to SD- URA-LEU medium. The total number of surviving colonies was quantified in triplicate. Error, SD. A similar mutational analysis was performed on the A4 protein as in (A). Plasmids included: (i) empty pRS316, (ii) WT A4 (pGF-IVL1337), and (iii) mutant A4 (pGF-V1421 to pGF- V1439). For AcrIIA2, 6/22 mutants caused a total loss of inhibitory function, 2/22 mutants caused an intermediate level of inhibition, and 14/22 had little to no effect on editing (Fig. 22 A). For AcrIIA4, 2/19 mutants caused a loss of inhibition, 3/19 displayed an intermediate range, and 14/19 has little to no effect on protein function (Fig. 22B). These data demonstrate that alteration of the AcrIIA2/A4 primary sequence may provide a means to titrate inhibition of ,SpCas9 in vivo. Inhibition of nuclease-based gene drives
One application of Cas9 inhibition that has been previously proposed has been to halt the action and progression of nuclease-base gene drive. Given the potential for this *¾per-Mendelian drive of a genetic element through a population, we tested whether AcrIIA2 and AcrIIA4 could inhibit a gene drive in budding yeast. Design of our gene drive system includes use of flanking [ul] sites, an artificial target gene, and a selectable marker (S. pombe HIS5) to assess "success" of the drive following activation of Cas9 (Fig. 18 A). The plasmids containing AcrIIA2 or AcrIIA4 and the sgRNA plasmid were transformed into yeast prior to activation of Cas9, with plasmids expressing either GFP -tagged (A) or untagged (pGF-IVL1336 and pGF-IVL1337) AcrIIA2 and AcrIIA4 constructs. Second, following induction of Cas9 in galactose, yeast were transformed with equimolar amounts of sgRNA[u2]-containing plasmid (pGF-V809), and the total number of colonies was quantified on SD-URA-LEU plates in triplicate.
Following mating, diploid selection, and induction of Cas9, we quantified success of the active gene drive within a population of yeast by scoring loss of the genetic marker within the target genome (Fig. 18B,C). "Gene drive activity" was illustrated as the proportion of sampled colonies (n = 100-300 colonies per plate) sensitive to the SD-HIS condition (e.g. 99% of colonies present on SD-URA-LEU plate but absent on the SD-HIS plate corresponds to 99% gene drive activity). An active drive system results in >99% activity following expression of Cas9. However, inclusion of either AcrIIA2 or AcrIIA4 in the same strain caused a near total loss of gene drive activity (>99.9% inhibition by AcrIIA4). Molecular analysis of diploid yeast following gene drive activation was based upon clonal isolates obtained from the 0 and 12 hr time points (SD-URA-LEU plate). Chromosomal DNA was purified, and PCRs were performed on the diploid genomes. Primer combinations and the expected fragment sizes (right) are illustrated in the gene drive schematic (Fig. 18 A). Four representative isolates from each genotype are illustrated— for the gene drive containing AcrIIA2, two isolates displaying no growth on SD-HIS were also tested (red asterisk). Characterization of samples of surviving diploids demonstrated that strains expressing the anti-CRISPR proteins maintained copies of both the gene drive and target cassettes whereas diploids following action of the drive only contain two identical copies of the gene drive allele (Fig. 18D).
We also tested the inhibitory function of all AcrIIA2 and AcrIIA4 mutants in the context of this drive system. Vectors expressing AcrIIA2 mutants (Fig. 22A, red text) and the sgRNA[ul] plasmid (pGF-V1220) were transformed into yeast along with controls (2x, empty pRS316, pRS425; lx, empty pRS316, sgRNA[ul] plasmid). Gene drive strains were mated to the target strains (GFY-3206 and GFY-3207) and the diploid strains were induced in rich media containing galactose for 5 hr before plating. The percentage of colonies displaying an active drive system was quantified in duplicate. Error, SD. Vectors expressing AcrIIA4 mutants (Fig. 22B, red text) and the sgRNA[ul] plasmid were also analyzed. Additional mutants not previously tested were also included (pGF-V1470 to pGF-V1485 and pGF-V1534 to pGF- VI 536). Blue asterisks, residues tested by previous groups shown to contribute to AcrIIA4 association with sgRNA-loaded Cas9. Red asterisks, mutational substitutions displaying a partial loss of inhibitory function not previously documented. We observed an inverse correlation between anti-CRISPR inhibitory function and gene drive activity (Figs. 22A & 22B, Figs. 23A and 23B). We also included additional mutations (single and double residue changes) and tested their contribution to Cas9 gene drive inhibition (Fig. 23B, Table 6). Of note, our mutational analysis is consistent and highlighted AcrIIA4 residues N39, E40, Y67, D69, and E70 as critical for association with Cas9 by in vitro binding and nuclease-dependent DNA cleavage assays.
Mutational substitution of these critical residues (and in combination) were shown to reduce or eliminate binding and prevent AcrIIA4 inhibition of Cas9. Here, we provide in vivo evidence that mutation of these residues (to alanine or arginine) largely decreases the effectiveness of the AcrIIA4 as a Cas9/gene drive inhibitor. While the strongest single mutations included N39R and E70R, we have demonstrated further decrease in binding (and function) by combinatorial substitutions (Fig. 23B, Table 6). Vectors expressing AcrIIA2 mutants and the sgRNA[ul] plasmid (pGF-V1220) were transformed into yeast along with controls (2x, empty pRS316, pRS425; lx, empty pRS316, sgRNA[ul] plasmid). Gene drive strains were mated to the target strains (GFY-3206 and GFY-3207) and the diploid strains were induced in rich media containing galactose for 5 hr before plating. The percentage of colonies displaying an active drive system was quantified in duplicate. Error, SD. Vectors expressing AcrIIA4 mutants and the sgRNA[ul] plasmid were also analyzed. Additional mutants not previously tested were also included (pGF-V1470 to pGF-V1485 and pGF-V1534 to pGF-V1536). Blue asterisks, residues tested by previous groups shown to contribute to AcrIIA4 association with sgRNA-loaded Cas9. Red asterisks, mutational substitutions displaying a partial loss of inhibitory function not previously documented. Table 6: Comparison of AcrIIA4 substitution mutants— effects on Cas9 binding, in vitro inhibition of ,SpCas9, and in vivo inhibition of a Cas9-based gene drive.
Figure imgf000074_0001
L82, K83, S84, E85, all show no effect on inhibition
L86, N87
aAcrIIA4 mutant substitutions are listed (in general) from residue 1 to 87. Groupings of more than one residue are for clarity: some combinations were only tested in combinations with other residues. Several residues are found within multiple categories if they occur as part of doubl e/tripl e/ quadrupl e sub stituti ons .
bThe in vivo assay used in this study to provide a quantification of AcrIIA4 inhibitory function S. pyogenes Cas9 is by halting an active gene drive in diploid yeast after 5 hrs of nuclease induction.
Tor several double mutant combinations, we tested the sgRNA loaded dCas9/AcrIIA4 association by an in vivo fluorescence localization assay. All three mutants displayed a total loss of Cas9 association as determined by a lack of PM-localization.
Our analysis also revealed that residues Y15, F73, and M77 provided an intermediate reduction of inhibition that might be further exploited to titrate the level of Cas9 activity (Fig. 23B). Surprisingly, these three residues are found in close proximity within the AcrIIA4 structure forming part of the hydrophobic interior and do not appear to contact Cas9, based upon the crystal structure of the AcrIIA4 protein bound to Cas9/sgRNA (PDB 5XBL) (Fig. 23 C). Cyan-labeled residues (and side chains) are residues previously demonstrated to be critical for AcrIIA4 association with Cas9. Yellow-labeled residues (red text) illustrate three substitutions found to partially reduce AcrIIA4 inhibitory function in vivo. Right, side views of AcrIIA4 with a 180° rotation.
For three of the most potent mutant combinations resulting in a loss of inhibition by AcrIIA4 (E70A/E71A, E40A/Y41A, and D14A/Y15A), we developed an in vivo dCas9 association assay (Figs. 24A & 24B). A yeast strain was constructed harboring an inducible dCas9 (D10A H840A) fused to both mCheny and LactC2 at its C-terminus (GFY-3104) and transformed with (i) a sgRNA[ul] plasmid (pGF-V1220) and (ii) GFP-tagged AcrIIA2/A4- containing plasmids (pGF-IVL1384 to pGF-IVL1387). Strains were cultured overnight in raffinose/sucrose medium lacking uracil and lacking leucine, back-diluted into synthetic medium containing galactose also lacking uracil and leucine, and grown for 4.5 hr at 30°C. Cells were harvested, washed with water, and imaged by fluorescence microscopy. Representative images are displayed in Fig. 24A, scale bar, 3 μπι. Expression of a membrane-tethered (via a Lact-C2 domain and lacking any LS sequence), mCherry-tagged dCas9 primed with a (nonsense) sgRNA was co-expressed with WT AcrIIA2 or AcrIIA4 in yeast. A measure of the ratio between the maximum pixel intensity located on the plasma membrane was compared to a sampling of the average cytosolic pixel intensity for the GFP signal (n = 15-30 cells per genotype). Ten random individual measurements were taken for both the plasma membrane and cytosolic levels per cell. Error, SD. Three additional AcrIIA4 constructs were tested (asterisks) containing sets of two alanine substitutions (pGF-IVL1431 to pGF-IVL1433). Bottom, select strains were compared using an unpaired t-test. Red text, p-values greater than 0.05. We observed recruitment of the GFP-tagged AcrIIA4 protein to the plasma membrane— this co-localization was (i) dependent on the presence of dCas9 (Fig. 25), (ii) independent of GFP interacting with the dCas9 construct (our unpublished data), and (iii) dependent on a fully functional AcrIIA4 protein as all three alanine mutants failed to localize to the plasma membrane (Fig. 24B). As shown in Fig. 25, GFP-tagged AcrIIA2 and AcrIIA4 proteins are not recruited to the plasma membrane by mCherry or the LactC2 domain. A strain (GFY-3268) was created which included an mCherry- LactC2 fusion under control of the GALl/10 promoter at the HIS3 locus. Plasmids expressing (i) C-terminally tagged AcrIIA2 or AcrIIA4 (pGF-IVL1386 and pGF-IVL1387) and (ii) sgRNA[ul] (pGF-IVL1220) were transformed into yeast. Cultured were pre-induced overnight in raffinose/sucrose lacking uracil and leucine, back-diluted to medium containing galactose and lacking uracil and leucine, and were grown for 4.5 hrs at 30°C. Cells were harvested, washer with water, and imaged by fluorescence microscopy. Representative images are shown. Scale bar, 3 μιη.
Finally, we developed a gene drive system harboring an inducible AcrIIA2/A4 within the drive cassette. Fig. 26A shows a schematic of the gene drive system harboring an inducible AcrIIA2 or AcrIIA4 inhibitor within the cassette. Modifications to the previously tested gene drive (Figs. 18A & 18B, 23 A & 23B) included integration of a second inducible expression cassette {MET25 promoter) proximal to the S. pyogenes gene drive system controlling expression of either AcrIIA2 or AcrIIA4. The sgRNA targeted the mCherry gene within the target strains. While ,SpCas9 expression was activated in the presence of galactose, AcrIIA2/A4 expression was induced by lack of methionine {MET25 promoter). Several experimental conditions were tested including co-expression of both Cas9 and the inhibitor, expression of only one component prior to the other, and relevant controls (Fig. 26B). The two gene drive strains (GFY-3285 and GFY- 3287) were transformed with the sgRNA(mCh) plasmid (pGF-425+IVL1277), mated to the target strains (GFY-3206 and GFY-3207), and diploids were selected on SD-LEU-HIS plates (three consecutive times). Diploids were then cultured overnight in pre-induction medium (raffinose/sucrose lacking leucine and containing methionine). Five distinct growth conditions were tested (labeled 1-5) altering the order of either Cas9 induction, AcrIIA2/A4 induction, or control conditions. Cas9 induction included culturing in medium containing galactose. Inhibition (asterisk) of Cas9 expression included use of the raffinose/sucrose mixture. Activation of AcrIIA2/A4 included culturing in medium lacking methionine (repression by addition of methionine). All culturing steps also lacked leucine to maintain the sgRNA(mCh) plasmid. For conditions (1) and (5), a media switch occurred— after 2 hrs, yeast were harvested, washed 4 times with water, and resuspended in the new media type for an additional 2 hrs. All diploids were plated on SD-LEU medium as previously described (500-1000 cells per plate).
A recent study had also provided data that inhibition of Cas9 editing was sensitive to the timing of delivery/presence of the inhibitor protein. Here, we observed that drive inhibition can be titrated to various levels of activity (including full inhibition) based on (i) the choice of either AcrIIA2 or AcrIIA4 and (ii) the timing of anti-CRISPR expression. While the MET25 promoter does allow for some leaked transcript of AcrIIA2/A4 (Fig. 26B, condition 3), our design demonstrates that a gene drive could be programmed to halt or titrate drive activity depending on the intended application and choice of inducible promoter systems.
DISCUSSION
Our study has demonstrated that the anti-CRISPR proteins AcrIIA2 and AcrIIA4 are able to inhibit the function of S. pyogenes Cas9 in vivo within haploid yeast and an active gene drive system. We found that epitopes or gene fusions were not tolerated on either terminus of AcrIIA2 in contrast to AcrIIA4. Therefore, we performed an extensive mutational scan across both proteins to (i) determine which residues were necessary for function and (ii) determine whether specific substitutions might provide an intermediate level of Cas9 inhibition. Based on the recent AcrIIA4 crystal structure, it is evident that inhibition of ,SpCas9 is through mimic of the PAM- binding motif within the nuclease. Several critical residues on AcrIIA4 were shown to be necessary for binding of Cas9 and inhibition of function in vitro. Our approach has confirmed these findings, revealed additional residues necessary for function within AcrIIA2/A4, and identified several positions that provided a partial loss of function. Use of a tagged or mutated AcrIIA4 protein coupled with temporal control over expression (as demonstrated with an inducible promoter) could provide an expanded suite of options for control of Cas9 function in vivo, and especially as a potent gene drive inhibitor.
The anti-CRISPR proteins provide several advantages over other (current or proposed) methods of Cas9 inhibition. First, it is a separate peptide not requiring translational fusion to Cas9/dCas9— this includes the ability to control timing and level of expression of the inhibitor separate from the nuclease. We suspect that regulation of subcellular location (of both Cas9 and the inhibitor) may provide even more options to enhance or restrict interaction with the Cas9/guide RNA complex. Second, this class of proteins are extremely small (87 residues for AcrIIA4) and, in some cases, can tolerate tags (GFP) three times their size and still function to inhibit Cas9. Third, inhibition appears to be titratable based on the amount of Acrll protein present and the inclusion of specific amino acid substitutions. Fourth, these proteins have evolved as a natural DNA mimic and may inform further design and/or optimization of new classes of peptides or small molecules. Our study has provided the first documented use of the anti-CRISPR proteins AcrIIA2 and AcrIIA4 to inhibit a gene drive system. Future nuclease- based gene drives could include an inducible drive inhibitor within the original cassette to provide a useful off-switch for the system, control the timing of drive activation, or halt propagation of a current drive element while a second drive replaces or destroys the first.

Claims

CLAIMS:
1. A modified CRISPR-Cas gene drive system configured for integration into a diploid eukaryotic cell genome at a target site, comprising a gene drive construct comprising
a first nucleotide sequence encoding a single guide RNA sequence complementary to said target site;
a second nucleotide sequence encoding for a functional CRISPR nuclease that induces a double-stranded break in or near said target site; and
a pair of flanking sequences homologous to sequences adjacent said target site of integration, wherein said first and second nucleotides are located between said pair of flanking sequences in said construct;
wherein said system further comprises one or more modifications to thereby inhibit activity of said gene drive in said cell, said modifications being selected from the group consisting of:
a regulatory element operably linked to said second nucleotide sequence, wherein said regulatory element reduces expression of said functional CRISPR nuclease in said cell;
one or more base pair mismatches in said single guide RNA to reduce specificity of said system to said target site;
one or more nuclear export signal sequences to reduce accumulation of said functional CRISPR nuclease in the nucleus of said cell; and
a secondary CRISPR nuclease expressed in said cell that competes with said functional CRISPR nuclease for binding to said target site.
2. The modified CRISPR-Cas gene drive system of claim 1, wherein said system has at least 5% lower activity as compared to a system without one or more of said modifications.
3. The modified CRISPR-Cas gene drive system of claim 1, further comprising a double stranded DNA donor sequence for incorporation into the eukaryotic cell.
4. The modified CRISPR-Cas gene drive system of claim 1, wherein said single guide RNA has a single mismatch at the 5' end of the sequence.
5. The modified CRISPR-Cas gene drive system of claim 1, further comprising one or more nuclear localization signal sequences.
6. The modified CRISPR-Cas gene drive system of claim 5, wherein the ratio of nuclear localization signals to nuclear export signals in said system ranges from 1 : 1 to 3 : 1.
7. The modified CRISPR-Cas gene drive system of claim 5, wherein said second nucleotide sequence comprises said nuclear localization signal sequence and/or nuclear export signal sequence at the C- or N-terminus of said functional CRISPR nuclease sequence.
8. The modified CRISPR-Cas gene drive system of claim 1, wherein said functional CRISPR nuclease is Cas9.
9. The modified CRISPR-Cas gene drive system of claim 8, wherein said Cas9 is selected from the group consisting of S. pyogenes, S. aureus, S. pneumoniae, S. thermophilus, N. meningitidis Cas9 nucleases, and functional Cas9 orthologs and mutant Cas9 derived from these organisms.
10. The modified CRISPR-Cas gene drive system of claim 1, wherein said secondary CRISPR nuclease is dead Cas9.
11. The modified CRISPR-Cas gene drive system of claim 10, wherein said dead Cas9 comprises a mutation selected from the group consisting of D10A, H840A, and combinations thereof.
12. The modified CRISPR-Cas gene drive system of claim 1, wherein said second nucleotide sequence encodes for a fusion protein of said functional CRISPR nuclease and secondary CRISPR nuclease.
13. The modified CRISPR-Cas gene drive system of claim 12, wherein said secondary CRISPR nuclease in said fusion protein is dead Cas9, wherein activity of said system is less than 70% as compared to a system without said modification.
14. The modified CRISPR-Cas gene drive system of claim 12, wherein said secondary CRISPR nuclease in said fusion protein is functional Cas9, wherein activity of said system is less than 90% as compared to a system without said modification.
15. The modified CRISPR-Cas gene drive system of claim 1, said system further comprising one or more vectors comprising sequences encoding for one or more of AcrIIA2 (SEQ ID NO:2) and AcrIIA4 (SEQ ID NO:3), or functional mutants thereof, for expression in the eukaryotic cell.
16. A eukaryotic host cell comprising a modified CRISPR-Cas gene drive system according to any one of claims 1-15 integrated into its genome.
17. An organism comprising the eukaryotic host cell of claim 16.
18. A method of integrating a gene drive into a eukaryotic cell genome at a target site, said method comprising:
introducing into said eukaryotic cell a modified CRISPR-Cas gene drive system according to any one of claims 1-15,
wherein said construct is expressed in said cell to produce said functional CRISPR nuclease and single guide RNA, which co-localize in said cell at said target site on a first chromosome, said CRISPR nuclease inducing a double-stranded break at said site, wherein homology-directed repair mediated by said flanking sequences integrates said gene drive construct into said target site.
19. The method of claim 18, wherein said integrated gene drive construct is expressed in said cell to produce a functional CRISPR nuclease and single guide RNA, which co-localize in said cell at a homologous target site on a second chromosome, said CRISPR nuclease induces a double-stranded break at said homologous target site, wherein homology-directed repair uses said integrated gene drive construct-containing first chromosome as a template to thereby integrate a copy of said gene drive construct into said second chromosome yielding a chromosome pair that is homozygous for said gene drive construct, wherein the effectiveness of integrating said copy of the said gene drive construct into said second chromosome is decreased by at least 5% as compared to a gene drive construct without one or more of said modifications.
20. A method of altering genetic sequences of a genome at a target site in a target population of a sexually-reproducing species comprising:
integrating a modified CRISPR-Cas gene drive system according to any one of claims 1-
15 into a genome of a member of said population,
wherein said system has at least 5% lower effectiveness in integrating into a genome of subsequent members of said target population as compared to a system without one or more of said modifications.
21. A modified CRISPR-Cas gene editing system for alteration of genetic sequences in a eukaryotic cell containing a DNA molecule having a target sequence and encoding a gene product, said system comprising one or more vectors comprising:
a first nucleotide sequence encoding a single guide RNA sequence complementary to said target sequence in said eukaryotic cell; and
a second nucleotide sequence encoding for a functional CRISPR nuclease that induces a double-stranded break in or near said target sequence thereby altering said genetic sequence,
wherein said system further comprises one or more modifications to thereby inhibit said alteration of said genetic sequences in said cell, said modifications being selected from the group consisting of:
a regulatory element operably linked to said second nucleotide sequence, wherein said regulatory element reduces expression of said functional CRISPR nuclease in said cell;
one or more base pair mismatches in said single guide RNA to reduce specificity of said system to said target sequence;
one or more nuclear export signal sequences to reduce accumulation of said functional CRISPR nuclease in the nucleus of said cell; and
a secondary CRISPR nuclease expressed in said cell that competes with said functional CRISPR nuclease for binding to said target sequence.
22. The modified CRISPR-Cas gene editing system of claim 21, wherein said system has at least 5% lower activity as compared to a system without one or more of said modifications.
23. The modified CRISPR-Cas gene editing system of claim 21, further comprising a double stranded DNA donor sequence for incorporation into the eukaryotic cell.
24. The modified CRISPR-Cas gene editing system of claim 21, wherein said single guide RNA has a single mismatch at the 5' end of the sequence.
25. The modified CRISPR-Cas gene editing system of claim 21, further comprising one or more nuclear localization signal sequences.
26. The modified CRISPR-Cas gene editing system of claim 25, wherein the ratio of nuclear localization signals to nuclear export signals in said system ranges from 1 : 1 to 3 : 1.
27. The modified CRISPR-Cas gene editing system of claim 25, wherein said second nucleotide sequence comprises said nuclear localization signal sequence and/or nuclear export signal sequence at the C- or N-terminus of said functional CRISPR nuclease sequence.
28. The modified CRISPR-Cas gene editing system of claim 21, wherein said functional CRISPR nuclease is Cas9 or functional variants thereof.
29. The modified CRISPR-Cas gene editing system of claim 28, wherein said Cas9 is selected from the group consisting of S. pyogenes, S. aureus, S. pneumoniae, S. thermophilus, N. meningitidis Cas9 nucleases, and functional Cas9 orthologs and mutant Cas9 derived from these organisms.
30. The modified CRISPR-Cas gene editing system of claim 21, wherein said secondary CRISPR nuclease is dead Cas9.
31. The modified CRISPR-Cas gene editing system of claim 30, wherein said dead Cas9 comprises a mutation selected from the group consisting of D10A, H840A, and combinations thereof.
32. The modified CRISPR-Cas gene editing system of claim 21, wherein said second nucleotide sequence encodes for a fusion protein of said functional CRISPR nuclease and secondary CRISPR nuclease.
33. The modified CRISPR-Cas gene editing system of claim 32, wherein said secondary CRISPR nuclease in said fusion protein is dead Cas9, wherein effectiveness of said system in altering said genetic sequence is less than 70% as compared to a system without said modification.
34. The modified CRISPR-Cas gene editing system of claim 32, wherein said secondary CRISPR nuclease in said fusion protein is functional Cas9, wherein effectiveness of said system in altering said genetic sequence is less than 90% as compared to a system without said modification.
35. The modified CRISPR-Cas gene editing system of claim 21, said system further comprising sequences encoding for one or more of AcrIIA2 (SEQ ID NO:2) and AcrIIA4 (SEQ ID NO:3) for expression in the eukaryotic cell.
36. A eukaryotic host cell comprising the system according to any one of claims 21-35.
37. An organism comprising the eukaryotic host cell of claim 36.
38. A method of altering a genetic sequence in a eukaryotic cell containing a DNA molecule having a target sequence and encoding a gene product, said method comprising:
introducing or expressing in said eukaryotic cell a modified CRISPR-Cas gene editing system according to any one of claims 21-35.
39. A method of inhibiting a CRISPR-Cas gene editing system or gene drive system, wherein said system has been introduced into a eukaryotic cell, said method comprising introducing or expressing an anti-CRISPR protein comprising at least a portion of an amino acid sequences selected from the group consisting of AcrIIA2 (SEQ ID NO:2), AcrIIA4 (SEQ ID NO:3), functional mutants thereof, and combinations thereof into the eukaryotic cell.
40. The method of claim 39, comprising introducing one or more vectors comprising sequences encoding for said one or more of AcrIIA2 (SEQ ID NO:2) and AcrIIA4 (SEQ ID NO:3) for expression in the eukaryotic cell.
PCT/US2018/016231 2017-09-29 2018-01-31 Programmed modulation of crispr/cas9 activity WO2019067011A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762565651P 2017-09-29 2017-09-29
US62/565,651 2017-09-29

Publications (1)

Publication Number Publication Date
WO2019067011A1 true WO2019067011A1 (en) 2019-04-04

Family

ID=65902657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/016231 WO2019067011A1 (en) 2017-09-29 2018-01-31 Programmed modulation of crispr/cas9 activity

Country Status (1)

Country Link
WO (1) WO2019067011A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3766968A1 (en) * 2019-07-16 2021-01-20 Deutsches Krebsforschungszentrum Improving cas nuclease target specificity
WO2021021677A1 (en) * 2019-07-26 2021-02-04 The Regents Of The University Of California Control of mammalian gene dosage using crispr
WO2021089828A1 (en) 2019-11-08 2021-05-14 Georg-August-Universitaet Goettingen Stiftung Oeffentlichen Rechts Treatment of aberrant fibroblast proliferation
WO2021108442A3 (en) * 2019-11-27 2021-07-08 The Regents Of The University Of California Modulators of cas9 polypeptide activity and methods of use thereof
WO2023060058A1 (en) * 2021-10-08 2023-04-13 The Regents Of The University Of California Methods for delaying crispr action and improving gene drive effectiveness

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160333376A1 (en) * 2014-01-08 2016-11-17 President And Fellows Of Harvard College RNA-Guided Gene Drives
WO2017049266A2 (en) * 2015-09-18 2017-03-23 The Regents Of The University Of California Methods for autocatalytic genome editing and neutralizing autocatalytic genome editing and compositions thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160333376A1 (en) * 2014-01-08 2016-11-17 President And Fellows Of Harvard College RNA-Guided Gene Drives
WO2017049266A2 (en) * 2015-09-18 2017-03-23 The Regents Of The University Of California Methods for autocatalytic genome editing and neutralizing autocatalytic genome editing and compositions thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ROGGENKAMP ET AL.: "CRISPR-UnLOCK: Multipurpose Cas9-Based Strategies for Conversion of Yeast Libraries and Strains", FRONT MICROBIOL., vol. 8, 20 September 2017 (2017-09-20), pages 1 - 24, XP055586064 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3766968A1 (en) * 2019-07-16 2021-01-20 Deutsches Krebsforschungszentrum Improving cas nuclease target specificity
WO2021009247A1 (en) * 2019-07-16 2021-01-21 Deutsches Krebsforschungszentrum Improving cas nuclease target specificity
CN114302960A (en) * 2019-07-16 2022-04-08 德国癌症研究中心 Increasing Cas nuclease target specificity
WO2021021677A1 (en) * 2019-07-26 2021-02-04 The Regents Of The University Of California Control of mammalian gene dosage using crispr
WO2021089828A1 (en) 2019-11-08 2021-05-14 Georg-August-Universitaet Goettingen Stiftung Oeffentlichen Rechts Treatment of aberrant fibroblast proliferation
US11331333B2 (en) 2019-11-08 2022-05-17 Georg-August-Universität Göttingen Stiftung Öffentichen Rechts, Universitätsmadizin Treatment of aberrant fibroblast proliferation
WO2021108442A3 (en) * 2019-11-27 2021-07-08 The Regents Of The University Of California Modulators of cas9 polypeptide activity and methods of use thereof
WO2023060058A1 (en) * 2021-10-08 2023-04-13 The Regents Of The University Of California Methods for delaying crispr action and improving gene drive effectiveness

Similar Documents

Publication Publication Date Title
Roggenkamp et al. Tuning CRISPR-Cas9 gene drives in Saccharomyces cerevisiae
WO2019067011A1 (en) Programmed modulation of crispr/cas9 activity
DE202019005567U1 (en) New CRISPR DNA Targeting Enzymes and Systems
DE69929796T2 (en) EVOLUTION OF WHOLE CELLS AND ORGANISMS THROUGH RECURSIVE SEQUENCE RECOMBINATION
CA3111432A1 (en) Novel crispr enzymes and systems
JP6502259B2 (en) Site-specific enzymes and methods of use
KR20180019655A (en) Thermostable CAS9 nuclease
JP2018522566A (en) Engineered CRISPR-CAS 9 Compositions and Methods of Use
EP2834357A2 (en) Tal-effector assembly platform, customized services, kits and assays
CN107164375B (en) Novel guide RNA expression cassette and application thereof in CRISPR/Cas system
US20190144852A1 (en) Combinatorial Metabolic Engineering Using a CRISPR System
WO2022068912A1 (en) Engineered crispr/cas13 system and uses thereof
Akella et al. Co-targeting strategy for precise, scarless gene editing with CRISPR/Cas9 and donor ssODNs in Chlamydomonas
WO2017196858A1 (en) Methods to design and use gene drives
EP3652320A1 (en) Materials and methods for efficient targeted knock in or gene replacement
Häcker et al. Molecular tools to create new strains for mosquito sexing and vector control
JP7026304B2 (en) Targeted in-situ protein diversification through site-specific DNA cleavage and repair
US20220389398A1 (en) Engineered crispr/cas13 system and uses thereof
CA2271228A1 (en) Method for producing tagged genes, transcripts and proteins
US20190241879A1 (en) Methods and compounds for gene insertion into repeated chromosome regions for multi-locus assortment and daisyfield drives
WO2020014570A1 (en) Multi-locus gene drive system
US20210277421A1 (en) Homologous Recombination Reporter Construct and Uses Thereof
Giersch et al. Method for multiplexing CRISPR/Cas9 in Saccharomyces cerevisiae using artificial target DNA sequences
Yi et al. The application of transcription activator-like effector nucleases for genome editing in C. elegans
Nuckolls et al. S. pombe wtf genes use dual transcriptional regulation and selective protein exclusion from spores to cause meiotic drive

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18860658

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18860658

Country of ref document: EP

Kind code of ref document: A1