WO2018118585A1

WO2018118585A1 - Antiviral compositions

Info

Publication number: WO2018118585A1
Application number: PCT/US2017/066108
Authority: WO
Inventors: Derek D. Sloan; Xin Cindy XIONG
Original assignee: Agenovir Corporation
Priority date: 2016-12-22
Filing date: 2017-12-13
Publication date: 2018-06-28

Abstract

Compositions include programmable nucleases with targeting mechanisms to prevent off-target cleavage of genetic material. Compositions may include a programmable nuclease such as a CRISPR-associated (Cas) endonuclease complexed with a short guide RNA sequence and linked to a DNA-binding domain from, for example, a TALE, ZFN, or Bat protein such that the guide RNA and the DNA-binding domain target separate target sequences in a target viral genome and binding of both is required to the Cas endonuclease to cleave the bound genetic material. Compositions and methods may also include using a single-strand DNA oligo complementary to a targeting region sequence on a guide RNA-Cas endonuclease complex to competitively block mismatched binding of the targeting region sequence.

Description

ANTIVIRAL COMPOSITIONS

Cross -Reference to Related Applications

This application claims the benefit of priority of U.S. Provisional Application No.

62/438,074, filed December 22, 2016, the contents of which are incorporated by reference.

Technical Field

The disclosure relates to antiviral therapeutics.

Background

Viral infections cause problems ranging from social embarrassment and discomfort to severe pain or death. Rabies, chicken pox, the flu, shingles, hepatitis, and cancer are examples of painful or fatal conditions that may arise as a consequence of viral infection. Some of the most common targets of viral infection include the respiratory system, the gastrointestinal tract, the liver, the nervous system, and skin— all systems that are important to a healthy and productive life. Thus viral infections pose significant problems to human health and welfare.

Some viruses have the ability to go into a latent stage or a persistent stage of infection, in which the virus does not replicate itself as it does in the active stage. For some viruses, in the latent phase, the viral genome is maintained in the host cell as an episome, such as a closed circular DNA molecule that replicates independently of the host chromosomes.

Unfortunately, latent or persistent infections are difficult to treat at least because the virus is not exhibiting the proteins that are targeted by some antiviral therapeutics. During latency, the only viral targets available may be the DNA genome. While it may be possible to treat the infection by attempting to disrupt the viral genome— e.g., by digesting the genome with nucleases, there are potential difficulties associated with off-target activity. The viral episome is typically resident in human host cells which are also home to the approximately 3.2 billion base pair human genome. Any attempt to digest the viral episome that also does widespread damage to the human genome may be of limited clinical value.

Summary The invention includes a programmable nuclease such as a CRISPR-associated (Cas) endonuclease linked to a DNA-binding domain. The DNA-binding domain recognizes and binds to one site on a target viral genome, while the programmable nuclease recognizes and binds to another. The Cas endonuclease associated guide RNA of the invention comprises a shorter protospacer sequence that has minimal binding affinity for the target DNA. The DNA-binding domain is also designed to have minimal binding affinity for the target DNA. Accordingly, neither the guide RNA or the DNA-binding domain alone is sufficient to bind the target DNA sequence. Only when both components have bound to their respective targets does the nuclease have enough DNA binding affinity to cleave the viral genome. The addition of a DNA-binding domain, such as a transcription-activator like effector (TALE) binding domain, to a

programmable nuclease increases the target specificity of the composition. In preferred embodiments, the nuclease is a Cas endonuclease and is complexed with a guide RNA (gRNA) that has a recognition sequence complementary to the target in the viral genome. The gRNA and the TALE DNA-binding domain target different sites on a viral genome for cleavage. The two targeting mechanisms individually may have reduced binding affinity and specificity compared to a larger TALE binding domain or a gRNA as would naturally be associated with Cas endonuclease but in combination, the requirement for a Cas target proximal to a TALE target increases specificity of binding and reduces off-target cleavage.

Thus a Cas endonuclease complexed with a gRNA with a recognition sequence of minimal length and linked to a minimal TALE domain provides an antiviral therapeutic with good target specificity. A composition that include the nuclease linked to the TALE domain, with both being designed to bind to targets determined to be proximal to each other in a viral genome, may be delivered to a site of a viral infection. The TALE-linked nuclease will differentiate the target viral genome from the host genome due to the increased specificity. Thus, the increased specificity provided by the combined minimal TALE DNA-binding domain and short gRNA with minimal targeting sequence is particularly suited to targeting viral genetic material in infected host cells, and particularly for targeting the genomes of latent or persistent viruses in infected human cells.

DNA-binding domains may be a TALE domain or other binding domains and can be similarly designed to have require components to have bound to their respective targets before the nuclease has enough DNA binding affinity to cleave the viral genome. The DNA-binding domain may be a zinc finger DNA-binding domain of a zinc finger nuclease (ZFN). Standard zinc finger DNA-binding domains may contain between three and six individual zinc finger repeats and recognize between 9 and 18 basepairs and zinc finger DNA- binding domains of the invention may be shortened as described above to reduce binding affinity.

The DNA-binding domain may be a second, catalytically inactive Cas

endonuclease/gRNA complex of the same or different species as the first cleavage-active Cas endonuclease/gRNA complex.

The DNA-binding domain may be from a Burkholderia rhizoxinica protein (Bat protein). Burkholderia rhizoxinica is an endosymbiotic bacterium that contains TALE-like proteins. Bat protein binding domains function similarly to TALE binding domains and may be similarly programmed to recognize target sequences. Bat protein binding domains generally require less sequence identity than TALE binding domains in order to bind.

The length of the linker joining the TALE or other binding domain to the Cas

endonuclease may be selected based on the distance within the viral genome between the TALE target and the gRNA target. By targeting two specific and separate sequences separated by an anticipated distance, mismatch binding and off-target cleaving is reduced and specificity increased over single- sequence targeting. Compositions of the invention are particularly useful where the Cas endonuclease target plus protospacer adjacent motif (PAM) is found in both the viral genome and the host genome. Or stated differently, where the viral genome does not offer any sequence adjacent to a PAM that cannot also be found in the host genome. Using a composition of the invention, the Cas recognition sequence can be shorter than would otherwise be permissible, which prevents the Cas endonuclease from binding to the host genome. The Cas endonuclease is linked to a TALE domain that is designed to recognize a nearby target found only in the viral genome. Thus, using a TALE domain allows an antiviral therapeutic

composition to require an additional targeting sequence that may be spaced away from the PAM at a point where the viral and host genomes diverge.

In preferred embodiments, where the programmable nuclease is a Cas endonuclease, the gRNA may include a minimal recognition sequence, e.g., between about 8 to about 16 base pairs in length while the TALE binding domain may include 7 repeat variable diresidues (RVDs). Allowing these targeting portions to be shorter than they otherwise might be means that each targeting portion, taken alone, would have poor binding affinity and specificity. As result, in the presence of only one of the two targets, the nuclease may have insufficient target binding to cleave the target. However, in the presence of both targets, binding affinity and specificity is increased to a degree that the TALE-linked nuclease can be designed to specifically cleave the genome of a virus without any off-target cleavage.

Additionally or alternatively, in certain embodiments, binding specificity is increased by delivering a Cas endonuclease along with short oligonucleotides, such as single- stranded DNA (ssNDA) oligonucleotides, that are substantially complementary to a portion of the targeting sequence of a guide RNA of the Cas endonuclease. The oligonucleotides hybridize to the guide RNA targeting sequence unless and until the true target, which is a better match to the targeting sequence and therefore higher affinity than the oligonucleotides, is present. In the presence of the viral genome, the true target within the viral genome displaces the oligonucleotides from the guide RNA. In the absence of the target viral genome, the oligonucleotides prevent the guide RNA from mediating any off-target binding (e.g., of the Cas endonuclease to a similar target within a human genome). That is, ssDNA oligos may be used to compete with the target viral DNA sequence such that only perfect or near perfect matches between the gRNA and the target viral nucleic acid will result in the target viral nucleic acid displacing the ssDNA, allowing the gRNA to bind and the endonuclease to cleave the bound nucleic acid. Accordingly, off-target cleavage of genetic material is avoided. The gRNA contains a targeting region at least substantially complementary to a portion of a target viral genome. The ssDNA oligos are shorter than the complementary portion of the targeting region and may comprise between about 10 and about 15 nucleotides, with a melting temperature of 40°C to 50°C.

Off-target cleavage reducing compositions and methods of the invention are particularly suited to anti- viral treatments where viral genetic material must be specifically targeted within a living host cell without cleaving the hosts own genetic material.

Preferred viral targets include a hepatitis virus such as a hepatitis B virus (HBV), an Epstein-Barr virus, a Kaposi's sarcoma-associated herpesvirus (KSHV), a herpes-simplex virus (HSV), a cytomegalovirus (CMV), human papilloma virus (HPV), or Merkel cell polyomavirus. Compositions of the preferred embodiments are formulated for topical delivery, i.e., so that the potential for systematic side effects may be reduced. For delivery to tissue such as basal epithelium or mucosal epithelium, the Cas endonuclease— or a messenger RNA (mRNA) encoding the Cas endonuclease— is preferably delivered via a nanoparticle such as a lipid nanoparticle that includes cationic lipids to encourage tissue and cellular penetration. Cas endonuclease and gRNA may be delivered as a plasmid encoding the same, however, delivery of the active, ribonucleoprotein (RNP) form of the Cas endonuclease, or the mRNA encoding the Cas endonuclease, avoids the requirement for nuclear import and transcription. Topical delivery of RNP or mRNA to tissue such vaginal or anal tissue is used for preferred targets such as HPV to treat warts, lesions, or even cancers such as cervical cancer.

Aspects of the disclosure include a composition for treating a viral infection. The composition includes a programmable nuclease linked to a DNA-binding domain, in which the programmable nuclease hybridizes to a first target in a viral genome and the DNA-binding domain binds to a second target in the viral genome. The DNA-binding domain may be a transcription activator-like effector (TALE) DNA-binding domain having about 7 repeat variable diresidues (RVDs) corresponding to the second target in the viral genome. In certain

embodiments, the DNA-binding domain may be a zinc finger DNA-binding domain of a zinc finger nuclease (ZFN). The DNA-binding domain can be a second, catalytically inactive Cas endonuclease and gRNA complex. In various embodiments, the DNA-binding domain may be from an endosymbiotic bacterium Burkholderia rhizoxinica protein (Bat protein).

The programmable nuclease may be a Cas endonuclease and guide RNA having a targeting region substantially complementary to the first target in a viral genome. Upon binding to both the first target and the second target, the nuclease cleaves the viral genome. The targeting region of the guide RNA can include an about 8-16 (preferably about 15-16) base pair sequence complementary to the first target in the viral genome. Any suitable virus may be targeted including, for example, Herpes Simplex virus (HSV), human Herpes virus 6 (HHV6), human Herpes virus 7 (HHV7), Kaposi's sarcoma-associated herpesvirus (KSHV), Cytomegalovirus (CMV), Epstein Barr virus (EBV), Varicella zoster virus (VZV), human papillomavirus (HPV), and hepatitis b virus (HBV).

In certain embodiments, compositions of the invention include a Cas endonuclease with a guide RNA that has a targeting sequence complementary to a target in a viral genome, and the composition further includes short oligonucleotides at least partially complementary to the targeting sequence of the guide RNA. The short oligonucleotides may be provided as a plurality of ssDNA molecules that are co-delivered with the Cas endonuclease. The oligonucleotides hybridize to the guide RNA unless and until the true target in the viral genome is present. Using the ssDNA oligonucleotides, the composition may be used to cleave viral targets even where somewhat similar sequences may be present in the host human genome. The ssDNA

oligonucleotides will prevent the Cas endonuclease from binding to the somewhat similar sequences in the host. Thus, the targeting sequence of the guide RNA may be at least partially complementary to a sequence in a host genome and the oligonucleotides will prevent off-target activity. The ssDNA oligonucleotides may be used in conjunction with embodiments in which a TALE domain, specific to a second target in a viral genome, is linked to the Cas endonuclease. The Cas endonuclease may be covalently or non-covalently linked to the TALE DNA-binding domain.

In certain embodiments, the Cas endonuclease may be linked to the TALE DNA-binding domain through a linker. The linker can be of a length chosen to correspond to a distance between the first target and the second target in the viral genome. The linker may be protein and may include one or more of proline, glycine, threonine, serine, or combinations thereof (e.g., glycine -rich for flexibility, or proline-rich for rigidity). A non-protein linker may be included. A non-protein linker may include, for example, a disulfide bond; a thioether bond; an amine bond; a hydrazine linkage; an amide bond; an imidoester; maleimide; PEG; and BM(PEG)n with 1 < n <9. The linker may be attached to the Cas endonuclease at an amino acid selected from the group consisting lysine, cysteine, aspartic acid, and glutamic acid.

In certain aspects, the invention provides a composition for treating a viral infection. The composition includes a Cas endonuclease, or mRNA encoding the Cas endonuclease, and a guide RNA (gRNA) comprising a targeting region complementary to a target in a viral genome. The composition also includes at least one short oligonucleotide at least partially complementary to the targeting region. The targeting region has a first binding affinity for the oligonucleotide that is lower than a perfect match. However, the first binding affinity may be greater than a binding affinity between the targeting region and a mismatched target sequence. The mismatched target sequence can be at least 75% complementary to the targeting region. The mismatched target sequence may be at least 80% complementary to the targeting region. In certain embodiments, the mismatched target sequence may be at least 90% complementary to the targeting region. In preferred embodiments, the mismatched target sequence may be at least 95% complementary to the targeting region. The targeting region can include about 20 nucleotides. The ssDNA oligonucleotide may comprise between about 10 and about 15 nucleotides.

Brief Description of the Drawings

FIG. 1 shows a composition for treating a viral infection comprising a Cas endonuclease linked to a TALE binding domain.

FIG. 2 shows a composition for treating a viral infection, the composition including a Cas endonuclease-gRNA complex with a ssDNA oligo complementary to the targeting sequence of the gRNA.

FIG. 3 shows a targeting region of a Cas endonuclease-gRNA complex that has preferentially bound a viral nucleic acid over the ssDNA oligo.

FIG. 4 shows a targeting region of a Cas endonuclease-gRNA complex and a mismatched viral nucleic acid where the ssDNA oligo has preferentially bound the targeting region over the mismatched viral nucleic acid.

FIG. 5 shows a Cas endonuclease linked to a TALE binding domain in a nanoparticle.

FIG. 6 shows a Cas endonuclease and a ssDNA oligonucleotide in a nanoparticle.

FIG. 7 shows a programmable nuclease in a nanoparticle.

FIG. 8 is a map of the Epstein Barr genome to illustrate target categories.

FIG. 9 shows a composition that includes a messenger RNA.

FIG. 10 shows an antiviral composition with a DNA vector encoding a nuclease.

FIG. 11 shows a method of preparing an antiviral composition.

FIG. 12 diagrams a method for treating a viral infection.

Detailed Description

FIG. 1 shows a composition 101 for treating a viral infection. The composition 101 includes a programmable nuclease 107 linked to a TALE domain 157, in which the

programmable nuclease 107 binds to a first target in a viral genome 175 and the TALE domain 157 is designed to bind to a second target in the viral genome, and wherein the first target and the second target are not both found in a human genome. In the depicted embodiment, the programmable nuclease 107 is a Cas endonuclease and is complexed with a guide RNA 121 that includes a targeting region 127 that is substantially (e.g., at least 60%) complementary to the first target in viral nucleic acid 175. A linker 161 joins the programmable nuclease 107 to the TALE domain 157. The TALE domain 157 includes series of repeat variable diresidue (RVD) designed to bind the second target in the viral genome 175.

The targeting region 127 of the guide RNA 121 hybridizes to the first target in the viral genome 175 and the TALE domain 157 binds to the second target. Both the TALE binding domain and the targeting region may be short enough in length that neither alone provide sufficient binding affinity to allow the programmable nuclease to efficiently cleave the viral nucleic acid. Instead, both the targeting region and the linked TALE binding domain binding to their recognition sites along the viral nucleic acid provide strong enough recognition for the programmable nuclease to function by cleaving the viral nucleic acid. By using two separate binding sites and binding modalities, off-target cleavage is avoided. Because the composition 101 is designed to require specific first and second targets a distance apart from each other on the viral genome, the composition may be particularly useful for targeting the genome of one organism within the cells of an infected host, and thus for digest genomes of latent viruses within the cells of the infected human host.

Compositions and methods of the invention are particularly suited to applications where a PAM and its adjacent sequence are found in both the viral genome and the human genome. That PAM and its adjacent sequence in the viral genome may still be targeted specifically using compositions and methods of the disclosure by having the DNA-binding domain linked to the Cas endonuclease be designed to bind to a sequence found only in the viral genome. By requiring binding of a second, separate target sequence that may be spaced away from the first target sequence in order to allow cleavage, compositions of the invention may provide the ability to use a conserved first target sequence between the host and the target virus. The second target sequence of the DNA-binding domain may be selected to be removed from the conserved section of sequence.

The DNA-binding domain may be linked to a programmable nuclease, such as a Cas endonuclease at, for example, a side chain of an amino acid of the Cas endonuclease, wherein the side chain may present an amine, a carboxyl, a sulfhydryl, or a carbonyl. Optionally, the DNA- binding domain can be attached to the side chain through a linker 161, which may include one or more of a disulfide bond, a thioether, an amine bond, a hydrazine linkage, an amide bond, an imidoester, a peptide bond, maleimide; polyethylene glycol (PEG), BM(PEG)n with 1 < n <9, biotin, or other molecule or agent. The linker 161 may be configured in length to space the DNA-binding domain apart from the gRNA's targeting region by a distance corresponding to the anticipated distance between the target sequences of the DNA-binding domain and the targeting region in the viral genome. Compositions and methods of the invention recognize that target DNA may not be encountered by the programmable nuclease in a perfect line and that, because of this, sequences that are separated by a distance within the target genome may, in fact, be quite close in physical space. Accordingly, the anticipated distance may be based upon the nucleotide length separating the two target sequences or spatial relationships based on known conformation of the viral genetic material.

Certain Cas endonuclease are thought to function naturally by forming a complex with a guide RNA that includes a ~ 20-bp targeting region that is substantially complementary to a target in viral nucleic acid. Compositions and methods of the invention may include a targeting region 127 that is shorter in length than 20-bp. In various embodiments, the substantially complementary targeting region may be, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 19 bp in length. In preferred embodiments, the targeting region may include a 10 to 16 base sequence substantially complimentary to a target sequence in a viral genome.

The DNA-binding domains may be derived from TALE DNA-binding domains, zinc finger DNA-binding domains from a zinc finger nuclease (ZFN), a second, catalytically inactive Cas endonuclease/gRNA complex, or a DNA-binding domain from a Bat protein for example.

Transcription activator-like effector nucleases (TALENs) are nucleases that include a DNA-binding domain and a cleavage domain. Specifically, TALENs contain a Fokl nuclease domain and a DNA-binding domain known as a transcription activator-like effector (TALE). The TALE is composed of tandem arrays of amino acid repeats, each of which recognizes a single base-pair in the major groove of target viral DNA. The nucleotide specificity of a domain comes from repeat variable diresidues (RVD), the two amino acids at positions 12 and 13 where Asn- Asn, Asn-Ile, His-Asp and Asn-Gly recognize guanine, adenine, cytosine and thymine, respectively. That pattern allows one to design a TALE domain to be a sequence-specific DNA- binding domain. Compositions and methods of the invention may include a TALE DNA-binding domain comprising, for example, 3, 4, 5, 6, 7, 8, 9, or 10 RVD programmed to bind a target sequence in a viral genome. In preferred embodiments, the TALE DNA-binding domain comprises about 7 RVD.

The DNA-binding domain may be a zinc finger DNA-binding domain of a zinc finger nuclease (ZFN). Zinc-finger nucleases (ZFNs) are artificial restriction enzymes generated by fusing a zinc finger DNA-binding domain to a DNA-cleavage domain. Zinc finger DNA-binding domains can be engineered to target specific desired DNA sequences. Standard zinc finger DNA- binding domains may contain between three and six individual zinc finger repeats and recognize between 9 and 18 basepairs and zinc finger DNA-binding domains of the invention may be shortened to reduce binding affinity in order to require binding of both the zinc finger binding domain and the coupled gRNA/Cas endonuclease complex before the Cas endonuclease can cleave the target viral genome.

In certain embodiments, the DNA-binding domain may be a second, catalytically inactive Cas endonuclease/gRNA complex of the same or different species as the first cleavage-active Cas endonuclease/gRNA complex. See Guilinger, et al., 2014, Fusion of catalytically inactive Cas9 to Fokl nuclease improves the specificity of genome modification, Nature Biotechnology 32, 577-582, incorporated herein by reference.

According to various embodiments, the Cas endonuclease/gRNA complex linked DNA- binding domain may be from a Burkholderia rhizoxinica protein (Bat protein). Burkholderia rhizoxinica is an endosymbiotic bacterium that contains TALE-like proteins. Bat protein binding domains function similarly to TALE binding domains and may be similarly programmed to recognize target sequences. Bat repeat domains have been shown to mediate sequence- specific DNA binding similarly to TALEs while requiring less than 40% sequence identity. See de Lange, et al., 2014, Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain, Nucleic Acids Res. 42(11): 7436-7449, incorporated herein by reference. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic, allowing for the creation of Bat derivatives with programmable specificity. Id.

A composition that includes a Cas endonuclease linked to a TALE or other DNA-binding domain will bind to a viral genome with good specificity and can be designed to not exhibit any off-target binding or cleavage. Thus compositions of the invention have good specificity and may be used to digest a viral genome within a human cell. Additionally, compositions of the invention may include other features that even further increase the target specificity. FIG. 2 shows a composition 201 of the invention including a programmable nuclease 207 complexed with a guide RNA 221 comprising a targeting region 227 that is substantially (e.g., at least 60%) complementary to a target in viral nucleic acid. The composition 201 includes a short oligonucleotide 225 such as a single-stranded DNA (ssDNA) oligonucleotide that is co-delivered with the programmable nuclease 207. The oligonucleotide 225 is complementary to at least a portion of the targeting region 227 of the guide RNA 221. The oligonucleotide 225 may be shorter than the targeting region 227 or may simply contain a sequence complementary to the targeting region 227 that is shorter than the viral genome-complementary portion of the targeting region 227. The oligonucleotide 225 hybridizes to the targeting sequence 227 of the guide RNA 221 with an affinity greater than a mis-matched target would hybridize to the guide RNA. Thus the oligonucleotide 225 prevents off-target binding and activity. However, the oligonucleotide 225 binds to the guide RNA 221 less well than would the true target. Accordingly, a perfectly matched target sequences in the viral genome will displace the oligonucleotide 225 and allow for cleavage by the programmable nuclease 207.

FIG. 3 illustrates a strong match between a target sequence on the viral genome 175 and the targeting region 227. The targeting region 227 has preferentially bound to the viral genome 175 and displaced the oligonucleotide 225. The programmable nuclease 107 will accordingly cleave the viral genome 175, disrupting the virus.

FIG. 4 illustrates a mismatch between the viral genome 175 and the targeting region 227. Accordingly, the ssDNA oligo 225 remains preferentially bound to the targeting region 227 preventing the mismatch binding of the targeting region 227 to the viral genome 175 or potential mismatched binding sites in the host genome. Off-target cleavage by the programmable nuclease 207 is thereby prevented. Reducing off-target cleavage is of particular importance where a sequence in the host genome is similar to the target sequence in the viral genome in order to prevent unwanted disruption of the host genetic material.

Compositions of the invention use a programmable nuclease to digest viral nucleic acid and include features that prevent off-target activity. Compositions and methods of the disclosure may have particular benefit for the treatment of viruses that contain target sequences that are similar to the host's genetic sequence. Where the target in the viral genome is similar to a region within the host genome, the inclusion of a TALE domain (while using a shorter guide RNA recognition sequence), the use of ssDNA oligonucleotides, or both, can prevent the nuclease from binding to and cleaving the host genome.

Any suitable virus may be targeted using compositions and methods of the invention. Suitable viruses include, Programmable nucleases may be used in various embodiments to target viral genetic material within host cells. In certain embodiments, compositions include, as the programmable nuclease, an RNA-guided nuclease (e.g., Cas9) and at least one gRNA targeting the genome of the virus. Suitable targets in viral genomes include, but are not limited to, a portion of a genome or gene of adenovirus, herpes simplex virus, varicella-zoster virus, Epstein- Barr virus, human cytomegalovirus, human herpesvirus type 8, human papillomavirus, BK virus, JC virus, smallpox, hepatitis B virus, human bocavirus, parvovirus, B 19, human astrovirus, Norwalk virus, coxsackievirus, hepatitis A virus, poliovirus, rhinovirus, sever acute respiratory syndrome virus, hepatitis C virus, yellow fever virus, dengue virus, west nile virus, rubella virus, hepatitis E virus, human immunodeficiency virus, influenza virus, guanarito virus, Junin virus, Lassa virus, machupo virus, sabia virus, Crimean-Congo hemorrhagic fever virus, Ebola virus, Marburg virus, measles virus, mumps virus, parainfluenza virus, respiratory syncytial virus, human metapnemovirus, Hendra virus, Nipah virus, rabies virus, hepatitis D virus, rotavirus, orbivirus, Coltivirus, or Banna virus. In preferred embodiments, compositions of the invention are provided as antiviral therapeutics that include a modified programmable nuclease

programmed to treat an infection by a hepatitis virus, a hepatitis B virus (HBV), an Epstein-Barr virus, a Kaposi's sarcoma-associated herpesvirus (KSHV), a herpes-simplex virus (HSV), a cytomegalovirus (CMV), human papilloma virus (HPV), and Merkel cell polyomavirus.

Methods and compositions of the disclosure use a programmable nuclease to digest nucleic acid of the virus, thereby rendering the virus incapable of replication or infection of the host patient.

The composition may be delivered as a programmable nuclease such as Cas endonuclease or a nucleic acid encoding the programmable nuclease. The programmable nuclease may be complexed with a guide RNA (gRNA) to form a ribonucleoprotein. The programmable nuclease may be delivered as mRNA that encodes a programmable nuclease, co-delivered with a gRNA. The programmable nuclease may be delivered in the form of a vector encoding for the gRNA and the programmable nuclease.

The programmable nuclease may be any suitable programmable nuclease. A

programmable nuclease is a molecule that can be designed to, or "programmed" to, cleave a nuclease in a sequence- specific manner. Programmable nucleases include CRISPR-associated (Cas) nucleases, such as Cas9, Cpfl, C2cl, C2c3, and C2c2.

In preferred embodiments, methods and compositions of the disclosure use a Cas endonuclease. Cas endonucleases were first found as part of bacterial immune systems. The host bacteria capture small DNA fragments (-20 bp) from invading viruses and insert those sequences (termed targeting regions) into their own genome to form a CRISPR. Those CRISPR regions are transcribed as pre-CRISPR RNA(pre-crRNA) and processed to give rise to target- specific crRNA. Invariable target-independent trans-activating crRNA (tracrRNA) is also transcribed from the locus and contributes to the processing of pre-crRNA. The crRNA and tracrRNA have been shown to be combinable into a single guide RNA. As used herein, "guide RNA" or gRNA refers to either format. Guide RNA and a Cas endonuclease form an active ribonucleoprotein (RNP) complex that cleaves the target nucleic acid. A sgRNA forms the RNP with Cas endonuclease protein, and the RNP finds the target by hybridization of a targeting region to the intended target. The RNP will cleave when the target is found next to a sequence known as protospacer adjacent motif (PAM).

Cas endonucleases are generally programmed to target a specific viral nucleic acid by providing a gRNA that includes a ~ 20-bp targeting region that is substantially complementary to a target in viral nucleic acid. In preferred embodiments of the invention, the length of the targeting region is reduced to about 15-16-bp to lower binding efficiency. The targetable sequences include, among others, 5^'-X 15NGG-3 ^' or 5^'-X 15NAG-3 ^'; where X 15 is

substantially complementary to the targeting region in the gRNA and NGG and NAG are PAMs. It will be appreciated that recognition sequences with lengths other than 15-16 bp and PAMs other than NGG and NAG are known and are included within the scope of the invention.

CRISPR systems with single-subunit effectors are known as Class 2. These are then subdivided even further into type II (e.g., Cas9) and type V (e.g., Cpfl). Cas endonucleases include Cas9, Cpfl, C2cl, C2c3, and C2c2, and modified versions of Cas9, Cpfl, C2cl, C2c3, and C2c2, such as nuclease with an amino acid sequence that is different, but at least about 85% similar to, an amino acid sequence of wild-type Cas9, Cpfl, C2cl, C2c3, or C2c2, or a Cas9, Cpfl, C2cl, C2c3, or C2c2 protein with a linked to an accessory element such as another polypeptide or protein domain (e.g., within a recombinant fusion protein or linked via an amino acid side-chain) or other molecule or agent. C2cl (Class 2, candidate 1) is a type V-B Cas endonuclease that has been found.

Examples of C2cl have been indicated to be functional in E. coli. tracrRNAs (short RNAs that help separate the CRISPR array into individual spacers, or crRNAs) were required. As is the case for Cas9, with C2cl, the tracrRNA may be fused to the crRNA to make a single short guide, or sgRNA. C2cl targets DNA with a 5' PAM sequence TTN.

C2c3 (Class 2, candidate 3) is a type V-C Cas endonuclease that clusters with C2cl and Cpf 1 within type V. C2c2 was found in metagenomic sequences, and the species is not known.

C2c2 (Class 2, candidate 2) is a type VI Cas endonuclease. C2c2 has been indicated to make mature crRNAs in E. coli. See Shmakov, 2015, Discovery and functional characterization of diverse class 2 CRISPR-Cas systems, Mol Cell 60(3):385-397, incorporated by reference.

In embodiments of the invention, a TALE DNA-binding domain is linked to a programmable nuclease through a linker. A linker may be chosen for its properties. For example, for a polypeptide linker (e.g., within a recombinant fusion protein) to be flexible it may be provided with a plurality of glycine resides (e.g., > 30% or > 50%). For a more rigid polypeptide linker, it may be desirable to include a plurality of proline residues. The linker may be biodegradable.

In some embodiments, the linker is cleavable. For example, the linker may include an enzyme cleavage region. Where a polypeptide linker is used, an enzyme cleave region can be the target of a protease.

In certain embodiments, the TALE DNA-binding domain is non-covalently bound to the programmable nuclease. For example, either the programmable nuclease or the DNA-binding domain may be biotinylated and the DNA-binding domain may thus be non-covalently bound to the programmable nuclease through a biotin/streptavidin linkage.

A composition for treating a viral infection may include a programmable nuclease covalently linked to a TALE DNA-binding domain through a protein linker. Some recombinant fusion proteins are composed of two or more functional domains joined by linker peptides. The linker serves to connect the proteins, and also provide many other functions, such as maintaining cooperative inter-domain interactions or preserving biological activity. The natural length of linkers in multi-domain proteins is about 6 to 10 residues on average. Preferred residues for linkers include threonine (Thr), serine (Ser), proline (Pro), glycine (Gly), aspartic acid (Asp), lysine (Lys), glutamine (Gin), asparagine (Asn), and alanine (Ala), arginine (Arg), phenylalanine (Phe), and glutamic acid. I.e., preferably residues are polar (charged or uncharged).

Proline may be included to give the linker rigidity. It is thought that the lack of an amide hydrogen, as well as the cyclic side chain, limit proline's ability to participate in promiscuous hydrogen bonding and restrict its flexibility. The small, polar amino acids, such as Thr, Ser, and Gly are thought to be favorable for providing good flexibility due to their small sizes, and also help maintain stability of the linker structure in the aqueous solvent through formation of hydrogen bonds with water. For flexibility, the linker may include a plurality of glycine residues. In some embodiments, the linker comprises a plurality of threonine and serine residues.

A linker may provide functionality such as flexibility, rigidity (e.g., even a mixture of both flexibility and rigidity at different points along it), solubility, cleavage targets, binding targets, others, or combinations thereof. In some embodiments, the linker is included to provide a spacer arm. The spacer arm is the chemical chain between two groups. The length of a spacer arm (e.g., in angstroms) determines how flexible a conjugate will be. Longer spacer arms have greater flexibility, reduced steric hindrance, and offer more sites for potential nonspecific binding. Spacer arms can range from zero length to > 100 angstroms. The molecular composition of a spacer arm can affect solubility and nonspecific binding. Some linkers have spacer arms that contain hydrocarbon chains or polyethylene glycol (PEG) chains. Hydrocarbon chains are not water soluble and typically require an organic solvent such as DMSO or DMF for suspension. Those crosslinkers are suited for penetrating the cell membrane and performing intercellular crosslinking because they are hydrophobic and uncharged. If a charged sulfonate group is added to the termini of such crosslinkers, a water soluble analogue is formed.

Certain exemplary categories of cross-linkers use bismaleimide-activated PEG

(BM(PEG)n) or bis(succinimidyl) PEG (BS(PEG)n). Canonically, BM(PEG)n cross-links sulfhydryls and BS(PEG)n cross-links amines, although variations will be understood by one of skill in the art. It may be preferable for most technological and therapeutic applications to use BS(PEG)n or BM(PEG)n with 1 < n <9, although related PEG-based chemistries will be understood by one of skill in the art and are included in the invention.

Optionally using a suitable linker, a TALE DNA-binding domain may be attached to a programmable nuclease at an amino acid with a side chain comprising an amine, a carboxyl, a sulfhydryl, or a carbonyl. For example, the agent or the linker may be attached to the programmable nuclease at an amino acid in the nuclease such as lysine, cysteine, aspartic acid, or glutamic acid.

In some embodiments, a programmable nuclease is linked to a TALE DNA -binding domain through a click reaction product such as one more five-membered rings or acyclic derivatives thereof. Click chemistry includes a class of biocompatible reactions intended primarily to join substrates of choice with specific biomolecules. Click chemistry provides methods joining small modular units. In general, click reactions usually join a biomolecule and a TALE DNA-binding domain. Typical click reactions occur in one pot, are not disturbed by water, make unremarkable byproducts, and are driving quickly and irreversibly to high yield of a single click reaction product, with high reaction specificity (in some cases, with both regio- and stereo-specificity). Click reaction products are physiologically stable with only non-toxic byproducts. In one example, the Azide-Alkyne Huisgen Cycloaddition is a 1,3-dipolar cycloaddition between an azide and a terminal or internal alkyne to give a 1,2,3-triazole. The 1,3-dipolar cycloaddition is a chemical reaction between a 1,3-dipole and a dipolarophile to form a five-membered ring. Linking a programmable nuclease to an agent via click chemistry can create a linker that includes, as the click reaction product, one or more five-membered rings or acyclic derivatives thereof. This 1,3-dipolar cycloaddition is an important route to the regio- and stereoselective synthesis of five-membered heterocycles and their ring-opened acyclic derivatives.

The 1,3-dipolar cycloaddition between organic azides and terminal alkynes, e.g., for bioconjugation, may proceed by a copper(I) -catalyzed version of the Huisgen reaction, CuAAC (for Copper-catalyzed Azide-Alkyne Cycloaddition), which proceeds readily in mild conditions that can approximate physiological conditions. Click chemistry may be bioorthogonal: azides and alkynes are typically not found in biomolecules discussed herein and can be selectively reacted. For discussion see Hein et al., 2009, Click chemistry, a powerful tool for pharmaceutical sciences, Pharm Res 25(10):2216-2230 and McCombs & Owen, 2015, Antibody drug conjugates: design and selection of linker, payload and conjugation chemistry, AAPS J

17(2):339-51, both incorporated by reference.

FIG. 5 shows a composition 501 for treating a viral infection. The composition 501 includes a programmable nuclease 507 designed to bind to and cleave viral nucleic acid. The programmable nuclease 507 is linked by a linker 561 to a TALE domain 557. The composition preferably includes a nanoparticle 571 encapsulating at least the programmable nuclease 507 and the TALE domain 557. The programmable nuclease 507 linked to the TALE domain 557 is delivered using the nanoparticle 571. The nanoparticle 571 may be any suitable nanoparticle including, for example, a liposome. In preferred embodiments, the nanoparticle 571 is a lipid nanoparticle and preferably includes cationic lipids, which may be found to be particularly well suited for topical delivery of compositions of the disclosure to tissues harboring a latent or persistent infection.

FIG. 6 shows a composition 601 for treating a viral infection. The composition 601 includes a programmable nuclease 607 programmed to cleave viral nucleic acid or an RNA encoding the programmable nuclease and a short oligonucleotide 625, such as an ssDNA oligo. The composition preferably includes a nanoparticle 671 encapsulating at least the programmable nuclease 607, or the RNA encoding the programmable nuclease, and the oligonucleotide 625. A feature of the depicted composition 601 is that the nuclease 607 is complexed with a guide RNA 621. The guide RNA 621 has a targeting sequence that is complementary to a target sequence within a viral genome, and the oligonucleotide 625 is complementary, over a shorter extent than for the true target, to the targeting sequence of the guide RNA 621. I.e., in some embodiments, the targeting sequence includes about a 16 to 20 base pair stretch that is complementary to a target in a viral genome (preferably about 20), and the oligonucleotide includes a stretch that is a few base pairs shorter (e.g., about 10 to 15) that is complementary to the targeting sequence. In certain embodiments, the short oligonucleotide 625 may be bound to the guide RNA 621 within the nanoparticle 671 at the time of delivery.

Any suitable nanoparticle 671 may be used in the composition 601. In certain embodiment, the nanoparticle is a lipid nanoparticle.

FIG. 7 shows a composition 701 for treating a viral infection, in which the composition 701 includes a programmable nuclease 707 linked to a TALE domain 757 packaged within a lipid nanoparticle 767, e.g., a liposome. Once delivered to cells in vivo in a patient, the TALE domain 757 binds to a first target in a viral genome while the programmable nuclease 707 binds to a second target in the viral genome. In preferred embodiments, the programmable nuclease 707 is a Cas endonuclease and is complexed with a guide RNA 721 as an active

ribonucleoprotein (RNP) within the lipid nanoparticle 767. Preferred viral targets include those with episomal DNA genomes during persistence or latency such as human papillomavirus (HPV), Herpes Simplex virus (HSV), Epstein Barr virus (EBV), Kaposi's sarcoma-associated herpesvirus (KSHV), hepatitis b virus (HBV), human Herpes virus 6 (HHV6), human Herpes virus 7 (HHV7), Cytomegalovirus (CMV), or Varicella zoster virus (VZV). The first target, recognized by the TALE domain 757, and the second target, recognized by the guide RNA 1121, are both in the genome of a virus. This means that the first and second targets are selected from within the genome of the virus and that the TALE domain 757 and the guide RNA 721 are designed to bind to those targets.

In preferred embodiments, the targets are selected by scanning the genomes of the virus for suitable targets. Once suitable first and second targets are identified, the human genome may be scanned for the presence of both of those targets. Once first and second targets are selected that appear in the viral genome and preferably do not also appear in the human genome, the TALE domain and guide RNA may be designed accordingly. It may be preferable to select targets within the viral genome based on where within the genome those targets lie. For example, it may be preferable to select a target that will cause the nuclease to cleave a protein-coding gene and either leave it cleaved or introduce a frame shift. Targets may be selected by reading a reference genome (e.g., a GenBank file) and selection may include referencing annotations to identify genes by category or function. For example, it may be preferable to target cleavage within a gene that is annotated as being functional during latency.

FIG. 8 is a map of the Epstein Barr genome and is used to illustrate how the guide sequence may be designed. The map shown in FIG. 12 shows certain features in the EBV genome that may be targeted with a programmable nuclease. The marks "#", and "+" are used to indicate features that are related to viral structure, transformation, and latency, respectively. Guide RNAs that target the EBV genome are used in compositions according to certain embodiments. Within a genome of interest, such as EBV, selected regions, or genes are targeted. For example, six regions can be targeted with seven guide RNA designs for different genome editing purposes. In relation to EBV, EBNA1 is the only nuclear Epstein-Barr virus (EBV) protein expressed in both latent and lytic modes of infection. While EBNA1 is known to play several important roles in latent infection, EBNA1 is crucial for many EBV functions including gene regulation and latent genome replication. Therefore, guide RNAs sgEBV4 and sgEBV5 were selected to target both ends of the EBNA1 coding region in order to excise this whole region of the genome. These "structural" targets enable systematic digestion of the EBV genome into smaller pieces. EBNA3C and LMPl are essential for host cell transformation, and guide RNAs sgEBV3 and sgEBW were designed to target the 5' exons of these two proteins respectively.

To design guide RNA targeting the EBV genome, the EBV reference genome is referred to. EBNA1 is crucial for many EBV functions including gene regulation and latent genome replication. Guide RNA sgEBV4 and sgEBV5 are targeted to both ends of the EBNA1 coding region in order to excise that region of the genome. Guide RNAs sgEBVl, 2 and 6 fall in repeat regions, so that the success rate of at least one CRISPR cut is multiplied. Those "structural" targets enable systematic digestion of the EBV genome into smaller pieces. EBNA3C and LMPl are essential for host cell transformation, and guide RNAs sgEBV3 and sgEBW are designed to target the 5' exons of these two proteins respectively. Using one or more guide RNA designed to target a selected location within a viral genome, a Cas endonuclease will digest the genome of the virus. Suitable targets include viruses such as Herpes Simplex virus (HSV), human Herpes virus 6 (HHV6), human Herpes virus 7 (HHV7), Kaposi's sarcoma-associated herpesvirus (KSHV), Cytomegalovirus (CMV), Epstein Barr virus (EBV), Varicella zoster virus (VZV), human papillomavirus (HPV), or hepatitis b virus (HBV).

FIG. 9 illustrates a preferred embodiment that includes a composition 901 for treating a viral infection. The composition 901 includes a messenger RNA 937 (mRNA) encoding a programmable nuclease linked to a TALE domain— here, as a transcript of a recombinant gene that will be translated in the infected cells to provide a recombinant fusion protein that includes a Cas endonuclease linked to a TALE domain. The composition 901 also includes a guide RNA 921 with a targeting sequence substantially complementary to a first target within a viral genome. The composition 901 further includes a nanoparticle 971 encapsulating at least the mRNA 937 encoding the programmable nuclease linked to the TALE domain.

FIG. 10 diagrams a composition 1000 for treating a viral infection that includes a nucleic acid vector 1001 (e.g., a plasmid) encoding a programmable nuclease for delivery to viral- infected cells. In the depicted embodiment, the nucleic acid vector 1001 is a plasmid that includes a gene 1027, preferably under control of a promoter 1039. The gene 1027 is preferably a recombinant gene encoding a TALE domain linked to a programmable endonuclease. The plasmid may also include a viral origin of replication 1035 to support maintenance of the plasmid preferentially in viral-infected cells. Where the programmable nuclease gene 1027 codes for Cas endonuclease linked to a TALE DNA-binding domain, the plasmid may also include a guide RNA segment 1055, which includes portions that correspond to targets in genetic material of a virus. When the guide RNA segment 1055 is transcribed, the product is one gRNA with a portion substantially

complementary to a target in viral genetic material, preferably with no match in a human genome.

In certain embodiments, vector 1001 is a plasmid and the gene 1027 codes for a Cas endonuclease (e.g., Cas9 or a modified version of Cas9 that is at least a 95% match to Cas9) linked to a TALE domain. The guide RNA segment 1055 preferably includes a 15-16 nucleotide segment that is at least a 70% match to a segment in a genome of a virus adjacent to a protospacer adjacent motif (PAM) (e.g., NGG); and the viral origin of replication 1035 is an origin of replication from the genome of a virus. The virus may be selected from Human papillomavirus (HPV), Hepatitis B virus, Cytomegalovirus, herpes simplex virus, Epstein Barr virus, for example. These certain embodiments may be preferred where the nucleic acid vector 901 is part of an antiviral therapeutic composition to be delivered to infected cells.

The programmable nuclease segment 1027 may preferably code for an RNA-guided nuclease such as Cas9, a modified Cas9 linked by a linker to a TALE DNA-binding domain, Cpfl, or a modified Cpfl . In some embodiments, the guide RNA segment 1055 and the viral origin of replication 1035 are omitted. Any suitable promoter 1039 (e.g., U6 promoter) may be included. These embodiments may be preferred where the gene 1027 is to be expressed e.g., in culture (for example, in E. coli, yeast, or a Lactobacillus) to produce a nuclease for use in an antiviral therapeutic composition. Where the programmable nuclease segment 1027 codes for an RNA-guided nuclease, the expressed protein is preferably complexed with a gRNA to form into an active ribonucleoprotein (RNP).

FIG. 11 shows preparation of an antiviral composition. A programmable nuclease 1107 linked to a TALE domain 1157 and guide RNA 1121 are obtained. Those elements are formed in an RNP 1167. In a preferred embodiment, the guide RNA 1121 includes a targeting region substantially complementary to a target site within a viral genome and the TALE domain 1157 binds a second target site within a viral genome. The RNP 1167 is packaged with nanoparticles 1137 to form a composition 1101. The composition 1101 may further include any suitable carrier fluid, cream, or gel e.g., for topical delivery. The composition 1101 is delivered topically to a site in tissue in a patient. The nanoparticles (e.g., liposomes) penetrate tissue, preferably to the basal epithelium or mucosal epithelium where, for example, the virus is HPV. The nanoparticles 1137 deliver the TALE-linked RNP 1167 molecules to infected cells 1179, where the TALE-linked RNP 1167 then cleaves viral DNA 1175.

FIG. 12 diagrams a method 1201 for treating a viral infection. The method 1201 includes providing 1205 a composition that includes a programmable nuclease programmed to cleave viral nucleic acid. The programmable nuclease may include a Cas endonuclease linked to a TALE DNA-binding domain. The programmable nuclease may be provided with a ssDNA oligo partially complementary to a targeting region sequence of a gRNA complexed with the programmable nuclease. Preferably, the programmable nuclease or the RNA encoding the programmable nuclease are encapsulated in a nanoparticle such as a lipid nanoparticle that includes cationic lipids. The composition is delivered 1209 to cells infected by a virus. In the most preferred embodiments, the composition is delivered topically to avoid systemic distribution. The programmable nuclease is preferably a Cas endonuclease programmed to cleave genetic material of the virus. The composition is then used to cleave 1213 the genetic material of the virus.

Claims

What is claimed is:

1. A composition for treating a viral infection, the composition comprising:

a Cas endonuclease complexed with a guide RNA (gRNA) comprising a targeting region at least substantially complementary to a first target in a viral genome; and

a DNA-binding domain linked to the Cas endonuclease, the DNA-binding domain programmed to bind a second target in the viral genome,

wherein binding of both the first target and the second target is required for the Cas endonuclease to cleave the viral genome.

2. The composition of claim 1, wherein the DNA-binding domain is a transcription activator- like effector (TALE) DNA-binding domain comprising about 7 repeat variable diresidues (RVDs) corresponding to the second target in the viral genome.

3. The composition of claim 1, wherein the DNA-binding domain is a zinc finger DNA-binding domain of a zinc finger nuclease (ZFN).

4. The composition of claim 1, wherein the DNA-binding domain is a second, catalytically inactive Cas endonuclease and gRNA complex.

5. The composition of claim 1, wherein the DNA-binding domain is from an endosymbiotic bacterium Burkholderia rhizoxinica protein (Bat protein).

6. The composition of claim 1, wherein the targeting region comprises an about 15-16 base pair sequence complementary to the first target in the viral genome.

7. The composition of claim 1, wherein the viral genome is selected from the group consisting of Herpes Simplex virus (HSV), human Herpes virus 6 (HHV6), human Herpes virus 7 (HHV7), Kaposi's sarcoma-associated herpesvirus (KSHV), Cytomegalovirus (CMV), Epstein Barr virus (EBV), Varicella zoster virus (VZV), human papillomavirus (HPV), Merkel cell polyomavirus (MCV), and hepatitis b virus (HBV).

8. The composition of claim 1, further comprising a single-strand DNA oligonucleotide at least partially complementary to the targeting region.

9. The composition of claim 1, wherein the targeting region is at least partially complementary to a sequence in a host genome.

10. The composition of claim 1, wherein the TALE DNA-binding domain is non-covalently bound to the Cas endonuclease.

11. The composition of claim 1, wherein the Cas endonuclease is covalently linked to the TALE DNA-binding domain .

12. The composition of claim 1, wherein the Cas endonuclease is linked to the TALE DNA- binding domain through a linker.

13. The composition of claim 12, wherein the linker is of a length approximately equivalent to a conformational distance between the first target and the second target in the viral genome.

14. The composition of claim 12, wherein the linker comprises protein.

15. The composition of claim 14, wherein the linker comprises at least one selected from the group consisting of: a plurality of proline residues; a plurality of glycine residues; and a plurality of threonine and serine residues.

16. The composition of claim 13, wherein the linker comprises a non-protein chemical linker.

17. The composition of claim 16, wherein the non-protein chemical linker comprises one selected from the group consisting of: a disulfide bond; a thioether bond; an amine bond; a hydrazine linkage; an amide bond; an imidoester; maleimide; PEG; and BM(PEG)n with 1 < n <9.

18. The composition of claim 12, wherein the linker is attached to the Cas endonuclease at an amino acid selected from the group consisting lysine, cysteine, aspartic acid, and glutamic acid.

19. A composition for treating a viral infection, the composition comprising:

a single-strand DNA (ssDNA) oligonucleotide at least partially complementary to the targeting region,

wherein the targeting region has a first binding affinity for the ssDNA oligonucleotide that is less than a binding affinity between the targeting region and a target sequence that is 100% complementary to the targeting region, and

wherein the first binding affinity is greater than a binding affinity between the targeting region and a mismatched target sequence.

20. The composition of claim 19, wherein the mismatched target sequence is at least 95% complementary to the targeting region.

21. The composition of claim 19, wherein the targeting region comprises about 20 nucleotides.

22. The composition of claim 21, wherein the ssDNA oligonucleotide comprises between about 10 and about 15 nucleotides.

23. The composition of claim 19, further comprising a transcription activator-like effector (TALE) DNA-binding domain linked to the Cas endonuclease, the TALE DNA-binding domain programmed to bind a second target in the viral genome,