WO2024050548A2 - Compact promoters for targeting hypoxia induced genes - Google Patents

Compact promoters for targeting hypoxia induced genes Download PDF

Info

Publication number
WO2024050548A2
WO2024050548A2 PCT/US2023/073368 US2023073368W WO2024050548A2 WO 2024050548 A2 WO2024050548 A2 WO 2024050548A2 US 2023073368 W US2023073368 W US 2023073368W WO 2024050548 A2 WO2024050548 A2 WO 2024050548A2
Authority
WO
WIPO (PCT)
Prior art keywords
promoter
certain embodiments
endonuclease
aav
seq
Prior art date
Application number
PCT/US2023/073368
Other languages
French (fr)
Other versions
WO2024050548A3 (en
WO2024050548A9 (en
Inventor
Vinod JASKULA-RANGA
Original Assignee
Hunterian Medicine Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunterian Medicine Llc filed Critical Hunterian Medicine Llc
Publication of WO2024050548A2 publication Critical patent/WO2024050548A2/en
Publication of WO2024050548A3 publication Critical patent/WO2024050548A3/en
Publication of WO2024050548A9 publication Critical patent/WO2024050548A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/10Applications; Uses in screening processes
    • C12N2320/11Applications; Uses in screening processes for the determination of target sites, i.e. of active nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2330/00Production
    • C12N2330/50Biochemical production, i.e. in a transformed host cell
    • C12N2330/51Specially adapted vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14122New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14145Special targeting system for viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/001Vector systems having a special element relevant for transcription controllable enhancer/promoter combination
    • C12N2830/002Vector systems having a special element relevant for transcription controllable enhancer/promoter combination inducible enhancer/promoter combination, e.g. hypoxia, iron, transcription factor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/20Vector systems having a special element relevant for transcription transcription of more than one cistron
    • C12N2830/205Vector systems having a special element relevant for transcription transcription of more than one cistron bidirectional

Definitions

  • the invention relates to compact promoters and their use in expressing gene editing systems, e.g., for treating or preventing disease, such as pulmonary arterial hypertension.
  • Pulmonary arterial hypertension is a rapidly progressing pulmonary vascular disease that leads to right heart failure and premature death.
  • the hallmark features of PAH are increased pulmonary arterial pressure, vascular remodeling, and right ventricle hypertrophy.
  • There are no curative treatments for PAH, and current medications result in modest impacts to morbidity and mortality; approximately 1,000 new cases of PAH are diagnosed in the U.S. each year, with a median survival of 6 years.
  • novel therapies are urgently needed.
  • HIF-2a encoded by the endothelial PAS domain protein 1 (EPAS1 gene, is expressed in the pulmonary endothelium, and strong evidence supports HIF -2a pathway activation in PAH.
  • Patients with PAH have elevated HIF-2a levels in endothelial cells, and pathway activation is well supported by preclinical models.
  • reduced pathway activity through either pharmacologic inhibition or conditional knockouts is protective in multiple animal models.
  • a role for HIF-2a is further supported by genome-wide studies of high- altitude populations with low pulmonary arterial pressure that carry EPAS1 variants with reduced activity.
  • Gene-editing represents a novel therapeutic approach to suppress the HIF-2a pathway activation that underlies PAH etiology.
  • CRISPR clustered regularly interspersed short palindromic repeats
  • AAV adeno-associated viruses
  • the disclosure is based, in part, upon the discovery of compact, bidirectional promoters that can be used to express both an endonuclease (e.g., a Cas9 endonuclease) and a guide RNA (gRNA).
  • a compact, bidirectional promoter directs expression of a gRNA in one direction and an endonuclease in the other direction. Accordingly, the promoters disclosed herein use less space than prior art promoters, allowing both an endonuclease and a gRNA to be packaged in a single vector (e.g., a plasmid or an AAV).
  • the disclosure relates to a non-naturally occurring endonuclease system having a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding an endonuclease wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia-induced gene, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
  • the promoter is a bidirectional promoter.
  • the promoter is an Hl promoter.
  • the Hl promoter is a bidirectional promoter includes pol II and pol III activity.
  • the promoter has a length of from 50 bp to 225 bp. In certain embodiments, the promoter has a length of from 50 bp to 200 bp. In certain embodiments, the promoter has a length of from 50 bp to 180 bp.
  • the promoter includes a nucleic acid sequence selected from SEQ ID NOs: 1-226, 242-521, and 171-175, or a nucleic acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the DNA endonuclease is Cas9, Casl2, or MAD7.
  • the Cas9 endonuclease is selected from the group consisting of SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9.
  • the Casl2 endonuclease is selected from the group consisting of Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, and Casl2i.
  • the DNA endonuclease is selected from the group consisting of Casl4a, Casl4b, Cas 14c, and Cas .
  • the endonuclease is codon optimized for expression in a eukaryotic cell.
  • the portion of the polyribonucleotide that hybridizes to the target sequence includes a nucleotide sequence selected from SEQ ID NOs: 538-710.
  • the endonuclease system is incorporated into a single vector.
  • the single vector is a viral vector or a plasmid.
  • the single vector is an AAV vector.
  • the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp.
  • the AAV vector is selected from the group including AAV1, AAV2, and AAV5.
  • the AAV vector includes a non-naturally occurring nucleotide sequence encoding a targeting peptide.
  • the targeting peptide confers cell type-specific tropism to the AAV vector.
  • the AAV vector confers tropism to a lung endothelial cell and/or to a lung artery smooth muscle cell.
  • the targeting peptide includes from 5 to 25 amino acids.
  • the targeting peptide includes an amino acid sequence selected from SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533), GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537).
  • the AAV vector is AAV-L1.
  • the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A), hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
  • the invention provides a method of preventing or treating Pulmonary Hypertension (PH) or Pulmonary Arterial Hypertension (PAH) in a subject in need thereof, the method including administering to the subject an adeno-associated viral (AAV) vector having a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding a endonuclease, wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia- induced gene in a cell of the subject, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
  • AAV adeno-associated viral
  • the promoter is a bidirectional promoter.
  • the promoter is an Hl promoter.
  • the Hl promoter is a bidirectional promoter including pol II and pol III activity.
  • the pol II activity promotes expression of the endonuclease and the pol III activity promotes expression of the polyribonucleotide.
  • the promoter has a length of from 50 bp to 225 bp. In certain embodiments, the promoter has a length of from 50 bp to 200 bp. In certain embodiments, the promoter has a length of from 50 bp to 180 bp.
  • the promoter includes a nucleic acid sequence selected from SEQ ID NOs: 1-226, 242-521, and 171- 175, or a nucleic acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the endonuclease is Cas9, Casl2, or MAD7.
  • the Cas9 endonuclease is selected from the group including SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9.
  • the Casl2 endonuclease is selected from the group including Cast 2a, Cast 2b, Cast 2c, Cast 2d, Casl2e, Casl2fl, Cast 2g, Casl2h, and Casl2i.
  • the DNA endonuclease is selected from the group including Casl4a, Casl4b, Cas 14c, and Cas . In certain embodiments, the endonuclease is codon optimized for expression in a eukaryotic cell.
  • the portion of the polyribonucleotide that hybridizes to the target sequence includes a nucleotide sequence selected from SEQ ID NOs: 538-710.
  • the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp.
  • the AAV vector is selected from the group consisting of AAV1, AAV2, and AAV5.
  • the AAV vector further includes a non-naturally occurring nucleotide sequence encoding a targeting peptide.
  • the targeting peptide confers cell type-specific tropism to the AAV vector.
  • the AAV vector confers tropism to a lung endothelial cell and/or to a lung artery smooth muscle cell.
  • the targeting peptide includes from about 4 to about 25 amino acids, for example, from about 4 to about 10 amino acids, from about 4 to about 15 amino acids, from about 4 to about 20 amin acids, from about 5 to about 10 amino acids, from about 5 to about 15 amino acids, or from about 5 to about 20 amino acids.
  • the targeting peptide includes an amino acid sequence selected from SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533).
  • the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533) GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537).
  • the AAV vector is AAV-L1.
  • the method further includes administering the composition prophylactically, concurrently, or following onset of PH or PAH.
  • the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A), hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
  • the method further includes administering an inhibitor of the hypoxia-induced gene.
  • the inhibitor is a small molecule inhibitor.
  • the small molecule inhibitor is selected from the group including belzutifan, PT2385, vadadustat, KC7F2, CAY10585, 2-Methoxyestradiol, SYP-5, PT2399, N- Acetylcysteine amide, IDF-11774, Lificiguat (YC-1), PX-478 2HC1, BAY 87-2243, C76 (Methyl-3-(2-(cyano(methylsulfonyl)methylene)hydrazino)thiophene-2-carboxylate), Roxadustat (FG-4592), Daprodustat (GSK1278863), Desidustat (ZYAN-1), Molidustat (Bay 85- 3934), MK-8617, IOX-2, 2-methoxyestradiol, GN-44028
  • FIG. 1 is a schematic drawing showing the region in which the Hl promoter is located, between the start of the H1RNA gene (left) to the start of the PARP2 gene (right). The approximate distance from the H1RNA transcriptional start site to the PARP2 translational start site is indicated as 175 bp.
  • FIG. 2 is a graph showing quantified editing observed as a result of suppressing GFP expression, at three different target sites of GFP, using a single Hl bidirectional promoter or the CBh Pol II promoter and the U6 Pol III promoter.
  • FIG. 3 is a graph showing the editing observed as a result of suppressing GFP expression, at three different target sites of GFP, by an endonuclease (SpCas9) system operably linked to a single Hl bidirectional promoter packaged into an AAV vector. No comparison to pX330, a widely used CRISPR plasmid, is shown because the pX330 system was too large to be packaged into an AAV vector.
  • FIG. 4A is an electron micrograph image of a purified AAV sample indicating proper packaging of the endonuclease system constructs into viral capsids (arrows indicate empty particles). Scale bar indicates 100 nm.
  • FIG. 4B is an electron micrograph image of a purified AAV sample indicating proper packaging of the endonuclease system constructs into viral capsids (arrows indicate empty particles). Scale bar indicates 20 nm.
  • FIG. 5A is a schematic drawing showing a timeline of the in vivo targeting of eGFP expression in a transgenic mouse model using AAV9 loaded with eGPF -targeting and editing constructs employing an Hl promoter. Two organs with AAV9 tropism, the liver and the heart, were analyzed for eGFP expression changes.
  • FIG. 5B is a graph showing indel analysis (quantified deletions) in the liver when measured at 14, 21, 28, 35, and 42 days following administration of the eGFP targeting endonuclease system packaged in AAV9 particles.
  • FIG. 5C is a graph showing indel analysis (quantified deletions) in the heart when measured at 14, 21, 28, 35, and 42 days following administration of the eGFP targeting endonuclease system packaged in AAV9 particles.
  • FIG. 6 is a schematic drawing showing an experimental timeline for prophylactic treatment.
  • the experimental timeline is shown with a 5-week old mouse being administered AAV at day 0 under normoxic conditions and shifted to hypoxic conditions at 3 weeks.
  • the dashed line models AAV expression in vivo, with expression onset just prior to hypoxia exposure.
  • FIG. 7 is a schematic drawing showing an experimental timeline for therapeutic treatment.
  • the experimental timeline shows a mouse being administered AAV at initiation of hypoxic conditions.
  • the dashed line models AAV expression in vivo, with expression onset occurring during hypoxic conditions.
  • the disclosure is based, in part, upon the discovery of compact, bidirectional promoters that can be used to express both an endonuclease (e.g., a Cas9 endonuclease) and a guide RNA (gRNA).
  • an endonuclease e.g., a Cas9 endonuclease
  • gRNA guide RNA
  • a non-naturally occurring endonuclease system including a compact promoter is disclosed herein.
  • the endonuclease system includes a nucleotide sequence encoding the promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding an endonuclease.
  • the disclosure is based on methods for preventing and treating Pulmonary Hypertension (PH) or Pulmonary Arterial Hypertension (PAH) by administering the non-naturally occurring endonuclease system.
  • the endonuclease is administered with an inhibitor of a hypoxia-induced gene.
  • Enzymatic reactions and purification techniques are performed according to manufacturer’s specifications, as commonly accomplished in the art or as described herein.
  • the nomenclatures used in connection with, and the laboratory procedures and techniques of, analytical chemistry, biochemistry, immunology, molecular biology, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques are used for chemical syntheses, and chemical analyses.
  • AAV adeno-associated virus
  • AAV refers to a vector derived from an AAV serotype, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7,
  • AAV vectors can have one or more of the AAV wild-type genes deleted in whole or part, e.g., the rep and/or cap genes, but retain functional flanking inverted terminal repeat (ITR) sequences.
  • Functional ITR sequences promote the rescue, replication, and packaging of the AAV virion.
  • an AAV vector is defined herein to include at least those sequences required in cis for replication and packaging (e.g., functional ITRs) of the virus. ITRs do not need to be the wild-type polynucleotide sequences and may be altered, e.g., by the insertion, deletion, or substitution of nucleotides, so long as the sequences provide for functional rescue, replication, and packaging.
  • AAV expression vectors are constructed using known techniques to at least provide as operatively linked components in the direction of transcription, control elements including a transcriptional initiation region, the DNA of interest (e.g., a polynucleotide encoding a nucleic acid molecule of the disclosure) and a transcriptional termination region.
  • control elements including a transcriptional initiation region, the DNA of interest (e.g., a polynucleotide encoding a nucleic acid molecule of the disclosure) and a transcriptional termination region.
  • the terms “adeno- associated virus inverted terminal repeats” and “AAV ITRs” refer to art-recognized regions flanking each end of the AAV genome which function together in cis as origins of DNA replication and as packaging signals for the virus.
  • AAV ITRs together with the AAV rep coding region, provide for the efficient excision and integration of a polynucleotide sequence interposed between two flanking ITRs into a mammalian genome.
  • the polynucleotide sequences of AAV ITR regions are known.
  • an “AAV ITR” does not necessarily include the wildtype polynucleotide sequence, which may be altered, e.g., by the insertion, deletion, or substitution of nucleotides.
  • the AAV ITR may be derived from any of several AAV serotypes, including without limitation AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC
  • 5' and 3' ITRs which flank a selected polynucleotide sequence in an AAV vector need not be identical or derived from the same AAV serotype or isolate, so long as they function as intended, e.g., to allow for excision and rescue of the sequence of interest from a host cell genome or vector, and to allow integration of the heterologous sequence into the recipient cell genome when AAV Rep gene products are present in the cell.
  • AAV ITRs may be derived from any of several AAV serotypes, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HS
  • An “AAV inverted terminal repeat (ITR)” sequence is an approximately 145-nucleotide sequence that is present at both termini of the native singlestranded AAV genome.
  • the outermost 125 nucleotides of the ITR can be present in either of two alternative orientations, leading to heterogeneity between different AAV genomes and between the two ends of a single AAV genome.
  • the outermost 125 nucleotides also contains several shorter regions of self-complementarity (designated A, A', B, B', C, C and D regions), allowing intrastrand base-pairing to occur within this portion of the ITR.
  • administering or “administration” of a substance, a compound, or an agent to a subject can be carried out using one of a variety of methods known to those skilled in the art.
  • administration may be local.
  • administration may be systemic.
  • Administering can also be performed, for example, once, a plurality of times, and/or over one or more extended periods.
  • the administration includes both direct administration, including self-administration, and indirect administration, including the act of prescribing a drug.
  • a physician who instructs a subject to selfadminister a drug, or to have the drug administered by another and/or who provides a subject with a prescription for a drug is administering the drug to the subject.
  • a “coding sequence” is a portion of a nucleic acid that contains codons that can be translated into amino acids. Although a “stop codon” (TAG, TGA, TAA) is not translated into an amino acid, it may be considered to be part of a coding region, if present, but any flanking sequences, for example, promoters, ribosome binding sites, transcriptional terminators, introns, 5' and 3' untranslated regions, and the like, are not part of the coding region.
  • codon optimization refers to the process of modifying a nucleic acid sequence in accordance with the principle that the frequency of occurrence of synonymous codons (e.g., codons that code for the same amino acid) in coding DNA is biased in different species. Such codon degeneracy allows an identical polypeptide to be encoded by a variety of nucleotide sequences. Sequences modified in this way are referred to herein as “codon- optimized.” This process may be performed on any of the sequences described in this specification to enhance expression or stability. Codon optimization may be performed in a manner, such as that described in, e.g., U.S. Patent Nos.
  • codon optimization includes the incorporation of multiple stop codons.
  • consensus sequence refers to a calculated sequence representing the most frequent nucleotide residues found at each position in a plurality of similar sequences. Typically, a consensus sequence is determined by sequence alignment in which similar sequences are compared to each other and similar sequence motifs are calculated.
  • a “deletion” may include the deletion of subject amino acids, deletion of small groups of amino acids such as 2, 3, 4, or 5 amino acids or deletion of larger amino acid regions, such as the deletion of specific amino acid domains or other features.
  • the term “functional fragment” refers to a fragment of (a) a promoter or (b) a gene or coding sequence (e.g., an mRNA) that encodes a protein (e.g., an endonuclease) that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full-length, naturally occurring promoter or protein.
  • a promoter or a gene or coding sequence e.g., an mRNA
  • a protein e.g., an endonuclease
  • fragment of refers to a segment (e.g., a segment of at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, or at least about 99.9%) of the full length gene(s) or nucleic acid molecule(s) of interest.
  • a segment e.g., a segment of at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%,
  • a “helper virus” for AAV refers to a virus that allows an AAV (which is a defective parvovirus) to be replicated and packaged by a host cell.
  • a number of such helper viruses are known in the art.
  • heterologous refers to regions that are not normally associated with a particular nucleic acid in nature.
  • a “coding region heterologous to a promoter” is a coding region that is not normally associated with the promoter in nature.
  • a “host cell” includes an individual cell or cell culture that can be or has been a recipient for vector(s) for incorporation of polynucleotide inserts.
  • the term host cell may refer to the packaging cell line in which a recombinant AAV (rAAV) is produced from a plasmid.
  • rAAV recombinant AAV
  • the term “host cell” may refer to a target cell in which expression of a transgene is desired.
  • An “insertion” may include the insertion of subject amino acids; insertion of small groups of amino acids, such as 2, 3, 4, or 5 amino acids; or insertion of larger amino acid regions, such as the insertion of specific amino acid domains or other features.
  • ITR inverted terminal repeat
  • nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
  • “Operably linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner.
  • a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression.
  • the terms “patient,” “subject,” or “individual” are used interchangeably herein and refer to either a human or a non-human animal. These terms include mammals, such as humans, nonhuman primates, laboratory animals, livestock animals (including bovines, porcines, camels, etc.), companion animals (e.g., canines, felines, other domesticated animals, etc.) and rodents (e.g., mice and rats).
  • the subject is a human that is at least 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or 95 years of age.
  • Percent (%) sequence identity or “percent (%) identical to” with respect to a reference polypeptide (or nucleotide) sequence is defined as the percentage of amino acid residues (or nucleic acids) in a candidate sequence that are identical with the amino acid residues (or nucleic acids) in the reference polypeptide (nucleotide) sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for aligning sequences, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • polynucleotide refers to chains of nucleotides of any length, and include DNA and RNA (e.g., polyribonucleotides).
  • the nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a chain by DNA or RNA polymerase.
  • a polynucleotide may comprise modified nucleotides, such as methylated nucleotides and their analogs. If present, modification to the nucleotide structure may be imparted before or after assembly of the chain.
  • the sequence of nucleotides may be interrupted by non-nucleotide components.
  • a polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
  • Other types of modifications include, for example, “caps,” substitution of one or more of the naturally occurring nucleotides with an analog, intemucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), those containing pendant moi eties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelators (e.g.
  • any of the hydroxyl groups ordinarily present in the sugars may be replaced, for example, by phosphonate groups, phosphate groups, protected by standard protecting groups, or activated to prepare additional linkages to additional nucleotides, or may be conjugated to solid supports.
  • the 5' and 3' terminal OH can be phosphorylated or substituted with amines or organic capping group moieties of from 1 to 20 carbon atoms.
  • Other hydroxyls may also be derivatized to standard protecting groups.
  • Polynucleotides can also contain analogous forms of ribose or deoxyribose sugars that are generally known in the art, including, for example, 2'-O-methyl-, 2'-O-allyl, 2'-fluoro- or 2'- azido-ribose, carbocyclic sugar analogs, alpha- or beta-anomeric sugars, epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, sedoheptuloses, acyclic analogs and abasic nucleoside analogs such as methyl riboside.
  • One or more phosphodiester linkages may be replaced by alternative linking groups.
  • linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(O)S(“thioate”), P(S)S (“dithioate”), (O)NRi (“amidate”), P(O)R, P(O)OR', CO or CH2 (“formacetal”), in which each R or R' is independently H or substituted or unsubstituted alkyl (1-20 C) optionally containing an ether (-O-) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl, or araldyl. Not all linkages in a polynucleotide need be identical. The preceding description applies to all polynucleotides referred to herein, including RNA and DNA.
  • IUPAC nucleotide code is used throughout. IUPAC nucleotide code is provided in
  • polyribonucleotide refers to polynucleotide polymers containing 50% or more ribose bases including unmodified and/or modified ribonucleotides.
  • a “guide RNA” is a type of polyribonucleotide that includes a CRISPR RNA sequence (crRNA, also referred to as a “guide sequence” or “spacer”), and, in certain embodiments, a trans-activating CRISPR RNA sequence (tracrRNA).
  • the tracrRNA if present, binds to an endonuclease (e.g., a CRISPR enzyme such as Cas9) and the crRNA is complementary to a target sequence.
  • polypeptide “oligopeptide,” “peptide,” and “protein” are used interchangeably herein to refer to chains of amino acids of any length.
  • the chain may be linear or branched, it may comprise modified amino acids, and/or may be interrupted by non-amino acids.
  • the terms also encompass an amino acid chain that has been modified naturally or by intervention, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component.
  • polypeptides containing one or more analogs of an amino acid including, for example, unnatural amino acids, etc.
  • the polypeptides can occur as single chains or associated chains.
  • the terms “prevent,” “preventing,” and “prevention” refer to the prevention of the recurrence or onset of, or a reduction in one or more symptoms of a disease or condition in a subject as result of the administration of a therapy (e.g., a prophylactic or therapeutic agent).
  • a therapy e.g., a prophylactic or therapeutic agent
  • “prevent,” “preventing,” and “prevention” refer to the inhibition or a reduction in the development or onset of a disease or condition, or the prevention of the recurrence, onset, or development of one or more symptoms of a disease or condition, in a subject resulting from the administration of a therapy (e.g., a prophylactic or therapeutic agent), or the administration of a combination of therapies (e.g., a combination of prophylactic or therapeutic agents).
  • a therapy e.g., a prophylactic or therapeutic agent
  • a combination of therapies e.g., a combination of prophylactic or therapeutic agents
  • promoter refers to a recognition site on DNA that is bound by an RNA polymerase. The polymerase drives transcription of a transgene. Exemplary promoters suitable for use with the compositions and methods described herein are described herein. Additionally, the term “promoter” may refer to a synthetic promoter, such as a regulatory DNA sequence that does not occur naturally in a biological system. Synthetic promoters contain parts of naturally occurring promoters combined with polynucleotide sequences that do not occur in nature and can be optimized to express recombinant DNA.
  • a “recombinant adeno-associated virus (rAAV virus)” or “rAAV viral particle” refers to a viral particle composed of at least one AAV capsid protein and an encapsidated rAAV vector genome.
  • a “recombinant AAV vector” refers to a polynucleotide vector based on an AAV comprising one or more heterologous sequences (z.e., nucleic acid sequence not of AAV origin) that are flanked by at least one AAV ITR.
  • Such rAAV vectors can be replicated and packaged into infectious viral particles when present in a host cell that has been infected with a suitable helper virus (or that is expressing suitable helper functions) and that is expressing AAV rep and cap gene products (i.e. AAV Rep and Cap proteins).
  • an rAAV vector When an rAAV vector is incorporated into a larger polynucleotide (e.g., in a chromosome or in another vector such as a plasmid used for cloning or transfection), then the rAAV vector may be referred to as a “provector” which can be “rescued” by replication and encapsidation in the presence of AAV packaging functions and suitable helper functions.
  • An rAAV vector can be in any of a number of forms, including, but not limited to, plasmids, linear artificial chromosomes, complexed with lipids, encapsulated within liposomes, and encapsidation in a viral particle, e.g., an AAV particle.
  • An rAAV vector can be packaged into an AAV virus capsid to generate a “recombinant adeno-associated viral particle (rAAV particle)”.
  • regulatory element or “regulatory sequence” is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences).
  • IRES internal ribosomal entry sites
  • regulatory elements e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences.
  • Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences).
  • a tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g., liver and pancreas), or particular cell types (e.g., lymphocytes). Regulatory elements may also direct expression in a temporal -dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, which may also be tissue- or cell type-specific. Also encompassed by the term “regulatory element” are enhancer elements, such as WPRE; CMV enhancers; the R-U5' segment in LTR of HTLV-I (Takebe et al. (1988) MOL. CELL. BIOL.
  • a “substitution” includes replacing a wild-type amino acid with another (e.g., a nonwild-type amino acid).
  • the another (e.g., non-wild-type) or inserted amino acid is Ala (A), His (H), Lys (K), Phe (F), Met (M), Thr (T), Gin (Q), Asp (D), or Glu (E).
  • the another (e.g., non-wild-type) or inserted amino acid is A.
  • the another (e.g., non-wild-type) amino acid is Arg (R), Asn (N), Cys (C), Gly (G), He (I), Leu (L), Pro (P), Ser (S), Trp (W), Tyr (Y), or Vai (V).
  • non-polar Norleucine, Met, Ala, Vai, Leu, and He
  • polar without charge Cys, Ser, Thr, Asn, and Gin
  • acidic negatively charged
  • Asp and Glu acidic
  • basic positively charged
  • Lys and Arg residues that influence chain orientation
  • aromatic Trp, Tyr, Phe and His.
  • Conventional amino acids include L or D stereochemistry.
  • the another (e.g., non-wild-type) amino acid is a member of a different group (e.g., an aromatic amino acid is substituted for a non-polar amino acid).
  • Substantial modifications in the biological properties of the polypeptide are accomplished by selecting substitutions that differ significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a P-sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
  • Naturally occurring residues are divided into groups based on common sidechain properties: (1) non-polar: Norleucine, Met, Ala, Vai, Leu and He; (2) polar without charge: Cys, Ser, Thr, Asn and Gin; (3) acidic (negatively charged): Asp and Glu; (4) basic (positively charged): Lys and Arg; (5) residues that influence chain orientation: Gly and Pro; and (6) aromatic: Trp, Tyr, Phe, and His.
  • the another (e.g., non-wild-type) amino acid is a member of a different group (e.g., a hydrophobic amino acid for a hydrophilic amino acid, a charged amino acid for a neutral amino acid, or an acidic amino acid for a basic amino acid).
  • the another (e.g., non-wild-type) amino acid is a member of the same group (e.g., another basic amino acid, another acidic amino acid, another neutral amino acid, another charged amino acid, another hydrophilic amino acid, another hydrophobic amino acid, another polar amino acid, another aromatic amino acid, or another aliphatic amino acid).
  • the another (e.g., non-wild-type) amino acid is an unconventional amino acid.
  • Unconventional amino acids are non-naturally occurring amino acids.
  • Examples of an unconventional amino acid include, but are not limited to, aminoadipic acid, beta-alanine, beta-aminopropionic acid, aminobutyric acid, piperidinic acid, aminocaprioic acid, aminoheptanoic acid, aminoisobutyric acid, aminopimelic acid, citrulline, diaminobutyric acid, desmosine, diaminopimelic acid, diaminopropionic acid, N-ethylglycine, N-ethylaspargine, hyroxylysine, allo-hydroxylysine, hydroxyproline, isodesmosine, allo-isoleucine, N- methylglycine, sarcosine, N-methylisoleucine, N-methylvaline, norvaline, norleucine, orithine, 4-hydroxypro
  • transgene refers to a polynucleotide that is introduced into a cell and is capable of being transcribed into RNA and optionally, translated and/or expressed under appropriate conditions. In aspects, it confers a desired property to a cell into which it was introduced, or otherwise leads to a desired therapeutic or diagnostic outcome.
  • Treating” a condition or subject refers to taking steps to obtain beneficial or desired results, including clinical results.
  • treatment refers to the reduction or amelioration of the progression, severity, and/or duration of one or more symptoms of the disease, or the amelioration of one or more symptoms resulting from the administration of one or more therapies (including, but not limited to, the administration of one or more prophylactic or therapeutic agents).
  • variant refers to a variant of (a) a promoter or (b) a gene or coding sequence (e.g, an mRNA) that encodes a protein (e.g, an endonuclease) that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full-length, naturally occurring promoter or protein.
  • a variant can comprise a splice variant or a gene comprising a mutation such as an insertion, deletion, or substitution.
  • the term “vector” includes a nucleic acid vector, e.g., a DNA vector, such as a plasmid, an RNA vector, or another suitable replicon (e.g., viral vector).
  • a DNA vector such as a plasmid, an RNA vector, or another suitable replicon (e.g., viral vector).
  • a variety of vectors have been developed for the delivery of polynucleotides encoding exogenous polynucleotides or proteins into a prokaryotic or eukaryotic cell. Examples of such expression vectors are disclosed in, e.g., WO 1994/011026; incorporated herein by reference as it pertains to vectors suitable for the expression of a nucleic acid molecule of interest.
  • Expression vectors suitable for use with the compositions and methods described herein contain a polynucleotide sequence as well as, e.g., additional sequence elements used for the expression of heterologous nucleic acid materials (e.g., a nucleic acid molecule) in a mammalian cell.
  • Certain vectors that can be used for the expression of the nucleic acid molecules described herein include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription.
  • the compact bidirectional promoters do not contain an enhancer.
  • nucleic acid molecule agents disclosed herein contain polynucleotide sequences that enhance the rate of translation of these polynucleotides or improve the stability or nuclear export of the RNA that results from gene transcription. These sequence elements include, e.g., 5' and 3' untranslated regions, an IRES, and polyA in order to direct efficient transcription of the gene carried on the expression vector.
  • the expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker are genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, nourseothricin, or zeocin.
  • a vector comprises one or more pol III promoters, one or more pol II promoters, one or more pol I promoters, or combinations thereof.
  • pol III promoters include, but are not limited to, U6 and Hl promoters.
  • pol II promoters include, but are not limited to the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) (e.g., Boshart et al.
  • RSV Rous sarcoma virus
  • CMV cytomegalovirus
  • a vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., CRISPR transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.).
  • Advantageous vectors include lentiviruses and AAVs, and types of such vectors can also be selected for targeting particular types of cells.
  • vector genome (vg) may refer to one or more polynucleotides comprising a set of the polynucleotide sequences of a vector, e.g., a viral vector.
  • a vector genome may be encapsidated in a viral particle.
  • a vector genome may comprise single-stranded DNA, double-stranded DNA, single-stranded RNA, or double-stranded RNA.
  • a vector genome may include endogenous sequences associated with a particular viral vector and/or any heterologous sequences inserted into a particular viral vector through recombinant techniques.
  • a rAAV vector genome may include at least one ITR sequence flanking a promoter, a stuff er, a sequence of interest (e.g., an interfering RNA (RNAi)), and a polyadenylation sequence.
  • a complete vector genome may include a complete set of the polynucleotide sequences of a vector.
  • the nucleic acid titer of a viral vector may be measured in terms of vg/mL. Methods suitable for measuring this titer are known in the art (e.g., quantitative PCR).
  • wild-type or wild type is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene, or characteristic as it occurs in nature as distinguished from mutant or variant forms.
  • expression control sequence means a nucleic acid sequence that directs transcription of a nucleic acid.
  • An expression control sequence can be a promoter, such as a constitutive promoter or an enhancer.
  • the expression control sequence is operably linked to the nucleic acid sequence to be transcribed.
  • the disclosure is based, in part, upon the discovery that compact promoters can effectively drive expression of endonuclease systems, for example, those including both an endonuclease and a gRNA (FIG. 1), to target a hypoxia-induced gene.
  • endonuclease systems for example, those including both an endonuclease and a gRNA (FIG. 1), to target a hypoxia-induced gene.
  • nucleic acids, expression constructs, and vectors comprising a compact bidirectional promoter and a gene editing system (e.g., a polyribonucleotide and an endonuclease), wherein the compact promoter is small enough to allow for the inclusion of both an endonuclease and a gRNA in a single vector, such as an AAV vector, which has a size limit that makes expression of both endonuclease and gRNA difficult using conventional promoters.
  • a gene editing system e.g., a polyribonucleotide and an endonuclease
  • polyribonucleotide e.g., a guide sequence
  • the polyribonucleotide hybridizes with a target sequence of a hypoxia-induced gene and where the polyribonucleotide directs the endonuclease to the target sequence.
  • the endonuclease can induce a break in the target DNA, which disrupts processing of an encoded mRNA transcript and expression of the encoded protein.
  • an “endonuclease system” refers collectively to transcripts and other elements, including the promoters described herein, involved in the expression of or directing the activity of a gene encoding a gene-editing endonuclease (e.g., a Cas endonuclease) and a polyribonucleotide (e.g., a gRNA) having a guide sequence (also referred to as a “spacer” in the context of certain endogenous gene editing systems, e.g., a CRISPR system).
  • a gene-editing endonuclease e.g., a Cas endonuclease
  • a polyribonucleotide e.g., a gRNA having a guide sequence
  • spacer also referred to as a “spacer” in the context of certain endogenous gene editing systems, e.g., a CRISPR system.
  • AAV and other vectors make it difficult to package both a gRNA and an endonuclease into a single vector.
  • this problem can be overcome by using a compact promoter, as described herein, to incorporate a non-naturally occurring endonuclease system via a single vector.
  • the single vector is a viral vector.
  • the single vector is a plasmid.
  • the promoter may be a compact, bidirectional promoter that directs expression of a gRNA in one direction and an endonuclease in the other direction.
  • the promoter is operably linked to the sequence encoding the polyribonucleotide and the sequence encoding the endonuclease.
  • a compact promoter provided herein can be selected to express the selected endonuclease system in a desired target cell.
  • the target cell is a lung endothelial cell and/or a lung artery smooth muscle cell.
  • the target cell is a lung endothelial cell.
  • the target cell is a lung artery smooth muscle cell.
  • the promoter may be derived from any species, including human.
  • the promoter is “cell specific.” The term “cell-specific” means that the particular promoter selected for the recombinant vector can direct expression of the selected transgene in a particular cell.
  • the promoter is of a small size, e.g., less than about 500 bp, due to the size limitations of the AAV vector. In certain embodiments, the promoter is less than about 500 bp, less than about 300 bp, or less than about 200 bp in size.
  • the promoter is between about 50 bp and about 400 bp, between about 75 bp and about 400 bp, between about 99 bp and about 400 bp, between about 100 bp and about 400 bp, between about 150 bp and about 400 bp, between about 200 bp and about 400 bp, between about 250 bp and about 400 bp, between about 300 bp and about 400 bp, between about 50 bp and about 300 bp, between about 75 bp and about 300 bp, between about 100 bp and about 300 bp, between about 150 bp and about 300 bp, between about 200 bp and about 300 bp, between about 50 bp and about 250 bp, between about 75 bp and about 250 bp, between about 100 bp and about 250 bp, between about 150 bp and about 250 bp, between about 200 bp and about 250 bp, between about 50 bp and about 250 bp, between about 50
  • the promoter is a bidirectional promoter. In certain embodiments, the bidirectional promoter is less than about 500 bp in size. In certain embodiments, the bidirectional promoter is less than about 300 bp in size. In certain embodiments, the bidirectional promoter is less than about 200 bp in size.
  • the bidirectional promoter is between about 50 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 99 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 250 bp and about 400 bp in size.
  • the bidirectional promoter is between about 300 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 250 bp in size.
  • the bidirectional promoter is between about 75 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 225 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 200 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 200 bp in size.
  • the bidirectional promoter is between about 150 bp and about 200 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 180 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 180 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 180 bp in size.
  • the promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof.
  • the promoter comprises a nucleotide sequence having at least 85% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 90% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 95% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof.
  • the promoter comprises a nucleotide sequence having at least 96% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 97% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 98% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 99% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof.
  • a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1- 226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175.
  • a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1- 226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
  • the functional fragment comprises at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
  • a functional fragment comprises at least a transcription factor binding sites selected from Staf, DSE, PSE, c-REL, GATA-1, GATA- 2, and CREB.
  • a functional fragment can comprise the B recognition sequence (BRE) or TATA box.
  • the promoter comprises a TATA mutation.
  • the TATA mutation is a TATAA TCGAA mutation.
  • a nucleic acid comprising a promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence.
  • the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof.
  • the 6 bp fragment is 5'-GCCACC-3'.
  • a nucleic acid comprising a promoter described herein further comprises a terminator sequence.
  • the terminator sequence comprises one of the terminator sequences in TABLE 2.
  • the compact promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
  • a viral intron e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron.
  • the compact promoter does not comprise a viral promoter and/or a synthetic promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, or 100% identity to a naturally occurring mammalian promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
  • the expression level of a compact promoter can be determined by expressing a reporter molecule in a cell, e.g., a human embryonic kidney (HEK) cell line or an N2A cell line.
  • the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK thymidine kinase (TK) promoter.
  • TK HSK thymidine kinase
  • the promoter is an Hl promoter.
  • the Hl promoter is a bidirectional promoter having both pol II and pol III activity.
  • the disclosure provides previously unidentified Hl promoters that the Applicant identified by generating a Hidden Markov model (HMM) profile from a multispecies alignment of known Hl promoters (see, e.g., International Patent Publication No. WO2015/195621 and W02018/009534). Regions flanking the Hl promoter region that were conserved throughout mammals were identified. As shown in FIG.
  • the region comprising the Hl promoter is located between the RPPH1 (Hl RNA) gene located on the minus strand to the left, and the beginning (i.e., the ATG(GCG)) of the protein coding gene, PARP2, located to the right.
  • the RPPH1 gene comprises a highly conserved region in the Hl RNA gene (5'-GGAAGCTCA-3') that is conserved throughout all mammals.
  • the Hl promoter comprises or consists of a region between the ATG(GCG) of PAPP2, and the highly conserved region in the Hl RNA gene (5'- GGAAGCTCA-3').
  • FIG. 1 is the position of the pol III portion of the Hl promoter. Additional conserved regions present in the Hl promoter are shown, including, for example, conserved transcription factor binding sites, like a TATA box
  • the Hl promoter is a mammalian promoter, e.g., an artiodactyla Hl promoter, a carnivora Hl promoter, a cetacea Hl promoter, a chiroptera Hl promoter, an insectivora Hl promoter, a lagomorpha Hl promoter, a marsupial Hl promoter, a pangolin Hl promoter, a perissodactyla Hl promoter, a primate Hl promoter, a rodent Hl promoter, or a xenartha promoter.
  • a mammalian promoter e.g., an artiodactyla Hl promoter, a carnivora Hl promoter, a cetacea Hl promoter, a chiroptera Hl promoter, an insectivora Hl promoter, a lagomorpha Hl promoter, a marsupial Hl promoter, a pangolin Hl
  • the promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof.
  • the promoter comprises a nucleotide sequence having at least 85%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 90%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 95%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof.
  • the promoter comprises a nucleotide sequence having at least 96%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 97%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 98%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 99%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof.
  • a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-52, or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521).
  • a functional fragment comprises a truncation of about 15 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521.
  • a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521.
  • a functional fragment comprises a truncation of about 25 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521).
  • a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of 1-82 and 242-521).
  • a functional fragment comprises a truncation of about 35 bases at the 5' end, at the 3' end, or at each of the 5' and ” ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of any one of SEQ ID NOs: 1-82 and 242-521).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521).
  • the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
  • the promoter comprises a TATA mutation.
  • the TATA mutation is a TATAA TCGAA mutation.
  • a nucleic acid comprising a promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence.
  • the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof.
  • the 6 bp fragment is 5'-GCCACC-3'.
  • a nucleic acid comprising a promoter described herein further comprises a terminator sequence.
  • the terminator sequence comprises one of the terminator sequences in TABLE 2.
  • the compact promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
  • a viral intron e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron.
  • the compact promoter does not comprise a viral promoter and/or a synthetic promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
  • the expression level of a compact promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line.
  • the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter.
  • a custom perl script was developed to compare the 5' transcriptional start sites of pol III genes with that of pol II genes. The results were filtered for those that are orientated in opposite directions (divergent transcription).
  • One compact bidirectional promoter identified using this method was the Garl promoter.
  • the Garl promoter expresses the GAR1 protein, which is involved with snoRNAs, rRNA processing, and telomerase activity.
  • the GAR1 protein appears to be expressed in all tissues, suggesting that the Garl promoter can drive expression ubiquitously (https://www.proteinatlas.org/ENSG00000109534-GARl/tissue).
  • it expresses a IncRNA (AC126283.1 or ENSG00000272795) with unknown function, and high expression in the testis.
  • the promoter is a Garl promoter.
  • the Garl promoter is a mammalian promoter, e.g., a human Garl promoter, a carnivora Garl promoter, a primate Garl promoter, or a rodent Garl promoter.
  • the Garl promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
  • the Garl promoter comprises a consensus sequence.
  • the Garl promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof, where the IUPAC nucleotide code is used.
  • the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 711-715 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof.
  • the Garl promoter comprises a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof.
  • a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83- 189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83- 189).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
  • a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711- 715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711- 715).
  • a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715).
  • the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
  • the Garl promoter comprises a TATA mutation.
  • the TATA mutation is a TATAA TCGAA mutation.
  • a nucleic acid comprising a Garl promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence.
  • the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof.
  • the 6 bp fragment is 5'-GCCACC-3'.
  • a nucleic acid comprising a Garl promoter described herein further comprises a terminator sequence.
  • the terminator sequence comprises one of the terminator sequences in TABLE 2.
  • the Garl promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
  • a viral intron e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron.
  • the Garl promoter does not comprise a viral promoter and/or a synthetic promoter.
  • the Garl promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
  • the expression level of a Garl promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line.
  • the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter.
  • the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof.
  • the bidirectional promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 190-226 or a fragment thereof.
  • the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof.
  • the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof.
  • a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190- 226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190- 226).
  • a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226).
  • the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
  • the promoter comprises a TATA mutation.
  • the TATA mutation is a TATAA TCGAA mutation.
  • the promoter is not an Hl promoter. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 1- 82 and 242-521. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 1-82. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 242-521.
  • the promoter is not one or more of an SRP-RPS29 promoter (SEQ ID NO: 219 or SEQ ID NO: 227), a mouse 7sk promoter (SEQ ID NO: 190); a 7skl promoter (SEQ ID NO: 228), a 7sk2 promoter (SEQ ID NO: 229), a 7sk3 promoter (SEQ ID NO: 230), an RMRP-CCDC107 promoter (SEQ ID NO: 207 or SEQ ID NO: 231), an ALOXE3 promoter (SEQ ID NO: 232), a CGB1 promoter (SEQ ID NO: 233), a CGB2 promoter (SEQ ID NO: 234), a Medl6-1 promoter (SEQ ID NO: 235), a Med 16-2 promoter (SEQ ID NO: 236), a DPP9-1 promoter (SEQ ID NO: 237), a DPP9-2 promoter (SEQ ID NO: 238), a DPP
  • a nucleic acid comprising a bidirectional promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence.
  • the 5'UTR includes the nucleotide sequence 5'-GCCGCCACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof.
  • the 6 bp fragment is 5'-GCCACC-3'.
  • a nucleic acid comprising a bidirectional promoter described herein further comprises a terminator sequence.
  • the terminator sequence comprises one of the terminator sequences in TABLE 2.
  • the bidirectional promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
  • a viral intron e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron.
  • the bidirectional promoter does not comprise a viral promoter and/or a synthetic promoter.
  • the compact promoter does not comprise a F5tg83 promoter.
  • the bidirectional promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter.
  • the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
  • the expression level of a bidirectional promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line.
  • the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter.
  • polyribonucleotide refers to polynucleotide polymers containing 50% or more ribose bases including unmodified and/or modified ribonucleotides.
  • a “guide RNA” (“gRNA”) is a type of polyribonucleotide that includes a CRISPR RNA sequence (crRNA, also referred to as a “guide sequence” or “spacer”), and, in certain embodiments, a trans-activating CRISPR RNA sequence (tracrRNA).
  • the tracrRNA if present, binds to an endonuclease (e.g., a CRISPR enzyme such as Cas9) and the crRNA is complementary to a target sequence.
  • target sequence refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a gene editing endonuclease complex (e.g., a CRISPR complex). Full complementarity is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a gene editing endonuclease complex (e.g., a CRISPR complex).
  • a target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides.
  • a target sequence is located in the nucleus or cytoplasm of a cell.
  • the target sequence may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast.
  • a sequence or template that may be used for recombination into the targeted locus comprising the target sequences is referred to as an “editing template,” “editing polynucleotide,” or “editing sequence.”
  • an exogenous template polynucleotide may be referred to as an editing template.
  • the recombination is homologous recombination.
  • a guide sequence is any polyribonucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence.
  • the degree of complementarity between a guide sequence and its corresponding target sequence when optimally aligned using a suitable alignment algorithm, is about 50% or more, about 60% or more, about 75% or more, about 80% or more, 85%, about 90% or more, about 95% or more, about 97.5% or more, or about 99% or more.
  • the polyribonucleotide comprises a nucleic acid sequence that is the reverse complement of any one of SEQ ID NOs: 538-710 and/or a nucleic acid that binds to any one of SEQ ID NOs: 538-710.
  • the target sequence is selected from SEQ ID NOs: 538-710.
  • a portion of the polyribonucleotide hybridizes to a target sequence of a hypoxia-induced gene.
  • the polyribonucleotide directs the endonuclease to the target sequence of the hypoxia-induced gene.
  • the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A) or hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
  • the endonuclease Upon hybridization with the target sequence, the endonuclease can induce a break in the hypoxia- induced gene (e.g., HIF1A, HIF2A, BMPR2, or Al. KI)., which disrupts processing of the encoded mRNA transcript and expression of the encoded protein.
  • hypoxia-induced gene e.g., HIF1A, HIF2A, BMPR2, or Al. KI
  • Optimal alignment of the polyribonucleotide to the target sequence may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows -Wheel er Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
  • any suitable algorithm for aligning sequences include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows -Wheel er Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina,
  • a guide sequence is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In certain embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, or 12, or fewer nucleotides in length. [00155] The ability of a guide sequence to direct sequence-specific binding of a CRISPR complex to a target sequence may be assessed by any suitable assay.
  • the components of a CRISPR system sufficient to form a CRISPR complex may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the CRISPR sequence, followed by an assessment of preferential cleavage within the target sequence, such as by a Surveyor assay.
  • cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions.
  • a guide sequence may be selected to target any target sequence.
  • the target sequence is a sequence within a genome of a cell.
  • Exemplary target sequences include those that are unique in the target genome.
  • the target sequence is selected from SEQ ID NOs: 538-710.
  • the invention provides an endonuclease system having a nucleotide sequence encoding an endonuclease.
  • the endonuclease can be any endonuclease that is capable of cleaving DNA to effect a single or double strand break at the intended locus.
  • Cas proteins include Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), StCas9, NmCas9, GeoCas9, CaslO, Casl2, Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, Casl2i, Casl4a, Casl4b, Cas 14c, Cas , Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3,
  • the CRISPR enzyme has DNA cleavage activity, such as Cas9.
  • the CRISPR enzyme is Cas9, and may be Cas9 from S. pyogenes or S. pneumoniae.
  • the endonuclease is a MAD1, MAD2, MAD3, MAD4, MAD5, MAD6, MAD7, MAD8, MAD9, MAD10, MAD11, or MAD1 1 endonuclease (see, e.g., U.S. Patent No. 9,982,279).
  • the DNA endonuclease can be a Cpfl endonuclease, a homolog thereof, a recombinant of the naturally-occurring molecule thereof, a codon-optimized version thereof, a modified version thereof (e.g., a mutated variant such as a Nickase), and combinations of any of the foregoing.
  • the DNA endonuclease is a Cas9 or Cpfl endonuclease that effects a single-strand break or double-strand break at a locus within or near a target sequence.
  • the DNA endonuclease is a Cas9 endonuclease.
  • the Cas9 endonuclease is a recombinant Cas9, a codon-optimized Cas9, or a modified or mutated Cas9.
  • the Cas9 endonuclease can be derived from a variety of bacterial species.
  • the Cas9 endonuclease is derived from Streptococcus thermophiles, Streptococcus pyogenes, Neisseria meningitides, Staphylococcus aureus, or Treponema denticola.
  • the Cas9 endonuclease is derived from Staphylococcus aureus (SaCas9). In certain embodiments, the Cas9 endonuclease is derived from Streptococcus pyogenes (SpCas9). Wild-type Cas9 has two active sites, RuvC and HNH nuclease domains, for cleaving DNA, one for each strand of the double helix.
  • the Cas9 endonuclease is a mutated SpCas9 endonuclease (e.g., a Nickase) and/or a codon-optimized version thereof
  • the DNA endonuclease is a Cpfl endonuclease (e.g., a recombinant Cpfl, a codon-optimized Cpfl, or a modified or mutated Cpfl).
  • the Cpfl endonuclease can be derived from a variety of bacterial species.
  • the Cpfl endonuclease is derived from Acidaminococcus bacteria or Lachnospiraceae bacteria.
  • the Cpfl endonuclease is a Lachnospiraceae bacterium ND2006 Cpfl .
  • the DNA endonuclease is a MAD7 endonuclease (e.g., a recombinant MAD7, a codon-optimized MAD7, or a modified or mutated MAD7).
  • MAD7 is a codon optimized endonuclease can be derived from Eubacterium rectale (Inscripta, Boulder, CO.) MAD7 is described in U.S. Patent No. 9,982,279.
  • RNA-targeting endonuclease is used.
  • RNA- targeting endonucleases include Cast 3 a, Cast 3b and Cast 3d.
  • the endonuclease (e.g., a CRISPR enzyme) directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the complement of the target sequence. In certain embodiments, the endonuclease directs cleavage of one or both strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, or 500, or more base pairs from the first or last nucleotide of a target sequence.
  • a vector encodes an endonuclease that is mutated with respect to a corresponding wild-type enzyme such that the mutated endonuclease lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence.
  • an endonuclease system comprises a nuclease-dead version of an endonuclease e.g., Cas9 (dCas9)) (Qi et al. (2013) CELL 152, 1173-1183; Gilbert et al. (2013) CELL 154, 442-451; Larson et al.
  • nuclease-dead endonuclease stays bound tightly to a target sequence.
  • inhibition of pol II progression through a steric hindrance mechanism can lead to efficient transcriptional repression.
  • use of a nuclease-dead nuclease can achieve therapeutic repression of a target gene without inducing a break in the target nucleotide sequence.
  • CRISPR system refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR- associated (“Cas”) genes, including sequences encoding a Cas gene, a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus.
  • Cas CRISPR-associated
  • one or more elements of a CRISPR system is derived from a type I, type II, or type III CRISPR system.
  • one or more elements of a CRISPR system is derived from a particular organism comprising an endogenous CRISPR system, such as Streptococcus pyogenes.
  • a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system).
  • an enzyme coding sequence encoding an endonuclease is codon optimized for expression in particular cells, such as eukaryotic cells.
  • the eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate.
  • codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
  • codon bias differs in codon usage between organisms
  • mRNA messenger RNA
  • tRNA transfer RNA
  • genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database,” and these tables can be adapted in a number of ways. See Nakamura et al. (2000) Nucl. Acids Res. 28:292. Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.).
  • one or more codons e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50, or more, or all codons
  • one or more codons e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50, or more, or all codons in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
  • the CRISPR enzyme is part of a fusion protein comprising one or more heterologous protein domains (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more domains in addition to the CRISPR enzyme).
  • a CRISPR enzyme fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains.
  • epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags.
  • reporter genes include, but are not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT), beta-galactosidase, betaglucuronidase, luciferase, green fluorescent protein (GFP), enhanced GFP (eGFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP).
  • GST glutathione-5-transferase
  • HRP horseradish peroxidase
  • CAT chloramphenicol acetyltransferase
  • beta-galactosidase betaglucuronidase
  • luciferase green fluorescent protein
  • GFP green fluorescent protein
  • eGFP enhanced GFP
  • CFP yellow fluorescent protein
  • YFP yellow fluorescent protein
  • autofluorescent proteins including blue fluorescent protein (BFP).
  • a CRISPR enzyme may be fused to a gene sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including but not limited to maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4A DNA binding domain fusions, and herpes simplex virus (HSV) BP 16 protein fusions. Additional domains that may form part of a fusion protein comprising a CRISPR enzyme are described in U.S. Patent Publication No.
  • a tagged CRISPR enzyme is used to identify the location of a target sequence.
  • a reporter gene which includes but is not limited to GST, HRP, CAT, beta-galactosidase, beta-glucuronidase, luciferase, GFP, eGFP, HcRed, DsRed, CFP, YFP, and autofluorescent proteins including BFP, may be introduced into a cell to encode a gene product which serves as a marker by which to measure the alteration or modification of expression of the gene product.
  • the DNA molecule encoding the gene product may be introduced into the cell via a vector.
  • the gene product is luciferase.
  • the expression of the gene product is decreased.
  • Vectors can be designed for expression of CRISPR transcripts (e.g., nucleic acid transcripts, proteins, or enzymes) in prokaryotic or eukaryotic cells.
  • CRISPR transcripts can be expressed in bacterial cells such as Escherichia coh, insect cells (using baculovirus expression vectors), yeast cells, or mammalian cells. Suitable host cells are discussed further in Goeddel (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif.
  • a recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
  • a vector comprises one or more insertion sites, such as a restriction endonuclease recognition sequence (also referred to as a “cloning site”).
  • one or more insertion sites e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more insertion sites
  • a single expression construct may be used to target endonuclease activity to multiple different, corresponding target sequences within a cell.
  • a single vector may comprise about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20, or more guide sequences.
  • a vector comprises a regulatory element operably linked to an enzyme-coding sequence encoding an endonuclease, such as a CRISPR enzyme (e.g., a Cas protein).
  • a CRISPR enzyme e.g., a Cas protein
  • Vectors may be introduced and propagated in a prokaryote.
  • a prokaryote is used to amplify copies of a vector to be introduced into a eukaryotic cell or as an intermediate vector in the production of a vector to be introduced into a eukaryotic cell e.g., amplifying a plasmid as part of a viral vector packaging system).
  • a prokaryote is used to amplify copies of a vector and express one or more nucleic acids, such as to provide a source of one or more proteins for delivery to a host cell or host organism. Expression of proteins in prokaryotes is most often carried out in Escherichia coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins.
  • Fusion vectors add a number of amino acids to a protein encoded therein, such as to the amino terminus of the recombinant protein.
  • Such fusion vectors may serve one or more purposes, such as: (i) to increase expression of recombinant protein; (ii) to increase the solubility of the recombinant protein; and (iii) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification.
  • a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.
  • Such enzymes, and their cognate recognition sequences include Factor Xa, thrombin, and enterokinase.
  • Example fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway N.J.) that fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.
  • E. coli expression vectors examples include pTrc (Amrann et al. (1988) GENE 69:301-315) and pET l id (Studier et al. (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif.).
  • a vector is a yeast expression vector.
  • yeast expression vectors for expression in yeast Saccharomyces cerevisiae include pYepSecl (Baldari, et al. (1987) EMBO J. 6:229-234), pMFa (Kuijan and Herskowitz (1982) CELL 30: 933-943), pJRY88 (Schultz et al. (1987) Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, Calif.), and picZ (Invitrogen Corp, San Diego, Calif.).
  • a vector is capable of driving expression of one or more sequences in mammalian cells using a mammalian expression vector.
  • mammalian expression vectors include pCDM8 (Seed (1987) NATURE 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J. 6: 187-195).
  • the expression vector’s control functions are typically provided by one or more regulatory elements.
  • commonly used promoters are derived from polyoma, adenovirus 2, cytomegalovirus, simian virus 40, and others disclosed herein and known in the art.
  • the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissuespecific regulatory elements are used to express the nucleic acid).
  • tissue-specific regulatory elements are known in the art.
  • suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) GENES DEV. 1 :268-277), lymphoid-specific promoters (Calame and Eaton (1988) ADV. IMMUNOL. 43:235-275), promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J.
  • promoters are also encompassed, e.g, the murine hox promoters (Kessel and Gruss (1990) SCIENCE 249: 374-379) and the alpha-fetoprotein promoter (Campes and Tilghman (1989) GENES DEV. 3:537-546).
  • a regulatory element is operably linked to one or more elements of a CRISPR system so as to drive expression of the one or more elements of the CRISPR system.
  • CRISPRs Clustered Regularly Interspaced Short Palindromic Repeats
  • SPIDRs Sacer Interspersed Direct Repeats
  • the CRISPR locus comprises a distinct class of interspersed short sequence repeats (SSRs) that were recognized in E. coli (Ishino et al. (1987) J. BACTERIOL., 169:5429-5433; and Nakata et al. (1989) J.
  • the CRISPR loci typically differ from other SSRs by the structure of the repeats, which have been termed short regularly spaced repeats (SRSRs) (Janssen et al. (2002) OMICS J. INTEG. BIOL., 6:23-33; and Mojica et al. (2000) MOL. MICROBIOL., 36:244-246).
  • SRSRs short regularly spaced repeats
  • the repeats are short elements that occur in clusters that are regularly spaced by unique intervening sequences with a substantially constant length (Mojica et al. (2000) MOL. MICROBIOL., 36:244- 246).
  • CRISPR loci have been identified in more than 40 prokaryotes (e.g, Jansen et al. (2002) MOL. MICROBIOL., 43: 1565-1575; and Mojica et al. (2005) J. MOL. EVOL.
  • 60: 174-82) including, but not limited to Aeropyrum, Pyrobaculum, Sulfolobus, Archaeoglobus, Halocarcula, Methanobacteriumn, Methanococcus, Methanosarcina, Methanopyrus, Pyrococcus, Picrophilus, Thernioplasnia, Corynebacterium, Mycobacterium, Streptomyces, Aquifrx, Porphvromonas, Chlorobium, Thermus, Bacillus, Listeria, Staphylococcus, Clostridium, Thermoanaerobacter, Mycoplasma, Fusobacterium, Azarcus, Chromobacterium, Neisseria, Nitrosomonas, Desulfovibrio, Geobacter, Myrococcus, Campylobacter, Wolinella, Acinetobacter, Erwinia, Escherichia, Legionella, Methylococcus, Pasteurella, Photobacterium, Salmonella, X
  • the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp (e.g., 3.1 kbp, 3.2 kbp, 3.3 kbp, 3.4 kbp, 3.5 kbp, 3.6 kbp, 3.7 kbp,
  • the disclosure provides rAA) vectors comprising an endonuclease system under the control of a suitable promoter (e.g., a compact bidirectional promoter) to direct the expression of the gRNA and endonuclease.
  • a suitable promoter e.g., a compact bidirectional promoter
  • the disclosure further provides a therapeutic composition comprising an rAAV vector comprising an endonuclease system under the control of a suitable promoter (e.g., a compact bidirectional promoter).
  • a variety of rAAV vectors may be used to deliver the desired complement system gene to the appropriate cells and/or tissues and to direct its expression. More than 30 naturally-occurring serotypes of AAV from humans and non-human primates are known. Many natural variants of the AAV capsid exist, and an rAAV vector of the disclosure may be designed based on an AAV with properties specifically suited for expression in the cells and/or tissues relevant for the endonuclease system to be expressed
  • an rAAV vector is comprised of, in order, a 5' AAV ITR, a transgene or gene of interest encoding an endonuclease system operably linked to a sequence which regulates its expression in a target cell, and a 3' AAV ITR.
  • the rAAV vector may have a polyadenylation sequence.
  • rAAV vectors have one copy of the AAV ITR at each end of the transgene or gene of interest, in order to allow replication, packaging, and efficient integration into cell chromosomes.
  • the transgene sequence encoding a complement system polypeptide (or a functional fragment or variant thereof) or a biologically active fragment thereof will be of about 2 kb to 5 kb in length (or alternatively, the transgene may additionally contain a “stuffer” or “filler” sequence to bring the total size of the nucleic acid sequence between the two ITRs to between about 2 kb and 5 kb).
  • Recombinant AAV vectors of the present disclosure may be generated from a variety of AAVs.
  • ITRs from any AAV serotype are expected to have similar structures and functions with regard to replication, integration, excision, and transcriptional mechanisms.
  • AAV serotypes include AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, and AAV12.
  • the AAV vector is generated from serotype AAV1, AAV2, AAV4, AAV5, or AAV8.
  • the AAV vector includes a targeting peptide that confers tropism to lung vascular cells (e.g., lung endothelial cells and/or lung artery smooth muscle cells).
  • the AAV vector includes the targeting peptide ESGHGYF (SEQ ID NO: 533) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the AAV vector includes the targeting peptide GHGYF (SEQ ID NO: 534) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the AAV vector includes the targeting peptide CGFECVRQCPER (SEQ ID NO: 535) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the AAV vector includes the targeting peptide CGSPGWVRC (SEQ ID NO: 536) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the AAV vector includes the targeting peptide CARSKNKDC (SEQ ID NO: 537) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the targeting peptide includes from about 4 to about 25 amino acids, for example, about 4 to about 10 amino acids, about 4 to about 15 amino acids, about 4 to about 20 amin acids, about 5 to about 10 amino acids, about 5 to about 15 amino acids, or about 5 to about 20 amino acids.
  • the targeting peptide includes an amino acid sequence of any one of SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
  • the AAV comprises a targeting peptide that allows it to target a particular cell and/or tissue type.
  • the targeting peptide comprises or consists of ESGHGYF (SEQ ID NO: 533), GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537), or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto, which confers tropism to lung vascular cells.
  • the rAAV vector is generated from serotype AAV2.
  • the rAAV vector is an AAV-L1 vector, which is a modified AAV2 serotype that comprises a peptide insertion (ESGHGYF (SEQ ID NO: 533)) at position R588 of the VP1 protein, thereby allowing it to target pulmonary vascular cells (e.g., endothelial and/or smooth muscle cells).
  • EGHGYF SEQ ID NO: 533
  • the AAV serotypes include AAVrh8, AAVrh8R, or AAVrhlO. It will also be understood that rAAV vectors of the disclosure may be chimeras of two or more serotypes selected from serotypes AAV1-AAV12, AAV-DJ, AAV-DJ8, AAV-DJ9, or other modified serotypes. The tropism of the vector may be altered by packaging the recombinant genome of one serotype into capsids derived from another AAV serotype.
  • the ITRs of the rAAV virus may be based on the ITRs of, for example, any one of AAV1-12 and may be combined with an AAV capsid selected from any one of AAV1-12, AAV- DJ, AAV-DJ8, AAV-DJ9, or other modified serotypes.
  • any AAV capsid serotype may be used with the vectors of the disclosure.
  • AAV serotypes include AAV1, AAV2, AAV-L1, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV-DJ, AAV-DJ8, AAV- DJ9, AAVrh8, AAVrh8R, and AAVrhlO.
  • the AAV capsid serotype is AAV2.
  • Desirable AAV fragments for assembly into vectors may include the Cap proteins, including the VP1 , VP2, VP3 and hypervariable regions, the Rep proteins, including Rep78, Rep68, Rep52, and Rep40, and the sequences encoding these proteins. These fragments may be readily utilized in a variety of vector systems and host cells. Such fragments may be used alone, in combination with other AAV serotype sequences or fragments, or in combination with elements from other AAV or non-AAV viral sequences.
  • artificial AAV serotypes include, without limitation, AAV with a non-naturally occurring capsid protein.
  • Such an artificial capsid may be generated by any suitable technique using a selected AAV sequence (e.g., a fragment of a vpl capsid protein) in combination with heterologous sequences which may be obtained from a different selected AAV serotype, non-contiguous portions of the same AAV serotype, non- AAV viral source, or non-viral source.
  • An artificial AAV serotype may be, without limitation, a pseudotyped AAV, a chimeric AAV capsid, a recombinant AAV capsid, or a “humanized” AAV capsid.
  • any of the vectors disclosed herein include a spacer, z.e., a DNA sequence interposed between the promoter and the Rep gene ATG start site.
  • the spacer may be a random sequence of nucleotides, or alternatively, it may encode a gene product, such as a marker gene.
  • the spacer may contain genes which typically incorporate start/stop and poly A sites.
  • the spacer may be a non-coding DNA sequence from a prokaryote or eukaryote, a repetitive non-coding sequence, a coding sequence without transcriptional controls, or a coding sequence with transcriptional controls.
  • the spacer is a phage ladder sequences or a yeast ladder sequence. In certain embodiments, the spacer is of a size sufficient to reduce expression of the Rep78 and Rep68 gene products, leaving the Rep52, Rep40 and Cap gene products expressed at normal levels. In certain embodiments, the length of the spacer may therefore range from about 10 bp to about 6 kbp, such as in the range of about 100 bp to about 6 kbp. In certain embodiments, the spacer is less than 2 kbp in length.
  • rAAV vectors Numerous methods are known in the art for production of rAAV vectors, including transfection, stable cell line production, and infectious hybrid virus production systems which include adenovirus-AAV hybrids, herpesvirus-AAV hybrids (Conway, JE et al, (1997). Virology 71(11):8780-8789), and baculovirus-AAV hybrids.
  • rAAV production cultures for the production of rAAV virus particles all require: 1) suitable host cells, including, for example, human-derived cell lines such as HeLa, A549, or 293 cells, or insect-derived cell lines such as SF-9, in the case of baculovirus production systems; 2) suitable helper virus function, provided by wild-type or mutant adenovirus (such as temperature-sensitive adenovirus), herpes virus, baculovirus, or a plasmid construct providing helper functions; 3) AAV Rep and Cap genes and gene products; 4) a transgene (such as a transgene comprising an endonuclease system) flanked by at least one AAV ITR sequence; and 5) suitable media and media components to support rAAV production.
  • suitable host cells including, for example, human-derived cell lines such as HeLa, A549, or 293 cells, or insect-derived cell lines such as SF-9, in the case of baculovirus production systems
  • suitable helper virus function
  • Suitable media known in the art may be used for the production of rAAV vectors.
  • These media include, without limitation, media produced by Hyclone Laboratories and JRH including Modified Eagle Medium (MEM), Dulbecco's Modified Eagle Medium (DMEM), custom formulations such as those described in U.S. Patent No. 6,566,118, and Sf-900 II SFM media as described in U.S. Patent No. 6,723,551, each of which is incorporated herein by reference in its entirety, particularly with respect to custom media formulations for use in production of recombinant AAV vectors.
  • MEM Modified Eagle Medium
  • DMEM Dulbecco's Modified Eagle Medium
  • custom formulations such as those described in U.S. Patent No. 6,566,118
  • Sf-900 II SFM media as described in U.S. Patent No. 6,723,551, each of which is incorporated herein by reference in its entirety, particularly with respect to custom media formulations for use in production of recombinant AAV vector
  • the rAAV particles can be produced using methods known in the art. See, e.g., U.S. Pat. Nos. 6,566,118; 6,989,264; and 6,995,006.
  • host cells for producing rAAV particles include mammalian cells, insect cells, plant cells, microorganisms, and yeast.
  • Host cells can also be packaging cells in which the AAV Rep and Cap genes are stably maintained in the host cell or producer cells in which the AAV vector genome is stably maintained.
  • Exemplary packaging and producer cells are derived from 293, A549, and HeLa cells.
  • AAV vectors are purified and formulated using standard techniques known in the art.
  • rAAV particles are generated by transfecting producer cells with a plasmid (cis- plasmid) containing an rAAV genome comprising a transgene flanked by the 145 nucleotide- long AAV ITRs and a separate construct expressing the AAV Rep and Cap genes in trans.
  • adenovirus helper factors such as El A, E1B, E2A, E40RF6, and VA RNAs, etc. may be provided by either adenovirus infection or by transfecting a third plasmid providing adenovirus helper genes into the producer cells.
  • Producer cells may be HEK293 cells.
  • Packaging cell lines suitable for producing AAV vectors may be readily accomplished given readily available techniques (see e.g., U.S. Pat. No. 5,872,005).
  • the helper factors provided will vary depending on the producer cells used and whether the producer cells already carry some of these helper factors.
  • rAAV particles may be produced by a triple transfection method, such as the exemplary triple transfection method provided infra. Briefly, a plasmid containing a Rep gene and a Cap gene, along with a helper adenoviral plasmid, may be transfected (e.g., using the calcium phosphate method) into a cell line (e.g., HEK-293 cells), and virus may be collected and optionally purified.
  • a triple transfection method such as the exemplary triple transfection method provided infra.
  • a plasmid containing a Rep gene and a Cap gene along with a helper adenoviral plasmid, may be transfected (e.g., using the calcium phosphate method) into a cell line (e.g., HEK-293 cells), and virus may be collected and optionally purified.
  • rAAV particles may be produced by a producer cell line method, such as the exemplary producer cell line method provided infra (see also (referenced in Martin et al., (2013) HUMAN GENE THERAPY METHODS 24:253-269).
  • a cell line e.g., a HeLa cell line
  • a cell line may be stably transfected with a plasmid containing a Rep gene, a Cap gene, and a promoter-transgene sequence.
  • Cell lines may be screened to select a lead clone for rAAV production, which may then be expanded to a production bioreactor and infected with an adenovirus (e.g., a wild-type adenovirus) as helper to initiate rAAV production.
  • adenovirus e.g., a wild-type adenovirus
  • Virus may subsequently be harvested, adenovirus may be inactivated (e.g., by heat) and/or removed, and the rAAV particles may be purified.
  • rAAV vector particles of the disclosure may be harvested from rAAV production cultures by lysis of the host cells of the production culture or by harvest of the spent media from the production culture, provided the cells are cultured under conditions known in the art to cause release of rAAV particles into the media from intact cells, as described more fully in U.S. Patent No. 6,566,118.
  • Suitable methods of lysing cells include for example multiple freeze/thaw cycles, sonication, microfluidization, and treatment with chemicals, such as detergents and/or proteases.
  • the rAAV particles are purified.
  • purified includes a preparation of rAAV particles devoid of at least some of the other components that may also be present where the rAAV particles naturally occur or are initially prepared from.
  • isolated rAAV particles may be prepared using a purification technique to enrich it from a source mixture, such as a culture lysate or production culture supernatant.
  • Enrichment can be measured in a variety of ways, such as, for example, by the proportion of DNase-resistant particles (DRPs) or genome copies (gc) present in a solution, or by infectivity, or it can be measured in relation to a second, potentially interfering substance present in the source mixture, such as contaminants, including production culture contaminants or in- process contaminants, including helper virus, media components, and the like.
  • DNase-resistant particles DNase-resistant particles
  • gc genome copies
  • rAAV particles may be isolated or purified using one or more of the following purification steps: equilibrium centrifugation; flow-through anionic exchange filtration; tangential flow filtration (TFF) for concentrating the rAAV particles; rAAV capture by apatite chromatography; heat inactivation of helper virus; rAAV capture by hydrophobic interaction chromatography; buffer exchange by size exclusion chromatography (SEC); nanofiltration; rAAV capture by anionic exchange chromatography, cationic exchange chromatography, or affinity chromatography.
  • TFF tangential flow filtration
  • SEC size exclusion chromatography
  • nanofiltration rAAV capture by anionic exchange chromatography, cationic exchange chromatography, or affinity chromatography.
  • the disclosure provides methods of preventing or treating PH or PAH in a subject by administering the endonuclease system as herein described.
  • the endonuclease system is administered using an AAV vector.
  • the AAV vector includes a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide (e.g., a gRNA), and a nucleotide sequence encoding an endonuclease.
  • a gRNA e.g., a guide sequence
  • a target sequence of a hypoxia-induced gene e.g., HIF1A, HIF2A, BMPR2, or Al. KI
  • the polyribonucleotide directs the endonuclease to the target sequence.
  • the endonuclease can induce a break in the target DNA. The break in the target DNA disrupts processing of an encoded mRNA transcript and expression of the encoded protein, thereby to prevent or treat PH or PAH.
  • the endonuclease induces a break in a HIF1A, HIF2A, BMPR2, or ALK1 gene, which disrupts processing of the encoded HIF1A, HIF2A, BMPR2, or ALK1 mRNA transcript and expression of the encoded HIF1 A, HIF2A, BMPR2, or ALK1 protein, thereby to prevent or treat PH or PAH.
  • the method of preventing or treating PH or PAH in a subject includes co-administration of the AAV vector, as herein described, and an inhibitor of a hypoxia-induced gene.
  • the inhibitor is a small molecule inhibitor.
  • the small molecule inhibitor is selected from the group including belzutifan, PT2385, vadadustat, KC7F2, CAY10585, 2-Methoxyestradiol, SYP-5, PT2399, N-Acetylcysteine amide, IDF-11774, Lificiguat (YC-1), PX-478 2HC1, BAY 87-2243, C76 (Methyl-3-(2-(cyano(methylsulfonyl)methylene)hydrazino)thiophene-2-carboxylate), Roxadustat (FG-4592), Daprodustat (GSK1278863), Desidustat (ZYAN-1), Molidustat (Bay 85- 3934), MK-8617, IOX-2, 2-methoxyestradiol, GN-44028, AKB-4924, FG-2216, and FG-4497.
  • the AAV vector may be administered prophylactically in individuals identified as having an elevated risk of developing PH or PAH (FIG. 6).
  • Clinical precursors used to anticipate the development of PH or PAH may include high blood pressure, left-sided heart disease (systolic LV failure, LV diastolic dysfunction, valvular diseases), lung disease (chronic obstructive pulmonary disease, interstitial lung disease, chronic thromboembolic pulmonary hypertension), chronic hemolytic anemia, sarcoidosis, chronic renal failure, connective tissue disease, liver cirrhosis, use of medication with a risk of producing PH or PAH as a side effect, genetic disorders or conditions (e.g., mutations in BMPR2. j AI.K1.
  • the AAV vector and the small molecule inhibitor may be administered substantially simultaneously. In certain embodiments, the AAV vector and the small molecule inhibitor may be administered sequentially in any order.
  • the AAV alone or in combination with the small molecule inhibitor may be administered subcutaneously, intradermally, intravenously, intraperitoneally, via inhalation, nasally, orally, intramuscularly, intracranially, via intrapulmonary route, via ophthalmic route, parenterally, rectally, vaginally, or via a transmucosal route.
  • the endonuclease system may be administered at substantially the same time that hypoxia-induced effects are detected. In certain embodiments, the endonuclease system may be administered after hypoxia-induced effects are detected.
  • the AAV capsid is modified to improve therapy.
  • the capsid may be modified using conventional molecular biology techniques.
  • the capsid is modified for minimized immunogenicity, better stability and particle lifetime, efficient degradation, and/or accurate delivery of the endonuclease system to the nucleus.
  • the modification or mutation is an amino acid deletion, insertion, substitution, or any combination thereof in a capsid protein.
  • a modified polypeptide may comprise 1, 2, 3, 4, 5, up to 10, or more amino acid substitutions and/or deletions and/or insertions.
  • a “deletion” may comprise the deletion of individual amino acids, deletion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or deletion of larger amino acid regions, such as the deletion of specific amino acid domains or other features.
  • An “insertion” may comprise the insertion of individual amino acids, insertion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or insertion of larger amino acid regions, such as the insertion of specific amino acid domains or other features.
  • one or more amino acid substitutions are introduced into one or more of VP1, VP2, and VP3.
  • a modified capsid protein comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 conservative or non-conservative substitutions relative to the wild-type polypeptide.
  • the modified capsid polypeptide of the disclosure comprises modified sequences, wherein such modifications can include both conservative and non-conservative substitutions, deletions, and/or additions, and typically include peptides that share at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 87%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the corresponding wild-type capsid protein.
  • the methods of the invention also provide administering pharmaceutical compositions comprising an endonuclease system described herein and a pharmaceutically acceptable carrier.
  • the pharmaceutical compositions may be suitable for any mode of administration described herein.
  • the pharmaceutical compositions comprising a nucleic acid described herein and a pharmaceutically acceptable carrier are suitable for administration to a human subject.
  • Such carriers are well known in the art (see, e.g., Remington's Pharmaceutical Sciences, 15th Edition, pp. 1035-1038 and 1570-1580).
  • Such pharmaceutically acceptable carriers can be sterile liquids, such as water and oil, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, and the like.
  • Saline solutions and aqueous dextrose, polyethylene glycol (PEG) and glycerol solutions can also be employed as liquid carriers, particularly for injectable solutions.
  • the pharmaceutical composition may further comprise additional ingredients, for example preservatives, buffers, tonicity agents, antioxidants and stabilizers, nonionic wetting or clarifying agents, viscosityincreasing agents, and the like.
  • the pharmaceutical compositions described herein can be packaged in single unit dosages or in multi-dosage forms.
  • the compositions are generally formulated as sterile and substantially isotonic solution.
  • the nucleic acid comprising the endonuclease system and compact bidirectional promoter for use in the target cells as detailed above is formulated into a pharmaceutical composition intended for oral, inhalation, intranasal, intratracheal, intravenous, intramuscular, subcutaneous, intradermal, or parental routes of administration.
  • a pharmaceutically and/or physiologically acceptable vehicle or carrier such as buffered saline or other buffers, e.g., HEPES, to maintain pH at appropriate physiological levels, and, optionally, other medicinal agents, pharmaceutical agents, stabilizing agents, buffers, carriers, adjuvants, diluents, etc.
  • the carrier will typically be a liquid.
  • physiologically acceptable carriers include sterile, pyrogen-free water and sterile, pyrogen-free, phosphate buffered saline.
  • the carrier is an isotonic sodium chloride solution.
  • the carrier is balanced salt solution.
  • the carrier includes tween. If the virus is to be stored long-term, it may be frozen in the presence of glycerol or Tween20.
  • the pharmaceutically acceptable carrier comprises a surfactant, such as perfluorooctane (Perfluoron liquid). Routes of administration may be combined, if desired.
  • the composition may be delivered in a volume of from about 0.1 pL to about 1 mL, including all numbers within the range, depending on the size of the area to be treated, the viral titer used, the route of administration, and the desired effect of the method.
  • the volume is about 50 pL.
  • the volume is about 70 pL.
  • the volume is about 100 pL.
  • the volume is about 125 pL.
  • the volume is about 150 pL.
  • the volume is about 175 pL.
  • the volume is about 200 pL.
  • the volume is about 250 pL.
  • the volume is about 300 pL. In certain embodiments, the volume is about 450 pL. In certain embodiments, the volume is about 500 pL. In certain embodiments, the volume is about 600 pL. In certain embodiments, the volume is about 750 pL. In certain embodiments, the volume is about 850 pL. In certain embodiments, the volume is about 1000 pL.
  • An effective concentration of a rAAV carrying a nucleic acid sequence encoding the desired transgene under the control of the cell-specific promoter sequence desirably ranges from about 10 7 and 10 13 vector genomes per milliliter (vg/mL) (also called genome copies/mL (GC/mL)).
  • the rAAV infectious units in certain embodiments, are measured as described in S.K. McLaughlin et al. (1988) J. VIROL., 62: 1963, which is incorporated herein by reference.
  • the concentration in the target tissue is from about 1.5 x 10 9 vg/mL to about 1.5 x 10 12 vg/mL, such as from about 1.5 x 10 9 vg/mL to about 1.5 x 10 11 vg/mL.
  • the effective concentration is about 2.5 x 10 10 vg to about 1.4 x 10 11 .
  • the effective concentration is about 1.4 x 10 8 vg/mL.
  • the effective concentration is about 3.5 x 10 10 vg/mL.
  • the effective concentration is about 5.6 x 10 11 vg/mL.
  • the effective concentration is about 5.3 x 10 12 vg/mL.
  • the effective concentration is about 1.5 x 10 12 vg/mL. In certain embodiments, the effective concentration is about 1.5 x 10 13 vg/mL. In certain embodiments, the effective dosage (total genome copies delivered) is from about 10 7 to 10 13 vector genomes. It is desirable that the lowest effective concentration of virus be utilized in order to reduce the risk of undesirable effects, such as toxicity. Still other dosages and administration volumes in these ranges may be selected by the attending physician, taking into account the physical state of the subject, such as a human, being treated, the age of the subject, the particular disorder and the degree to which the disorder, if progressive, has developed.
  • the vector is administered at a dose between 2.5 x 10 10 vg/kg and 1.4 x 10 11 vg/kg. In certain embodiments, the vectors are administered at a dose between 1.0 x 10 11 vg/kg and 1.5 x 10 13 vg/kg. In certain embodiments, the vectors are administered at a dose between 1.0 x 10 11 vg/kg and 1.5 x 10 12 vg/kg.
  • the vectors are administered at a dose of about 1.4 x 10 12 . In certain embodiments, the vectors are administered at a dose of 1.4 x 10 12 vg/kg.
  • the pharmaceutical compositions of the disclosure comprise a pharmaceutically acceptable carrier. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PBS. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PLURONIC®. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PBS, NaCl, and PLURONIC®. In certain embodiments, the vectors are administered by intravitreal injection in a solution of PBS with additional NaCl and PLURONIC®.
  • any of the endonuclease systems disclosed herein are assembled into a pharmaceutical, diagnostic, or research kit to facilitate their use in prophylactic, therapeutic, diagnostic, or research applications.
  • a kit may include one or more containers housing any of the vectors including the endonuclease system disclosed herein and instructions for use.
  • the kit may be designed to facilitate use of the methods described herein by researchers and can take many forms.
  • Each of the compositions of the kit may be provided in liquid form (e.g., in solution) or in solid form, (e.g., a dry powder).
  • some of the compositions may be constitutable or otherwise processable (e.g., to an active form), for example, by the addition of a suitable solvent or other species (e.g., water or a cell culture medium), which may or may not be provided with the kit.
  • a suitable solvent or other species e.g., water or a cell culture medium
  • “instructions” can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure.
  • Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, e.g., audiovisual (e.g., videotape, DVD, etc.), internet, and/or web-based communications, etc.
  • the written instructions may be in a form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which instructions can also reflect approval by the agency of manufacture, use, or sale for animal administration.
  • compositions are described as having, including, or comprising specific components, or where processes and methods are described as having, including, or comprising specific steps, it is contemplated that, additionally, there are compositions of the present invention that consist essentially of, or consist of, the recited components, and that there are processes and methods according to the present invention that consist essentially of, or consist of, the recited processing steps.
  • an element or component is said to be included in and/or selected from a list of recited elements or components, it should be understood that the element or component can be any one of the recited elements or components, or the element or component can be selected from a group consisting of two or more of the recited elements or components.
  • Example 1 In vitro characterization of editing performance using an endonuclease system with a compact promoter
  • pX330 a widely used CRISPR plasmid.
  • the CBh promoter ( ⁇ 800bp) in pX330 is an engineered, strong Pol II promoter system that contains an enhancer and a hybrid intron, two additional elements that boost expression.
  • the pX330 plasmid also contains the U6 promoter (a Pol III promoter). This demonstrates that the small endogenous mammalian promoter (Hl) resulted in comparable editing activity to a strong promoter-enhancer, despite being less than half the size (FIG. 2).
  • FIG. 3 packaging of the constructs into an AAV resulted in robust editing (FIG. 3).
  • AAV comprising the compact promoter were properly packaged with few disrupted virions (FIGs. 4A-4B), indicating that the promoter wass sufficiently small to enable efficient packaging and editing.
  • Example 2 In vivo characterization of editing performance using an endonuclease system with a compact promoter
  • FIG. 5A eGFP-expressing mice that were administered the endonuclease system showed increasing levels of editing in the liver (FIG. 5B) and in the heart (FIG. 5C) at 14, 21, 28, and 42 days post administration. To the inventors’ knowledge, this was the first and only in vivo demonstration of gene-editing using SpCas9-gRNA delivered through a single AAV.
  • This Example describes identification of target nucleotide sequences in the human HIF-2a gene and characterization of HIF-2a transcript and protein levels.
  • HeLa cells were selected due to their robust expression of HIF-2a following activation by either hypoxia or small-molecule activators.
  • genomic DNA was extracted and amplified by PCR.
  • PCR primers flanking the on-target site generated 150-500 bp single-band amplicons.
  • Primer3 were used to identify 3 forward primers and 3 reverse primers for a given genomic locus, which enabled testing of 9 amplicons.
  • PCR reactions were performed using the high- fidelity DNA polymerase (Platinum SuperFi II ThermoFisher), then analyzed on PAGE gels stained with SYBR-Gold and visualized with a UV transilluminator to identify single band amplifications. This sensitive approach enabled the selection of highly specific amplicons, as agarose gel and ethidium bromide staining often fail to detect spurious amplifications.
  • DNA editing and repair alleles will be confirmed by NGS in two additional human cell lines (A549 and HEK293). Two targets (plus 1 nontargeting control) will be advanced, based on (i) editing levels and (ii) repair alleles leading to protein disruption.
  • Plasmid constructs will be transfected into HeLa cells, followed by activation of HIF-2a expression with chemically-induced hypoxia (150pM CoCh) or dH2O vehicle control for 24h.
  • RNA will be extracted from cells and treated with DNase I.
  • cDNA will be generated using an oligo(dT) primer (High-Capacity cDNA RT Kit, Thermo Fisher), followed by PCR with HIF-2a cDNA primers spanning the DNA target site, column-purification, and sequencing to detect mRNA sequence changes.
  • oligo(dT) primer High-Capacity cDNA RT Kit, Thermo Fisher
  • HIF-2a Protein (10-30pg) will be separated by SDS-PAGE, transferred to PVDF, and blotted using primary antibodies against HIF-2a and P-actin. Blots will be probed with HRP-conjugated secondary antibodies and visualized using chemiluminescence. Protein bands corresponding to full-length HIF-2a will be quantified by ImageJ and normalized to P-actin. HIF-2a expression will be compared across treatment groups for statistical significance by ANOVA, and targets that reduced HIF-2a protein by > 50% will be advanced for AAV validation studies based on a target of 50% inhibition (a level that is protective in mouse models) multiplied by an expected plasmid transfection efficiency in HeLa cells determined above.
  • the transfection efficiency determined in the optimization experiments will be used to determine the precise threshold criteria. For example, if the plasmid transfection efficiency in cells is determined to be 60%, the threshold for success will be a 30% overall reduction (60% transfection efficiency x 50% therapeutic threshold).
  • Cast 2a target sites will be packaged into an AAV vector and HIF-2a pathway disruption will be assessed following transduction of hPAECs.
  • AAV serotype specificity and transduction efficiency in human pulmonary artery endothelial cells (hPAECs) is not well established.
  • AAV transduction will be optimized in hPAECs using an AAV serotype testing kit to test 12 AAV serotypes using a range of MOIs (IxlO 3 to IxlO 5 ).
  • the AAV serotype with the greatest transduction efficiency without visible cellular toxicity will be used for packaging.
  • a minimal threshold of 30% transduction will be set for hPAECs.
  • Two targeting constructs and one non-targeting control will be cloned into ITR- containing plasmids and sequenced to verify the expression cassette. Validated plasmids will be amplified and maxi -prepped to generate sufficient endotoxin-free material. Plasmids will be sequenced and digested with Smal to verify ITR integrity prior to packaging. Small-scale preps will be produced and crude lysate will be titered by dPCR and quantified on a QuantStudio 3D Digital PCR. Crude AAV lysates will be added to the cell culture media of hPAECs and incubated for 48 hours. HIF-2a activation will be induced via exposure to hypoxia (1% O2) or normoxia for 24 hours. DNA will be isolated and Sanger-sequenced to verify editing, followed by confirmation by NGS (TABLE 5).
  • HIF pathway suppression will be determined in hPAECs grown in either hypoxic or normoxic conditions for 24h. Pathway inhibition will be benchmarked against 20 pM C76, a selective HIF-2a translation inhibitor that reduces activated HIF-2a to wild-type levels. Due to rapid degradation of HIF -2a, RNA and protein will be harvested immediately. RNA will be extracted and cDNA will be generated as described above.
  • cDNA samples and RT controls will be used as templates for qPCR reactions containing primer-probe sets for HIF-la and HIF-2a, as well as HIF-2a transcriptional targets, DLL4, ANGPT2, VEGFA, FGF2, and TIE254-58; and GAPDH and HIF-la as endogenous controls (TABLE 6).
  • HIF-la and HIF -2a proteins will be quantified as described above.
  • Example 5 In vivo Hif2a targeting in aaa murine pulmonary hypertension model assessing a prophylactic intervention strategy
  • Mouse HIF -2a Cast 2a target sites were identified bioinformatically and will be tested in mouse cells for HIF-2a pathway disruption, as described for human targets above. Validated targets will be packaged with AAV-L1 and HIF -2a targeting in vivo and assessed in a hypoxia mouse model for PH, a tractable model dependent on HIF -2a activation that recapitulates many hallmarks of PAH pathology.
  • AAV will be administered 3 weeks prior to the initiation of exposure to hypoxic conditions (FIG. 6).
  • AAV will be administered concomitantly with exposure (FIG. 7), thereby increasing the stringency through a reduction in the therapeutic window prior to hypoxia.
  • the duration of hypoxia will be 4 weeks; and DNA editing, mean PA pressure (mPAP), hematocrit, RV hypertrophy, and pulmonary arterial remodeling, will be determined.
  • the lead target (and non-targeting control) will be cloned into an ITR containing vector, as described.
  • Packaging is done by triple-transfection of HEK293 cells, as described above with a few exceptions: an AAV-L1 cap plasmid is used, and in vivo preps are then purified by iodixanol gradient centrifugation and concentrated.
  • Each vector is subjected to standardized assessments such as titering by digital PCR, endotoxin quantification, and AAV prep purity and stoichiometric analysis of VP1, VP2, and VP3 capsid proteins by polyacrylamide gel electrophoresis followed by silver staining or SYPRO Red staining.
  • Preps are further imaged using electron microscopy to visualize full and empty vector particles and viral prep integrity.
  • Male C57BL/6 mice will be randomized at 5 weeks of age to receive 2.5xl0 12 vg/kg AAVLl-Hif-2a or control via retroorbital injection (note: female mice have milder phenotypes in the hypoxia model.) After 3 weeks of normoxia (to enable AAV-mediated expression), half of the mice will be exposed to 10% O2 (hypoxia) for 4 weeks while the other half will remain at normoxia for 4 weeks. This treatment strategy will be benchmarked against C76, which will be administered daily by intraperitoneal injection at 12.5 mg/kg (or 0.5% DMSO vehicle control) throughout the exposure.
  • Mouse PA endothelial cells will be isolated from AAV-treated hypoxic mice as previously described (see, Dong, Q. G. et al. (1997) ARTERIOSCLER THROMB VASC BIOL 17, 1599-1604; Marelli-Berg et al. (2000) J IMMUNOL METHODS 244, 205-215). DNA will be extracted and amplified using PCR primers specific to the mouse Hif-2a sequence and column- purified amplicons analyzed by NGS, as described (TABLE 8).
  • mPAP (i) mPAP.
  • mPAP will be calculated from closed-chest RV diastolic and systolic pressures measured in anesthetized mice via a 23 -gauge needle filled with heparinized saline and connected to a pressure transducer.
  • the threshold of pathologic mPAP will be > 20mmHg;
  • RV Hypertrophy Under a dissecting microscope, the atria and extraneous vascular material will be removed from the heart. The RV wall will be separated from the LV and septum, and both portions will be quickly blotted dry and weighed. RV weight will be normalized to the combined weight of LV plus septum. The threshold of pathologic RV/(LV+S) in this model will be > 0.25;
  • Pulmonary Arterial Remodeling A suture will be used to occlude the right lung, which will be removed for subsequent DNA analysis.
  • the left lung will be inflated with 10% formalin, embedded in paraffin, and sectioned. It is expected that extension of smooth muscle into previously non-muscular vessels will be observed by confocal microscopy as an increase in small diameter vessels ( ⁇ 100pm outer diameter) that are positive for smooth muscle-specific a- actin (SMA). Morphometric analysis of arterial medial thickness will be quantified from H&E- stained sections, while collagen deposition will be quantified from sections stained with Picrosirius red. The threshold of pathologic remodeling in this model will be > 40% SMA- positive vessels.
  • AAV in saline
  • mice 50 pL of AAV (in saline) will be injected into the retroorbital vessels of anesthetized mice using a 26-30 g gauge needle.
  • light pressure will be applied to the eye to control bleeding, and ophthalmic ointment will be applied to prevent eye drying. Animals will be immediately returned to their home cage and monitored for recovery from anesthesia before returning to the exposure chamber.
  • a positive (C76) and negative (0.5% DMSO) control group will be included.
  • Conscious mice will receive daily dosing of the HIF-2a translation inhibitor C76 (12.5 mg/kg) or DMSO vehicle control via intraperitoneal injection (100 pL).
  • mice will be dosed with C76 or DMSO throughout the entire 4-week exposure.
  • mice will be dosed only for the final 2 weeks of exposure.
  • mice After pressure measurements are obtained (approximately 20 sec), the animal will be euthanized via exsanguination from the right ventricle to obtain blood for hematocrit measurement. At the conclusion of this measurement, the mice will be euthanized according to AMVA-approved methods. Hearts and lungs will be harvested from euthanized mice. mPAP will be determined using the equation: 2/3 PADP + 1/3 PASP. Pulmonary hypertension will be defined as a resting mean pulmonary arterial pressure of > 20 mmHg.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Mycology (AREA)
  • Biophysics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates generally to compact promoters and their use in gene editing e.g., for treating or preventing disease, such as pulmonary arterial hypertension.

Description

COMPACT PROMOTERS FOR TARGETING HYPOXIA INDUCED GENES
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of and priority to U.S. Provisional Application No. 63/403,559, filed September 2, 2022, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes.
FIELD OF THE INVENTION
[0002] The invention relates to compact promoters and their use in expressing gene editing systems, e.g., for treating or preventing disease, such as pulmonary arterial hypertension.
BACKGROUND
[0003] Pulmonary arterial hypertension (PAH) is a rapidly progressing pulmonary vascular disease that leads to right heart failure and premature death. The hallmark features of PAH are increased pulmonary arterial pressure, vascular remodeling, and right ventricle hypertrophy. There are no curative treatments for PAH, and current medications result in modest impacts to morbidity and mortality; approximately 1,000 new cases of PAH are diagnosed in the U.S. each year, with a median survival of 6 years. Thus, novel therapies are urgently needed.
[0004] Hypoxia-inducible factors (HIF) are critical mediators of the oxygen sensing and adaptive pathways. HIF-2a, encoded by the endothelial PAS domain protein 1 (EPAS1 gene, is expressed in the pulmonary endothelium, and strong evidence supports HIF -2a pathway activation in PAH. Patients with PAH have elevated HIF-2a levels in endothelial cells, and pathway activation is well supported by preclinical models. Conversely, reduced pathway activity through either pharmacologic inhibition or conditional knockouts is protective in multiple animal models. A role for HIF-2a is further supported by genome-wide studies of high- altitude populations with low pulmonary arterial pressure that carry EPAS1 variants with reduced activity. Gene-editing represents a novel therapeutic approach to suppress the HIF-2a pathway activation that underlies PAH etiology.
[0005] The therapeutic potential of clustered regularly interspersed short palindromic repeats (CRISPR) is well-appreciated and has begun to be applied to treating select diseases, however, there is a significant barrier for in vivo therapeutic development. Though adeno-associated viruses (AAV) are a safe and effective gene delivery vehicle, the small genetic payload of AAV presents a major impediment toward therapeutic development of CRISPR. The discovery of compact bidirectional promoters that enable packaging of CRISPR components into a single AAV virus presents an opportunity for gene-editing-based therapeutics.
SUMMARY OF THE INVENTION
[0006] The disclosure is based, in part, upon the discovery of compact, bidirectional promoters that can be used to express both an endonuclease (e.g., a Cas9 endonuclease) and a guide RNA (gRNA). For example, in certain embodiments disclosed herein, a compact, bidirectional promoter directs expression of a gRNA in one direction and an endonuclease in the other direction. Accordingly, the promoters disclosed herein use less space than prior art promoters, allowing both an endonuclease and a gRNA to be packaged in a single vector (e.g., a plasmid or an AAV).
[0007] In one aspect, the disclosure relates to a non-naturally occurring endonuclease system having a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding an endonuclease wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia-induced gene, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
[0008] In certain embodiments, the promoter is a bidirectional promoter. In certain embodiments, the promoter is an Hl promoter. In certain embodiments, the Hl promoter is a bidirectional promoter includes pol II and pol III activity. In certain embodiments, the promoter has a length of from 50 bp to 225 bp. In certain embodiments, the promoter has a length of from 50 bp to 200 bp. In certain embodiments, the promoter has a length of from 50 bp to 180 bp. In certain embodiments, the promoter includes a nucleic acid sequence selected from SEQ ID NOs: 1-226, 242-521, and 171-175, or a nucleic acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
[0009] In certain embodiments, the DNA endonuclease is Cas9, Casl2, or MAD7. In certain embodiments, the Cas9 endonuclease is selected from the group consisting of SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9. In certain embodiments, the Casl2 endonuclease is selected from the group consisting of Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, and Casl2i. In certain embodiments, the DNA endonuclease is selected from the group consisting of Casl4a, Casl4b, Cas 14c, and Cas . In certain embodiments, the endonuclease is codon optimized for expression in a eukaryotic cell. [0010] In certain embodiments, the portion of the polyribonucleotide that hybridizes to the target sequence includes a nucleotide sequence selected from SEQ ID NOs: 538-710.
[0011] In certain embodiments, the endonuclease system is incorporated into a single vector. In certain embodiments, the single vector is a viral vector or a plasmid. In certain embodiments, the single vector is an AAV vector. In certain embodiments, the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp. In certain embodiments, the AAV vector is selected from the group including AAV1, AAV2, and AAV5. In certain embodiments, the AAV vector includes a non-naturally occurring nucleotide sequence encoding a targeting peptide. In certain embodiments, the targeting peptide confers cell type-specific tropism to the AAV vector. In certain embodiments, the AAV vector confers tropism to a lung endothelial cell and/or to a lung artery smooth muscle cell.
[0012] In certain embodiments, the targeting peptide includes from 5 to 25 amino acids. In certain embodiments, the targeting peptide includes an amino acid sequence selected from SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533), GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537). In certain embodiments, the AAV vector is AAV-L1. In certain embodiments, the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A), hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
[0013] In another aspect, the invention provides a method of preventing or treating Pulmonary Hypertension (PH) or Pulmonary Arterial Hypertension (PAH) in a subject in need thereof, the method including administering to the subject an adeno-associated viral (AAV) vector having a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding a endonuclease, wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia- induced gene in a cell of the subject, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
[0014] In certain embodiments, the promoter is a bidirectional promoter. In certain embodiments, the promoter is an Hl promoter. In certain embodiments, the Hl promoter is a bidirectional promoter including pol II and pol III activity. In certain embodiments, the pol II activity promotes expression of the endonuclease and the pol III activity promotes expression of the polyribonucleotide. In certain embodiments, the promoter has a length of from 50 bp to 225 bp. In certain embodiments, the promoter has a length of from 50 bp to 200 bp. In certain embodiments, the promoter has a length of from 50 bp to 180 bp. In certain embodiments, the promoter includes a nucleic acid sequence selected from SEQ ID NOs: 1-226, 242-521, and 171- 175, or a nucleic acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
[0015] In certain embodiments, the endonuclease is Cas9, Casl2, or MAD7. In certain embodiments, the Cas9 endonuclease is selected from the group including SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9. In certain embodiments, the Casl2 endonuclease is selected from the group including Cast 2a, Cast 2b, Cast 2c, Cast 2d, Casl2e, Casl2fl, Cast 2g, Casl2h, and Casl2i. In certain embodiments, the DNA endonuclease is selected from the group including Casl4a, Casl4b, Cas 14c, and Cas . In certain embodiments, the endonuclease is codon optimized for expression in a eukaryotic cell.
[0016] In certain embodiments, the portion of the polyribonucleotide that hybridizes to the target sequence includes a nucleotide sequence selected from SEQ ID NOs: 538-710. In certain embodiments, the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp. In certain embodiments, the AAV vector is selected from the group consisting of AAV1, AAV2, and AAV5. In certain embodiments, the AAV vector further includes a non-naturally occurring nucleotide sequence encoding a targeting peptide.
[0017] In certain embodiments, the targeting peptide confers cell type-specific tropism to the AAV vector. In certain embodiments, the AAV vector confers tropism to a lung endothelial cell and/or to a lung artery smooth muscle cell. In certain embodiments, the targeting peptide includes from about 4 to about 25 amino acids, for example, from about 4 to about 10 amino acids, from about 4 to about 15 amino acids, from about 4 to about 20 amin acids, from about 5 to about 10 amino acids, from about 5 to about 15 amino acids, or from about 5 to about 20 amino acids. In certain embodiments, the targeting peptide includes an amino acid sequence selected from SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533). In certain embodiments, the targeting peptide includes an amino acid sequence including ESGHGYF (SEQ ID NO: 533) GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537).
[0018] In certain embodiments, the AAV vector is AAV-L1. In certain embodiments, the method further includes administering the composition prophylactically, concurrently, or following onset of PH or PAH. In certain embodiments, the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A), hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
[0019] In certain embodiments, the method further includes administering an inhibitor of the hypoxia-induced gene. In certain embodiments, the inhibitor is a small molecule inhibitor. In certain embodiments, the small molecule inhibitor is selected from the group including belzutifan, PT2385, vadadustat, KC7F2, CAY10585, 2-Methoxyestradiol, SYP-5, PT2399, N- Acetylcysteine amide, IDF-11774, Lificiguat (YC-1), PX-478 2HC1, BAY 87-2243, C76 (Methyl-3-(2-(cyano(methylsulfonyl)methylene)hydrazino)thiophene-2-carboxylate), Roxadustat (FG-4592), Daprodustat (GSK1278863), Desidustat (ZYAN-1), Molidustat (Bay 85- 3934), MK-8617, IOX-2, 2-methoxyestradiol, GN-44028, AKB-4924, FG-2216, and FG-4497. [0020] These and other aspects and features of the invention are described in the following detailed description and claims.
DESCRIPTION OF THE DRAWINGS
[0021] The invention can be more completely understood with reference to the following drawings.
[0022] FIG. 1 is a schematic drawing showing the region in which the Hl promoter is located, between the start of the H1RNA gene (left) to the start of the PARP2 gene (right). The approximate distance from the H1RNA transcriptional start site to the PARP2 translational start site is indicated as 175 bp.
[0023] FIG. 2 is a graph showing quantified editing observed as a result of suppressing GFP expression, at three different target sites of GFP, using a single Hl bidirectional promoter or the CBh Pol II promoter and the U6 Pol III promoter.
[0024] FIG. 3 is a graph showing the editing observed as a result of suppressing GFP expression, at three different target sites of GFP, by an endonuclease (SpCas9) system operably linked to a single Hl bidirectional promoter packaged into an AAV vector. No comparison to pX330, a widely used CRISPR plasmid, is shown because the pX330 system was too large to be packaged into an AAV vector. [0025] FIG. 4A is an electron micrograph image of a purified AAV sample indicating proper packaging of the endonuclease system constructs into viral capsids (arrows indicate empty particles). Scale bar indicates 100 nm.
[0026] FIG. 4B is an electron micrograph image of a purified AAV sample indicating proper packaging of the endonuclease system constructs into viral capsids (arrows indicate empty particles). Scale bar indicates 20 nm.
[0027] FIG. 5A is a schematic drawing showing a timeline of the in vivo targeting of eGFP expression in a transgenic mouse model using AAV9 loaded with eGPF -targeting and editing constructs employing an Hl promoter. Two organs with AAV9 tropism, the liver and the heart, were analyzed for eGFP expression changes.
[0028] FIG. 5B is a graph showing indel analysis (quantified deletions) in the liver when measured at 14, 21, 28, 35, and 42 days following administration of the eGFP targeting endonuclease system packaged in AAV9 particles.
[0029] FIG. 5C is a graph showing indel analysis (quantified deletions) in the heart when measured at 14, 21, 28, 35, and 42 days following administration of the eGFP targeting endonuclease system packaged in AAV9 particles.
[0030] FIG. 6 is a schematic drawing showing an experimental timeline for prophylactic treatment. The experimental timeline is shown with a 5-week old mouse being administered AAV at day 0 under normoxic conditions and shifted to hypoxic conditions at 3 weeks. The dashed line models AAV expression in vivo, with expression onset just prior to hypoxia exposure.
[0031] FIG. 7 is a schematic drawing showing an experimental timeline for therapeutic treatment. The experimental timeline shows a mouse being administered AAV at initiation of hypoxic conditions. The dashed line models AAV expression in vivo, with expression onset occurring during hypoxic conditions.
DETAILED DESCRIPTION
[0032] Various features and aspects of the invention are discussed in more detail below.
[0033] The disclosure is based, in part, upon the discovery of compact, bidirectional promoters that can be used to express both an endonuclease (e.g., a Cas9 endonuclease) and a guide RNA (gRNA). For example, a non-naturally occurring endonuclease system, including a compact promoter is disclosed herein. In certain embodiments, the endonuclease system includes a nucleotide sequence encoding the promoter, a nucleotide sequence encoding a polyribonucleotide, and a nucleotide sequence encoding an endonuclease. In addition, the disclosure is based on methods for preventing and treating Pulmonary Hypertension (PH) or Pulmonary Arterial Hypertension (PAH) by administering the non-naturally occurring endonuclease system. In certain embodiments, the endonuclease is administered with an inhibitor of a hypoxia-induced gene.
[0034] Unless otherwise defined herein, scientific and technical terms used in this application shall have the meanings that are commonly understood by those of ordinary skill in the art. [0035] Generally, nomenclature used in connection with, and techniques of, pharmacology, cell and tissue culture, molecular biology, cell and cancer biology, neurobiology, neurochemistry, virology, immunology, microbiology, genetics and protein and nucleic acid chemistry, described herein, are those well-known and commonly used in the art. In case of conflict, the present specification, including definitions, will control.
[0036] The practice of the present disclosure will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature, such as, Molecular Cloning: A Laboratory Manual, second edition (Sambrook et al. , 1989) Cold Spring Harbor Press; Oligonucleotide Synthesis (M.J. Gait, ed., 1984); Methods in Molecular Biology, Humana Press; Cell Biology: A Laboratory Notebook (J.E. Cellis, ed., 1998) Academic Press; Animal Cell Culture (R.I. Freshney, ed., 1987); Introduction to Cell and Tissue Culture (J. P. Mather and P.E. Roberts, 1998) Plenum Press; Cell and Tissue Culture: Laboratory Procedures (A. Doyle, J.B. Griffiths, and D.G. Newell, eds., 1993-1998) J. Wiley and Sons; Methods in Enzymology (Academic Press, Inc.); Gene Transfer Vectors for Mammalian Cells (J.M. Miller and M.P. Calos, eds., 1987); Current Protocols in Molecular Biology (F.M. Ausubel et al., eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); Sambrook and Russell, Molecular Cloning: A Laboratory Manual, 3rd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (2001); Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, NY (2002); Harlow and Lane Using Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1998); Coligan et al., Short Protocols in Protein Science, John Wiley & Sons, NY (2003); Short Protocols in Molecular Biology (Wiley and Sons, 1999).
[0037] Enzymatic reactions and purification techniques are performed according to manufacturer’s specifications, as commonly accomplished in the art or as described herein. The nomenclatures used in connection with, and the laboratory procedures and techniques of, analytical chemistry, biochemistry, immunology, molecular biology, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques are used for chemical syntheses, and chemical analyses.
[0038] Throughout this specification and embodiments, the word “comprise,” or variations such as “comprises” or “comprising,” will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
[0039] It is understood that wherever embodiments are described herein with the language “comprising,” otherwise analogous embodiments described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.
[0040] Any example(s) following the term “e.g.” or “for example” is not meant to be exhaustive or limiting.
[0041] Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
[0042] Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements. Moreover, all ranges disclosed herein are to be understood to encompass any and all subranges subsumed therein. For example, a stated range of “1 to 10” should be considered to include any and all subranges between (and inclusive of) the minimum value of 1 and the maximum value of 10; that is, all subranges beginning with a minimum value of 1 or more, e.g., 1 to 6.1, and ending with a maximum value of 10 or less, e.g., 5.5 to 10.
[0043] Where aspects or embodiments of the disclosure are described in terms of a Markush group or other grouping of alternatives, the present disclosure encompasses not only the entire group listed as a whole, but each member of the group individually and all possible subgroups of the main group, but also the main group absent one or more of the group members. The present disclosure also envisages the explicit exclusion of one or more of any of the group members in an embodiment of the disclosure.
[0044] Exemplary methods and materials are described herein, although methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure. The materials, methods, and examples are illustrative only and not intended to be limiting. I. Definitions
[0045] The following terms, unless otherwise indicated, shall be understood to have the following meanings:
[0046] The articles “a” and “an” are used herein to refer to one or to more than one (z.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element. Reference to “about” a value or parameter herein includes (and describes) embodiments that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X .” Numeric ranges are inclusive of the numbers defining the range. Where the use of the term “about” is before a quantitative value, the present invention also includes the specific quantitative value itself, unless specifically stated otherwise. As used herein, the term “about” refers to a ± 10% variation from the nominal value unless otherwise indicated or inferred.
[0047] As used herein, the term “adeno-associated virus” (AAV) refers to a vector derived from an AAV serotype, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, AAV.HSC12, AAV.HSC13, AAV.HSC14, AAV.HSC15, AAV-TT, AAV-DJ8, or AAV.HSC16. AAV vectors can have one or more of the AAV wild-type genes deleted in whole or part, e.g., the rep and/or cap genes, but retain functional flanking inverted terminal repeat (ITR) sequences. Functional ITR sequences promote the rescue, replication, and packaging of the AAV virion. Thus, an AAV vector is defined herein to include at least those sequences required in cis for replication and packaging (e.g., functional ITRs) of the virus. ITRs do not need to be the wild-type polynucleotide sequences and may be altered, e.g., by the insertion, deletion, or substitution of nucleotides, so long as the sequences provide for functional rescue, replication, and packaging. AAV expression vectors are constructed using known techniques to at least provide as operatively linked components in the direction of transcription, control elements including a transcriptional initiation region, the DNA of interest (e.g., a polynucleotide encoding a nucleic acid molecule of the disclosure) and a transcriptional termination region. The terms “adeno- associated virus inverted terminal repeats” and “AAV ITRs” refer to art-recognized regions flanking each end of the AAV genome which function together in cis as origins of DNA replication and as packaging signals for the virus. AAV ITRs, together with the AAV rep coding region, provide for the efficient excision and integration of a polynucleotide sequence interposed between two flanking ITRs into a mammalian genome. The polynucleotide sequences of AAV ITR regions are known. As used herein, an “AAV ITR” does not necessarily include the wildtype polynucleotide sequence, which may be altered, e.g., by the insertion, deletion, or substitution of nucleotides. Additionally, the AAV ITR may be derived from any of several AAV serotypes, including without limitation AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, AAV.HSC12, AAV.HSC13, AAV.HSC14, AAV.HSC15, AAV-TT, AAV-DJ8, or AAV.HSC16, among others. Furthermore, 5' and 3' ITRs which flank a selected polynucleotide sequence in an AAV vector need not be identical or derived from the same AAV serotype or isolate, so long as they function as intended, e.g., to allow for excision and rescue of the sequence of interest from a host cell genome or vector, and to allow integration of the heterologous sequence into the recipient cell genome when AAV Rep gene products are present in the cell. Additionally, AAV ITRs may be derived from any of several AAV serotypes, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, and AAV.HSC12.
[0048] An “AAV inverted terminal repeat (ITR)” sequence, a term well-understood in the art, is an approximately 145-nucleotide sequence that is present at both termini of the native singlestranded AAV genome. The outermost 125 nucleotides of the ITR can be present in either of two alternative orientations, leading to heterogeneity between different AAV genomes and between the two ends of a single AAV genome. The outermost 125 nucleotides also contains several shorter regions of self-complementarity (designated A, A', B, B', C, C and D regions), allowing intrastrand base-pairing to occur within this portion of the ITR.
[0049] “Administering” or “administration” of a substance, a compound, or an agent to a subject can be carried out using one of a variety of methods known to those skilled in the art. In certain embodiments, administration may be local. In other embodiments, administration may be systemic. Administering can also be performed, for example, once, a plurality of times, and/or over one or more extended periods. In some aspects, the administration includes both direct administration, including self-administration, and indirect administration, including the act of prescribing a drug. For example, as used herein, a physician who instructs a subject to selfadminister a drug, or to have the drug administered by another and/or who provides a subject with a prescription for a drug is administering the drug to the subject.
[0050] It should be understood that the expression of “at least one of’ includes individually each of the recited objects after the expression and the various combinations of two or more of the recited objects unless otherwise understood from the context and use. The expression “and/or” in connection with three or more recited objects should be understood to have the same meaning unless otherwise understood from the context.
[0051] As used herein, a “coding sequence” is a portion of a nucleic acid that contains codons that can be translated into amino acids. Although a “stop codon” (TAG, TGA, TAA) is not translated into an amino acid, it may be considered to be part of a coding region, if present, but any flanking sequences, for example, promoters, ribosome binding sites, transcriptional terminators, introns, 5' and 3' untranslated regions, and the like, are not part of the coding region. [0052] As used herein, “codon optimization” refers to the process of modifying a nucleic acid sequence in accordance with the principle that the frequency of occurrence of synonymous codons (e.g., codons that code for the same amino acid) in coding DNA is biased in different species. Such codon degeneracy allows an identical polypeptide to be encoded by a variety of nucleotide sequences. Sequences modified in this way are referred to herein as “codon- optimized.” This process may be performed on any of the sequences described in this specification to enhance expression or stability. Codon optimization may be performed in a manner, such as that described in, e.g., U.S. Patent Nos. 7,561,972, 7,561,973, and 7,888,112, the entire contents of each of which is incorporated herein by reference. The sequence surrounding the translational start site can be converted to a consensus Kozak sequence according to known methods. See, e.g., Kozak et al. (Nucleic Acids Res AS (20): 8125-8148, 1987), the entire contents of which is hereby incorporated by reference. In certain embodiments, codon optimization includes the incorporation of multiple stop codons.
[0053] Throughout this specification and embodiments, the word “include,” or variations such as “includes” or “including,” will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers. It is understood that wherever embodiments are described herein with the language “including,” otherwise analogous embodiments described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.
[0054] The term “consensus sequence,” as used herein in the context of nucleic acid sequences, refers to a calculated sequence representing the most frequent nucleotide residues found at each position in a plurality of similar sequences. Typically, a consensus sequence is determined by sequence alignment in which similar sequences are compared to each other and similar sequence motifs are calculated.
[0055] A “deletion” may include the deletion of subject amino acids, deletion of small groups of amino acids such as 2, 3, 4, or 5 amino acids or deletion of larger amino acid regions, such as the deletion of specific amino acid domains or other features.
[0056] As used herein, the term “functional fragment” refers to a fragment of (a) a promoter or (b) a gene or coding sequence (e.g., an mRNA) that encodes a protein (e.g., an endonuclease) that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full-length, naturally occurring promoter or protein. The term “fragment of,” or “fragment thereof,” as used herein, refers to a segment (e.g., a segment of at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, or at least about 99.9%) of the full length gene(s) or nucleic acid molecule(s) of interest.
[0057] A “helper virus” for AAV refers to a virus that allows an AAV (which is a defective parvovirus) to be replicated and packaged by a host cell. A number of such helper viruses are known in the art.
[0058] As used herein, the term “heterologous” refer to regions that are not normally associated with a particular nucleic acid in nature. For example, a “coding region heterologous to a promoter” is a coding region that is not normally associated with the promoter in nature.
[0059] “Homologous,” in all its grammatical forms and spelling variations, refers to the relationship between two proteins that possess a “common evolutionary origin,” including proteins from superfamilies in the same species of organism, as well as homologous proteins from different species of organism. Such proteins (and their encoding nucleic acids) have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or by the presence of specific residues or motifs and conserved positions. [0060] However, in common usage and in the instant application, the term “homologous,” when modified with an adverb such as “highly,” may refer to sequence similarity and may or may not relate to a common evolutionary origin.
[0061] As used herein, a “host cell” includes an individual cell or cell culture that can be or has been a recipient for vector(s) for incorporation of polynucleotide inserts. The term host cell may refer to the packaging cell line in which a recombinant AAV (rAAV) is produced from a plasmid. In the alternative, the term “host cell” may refer to a target cell in which expression of a transgene is desired.
[0062] The use of the terms “include,” “includes,” “including,” “have,” “has,” “having,” “contain,” “contains,” or “containing,” including grammatical equivalents thereof, should be understood generally as open-ended and non-limiting, for example, not excluding additional unrecited elements or steps, unless otherwise specifically stated or understood from the context. [0063] An “insertion” may include the insertion of subject amino acids; insertion of small groups of amino acids, such as 2, 3, 4, or 5 amino acids; or insertion of larger amino acid regions, such as the insertion of specific amino acid domains or other features.
[0064] An “inverted terminal repeat” or “ITR” sequence is a term well understood in the art and refers to relatively short sequences found at the termini of viral genomes which are in opposite orientation.
[0065] The terms “non-naturally occurring” and “engineered” are used interchangeably and indicate the involvement of the hand of man. The terms, when referring to nucleic acid molecules or polypeptides mean that the nucleic acid molecule or the polypeptide is at least substantially free from at least one other component with which they are naturally associated in nature and as found in nature.
[0066] “Operably linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. For instance, a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression.
[0067] The terms “patient,” “subject,” or “individual” are used interchangeably herein and refer to either a human or a non-human animal. These terms include mammals, such as humans, nonhuman primates, laboratory animals, livestock animals (including bovines, porcines, camels, etc.), companion animals (e.g., canines, felines, other domesticated animals, etc.) and rodents (e.g., mice and rats). In certain embodiments, the subject is a human that is at least 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or 95 years of age.
[0068] “Percent (%) sequence identity” or “percent (%) identical to” with respect to a reference polypeptide (or nucleotide) sequence is defined as the percentage of amino acid residues (or nucleic acids) in a candidate sequence that are identical with the amino acid residues (or nucleic acids) in the reference polypeptide (nucleotide) sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for aligning sequences, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
[0069] As known in the art, “polynucleotide,” or “nucleic acid,” as used interchangeably herein, refer to chains of nucleotides of any length, and include DNA and RNA (e.g., polyribonucleotides). The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a chain by DNA or RNA polymerase. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and their analogs. If present, modification to the nucleotide structure may be imparted before or after assembly of the chain. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. Other types of modifications include, for example, “caps,” substitution of one or more of the naturally occurring nucleotides with an analog, intemucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), those containing pendant moi eties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as well as unmodified forms of the polynucleotide(s). Further, any of the hydroxyl groups ordinarily present in the sugars may be replaced, for example, by phosphonate groups, phosphate groups, protected by standard protecting groups, or activated to prepare additional linkages to additional nucleotides, or may be conjugated to solid supports. The 5' and 3' terminal OH can be phosphorylated or substituted with amines or organic capping group moieties of from 1 to 20 carbon atoms. Other hydroxyls may also be derivatized to standard protecting groups. Polynucleotides can also contain analogous forms of ribose or deoxyribose sugars that are generally known in the art, including, for example, 2'-O-methyl-, 2'-O-allyl, 2'-fluoro- or 2'- azido-ribose, carbocyclic sugar analogs, alpha- or beta-anomeric sugars, epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, sedoheptuloses, acyclic analogs and abasic nucleoside analogs such as methyl riboside. One or more phosphodiester linkages may be replaced by alternative linking groups. These alternative linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(O)S(“thioate”), P(S)S (“dithioate”), (O)NRi (“amidate”), P(O)R, P(O)OR', CO or CH2 (“formacetal”), in which each R or R' is independently H or substituted or unsubstituted alkyl (1-20 C) optionally containing an ether (-O-) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl, or araldyl. Not all linkages in a polynucleotide need be identical. The preceding description applies to all polynucleotides referred to herein, including RNA and DNA.
[0070] IUPAC nucleotide code is used throughout. IUPAC nucleotide code is provided in
TABLE 1
TABLE 1
Figure imgf000016_0001
[0071] In aspects of the presently disclosed subject matter the term “polyribonucleotide” refers to polynucleotide polymers containing 50% or more ribose bases including unmodified and/or modified ribonucleotides. A “guide RNA” (gRNA) is a type of polyribonucleotide that includes a CRISPR RNA sequence (crRNA, also referred to as a “guide sequence” or “spacer”), and, in certain embodiments, a trans-activating CRISPR RNA sequence (tracrRNA). The tracrRNA, if present, binds to an endonuclease (e.g., a CRISPR enzyme such as Cas9) and the crRNA is complementary to a target sequence.
[0072] The terms “polypeptide,” “oligopeptide,” “peptide,” and “protein” are used interchangeably herein to refer to chains of amino acids of any length. The chain may be linear or branched, it may comprise modified amino acids, and/or may be interrupted by non-amino acids. The terms also encompass an amino acid chain that has been modified naturally or by intervention, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. Also included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino acids, etc.), as well as other modifications known in the art. It is understood that the polypeptides can occur as single chains or associated chains.
[0073] As used herein, the terms “prevent,” “preventing,” and “prevention” refer to the prevention of the recurrence or onset of, or a reduction in one or more symptoms of a disease or condition in a subject as result of the administration of a therapy (e.g., a prophylactic or therapeutic agent). For example, in the context of the administration of a therapy to a subject for a disease or condition, “prevent,” “preventing,” and “prevention” refer to the inhibition or a reduction in the development or onset of a disease or condition, or the prevention of the recurrence, onset, or development of one or more symptoms of a disease or condition, in a subject resulting from the administration of a therapy (e.g., a prophylactic or therapeutic agent), or the administration of a combination of therapies (e.g., a combination of prophylactic or therapeutic agents).
[0074] As used herein, the term “promoter” refers to a recognition site on DNA that is bound by an RNA polymerase. The polymerase drives transcription of a transgene. Exemplary promoters suitable for use with the compositions and methods described herein are described herein. Additionally, the term “promoter” may refer to a synthetic promoter, such as a regulatory DNA sequence that does not occur naturally in a biological system. Synthetic promoters contain parts of naturally occurring promoters combined with polynucleotide sequences that do not occur in nature and can be optimized to express recombinant DNA.
[0075] A “recombinant adeno-associated virus (rAAV virus)” or “rAAV viral particle” refers to a viral particle composed of at least one AAV capsid protein and an encapsidated rAAV vector genome.
[0076] A “recombinant AAV vector (rAAV vector)” refers to a polynucleotide vector based on an AAV comprising one or more heterologous sequences (z.e., nucleic acid sequence not of AAV origin) that are flanked by at least one AAV ITR. Such rAAV vectors can be replicated and packaged into infectious viral particles when present in a host cell that has been infected with a suitable helper virus (or that is expressing suitable helper functions) and that is expressing AAV rep and cap gene products (i.e. AAV Rep and Cap proteins). When an rAAV vector is incorporated into a larger polynucleotide (e.g., in a chromosome or in another vector such as a plasmid used for cloning or transfection), then the rAAV vector may be referred to as a “provector” which can be “rescued” by replication and encapsidation in the presence of AAV packaging functions and suitable helper functions. An rAAV vector can be in any of a number of forms, including, but not limited to, plasmids, linear artificial chromosomes, complexed with lipids, encapsulated within liposomes, and encapsidation in a viral particle, e.g., an AAV particle. An rAAV vector can be packaged into an AAV virus capsid to generate a “recombinant adeno-associated viral particle (rAAV particle)”.
[0077] “ The term “regulatory element” or “regulatory sequence” is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego Calif. Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g., liver and pancreas), or particular cell types (e.g., lymphocytes). Regulatory elements may also direct expression in a temporal -dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, which may also be tissue- or cell type-specific. Also encompassed by the term “regulatory element” are enhancer elements, such as WPRE; CMV enhancers; the R-U5' segment in LTR of HTLV-I (Takebe et al. (1988) MOL. CELL. BIOL. 8:466-472); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit P-globin (O'Hare et al. (1981) PROC. NATL. ACAD. SCI. USA. 78(3): 1527-31). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. [0078] As used herein, “residue” refers to a position in a protein and its associated amino acid identity.
[0079] A “substitution” includes replacing a wild-type amino acid with another (e.g., a nonwild-type amino acid). In certain embodiments, the another (e.g., non-wild-type) or inserted amino acid is Ala (A), His (H), Lys (K), Phe (F), Met (M), Thr (T), Gin (Q), Asp (D), or Glu (E). In certain embodiments, the another (e.g., non-wild-type) or inserted amino acid is A. In certain embodiments, the another (e.g., non-wild-type) amino acid is Arg (R), Asn (N), Cys (C), Gly (G), He (I), Leu (L), Pro (P), Ser (S), Trp (W), Tyr (Y), or Vai (V). Conventional or naturally occurring amino acids are divided into the following basic groups based on common side-chain properties: (1) non-polar: Norleucine, Met, Ala, Vai, Leu, and He; (2) polar without charge: Cys, Ser, Thr, Asn, and Gin; (3) acidic (negatively charged): Asp and Glu; (4) basic (positively charged): Lys and Arg; and (5) residues that influence chain orientation: Gly and Pro; and (6) aromatic: Trp, Tyr, Phe and His. Conventional amino acids include L or D stereochemistry. In certain embodiments, the another (e.g., non-wild-type) amino acid is a member of a different group (e.g., an aromatic amino acid is substituted for a non-polar amino acid). Substantial modifications in the biological properties of the polypeptide are accomplished by selecting substitutions that differ significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a P-sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. Naturally occurring residues are divided into groups based on common sidechain properties: (1) non-polar: Norleucine, Met, Ala, Vai, Leu and He; (2) polar without charge: Cys, Ser, Thr, Asn and Gin; (3) acidic (negatively charged): Asp and Glu; (4) basic (positively charged): Lys and Arg; (5) residues that influence chain orientation: Gly and Pro; and (6) aromatic: Trp, Tyr, Phe, and His. In certain embodiments, the another (e.g., non-wild-type) amino acid is a member of a different group (e.g., a hydrophobic amino acid for a hydrophilic amino acid, a charged amino acid for a neutral amino acid, or an acidic amino acid for a basic amino acid). In certain embodiments, the another (e.g., non-wild-type) amino acid is a member of the same group (e.g., another basic amino acid, another acidic amino acid, another neutral amino acid, another charged amino acid, another hydrophilic amino acid, another hydrophobic amino acid, another polar amino acid, another aromatic amino acid, or another aliphatic amino acid). In certain embodiments, the another (e.g., non-wild-type) amino acid is an unconventional amino acid. Unconventional amino acids are non-naturally occurring amino acids. Examples of an unconventional amino acid include, but are not limited to, aminoadipic acid, beta-alanine, beta-aminopropionic acid, aminobutyric acid, piperidinic acid, aminocaprioic acid, aminoheptanoic acid, aminoisobutyric acid, aminopimelic acid, citrulline, diaminobutyric acid, desmosine, diaminopimelic acid, diaminopropionic acid, N-ethylglycine, N-ethylaspargine, hyroxylysine, allo-hydroxylysine, hydroxyproline, isodesmosine, allo-isoleucine, N- methylglycine, sarcosine, N-methylisoleucine, N-methylvaline, norvaline, norleucine, orithine, 4-hydroxyproline, y-carboxyglutamate, s-N,N,N-trimethyllysine, s-N-acetyllysine, O- phosphoserine, N-acetyl serine, N-formylmethionine, 3-methylhistidine, 5 -hydroxy lysine, o-N- methylarginine, and other similar amino acids and amino acids (e.g., 4-hydroxyproline). [0080] The term “transgene” refers to a polynucleotide that is introduced into a cell and is capable of being transcribed into RNA and optionally, translated and/or expressed under appropriate conditions. In aspects, it confers a desired property to a cell into which it was introduced, or otherwise leads to a desired therapeutic or diagnostic outcome.
[0081] “Treating” a condition or subject refers to taking steps to obtain beneficial or desired results, including clinical results. With respect to a disease or condition, treatment refers to the reduction or amelioration of the progression, severity, and/or duration of one or more symptoms of the disease, or the amelioration of one or more symptoms resulting from the administration of one or more therapies (including, but not limited to, the administration of one or more prophylactic or therapeutic agents).
[0082] As used herein, the term “variant” refers to a variant of (a) a promoter or (b) a gene or coding sequence (e.g, an mRNA) that encodes a protein (e.g, an endonuclease) that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full-length, naturally occurring promoter or protein. For example, a variant can comprise a splice variant or a gene comprising a mutation such as an insertion, deletion, or substitution.
[0083] As used herein, the term “vector” includes a nucleic acid vector, e.g., a DNA vector, such as a plasmid, an RNA vector, or another suitable replicon (e.g., viral vector). A variety of vectors have been developed for the delivery of polynucleotides encoding exogenous polynucleotides or proteins into a prokaryotic or eukaryotic cell. Examples of such expression vectors are disclosed in, e.g., WO 1994/011026; incorporated herein by reference as it pertains to vectors suitable for the expression of a nucleic acid molecule of interest. Expression vectors suitable for use with the compositions and methods described herein contain a polynucleotide sequence as well as, e.g., additional sequence elements used for the expression of heterologous nucleic acid materials (e.g., a nucleic acid molecule) in a mammalian cell. Certain vectors that can be used for the expression of the nucleic acid molecules described herein include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription. In certain embodiments, the compact bidirectional promoters do not contain an enhancer. Other useful vectors for expression of nucleic acid molecule agents disclosed herein contain polynucleotide sequences that enhance the rate of translation of these polynucleotides or improve the stability or nuclear export of the RNA that results from gene transcription. These sequence elements include, e.g., 5' and 3' untranslated regions, an IRES, and polyA in order to direct efficient transcription of the gene carried on the expression vector. The expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker are genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, nourseothricin, or zeocin.
[0084] In certain embodiments, a vector comprises one or more pol III promoters, one or more pol II promoters, one or more pol I promoters, or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and Hl promoters. Examples of pol II promoters include, but are not limited to the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) (e.g., Boshart et al. (1985) CELL 41 :521-530), the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EFla promoter.
[0085] A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., CRISPR transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.). Advantageous vectors include lentiviruses and AAVs, and types of such vectors can also be selected for targeting particular types of cells.
[0086] The term “vector genome (vg)” as used herein may refer to one or more polynucleotides comprising a set of the polynucleotide sequences of a vector, e.g., a viral vector. A vector genome may be encapsidated in a viral particle. Depending on the particular viral vector, a vector genome may comprise single-stranded DNA, double-stranded DNA, single-stranded RNA, or double-stranded RNA. A vector genome may include endogenous sequences associated with a particular viral vector and/or any heterologous sequences inserted into a particular viral vector through recombinant techniques. For example, a rAAV vector genome may include at least one ITR sequence flanking a promoter, a stuff er, a sequence of interest (e.g., an interfering RNA (RNAi)), and a polyadenylation sequence. A complete vector genome may include a complete set of the polynucleotide sequences of a vector. In certain embodiments, the nucleic acid titer of a viral vector may be measured in terms of vg/mL. Methods suitable for measuring this titer are known in the art (e.g., quantitative PCR).
[0087] As used herein the term “wild-type” or “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene, or characteristic as it occurs in nature as distinguished from mutant or variant forms.
[0088] As used herein, “expression control sequence” means a nucleic acid sequence that directs transcription of a nucleic acid. An expression control sequence can be a promoter, such as a constitutive promoter or an enhancer. The expression control sequence is operably linked to the nucleic acid sequence to be transcribed.
[0089] Each embodiment described herein may be used individually or in combination with any other embodiment described herein.
II. Endonuclease system
[0090] The disclosure is based, in part, upon the discovery that compact promoters can effectively drive expression of endonuclease systems, for example, those including both an endonuclease and a gRNA (FIG. 1), to target a hypoxia-induced gene. Accordingly, the disclosure provides nucleic acids, expression constructs, and vectors comprising a compact bidirectional promoter and a gene editing system (e.g., a polyribonucleotide and an endonuclease), wherein the compact promoter is small enough to allow for the inclusion of both an endonuclease and a gRNA in a single vector, such as an AAV vector, which has a size limit that makes expression of both endonuclease and gRNA difficult using conventional promoters. Herein disclosed are also embodiments where all or a portion of the polyribonucleotide (e.g., a guide sequence) hybridizes with a target sequence of a hypoxia-induced gene and where the polyribonucleotide directs the endonuclease to the target sequence. Upon hybridization with the target sequence, the endonuclease can induce a break in the target DNA, which disrupts processing of an encoded mRNA transcript and expression of the encoded protein.
[0091] In general, an “endonuclease system” refers collectively to transcripts and other elements, including the promoters described herein, involved in the expression of or directing the activity of a gene encoding a gene-editing endonuclease (e.g., a Cas endonuclease) and a polyribonucleotide (e.g., a gRNA) having a guide sequence (also referred to as a “spacer” in the context of certain endogenous gene editing systems, e.g., a CRISPR system).
A. Promoters
[0092] The size limitations of AAV and other vectors (e.g., plasmids) make it difficult to package both a gRNA and an endonuclease into a single vector. However, this problem can be overcome by using a compact promoter, as described herein, to incorporate a non-naturally occurring endonuclease system via a single vector. In certain embodiments, the single vector is a viral vector. In certain embodiments, the single vector is a plasmid.
[0093] In certain embodiments, the promoter may be a compact, bidirectional promoter that directs expression of a gRNA in one direction and an endonuclease in the other direction. [0094] In certain embodiments, the promoter is operably linked to the sequence encoding the polyribonucleotide and the sequence encoding the endonuclease.
[0095] A compact promoter provided herein can be selected to express the selected endonuclease system in a desired target cell. In certain embodiments, the target cell is a lung endothelial cell and/or a lung artery smooth muscle cell. In certain embodiments, the target cell is a lung endothelial cell. In certain embodiments, the target cell is a lung artery smooth muscle cell. The promoter may be derived from any species, including human. In certain embodiments, the promoter is “cell specific.” The term “cell-specific” means that the particular promoter selected for the recombinant vector can direct expression of the selected transgene in a particular cell.
[0096] In certain embodiments, the promoter is of a small size, e.g., less than about 500 bp, due to the size limitations of the AAV vector. In certain embodiments, the promoter is less than about 500 bp, less than about 300 bp, or less than about 200 bp in size.
[0097] In certain embodiments, the promoter is between about 50 bp and about 400 bp, between about 75 bp and about 400 bp, between about 99 bp and about 400 bp, between about 100 bp and about 400 bp, between about 150 bp and about 400 bp, between about 200 bp and about 400 bp, between about 250 bp and about 400 bp, between about 300 bp and about 400 bp, between about 50 bp and about 300 bp, between about 75 bp and about 300 bp, between about 100 bp and about 300 bp, between about 150 bp and about 300 bp, between about 200 bp and about 300 bp, between about 50 bp and about 250 bp, between about 75 bp and about 250 bp, between about 100 bp and about 250 bp, between about 150 bp and about 250 bp, between about 200 bp and about 250 bp, between about 50 bp and about 225 bp, between about 75 bp and about 200 bp, between about 100 bp and about 200 bp, between about 150 bp and about 200 bp, between about 50 bp and about 180 bp, between about 100 bp and about 180 bp, between about 150 bp and about 180 bp in size.
[0098] In certain embodiments, the promoter is a bidirectional promoter. In certain embodiments, the bidirectional promoter is less than about 500 bp in size. In certain embodiments, the bidirectional promoter is less than about 300 bp in size. In certain embodiments, the bidirectional promoter is less than about 200 bp in size.
[0099] In certain embodiments, the bidirectional promoter is between about 50 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 99 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 250 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 300 bp and about 400 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 300 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 200 bp and about 250 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 225 bp in size. In certain embodiments, the bidirectional promoter is between about 75 bp and about 200 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 200 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 200 bp in size. In certain embodiments, the bidirectional promoter is between about 50 bp and about 180 bp in size. In certain embodiments, the bidirectional promoter is between about 100 bp and about 180 bp in size. In certain embodiments, the bidirectional promoter is between about 150 bp and about 180 bp in size.
[00100] In certain embodiments, the promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof.
[00101] In certain embodiments, the promoter comprises a nucleotide sequence having at least 85% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 90% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 95% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 96% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 97% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 98% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 99% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a functional fragment or variant thereof.
[00102] In certain embodiments, a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1- 226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175. In certain embodiments, a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1- 226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175). In certain embodiments, a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-226, 242-521, and 171-175 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-226, 242-521, and 171-175).
[00103] In certain embodiments, the functional fragment comprises at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83). In certain embodiments, a functional fragment comprises at least a transcription factor binding sites selected from Staf, DSE, PSE, c-REL, GATA-1, GATA- 2, and CREB. A functional fragment can comprise the B recognition sequence (BRE) or TATA box.
[00104] In certain embodiments, the promoter comprises a TATA mutation. In certain embodiments, the TATA mutation is a TATAA
Figure imgf000026_0001
TCGAA mutation.
[00105] In certain embodiments, a nucleic acid comprising a promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence. In certain embodiments, the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof. In certain embodiments, the 6 bp fragment is 5'-GCCACC-3'.
[00106] In certain embodiments, a nucleic acid comprising a promoter described herein further comprises a terminator sequence. In certain embodiments, the terminator sequence comprises one of the terminator sequences in TABLE 2.
TABLE 2
Figure imgf000026_0002
Figure imgf000027_0001
[00107] In certain embodiments, the compact promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
[00108] In certain embodiments, the compact promoter does not comprise a viral promoter and/or a synthetic promoter. [00109] In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, or 100% identity to a naturally occurring mammalian promoter. In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
[00110] The expression level of a compact promoter can be determined by expressing a reporter molecule in a cell, e.g., a human embryonic kidney (HEK) cell line or an N2A cell line. In certain embodiments, the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK thymidine kinase (TK) promoter. i) Hl promoters
[00111] In certain embodiments, the promoter is an Hl promoter. The Hl promoter is a bidirectional promoter having both pol II and pol III activity. The disclosure provides previously unidentified Hl promoters that the Applicant identified by generating a Hidden Markov model (HMM) profile from a multispecies alignment of known Hl promoters (see, e.g., International Patent Publication No. WO2015/195621 and W02018/009534). Regions flanking the Hl promoter region that were conserved throughout mammals were identified. As shown in FIG. 1, the region comprising the Hl promoter is located between the RPPH1 (Hl RNA) gene located on the minus strand to the left, and the beginning (i.e., the ATG(GCG)) of the protein coding gene, PARP2, located to the right. The RPPH1 gene comprises a highly conserved region in the Hl RNA gene (5'-GGAAGCTCA-3') that is conserved throughout all mammals. Accordingly, in certain embodiments, the Hl promoter comprises or consists of a region between the ATG(GCG) of PAPP2, and the highly conserved region in the Hl RNA gene (5'- GGAAGCTCA-3'). Also shown in FIG. 1 is the position of the pol III portion of the Hl promoter. Additional conserved regions present in the Hl promoter are shown, including, for example, conserved transcription factor binding sites, like a TATA box
[00112] In certain embodiments, the Hl promoter is a mammalian promoter, e.g., an artiodactyla Hl promoter, a carnivora Hl promoter, a cetacea Hl promoter, a chiroptera Hl promoter, an insectivora Hl promoter, a lagomorpha Hl promoter, a marsupial Hl promoter, a pangolin Hl promoter, a perissodactyla Hl promoter, a primate Hl promoter, a rodent Hl promoter, or a xenartha promoter. In certain embodiments, the promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof.
[00113] In certain embodiments, the promoter comprises a nucleotide sequence having at least 85%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 90%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 95%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 96%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 97%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 98%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof. In certain embodiments, the promoter comprises a nucleotide sequence having at least 99%, identity to a portion of any one of SEQ ID NOs: 1-82 and 242-521 or a functional fragment or variant thereof.
[00114] In certain embodiments, a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-52, or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521). In certain embodiments, a functional fragment comprises a truncation of about 15 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521. In certain embodiments, a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521. In certain embodiments, a functional fragment comprises a truncation of about 25 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521). In certain embodiments, a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of 1-82 and 242-521). In certain embodiments, a functional fragment comprises a truncation of about 35 bases at the 5' end, at the 3' end, or at each of the 5' and ” ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of any one of SEQ ID NOs: 1-82 and 242-521). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-82 and 242-521 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 1-82 and 242-521).
[00115] In certain embodiments, the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
[00116] In certain embodiments, the promoter comprises a TATA mutation. In certain embodiments, the TATA mutation is a TATAA
Figure imgf000030_0001
TCGAA mutation.
[00117] In certain embodiments, a nucleic acid comprising a promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence. In certain embodiments, the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof. In certain embodiments, the 6 bp fragment is 5'-GCCACC-3'.
[00118] In certain embodiments, a nucleic acid comprising a promoter described herein further comprises a terminator sequence. In certain embodiments, the terminator sequence comprises one of the terminator sequences in TABLE 2.
[00119] In certain embodiments, the compact promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
[00120] In certain embodiments, the compact promoter does not comprise a viral promoter and/or a synthetic promoter. [00121] In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter. In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
[00122] The expression level of a compact promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line. In certain embodiments, the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter. ii) Garl promoters
[00123] A custom perl script was developed to compare the 5' transcriptional start sites of pol III genes with that of pol II genes. The results were filtered for those that are orientated in opposite directions (divergent transcription). One compact bidirectional promoter identified using this method was the Garl promoter. On one side, the Garl promoter expresses the GAR1 protein, which is involved with snoRNAs, rRNA processing, and telomerase activity. The GAR1 protein appears to be expressed in all tissues, suggesting that the Garl promoter can drive expression ubiquitously (https://www.proteinatlas.org/ENSG00000109534-GARl/tissue). On the other side, it expresses a IncRNA (AC126283.1 or ENSG00000272795) with unknown function, and high expression in the testis.
[00124] Accordingly in certain embodiments, the promoter is a Garl promoter. In certain embodiments, the Garl promoter is a mammalian promoter, e.g., a human Garl promoter, a carnivora Garl promoter, a primate Garl promoter, or a rodent Garl promoter. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 83-189 or a fragment thereof. [00125] In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 83-189 or a fragment thereof.
[00126] In some embodiments, the Garl promoter comprises a consensus sequence. [00127] In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof, where the IUPAC nucleotide code is used. In certain embodiments, the promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 711-715 or a fragment thereof.
[00128] In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof. In certain embodiments, the Garl promoter comprises a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 711-715 or a fragment thereof.
[00129] In certain embodiments, a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83- 189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83- 189). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189). In certain embodiments, a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 83-189 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 83-189).
[00130] In certain embodiments, a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711- 715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711- 715). In certain embodiments, a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). In certain embodiments, a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 711-715 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 711-715). [00131] In certain embodiments, the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
[00132] In certain embodiments, the Garl promoter comprises a TATA mutation. In certain embodiments, the TATA mutation is a TATAA
Figure imgf000034_0001
TCGAA mutation.
[00133] In certain embodiments, a nucleic acid comprising a Garl promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence. In certain embodiments, the 5'UTR includes the nucleotide sequence 5'- GCCGCC ACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof. In certain embodiments, the 6 bp fragment is 5'-GCCACC-3'. [00134] In certain embodiments, a nucleic acid comprising a Garl promoter described herein further comprises a terminator sequence. In certain embodiments, the terminator sequence comprises one of the terminator sequences in TABLE 2.
[00135] In certain embodiments, the Garl promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
[00136] In certain embodiments, the Garl promoter does not comprise a viral promoter and/or a synthetic promoter.
[00137] In certain embodiments, the Garl promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter. In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
[00138] The expression level of a Garl promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line. In certain embodiments, the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter.
Hi) Other bidirectional promoters
[00139] Using the custom perl script described above, additional bidirectional promoters were identified that can be used according to the methods described herein. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the bidirectional promoter comprises the nucleotide sequence of any one of SEQ ID NOs: 190-226 or a fragment thereof.
[00140] In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 85% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 90% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 95% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 96% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 97% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 98% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof. In certain embodiments, the promoter is a bidirectional promoter comprising a nucleotide sequence having at least 99% identity of any one of SEQ ID NOs: 190-226 or a fragment thereof.
[00141] In certain embodiments, a functional fragment comprises a truncation of from about 10 bases to about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 10 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 20 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 30 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 40 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190- 226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190- 226). In certain embodiments, a functional fragment comprises a truncation of about 50 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 60 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). In certain embodiments, a functional fragment comprises a truncation of about 70 bases at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 190-226 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identity to any one of SEQ ID NOs: 190-226). [00142] In certain embodiments, the functional fragment comprise at least a transcription factor binding site. Identification of transcription factor binding sites can be determined by consensus, or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83).
[00143] In certain embodiments, the promoter comprises a TATA mutation. In certain embodiments, the TATA mutation is a TATAA
Figure imgf000037_0001
TCGAA mutation.
[00144] In certain embodiments, the promoter is not an Hl promoter. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 1- 82 and 242-521. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 1-82. In certain embodiments, the promoter is not one or more of the Hl promoters set forth in SEQ ID NOs: 242-521. In certain embodiments, the promoter is not one or more of an SRP-RPS29 promoter (SEQ ID NO: 219 or SEQ ID NO: 227), a mouse 7sk promoter (SEQ ID NO: 190); a 7skl promoter (SEQ ID NO: 228), a 7sk2 promoter (SEQ ID NO: 229), a 7sk3 promoter (SEQ ID NO: 230), an RMRP-CCDC107 promoter (SEQ ID NO: 207 or SEQ ID NO: 231), an ALOXE3 promoter (SEQ ID NO: 232), a CGB1 promoter (SEQ ID NO: 233), a CGB2 promoter (SEQ ID NO: 234), a Medl6-1 promoter (SEQ ID NO: 235), a Med 16-2 promoter (SEQ ID NO: 236), a DPP9-1 promoter (SEQ ID NO: 237), a DPP9-2 promoter (SEQ ID NO: 238), a DPP9-3 promoter (SEQ ID NO: 239), a SNORD13-C8orf41 promoter (SEQ ID NO: 240), and a THEM259 promoter (SEQ ID NO: 241).
[00145] In certain embodiments, a nucleic acid comprising a bidirectional promoter described herein further comprises a 5'UTR including at least a portion of a beta-globin 5'UTR sequence or a Kozak sequence. In certain embodiments, the 5'UTR includes the nucleotide sequence 5'-GCCGCCACC-3', or a 6 bp, 7 bp, or 8 bp fragment thereof. In certain embodiments, the 6 bp fragment is 5'-GCCACC-3'.
[00146] In certain embodiments, a nucleic acid comprising a bidirectional promoter described herein further comprises a terminator sequence. In certain embodiments, the terminator sequence comprises one of the terminator sequences in TABLE 2. In certain embodiments, the bidirectional promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron).
[00147] In certain embodiments, the bidirectional promoter does not comprise a viral promoter and/or a synthetic promoter. In certain embodiments, the compact promoter does not comprise a F5tg83 promoter.
[00148] In certain embodiments, the bidirectional promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring mammalian promoter. In certain embodiments, the compact promoter comprises at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5% or 100% identity to a naturally occurring human promoter.
[00149] The expression level of a bidirectional promoter can be determined by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line. In certain embodiments, the compact promoter is capable of expressing a luciferase reporter at a higher level than is a HSK TK promoter.
B. Polyribonucleotides (e.g., Guide RNAs)
[00150] In aspects of the presently disclosed subject matter the term “polyribonucleotide” refers to polynucleotide polymers containing 50% or more ribose bases including unmodified and/or modified ribonucleotides. A “guide RNA” (“gRNA”) is a type of polyribonucleotide that includes a CRISPR RNA sequence (crRNA, also referred to as a “guide sequence” or “spacer”), and, in certain embodiments, a trans-activating CRISPR RNA sequence (tracrRNA). The tracrRNA, if present, binds to an endonuclease (e.g., a CRISPR enzyme such as Cas9) and the crRNA is complementary to a target sequence.
[00151] As used herein, “target sequence” refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a gene editing endonuclease complex (e.g., a CRISPR complex). Full complementarity is not necessarily required, provided there is sufficient complementarity to cause hybridization and promote formation of a gene editing endonuclease complex (e.g., a CRISPR complex). A target sequence may comprise any polynucleotide, such as DNA or RNA polynucleotides. In certain embodiments, a target sequence is located in the nucleus or cytoplasm of a cell. In certain embodiments, the target sequence may be within an organelle of a eukaryotic cell, for example, mitochondrion or chloroplast. A sequence or template that may be used for recombination into the targeted locus comprising the target sequences is referred to as an “editing template,” “editing polynucleotide,” or “editing sequence.” In aspects of the presently disclosed subject matter, an exogenous template polynucleotide may be referred to as an editing template. In an aspect of the presently disclosed subject matter the recombination is homologous recombination.
[00152] In general, a guide sequence is any polyribonucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. In certain embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about 50% or more, about 60% or more, about 75% or more, about 80% or more, 85%, about 90% or more, about 95% or more, about 97.5% or more, or about 99% or more. In certain embodiments, at least a portion of the polyribonucleotide comprises a nucleic acid sequence that is the reverse complement of any one of SEQ ID NOs: 538-710 and/or a nucleic acid that binds to any one of SEQ ID NOs: 538-710. In certain embodiments, the target sequence is selected from SEQ ID NOs: 538-710.
[00153] Following administration of a endonuclease system, a portion of the polyribonucleotide (e.g., guide RNA) hybridizes to a target sequence of a hypoxia-induced gene. In certain embodiments, the polyribonucleotide directs the endonuclease to the target sequence of the hypoxia-induced gene. In certain embodiments the hypoxia-induced gene includes hypoxia-inducible factor- 1 -alpha (HIF1A) or hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI). Upon hybridization with the target sequence, the endonuclease can induce a break in the hypoxia- induced gene (e.g., HIF1A, HIF2A, BMPR2, or Al. KI)., which disrupts processing of the encoded mRNA transcript and expression of the encoded protein.
[00154] Optimal alignment of the polyribonucleotide to the target sequence may be determined with the use of any suitable algorithm for aligning sequences, non-limiting example of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows -Wheel er Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net). In certain embodiments, a guide sequence is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In certain embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, or 12, or fewer nucleotides in length. [00155] The ability of a guide sequence to direct sequence-specific binding of a CRISPR complex to a target sequence may be assessed by any suitable assay. For example, the components of a CRISPR system sufficient to form a CRISPR complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of the CRISPR sequence, followed by an assessment of preferential cleavage within the target sequence, such as by a Surveyor assay. Similarly, cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a CRISPR complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible, and will be known to those skilled in the art. [00156] A guide sequence may be selected to target any target sequence. In certain embodiments, the target sequence is a sequence within a genome of a cell. Exemplary target sequences include those that are unique in the target genome. In certain embodiments, the target sequence is selected from SEQ ID NOs: 538-710.
C. Endonucleases
[00157] The invention provides an endonuclease system having a nucleotide sequence encoding an endonuclease. In certain embodiments, the endonuclease can be any endonuclease that is capable of cleaving DNA to effect a single or double strand break at the intended locus. Non-limiting examples of Cas proteins include Casl, CaslB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), StCas9, NmCas9, GeoCas9, CaslO, Casl2, Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, Casl2i, Casl4a, Casl4b, Cas 14c, Cas , Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, and Csf4, homologs of any of the foregoing thereof, or modified versions of any of the foregoing thereof. These enzymes are known; for example, the amino acid sequence of S. pyogenes Cas9 protein may be found in the SwissProt database under accession number Q99ZW2. In certain embodiments, the CRISPR enzyme has DNA cleavage activity, such as Cas9. In certain embodiments the CRISPR enzyme is Cas9, and may be Cas9 from S. pyogenes or S. pneumoniae. In certain embodiments, the endonuclease is a MAD1, MAD2, MAD3, MAD4, MAD5, MAD6, MAD7, MAD8, MAD9, MAD10, MAD11, or MAD1 1 endonuclease (see, e.g., U.S. Patent No. 9,982,279). The DNA endonuclease can be a Cpfl endonuclease, a homolog thereof, a recombinant of the naturally-occurring molecule thereof, a codon-optimized version thereof, a modified version thereof (e.g., a mutated variant such as a Nickase), and combinations of any of the foregoing. For example, in certain embodiments, the DNA endonuclease is a Cas9 or Cpfl endonuclease that effects a single-strand break or double-strand break at a locus within or near a target sequence.
[00158] In certain embodiments, the DNA endonuclease is a Cas9 endonuclease. In certain embodiments, the Cas9 endonuclease is a recombinant Cas9, a codon-optimized Cas9, or a modified or mutated Cas9. The Cas9 endonuclease can be derived from a variety of bacterial species. For example, in certain embodiments, the Cas9 endonuclease is derived from Streptococcus thermophiles, Streptococcus pyogenes, Neisseria meningitides, Staphylococcus aureus, or Treponema denticola. In certain embodiments, the Cas9 endonuclease is derived from Staphylococcus aureus (SaCas9). In certain embodiments, the Cas9 endonuclease is derived from Streptococcus pyogenes (SpCas9). Wild-type Cas9 has two active sites, RuvC and HNH nuclease domains, for cleaving DNA, one for each strand of the double helix. However, Nickase variants of Cas9 are readily available (e.g., Addgene, plasmid #48873) that are only capable of cleaving one strand of the DNA due to catalytic inactivation of the RuvC or HNH nuclease domains. Accordingly, in certain embodiments, the Cas9 endonuclease is a mutated SpCas9 endonuclease (e.g., a Nickase) and/or a codon-optimized version thereof
[00159] In other embodiments, the DNA endonuclease is a Cpfl endonuclease (e.g., a recombinant Cpfl, a codon-optimized Cpfl, or a modified or mutated Cpfl). The Cpfl endonuclease can be derived from a variety of bacterial species. For example, in certain embodiments, the Cpfl endonuclease is derived from Acidaminococcus bacteria or Lachnospiraceae bacteria. In certain embodiments, the Cpfl endonuclease is a Lachnospiraceae bacterium ND2006 Cpfl .
[00160] In other embodiments, the DNA endonuclease is a MAD7 endonuclease (e.g., a recombinant MAD7, a codon-optimized MAD7, or a modified or mutated MAD7). MAD7 is a codon optimized endonuclease can be derived from Eubacterium rectale (Inscripta, Boulder, CO.) MAD7 is described in U.S. Patent No. 9,982,279.
[00161] In other embodiments, an RNA-targeting endonuclease is used. Exemplary RNA- targeting endonucleases include Cast 3 a, Cast 3b and Cast 3d.
[00162] In certain embodiments, the endonuclease (e.g., a CRISPR enzyme) directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the complement of the target sequence. In certain embodiments, the endonuclease directs cleavage of one or both strands within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, or 500, or more base pairs from the first or last nucleotide of a target sequence. In certain embodiments, a vector encodes an endonuclease that is mutated with respect to a corresponding wild-type enzyme such that the mutated endonuclease lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. For example, in certain embodiments, an endonuclease system comprises a nuclease-dead version of an endonuclease e.g., Cas9 (dCas9)) (Qi et al. (2013) CELL 152, 1173-1183; Gilbert et al. (2013) CELL 154, 442-451; Larson et al. (2013) NATURE PROTOCOLS 8, 2180-2196; Fuller et al. (2014) ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 801, 773-781). Instead of inducing cleavage, a nuclease-dead endonuclease stays bound tightly to a target sequence. When targeted to an actively-transcribed gene, inhibition of pol II progression through a steric hindrance mechanism can lead to efficient transcriptional repression. Thus, use of a nuclease-dead nuclease can achieve therapeutic repression of a target gene without inducing a break in the target nucleotide sequence.
[00163] In general, the term “CRISPR system” as used herein refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR- associated (“Cas”) genes, including sequences encoding a Cas gene, a guide sequence (also referred to as a “spacer” in the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus. In certain embodiments, one or more elements of a CRISPR system is derived from a type I, type II, or type III CRISPR system. In certain embodiments, one or more elements of a CRISPR system is derived from a particular organism comprising an endogenous CRISPR system, such as Streptococcus pyogenes. In general, a CRISPR system is characterized by elements that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR system).
[00164] In certain embodiments, an enzyme coding sequence encoding an endonuclease (e.g., a CRISPR enzyme) is codon optimized for expression in particular cells, such as eukaryotic cells. The eukaryotic cells may be those of or derived from a particular organism, such as a mammal, including but not limited to human, mouse, rat, rabbit, dog, or non-human primate. In general, codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (differences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, among other things, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database,” and these tables can be adapted in a number of ways. See Nakamura et al. (2000) Nucl. Acids Res. 28:292. Computer algorithms for codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.). In certain embodiments, one or more codons e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, or 50, or more, or all codons) in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
[00165] In certain embodiments, the CRISPR enzyme is part of a fusion protein comprising one or more heterologous protein domains (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more domains in addition to the CRISPR enzyme). A CRISPR enzyme fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains. Examples of protein domains that may be fused to a CRISPR enzyme include, without limitation, epitope tags, reporter gene sequences, and protein domains having one or more of the following activities: methylase activity, demethylase activity, transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, RNA cleavage activity, and nucleic acid binding activity. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporter genes include, but are not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT), beta-galactosidase, betaglucuronidase, luciferase, green fluorescent protein (GFP), enhanced GFP (eGFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP). A CRISPR enzyme may be fused to a gene sequence encoding a protein or a fragment of a protein that bind DNA molecules or bind other cellular molecules, including but not limited to maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4A DNA binding domain fusions, and herpes simplex virus (HSV) BP 16 protein fusions. Additional domains that may form part of a fusion protein comprising a CRISPR enzyme are described in U.S. Patent Publication No.
2011/0059502, incorporated herein by reference. In certain embodiments, a tagged CRISPR enzyme is used to identify the location of a target sequence. [00166] In an aspect of the presently disclosed subject matter, a reporter gene which includes but is not limited to GST, HRP, CAT, beta-galactosidase, beta-glucuronidase, luciferase, GFP, eGFP, HcRed, DsRed, CFP, YFP, and autofluorescent proteins including BFP, may be introduced into a cell to encode a gene product which serves as a marker by which to measure the alteration or modification of expression of the gene product. In certain embodiments, the DNA molecule encoding the gene product may be introduced into the cell via a vector. In certain embodiments, the gene product is luciferase. In certain embodiments, the expression of the gene product is decreased.
D. Vector Systems
[00167] Several aspects of the presently disclosed subject matter relate to vector systems comprising one or more vectors. Vectors can be designed for expression of CRISPR transcripts (e.g., nucleic acid transcripts, proteins, or enzymes) in prokaryotic or eukaryotic cells. For example, CRISPR transcripts can be expressed in bacterial cells such as Escherichia coh, insect cells (using baculovirus expression vectors), yeast cells, or mammalian cells. Suitable host cells are discussed further in Goeddel (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. Alternatively, a recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.
[00168] In certain embodiments, a vector comprises one or more insertion sites, such as a restriction endonuclease recognition sequence (also referred to as a “cloning site”). In certain embodiments, one or more insertion sites (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more insertion sites) are located upstream and/or downstream of one or more sequence elements of one or more vectors. When multiple different guide sequences are used, a single expression construct may be used to target endonuclease activity to multiple different, corresponding target sequences within a cell. For example, a single vector may comprise about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20, or more guide sequences. In certain embodiments, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more such guide sequence-containing vectors may be provided, and optionally delivered to a cell. In certain embodiments, a vector comprises a regulatory element operably linked to an enzyme-coding sequence encoding an endonuclease, such as a CRISPR enzyme (e.g., a Cas protein).
[00169] Vectors may be introduced and propagated in a prokaryote. In certain embodiments, a prokaryote is used to amplify copies of a vector to be introduced into a eukaryotic cell or as an intermediate vector in the production of a vector to be introduced into a eukaryotic cell e.g., amplifying a plasmid as part of a viral vector packaging system). In certain embodiments, a prokaryote is used to amplify copies of a vector and express one or more nucleic acids, such as to provide a source of one or more proteins for delivery to a host cell or host organism. Expression of proteins in prokaryotes is most often carried out in Escherichia coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins.
[00170] Fusion vectors add a number of amino acids to a protein encoded therein, such as to the amino terminus of the recombinant protein. Such fusion vectors may serve one or more purposes, such as: (i) to increase expression of recombinant protein; (ii) to increase the solubility of the recombinant protein; and (iii) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin, and enterokinase. Example fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and Johnson (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway N.J.) that fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein.
[00171] Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amrann et al. (1988) GENE 69:301-315) and pET l id (Studier et al. (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif.).
[00172] In certain embodiments, a vector is a yeast expression vector. Examples of vectors for expression in yeast Saccharomyces cerevisiae include pYepSecl (Baldari, et al. (1987) EMBO J. 6:229-234), pMFa (Kuijan and Herskowitz (1982) CELL 30: 933-943), pJRY88 (Schultz et al. (1987) Gene 54: 113-123), pYES2 (Invitrogen Corporation, San Diego, Calif.), and picZ (Invitrogen Corp, San Diego, Calif.).
[00173] In certain embodiments, a vector is capable of driving expression of one or more sequences in mammalian cells using a mammalian expression vector. Examples of mammalian expression vectors include pCDM8 (Seed (1987) NATURE 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J. 6: 187-195). When used in mammalian cells, the expression vector’s control functions are typically provided by one or more regulatory elements. For example, commonly used promoters are derived from polyoma, adenovirus 2, cytomegalovirus, simian virus 40, and others disclosed herein and known in the art. For other suitable expression systems for both prokaryotic and eukaryotic cells see, e.g., Chapters 16 and 17 of Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
[00174] In certain embodiments, the recombinant mammalian expression vector is capable of directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissuespecific regulatory elements are used to express the nucleic acid). Tissue-specific regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. (1987) GENES DEV. 1 :268-277), lymphoid-specific promoters (Calame and Eaton (1988) ADV. IMMUNOL. 43:235-275), promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Baneiji et al. (1983) CELL 33:729-740: Queen and Baltimore (1983) CELL 33:741-748), neuronspecific promoters (e.g., the neurofilament promoter; Byrne and Ruddle (1989) PROC. NATL. ACAD. SCI. USA 86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) SCIENCE 230:912-916), and mammary gland-specific promoters (e.g, milk whey promoter; U.S. Pat. No. 4,873,316 and European Application Publication. No. 264,166). Developmentally-regulated promoters are also encompassed, e.g, the murine hox promoters (Kessel and Gruss (1990) SCIENCE 249: 374-379) and the alpha-fetoprotein promoter (Campes and Tilghman (1989) GENES DEV. 3:537-546).
[00175] In certain embodiments, a regulatory element is operably linked to one or more elements of a CRISPR system so as to drive expression of the one or more elements of the CRISPR system. In general, CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats), also known as SPIDRs (SPacer Interspersed Direct Repeats), constitute a family of DNA loci that are usually specific to a particular bacterial species. The CRISPR locus comprises a distinct class of interspersed short sequence repeats (SSRs) that were recognized in E. coli (Ishino et al. (1987) J. BACTERIOL., 169:5429-5433; and Nakata et al. (1989) J. BACTERIOL., 171 :3553-3556). Similar interspersed SSRs have been identified in Haloferax medilerranei. Streptococcus pyogenes, Anabaena, and Mycobacterium tuberculosis (Groenen et al. (1993) MOL. MICROBIOL., 10: 1057-1065; Hoe et al. (1999) EMERG. INFECT. DIS., 5:254-263; Masepohl et al. (1996) BlOCHlM. BlOPHYS. ACTA 1307:26-30; and Mojica et al. (1995) MOL. MICROBIOL., 17:85-93). The CRISPR loci typically differ from other SSRs by the structure of the repeats, which have been termed short regularly spaced repeats (SRSRs) (Janssen et al. (2002) OMICS J. INTEG. BIOL., 6:23-33; and Mojica et al. (2000) MOL. MICROBIOL., 36:244-246). In general, the repeats are short elements that occur in clusters that are regularly spaced by unique intervening sequences with a substantially constant length (Mojica et al. (2000) MOL. MICROBIOL., 36:244- 246). Although the repeat sequences are highly conserved between strains, the number of interspersed repeats and the sequences of the spacer regions typically differ from strain to strain (van Embden et al. (2000) J. BACTERIOL., 182:2393-2401). CRISPR loci have been identified in more than 40 prokaryotes (e.g, Jansen et al. (2002) MOL. MICROBIOL., 43: 1565-1575; and Mojica et al. (2005) J. MOL. EVOL. 60: 174-82) including, but not limited to Aeropyrum, Pyrobaculum, Sulfolobus, Archaeoglobus, Halocarcula, Methanobacteriumn, Methanococcus, Methanosarcina, Methanopyrus, Pyrococcus, Picrophilus, Thernioplasnia, Corynebacterium, Mycobacterium, Streptomyces, Aquifrx, Porphvromonas, Chlorobium, Thermus, Bacillus, Listeria, Staphylococcus, Clostridium, Thermoanaerobacter, Mycoplasma, Fusobacterium, Azarcus, Chromobacterium, Neisseria, Nitrosomonas, Desulfovibrio, Geobacter, Myrococcus, Campylobacter, Wolinella, Acinetobacter, Erwinia, Escherichia, Legionella, Methylococcus, Pasteurella, Photobacterium, Salmonella, Xanthomonuas, Yersinia, Treponema, and Thermotoga.
[00176] In certain embodiments, the AAV vector includes a nucleotide sequence having a length of from 3 kbp to 6 kbp (e.g., 3.1 kbp, 3.2 kbp, 3.3 kbp, 3.4 kbp, 3.5 kbp, 3.6 kbp, 3.7 kbp,
3.8 kbp, 3.9 kbp, 4.0 kbp, 4.1 kbp, 4.2 kbp, 4.3 kbp, 4.4 kbp, 4.5 kbp, 4.6 kbp, 4.7 kbp, 4.8 kbp,
4.9 kbp, 5.0 kbp, 5.1 kbp, 5.2 kbp, 5.3 kbp, 5.4 kbp, 5.5 kbp, 5.6 kbp, 5.7 kbp, 5.8 kbp, 5.9 kbp, or 6.0 kbp).
[00177] The disclosure provides rAA) vectors comprising an endonuclease system under the control of a suitable promoter (e.g., a compact bidirectional promoter) to direct the expression of the gRNA and endonuclease. The disclosure further provides a therapeutic composition comprising an rAAV vector comprising an endonuclease system under the control of a suitable promoter (e.g., a compact bidirectional promoter). A variety of rAAV vectors may be used to deliver the desired complement system gene to the appropriate cells and/or tissues and to direct its expression. More than 30 naturally-occurring serotypes of AAV from humans and non-human primates are known. Many natural variants of the AAV capsid exist, and an rAAV vector of the disclosure may be designed based on an AAV with properties specifically suited for expression in the cells and/or tissues relevant for the endonuclease system to be expressed.
[00178] In general, an rAAV vector is comprised of, in order, a 5' AAV ITR, a transgene or gene of interest encoding an endonuclease system operably linked to a sequence which regulates its expression in a target cell, and a 3' AAV ITR. In addition, the rAAV vector may have a polyadenylation sequence. Generally, rAAV vectors have one copy of the AAV ITR at each end of the transgene or gene of interest, in order to allow replication, packaging, and efficient integration into cell chromosomes. In certain embodiments, the transgene sequence encoding a complement system polypeptide (or a functional fragment or variant thereof) or a biologically active fragment thereof will be of about 2 kb to 5 kb in length (or alternatively, the transgene may additionally contain a “stuffer” or “filler” sequence to bring the total size of the nucleic acid sequence between the two ITRs to between about 2 kb and 5 kb).
[00179] Recombinant AAV vectors of the present disclosure may be generated from a variety of AAVs. For example, ITRs from any AAV serotype are expected to have similar structures and functions with regard to replication, integration, excision, and transcriptional mechanisms. Examples of AAV serotypes include AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, and AAV12. In certain embodiments, the AAV vector is generated from serotype AAV1, AAV2, AAV4, AAV5, or AAV8. In certain embodiments, the AAV vector includes a targeting peptide that confers tropism to lung vascular cells (e.g., lung endothelial cells and/or lung artery smooth muscle cells). In certain embodiments, the AAV vector includes the targeting peptide ESGHGYF (SEQ ID NO: 533) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the AAV vector includes the targeting peptide GHGYF (SEQ ID NO: 534) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the AAV vector includes the targeting peptide CGFECVRQCPER (SEQ ID NO: 535) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the AAV vector includes the targeting peptide CGSPGWVRC (SEQ ID NO: 536) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. In certain embodiments, the AAV vector includes the targeting peptide CARSKNKDC (SEQ ID NO: 537) or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
[00180] In certain embodiments, the targeting peptide includes from about 4 to about 25 amino acids, for example, about 4 to about 10 amino acids, about 4 to about 15 amino acids, about 4 to about 20 amin acids, about 5 to about 10 amino acids, about 5 to about 15 amino acids, or about 5 to about 20 amino acids. In certain embodiments, the targeting peptide includes an amino acid sequence of any one of SEQ ID NOs: 533-537 or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto. [00181] In certain embodiments, the AAV comprises a targeting peptide that allows it to target a particular cell and/or tissue type. In certain embodiments, the targeting peptide comprises or consists of ESGHGYF (SEQ ID NO: 533), GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537), or an amino acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto, which confers tropism to lung vascular cells.
[00182] In particular embodiments, the rAAV vector is generated from serotype AAV2. In certain embodiments, the rAAV vector is an AAV-L1 vector, which is a modified AAV2 serotype that comprises a peptide insertion (ESGHGYF (SEQ ID NO: 533)) at position R588 of the VP1 protein, thereby allowing it to target pulmonary vascular cells (e.g., endothelial and/or smooth muscle cells).
[00183] In certain embodiments, the AAV serotypes include AAVrh8, AAVrh8R, or AAVrhlO. It will also be understood that rAAV vectors of the disclosure may be chimeras of two or more serotypes selected from serotypes AAV1-AAV12, AAV-DJ, AAV-DJ8, AAV-DJ9, or other modified serotypes. The tropism of the vector may be altered by packaging the recombinant genome of one serotype into capsids derived from another AAV serotype. In certain embodiments, the ITRs of the rAAV virus may be based on the ITRs of, for example, any one of AAV1-12 and may be combined with an AAV capsid selected from any one of AAV1-12, AAV- DJ, AAV-DJ8, AAV-DJ9, or other modified serotypes. In certain embodiments, any AAV capsid serotype may be used with the vectors of the disclosure.
[00184] Examples of AAV serotypes include AAV1, AAV2, AAV-L1, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV-DJ, AAV-DJ8, AAV- DJ9, AAVrh8, AAVrh8R, and AAVrhlO. In certain embodiments, the AAV capsid serotype is AAV2.
[00185] Desirable AAV fragments for assembly into vectors may include the Cap proteins, including the VP1 , VP2, VP3 and hypervariable regions, the Rep proteins, including Rep78, Rep68, Rep52, and Rep40, and the sequences encoding these proteins. These fragments may be readily utilized in a variety of vector systems and host cells. Such fragments may be used alone, in combination with other AAV serotype sequences or fragments, or in combination with elements from other AAV or non-AAV viral sequences. As used herein, artificial AAV serotypes include, without limitation, AAV with a non-naturally occurring capsid protein. Such an artificial capsid may be generated by any suitable technique using a selected AAV sequence (e.g., a fragment of a vpl capsid protein) in combination with heterologous sequences which may be obtained from a different selected AAV serotype, non-contiguous portions of the same AAV serotype, non- AAV viral source, or non-viral source. An artificial AAV serotype may be, without limitation, a pseudotyped AAV, a chimeric AAV capsid, a recombinant AAV capsid, or a “humanized” AAV capsid.
[00186] In certain embodiments, any of the vectors disclosed herein include a spacer, z.e., a DNA sequence interposed between the promoter and the Rep gene ATG start site. In certain embodiments, the spacer may be a random sequence of nucleotides, or alternatively, it may encode a gene product, such as a marker gene. In certain embodiments, the spacer may contain genes which typically incorporate start/stop and poly A sites. In certain embodiments, the spacer may be a non-coding DNA sequence from a prokaryote or eukaryote, a repetitive non-coding sequence, a coding sequence without transcriptional controls, or a coding sequence with transcriptional controls. In certain embodiments, the spacer is a phage ladder sequences or a yeast ladder sequence. In certain embodiments, the spacer is of a size sufficient to reduce expression of the Rep78 and Rep68 gene products, leaving the Rep52, Rep40 and Cap gene products expressed at normal levels. In certain embodiments, the length of the spacer may therefore range from about 10 bp to about 6 kbp, such as in the range of about 100 bp to about 6 kbp. In certain embodiments, the spacer is less than 2 kbp in length.
[00187] Numerous methods are known in the art for production of rAAV vectors, including transfection, stable cell line production, and infectious hybrid virus production systems which include adenovirus-AAV hybrids, herpesvirus-AAV hybrids (Conway, JE et al, (1997). Virology 71(11):8780-8789), and baculovirus-AAV hybrids. rAAV production cultures for the production of rAAV virus particles all require: 1) suitable host cells, including, for example, human-derived cell lines such as HeLa, A549, or 293 cells, or insect-derived cell lines such as SF-9, in the case of baculovirus production systems; 2) suitable helper virus function, provided by wild-type or mutant adenovirus (such as temperature-sensitive adenovirus), herpes virus, baculovirus, or a plasmid construct providing helper functions; 3) AAV Rep and Cap genes and gene products; 4) a transgene (such as a transgene comprising an endonuclease system) flanked by at least one AAV ITR sequence; and 5) suitable media and media components to support rAAV production. Suitable media known in the art may be used for the production of rAAV vectors. These media include, without limitation, media produced by Hyclone Laboratories and JRH including Modified Eagle Medium (MEM), Dulbecco's Modified Eagle Medium (DMEM), custom formulations such as those described in U.S. Patent No. 6,566,118, and Sf-900 II SFM media as described in U.S. Patent No. 6,723,551, each of which is incorporated herein by reference in its entirety, particularly with respect to custom media formulations for use in production of recombinant AAV vectors.
[00188] The rAAV particles can be produced using methods known in the art. See, e.g., U.S. Pat. Nos. 6,566,118; 6,989,264; and 6,995,006. In practicing the disclosure, host cells for producing rAAV particles include mammalian cells, insect cells, plant cells, microorganisms, and yeast. Host cells can also be packaging cells in which the AAV Rep and Cap genes are stably maintained in the host cell or producer cells in which the AAV vector genome is stably maintained. Exemplary packaging and producer cells are derived from 293, A549, and HeLa cells. AAV vectors are purified and formulated using standard techniques known in the art. [00189] rAAV particles are generated by transfecting producer cells with a plasmid (cis- plasmid) containing an rAAV genome comprising a transgene flanked by the 145 nucleotide- long AAV ITRs and a separate construct expressing the AAV Rep and Cap genes in trans. In addition, adenovirus helper factors such as El A, E1B, E2A, E40RF6, and VA RNAs, etc. may be provided by either adenovirus infection or by transfecting a third plasmid providing adenovirus helper genes into the producer cells. Producer cells may be HEK293 cells. Packaging cell lines suitable for producing AAV vectors may be readily accomplished given readily available techniques (see e.g., U.S. Pat. No. 5,872,005). The helper factors provided will vary depending on the producer cells used and whether the producer cells already carry some of these helper factors.
[00190] In certain embodiments, rAAV particles may be produced by a triple transfection method, such as the exemplary triple transfection method provided infra. Briefly, a plasmid containing a Rep gene and a Cap gene, along with a helper adenoviral plasmid, may be transfected (e.g., using the calcium phosphate method) into a cell line (e.g., HEK-293 cells), and virus may be collected and optionally purified.
[00191] In certain embodiments, rAAV particles may be produced by a producer cell line method, such as the exemplary producer cell line method provided infra (see also (referenced in Martin et al., (2013) HUMAN GENE THERAPY METHODS 24:253-269). Briefly, a cell line (e.g., a HeLa cell line) may be stably transfected with a plasmid containing a Rep gene, a Cap gene, and a promoter-transgene sequence. Cell lines may be screened to select a lead clone for rAAV production, which may then be expanded to a production bioreactor and infected with an adenovirus (e.g., a wild-type adenovirus) as helper to initiate rAAV production. Virus may subsequently be harvested, adenovirus may be inactivated (e.g., by heat) and/or removed, and the rAAV particles may be purified. [00192] rAAV vector particles of the disclosure may be harvested from rAAV production cultures by lysis of the host cells of the production culture or by harvest of the spent media from the production culture, provided the cells are cultured under conditions known in the art to cause release of rAAV particles into the media from intact cells, as described more fully in U.S. Patent No. 6,566,118. Suitable methods of lysing cells are also known in the art and include for example multiple freeze/thaw cycles, sonication, microfluidization, and treatment with chemicals, such as detergents and/or proteases.
[00193] In certain embodiments, the rAAV particles are purified. The term “purified” as used herein includes a preparation of rAAV particles devoid of at least some of the other components that may also be present where the rAAV particles naturally occur or are initially prepared from. Thus, for example, isolated rAAV particles may be prepared using a purification technique to enrich it from a source mixture, such as a culture lysate or production culture supernatant. Enrichment can be measured in a variety of ways, such as, for example, by the proportion of DNase-resistant particles (DRPs) or genome copies (gc) present in a solution, or by infectivity, or it can be measured in relation to a second, potentially interfering substance present in the source mixture, such as contaminants, including production culture contaminants or in- process contaminants, including helper virus, media components, and the like.
[00194] rAAV particles may be isolated or purified using one or more of the following purification steps: equilibrium centrifugation; flow-through anionic exchange filtration; tangential flow filtration (TFF) for concentrating the rAAV particles; rAAV capture by apatite chromatography; heat inactivation of helper virus; rAAV capture by hydrophobic interaction chromatography; buffer exchange by size exclusion chromatography (SEC); nanofiltration; rAAV capture by anionic exchange chromatography, cationic exchange chromatography, or affinity chromatography. These steps may be used alone, in various combinations, or in different orders. In certain embodiments, the method comprises all the steps in the order as described below. Methods to purify rAAV particles are found, for example, in Xiao et al.. (1998) JOURNAL OF VIROLOGY 72:2224-2232; U.S. Patent Nos. 6,989,264 and 8,137,948; and WO 2010/148143.
III. Methods of preventing or treating PH or PAH
[00195] The disclosure provides methods of preventing or treating PH or PAH in a subject by administering the endonuclease system as herein described. In certain embodiments the endonuclease system is administered using an AAV vector. In certain embodiments, the AAV vector includes a nucleotide sequence encoding a promoter, a nucleotide sequence encoding a polyribonucleotide (e.g., a gRNA), and a nucleotide sequence encoding an endonuclease. [00196] Upon administration of an endonuclease system described herein, all or a portion of a gRNA (e.g., a guide sequence) hybridizes with a target sequence of a hypoxia-induced gene (e.g., HIF1A, HIF2A, BMPR2, or Al. KI) and the polyribonucleotide directs the endonuclease to the target sequence. Upon hybridization with the target sequence, the endonuclease can induce a break in the target DNA. The break in the target DNA disrupts processing of an encoded mRNA transcript and expression of the encoded protein, thereby to prevent or treat PH or PAH.
[00197] In certain embodiments, the endonuclease induces a break in a HIF1A, HIF2A, BMPR2, or ALK1 gene, which disrupts processing of the encoded HIF1A, HIF2A, BMPR2, or ALK1 mRNA transcript and expression of the encoded HIF1 A, HIF2A, BMPR2, or ALK1 protein, thereby to prevent or treat PH or PAH.
[00198] The method of preventing or treating PH or PAH in a subject, in certain embodiments, includes co-administration of the AAV vector, as herein described, and an inhibitor of a hypoxia-induced gene. In certain embodiments, the inhibitor is a small molecule inhibitor. In certain embodiments, the small molecule inhibitor is selected from the group including belzutifan, PT2385, vadadustat, KC7F2, CAY10585, 2-Methoxyestradiol, SYP-5, PT2399, N-Acetylcysteine amide, IDF-11774, Lificiguat (YC-1), PX-478 2HC1, BAY 87-2243, C76 (Methyl-3-(2-(cyano(methylsulfonyl)methylene)hydrazino)thiophene-2-carboxylate), Roxadustat (FG-4592), Daprodustat (GSK1278863), Desidustat (ZYAN-1), Molidustat (Bay 85- 3934), MK-8617, IOX-2, 2-methoxyestradiol, GN-44028, AKB-4924, FG-2216, and FG-4497. In certain embodiments, the AAV vector may be administered prophylactically in individuals identified as having an elevated risk of developing PH or PAH (FIG. 6). Clinical precursors used to anticipate the development of PH or PAH may include high blood pressure, left-sided heart disease (systolic LV failure, LV diastolic dysfunction, valvular diseases), lung disease (chronic obstructive pulmonary disease, interstitial lung disease, chronic thromboembolic pulmonary hypertension), chronic hemolytic anemia, sarcoidosis, chronic renal failure, connective tissue disease, liver cirrhosis, use of medication with a risk of producing PH or PAH as a side effect, genetic disorders or conditions (e.g., mutations in BMPR2.j AI.K1. or eukaryotic translation initiation factor 2 alpha kinase 4 (EIF2AK4),' a tumor; chronic pulmonary emboli; hypoxic pulmonary vasoconstriction; a parasitic infection or disease (e.g., schistosomiasis); viral infections of SARS-CoV-2 or HIV; drug-induced conditions (e.g., side effects from CML treatment caused by tyrosine kinase inhibitors); tobacco usage; or environmental precursors (e.g., high-altitude). In certain embodiments, the AAV vector and the small molecule inhibitor may be administered substantially simultaneously. In certain embodiments, the AAV vector and the small molecule inhibitor may be administered sequentially in any order. The AAV alone or in combination with the small molecule inhibitor may be administered subcutaneously, intradermally, intravenously, intraperitoneally, via inhalation, nasally, orally, intramuscularly, intracranially, via intrapulmonary route, via ophthalmic route, parenterally, rectally, vaginally, or via a transmucosal route. In certain embodiments, the endonuclease system may be administered at substantially the same time that hypoxia-induced effects are detected. In certain embodiments, the endonuclease system may be administered after hypoxia-induced effects are detected.
[00199] In certain embodiments, the AAV capsid is modified to improve therapy. The capsid may be modified using conventional molecular biology techniques. In certain embodiments, the capsid is modified for minimized immunogenicity, better stability and particle lifetime, efficient degradation, and/or accurate delivery of the endonuclease system to the nucleus. In certain embodiments, the modification or mutation is an amino acid deletion, insertion, substitution, or any combination thereof in a capsid protein. A modified polypeptide may comprise 1, 2, 3, 4, 5, up to 10, or more amino acid substitutions and/or deletions and/or insertions. A “deletion” may comprise the deletion of individual amino acids, deletion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or deletion of larger amino acid regions, such as the deletion of specific amino acid domains or other features. An “insertion” may comprise the insertion of individual amino acids, insertion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or insertion of larger amino acid regions, such as the insertion of specific amino acid domains or other features.
[00200] In certain embodiments, one or more amino acid substitutions are introduced into one or more of VP1, VP2, and VP3. In one aspect, a modified capsid protein comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 conservative or non-conservative substitutions relative to the wild-type polypeptide. In another aspect, the modified capsid polypeptide of the disclosure comprises modified sequences, wherein such modifications can include both conservative and non-conservative substitutions, deletions, and/or additions, and typically include peptides that share at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 87%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the corresponding wild-type capsid protein.
[00201] The methods of the invention also provide administering pharmaceutical compositions comprising an endonuclease system described herein and a pharmaceutically acceptable carrier. The pharmaceutical compositions may be suitable for any mode of administration described herein. [00202] In certain embodiments, the pharmaceutical compositions comprising a nucleic acid described herein and a pharmaceutically acceptable carrier are suitable for administration to a human subject. Such carriers are well known in the art (see, e.g., Remington's Pharmaceutical Sciences, 15th Edition, pp. 1035-1038 and 1570-1580). Such pharmaceutically acceptable carriers can be sterile liquids, such as water and oil, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, and the like. Saline solutions and aqueous dextrose, polyethylene glycol (PEG) and glycerol solutions can also be employed as liquid carriers, particularly for injectable solutions. The pharmaceutical composition may further comprise additional ingredients, for example preservatives, buffers, tonicity agents, antioxidants and stabilizers, nonionic wetting or clarifying agents, viscosityincreasing agents, and the like. The pharmaceutical compositions described herein can be packaged in single unit dosages or in multi-dosage forms. The compositions are generally formulated as sterile and substantially isotonic solution.
[00203] In certain embodiments, the nucleic acid comprising the endonuclease system and compact bidirectional promoter for use in the target cells as detailed above is formulated into a pharmaceutical composition intended for oral, inhalation, intranasal, intratracheal, intravenous, intramuscular, subcutaneous, intradermal, or parental routes of administration. Such formulation involves the use of a pharmaceutically and/or physiologically acceptable vehicle or carrier, such as buffered saline or other buffers, e.g., HEPES, to maintain pH at appropriate physiological levels, and, optionally, other medicinal agents, pharmaceutical agents, stabilizing agents, buffers, carriers, adjuvants, diluents, etc. For injection, the carrier will typically be a liquid. Exemplary physiologically acceptable carriers include sterile, pyrogen-free water and sterile, pyrogen-free, phosphate buffered saline. A variety of such known carriers are provided in U.S. Patent No. 7,629,322, incorporated herein by reference. In certain embodiments, the carrier is an isotonic sodium chloride solution. In certain embodiments, the carrier is balanced salt solution. In certain embodiments, the carrier includes tween. If the virus is to be stored long-term, it may be frozen in the presence of glycerol or Tween20. In certain embodiments, the pharmaceutically acceptable carrier comprises a surfactant, such as perfluorooctane (Perfluoron liquid). Routes of administration may be combined, if desired.
[00204] The composition may be delivered in a volume of from about 0.1 pL to about 1 mL, including all numbers within the range, depending on the size of the area to be treated, the viral titer used, the route of administration, and the desired effect of the method. In certain embodiments, the volume is about 50 pL. In certain embodiments, the volume is about 70 pL. In certain embodiments, the volume is about 100 pL. In certain embodiments, the volume is about 125 pL. In certain embodiments, the volume is about 150 pL. In certain embodiments, the volume is about 175 pL. In certain embodiments, the volume is about 200 pL. In certain embodiments, the volume is about 250 pL. In certain embodiments, the volume is about 300 pL. In certain embodiments, the volume is about 450 pL. In certain embodiments, the volume is about 500 pL. In certain embodiments, the volume is about 600 pL. In certain embodiments, the volume is about 750 pL. In certain embodiments, the volume is about 850 pL. In certain embodiments, the volume is about 1000 pL. An effective concentration of a rAAV carrying a nucleic acid sequence encoding the desired transgene under the control of the cell-specific promoter sequence desirably ranges from about 107 and 1013 vector genomes per milliliter (vg/mL) (also called genome copies/mL (GC/mL)). The rAAV infectious units, in certain embodiments, are measured as described in S.K. McLaughlin et al. (1988) J. VIROL., 62: 1963, which is incorporated herein by reference.
[00205] In certain embodiments, the concentration in the target tissue is from about 1.5 x 109 vg/mL to about 1.5 x 1012 vg/mL, such as from about 1.5 x 109 vg/mL to about 1.5 x 1011 vg/mL. In certain embodiments, the effective concentration is about 2.5 x 1010 vg to about 1.4 x 1011. In certain embodiments, the effective concentration is about 1.4 x 108 vg/mL. In certain embodiments, the effective concentration is about 3.5 x 1010 vg/mL. In certain embodiments, the effective concentration is about 5.6 x 1011 vg/mL. In certain embodiments, the effective concentration is about 5.3 x 1012 vg/mL. In certain embodiments, the effective concentration is about 1.5 x 1012 vg/mL. In certain embodiments, the effective concentration is about 1.5 x 1013 vg/mL. In certain embodiments, the effective dosage (total genome copies delivered) is from about 107 to 1013 vector genomes. It is desirable that the lowest effective concentration of virus be utilized in order to reduce the risk of undesirable effects, such as toxicity. Still other dosages and administration volumes in these ranges may be selected by the attending physician, taking into account the physical state of the subject, such as a human, being treated, the age of the subject, the particular disorder and the degree to which the disorder, if progressive, has developed.
[00206] In certain embodiments, the vector is administered at a dose between 2.5 x 1010 vg/kg and 1.4 x 1011 vg/kg. In certain embodiments, the vectors are administered at a dose between 1.0 x 1011 vg/kg and 1.5 x 1013 vg/kg. In certain embodiments, the vectors are administered at a dose between 1.0 x 1011 vg/kg and 1.5 x 1012 vg/kg.
[00207] In certain embodiments, the vectors are administered at a dose of about 1.4 x 1012. In certain embodiments, the vectors are administered at a dose of 1.4 x 1012 vg/kg. [00208] In certain embodiments, the pharmaceutical compositions of the disclosure comprise a pharmaceutically acceptable carrier. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PBS. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PLURONIC®. In certain embodiments, the pharmaceutical compositions of the disclosure comprise PBS, NaCl, and PLURONIC®. In certain embodiments, the vectors are administered by intravitreal injection in a solution of PBS with additional NaCl and PLURONIC®.
IV. Kits
[00209] In certain embodiments, any of the endonuclease systems disclosed herein are assembled into a pharmaceutical, diagnostic, or research kit to facilitate their use in prophylactic, therapeutic, diagnostic, or research applications. A kit may include one or more containers housing any of the vectors including the endonuclease system disclosed herein and instructions for use.
[00210] The kit may be designed to facilitate use of the methods described herein by researchers and can take many forms. Each of the compositions of the kit, where applicable, may be provided in liquid form (e.g., in solution) or in solid form, (e.g., a dry powder). In certain cases, some of the compositions may be constitutable or otherwise processable (e.g., to an active form), for example, by the addition of a suitable solvent or other species (e.g., water or a cell culture medium), which may or may not be provided with the kit. As used herein, “instructions” can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure. Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, e.g., audiovisual (e.g., videotape, DVD, etc.), internet, and/or web-based communications, etc. The written instructions may be in a form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which instructions can also reflect approval by the agency of manufacture, use, or sale for animal administration.
[00211] Throughout the description, where compositions are described as having, including, or comprising specific components, or where processes and methods are described as having, including, or comprising specific steps, it is contemplated that, additionally, there are compositions of the present invention that consist essentially of, or consist of, the recited components, and that there are processes and methods according to the present invention that consist essentially of, or consist of, the recited processing steps. [00212] In the application, where an element or component is said to be included in and/or selected from a list of recited elements or components, it should be understood that the element or component can be any one of the recited elements or components, or the element or component can be selected from a group consisting of two or more of the recited elements or components.
[00213] Further, it should be understood that elements and/or features of a composition or a method described herein can be combined in a variety of ways without departing from the spirit and scope of the present invention, whether explicit or implicit herein. For example, where reference is made to a particular compound, that compound can be used in various embodiments of compositions of the present invention and/or in methods of the present invention, unless otherwise understood from the context. In other words, within this application, embodiments have been described and depicted in a way that enables a clear and concise application to be written and drawn, but it is intended and will be appreciated that embodiments may be variously combined or separated without parting from the present teachings and invention(s). For example, it will be appreciated that all features described and depicted herein can be applicable to all aspects of the invention(s) described and depicted herein.
[00214] Where the use of the term “about” is before a quantitative value, the present invention also includes the specific quantitative value itself, unless specifically stated otherwise. As used herein, the term “about” refers to a ±10% variation from the nominal value unless otherwise indicated or inferred.
[00215] It should be understood that the order of steps or order for performing certain actions is immaterial so long as the present invention remain operable. Moreover, two or more steps or actions may be conducted simultaneously.
[00216] The use of any and all examples, or exemplary language herein, for example, “such as” or “including,” is intended merely to illustrate better the present invention and does not pose a limitation on the scope of the invention unless claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the present invention.
EXAMPLES
[00217] The following Examples are merely illustrative and are not intended to limit the scope or content of the invention in any way. Example 1. In vitro characterization of editing performance using an endonuclease system with a compact promoter
[00218] Using an Hl compact promoter system to drive SpCas9 and gRNA expression, three regions in GFP were targeted and editing was compared with pX330, a widely used CRISPR plasmid. The CBh promoter (~800bp) in pX330 is an engineered, strong Pol II promoter system that contains an enhancer and a hybrid intron, two additional elements that boost expression. The pX330 plasmid also contains the U6 promoter (a Pol III promoter). This demonstrates that the small endogenous mammalian promoter (Hl) resulted in comparable editing activity to a strong promoter-enhancer, despite being less than half the size (FIG. 2). In addition, packaging of the constructs into an AAV resulted in robust editing (FIG. 3). No comparison to the CBh promoter of pX330 was shown because the pX330 system was far too big to package into a single AAV. AAV comprising the compact promoter were properly packaged with few disrupted virions (FIGs. 4A-4B), indicating that the promoter wass sufficiently small to enable efficient packaging and editing.
Example 2. In vivo characterization of editing performance using an endonuclease system with a compact promoter
[00219] An endonuclease system having a compact promoter was used to characterize the efficiency of editing in vivo. The endonuclease system was packaged into an AAV9 vector and administered via injection to a transgenic eGFP mouse line. eGFP disruption in the liver and the heart, two organs for which AAV9 exhibits tropism (FIGs. 5A-5C), was targeted and quantified. The dosing levels (2xl013 vg/kg) were several orders of magnitude lower than those described in other published reports using multiple viruses. As shown in FIG. 5A, eGFP-expressing mice that were administered the endonuclease system showed increasing levels of editing in the liver (FIG. 5B) and in the heart (FIG. 5C) at 14, 21, 28, and 42 days post administration. To the inventors’ knowledge, this was the first and only in vivo demonstration of gene-editing using SpCas9-gRNA delivered through a single AAV.
Example 3. Identification of target sites in the human HIF-2a
[00220] This Example describes identification of target nucleotide sequences in the human HIF-2a gene and characterization of HIF-2a transcript and protein levels.
[00221] 42 Cast 2a target sites in HIF-2a were identified 42, of which 15 are in regions expected to negatively impact mRNA splicing and protein expression. Fifteen (15) crRNA targets (SEQ ID NOs: 541, 545, 547, 549, 556, 650, 655, 666, and 695-701), a positive control, and a non-targeting control were cloned by synthesis into a pre-designed expression vector. Plasmid constructs were transfected into HeLa cells (Lipofectamine 3000) using conditions optimized for cell growth and plating, transfection efficiency, and plasmid concentration (transfection efficiency were used to determine the milestone threshold). HeLa cells were selected due to their robust expression of HIF-2a following activation by either hypoxia or small-molecule activators. At 48h post-transfection, genomic DNA was extracted and amplified by PCR. PCR primers flanking the on-target site generated 150-500 bp single-band amplicons. Primer3 were used to identify 3 forward primers and 3 reverse primers for a given genomic locus, which enabled testing of 9 amplicons. PCR reactions were performed using the high- fidelity DNA polymerase (Platinum SuperFi II ThermoFisher), then analyzed on PAGE gels stained with SYBR-Gold and visualized with a UV transilluminator to identify single band amplifications. This sensitive approach enabled the selection of highly specific amplicons, as agarose gel and ethidium bromide staining often fail to detect spurious amplifications.
Amplicons were column-purified and quantified prior to Sanger sequencing and TIDE (Tracking of Indels by Decomposition) or ICE analysis (Inference of CRISPR Edits) (Brinkman et al. (2014) NUCLEIC ACIDS RES 42, el68, doi: 10.1093/nar/gku936; Synthego Performance Analysis, ICE Analysis. 2019. v2.0. Synthego). Editing will be further verified in 2 samples with next generation amplicon sequencing (NGS) using slightly modified primers that append partial Illumina adapter sequences (TABLE 3). NGS reads will be analyzed using the command line utility CRISPResso (Pinello, L. et al. (2016) NAT BIOTECHNOL 34, 695-697, doi: 10.1038/nbt.3583). DNA editing and repair alleles will be confirmed by NGS in two additional human cell lines (A549 and HEK293). Two targets (plus 1 nontargeting control) will be advanced, based on (i) editing levels and (ii) repair alleles leading to protein disruption.
TABLE 3. Gene editing analysis
Figure imgf000060_0001
[00222] To confirm HIF-2a disruption, the 2 selected targets will be assessed for transcript processing and protein expression (TABLE 4). Plasmid constructs will be transfected into HeLa cells, followed by activation of HIF-2a expression with chemically-induced hypoxia (150pM CoCh) or dH2O vehicle control for 24h. RNA will be extracted from cells and treated with DNase I. cDNA will be generated using an oligo(dT) primer (High-Capacity cDNA RT Kit, Thermo Fisher), followed by PCR with HIF-2a cDNA primers spanning the DNA target site, column-purification, and sequencing to detect mRNA sequence changes. For Western Blot, cells will be lysed in RIPA buffer containing a protease inhibitor cocktail. Protein (10-30pg) will be separated by SDS-PAGE, transferred to PVDF, and blotted using primary antibodies against HIF-2a and P-actin. Blots will be probed with HRP-conjugated secondary antibodies and visualized using chemiluminescence. Protein bands corresponding to full-length HIF-2a will be quantified by ImageJ and normalized to P-actin. HIF-2a expression will be compared across treatment groups for statistical significance by ANOVA, and targets that reduced HIF-2a protein by > 50% will be advanced for AAV validation studies based on a target of 50% inhibition (a level that is protective in mouse models) multiplied by an expected plasmid transfection efficiency in HeLa cells determined above.
TABLE 4. HIF-2a transcript and protein analyses
Figure imgf000061_0001
[00223] The transfection efficiency determined in the optimization experiments will be used to determine the precise threshold criteria. For example, if the plasmid transfection efficiency in cells is determined to be 60%, the threshold for success will be a 30% overall reduction (60% transfection efficiency x 50% therapeutic threshold).
Example 4. In vitro hypoxia model validation
[00224] Cast 2a target sites will be packaged into an AAV vector and HIF-2a pathway disruption will be assessed following transduction of hPAECs. AAV serotype specificity and transduction efficiency in human pulmonary artery endothelial cells (hPAECs) is not well established. AAV transduction will be optimized in hPAECs using an AAV serotype testing kit to test 12 AAV serotypes using a range of MOIs (IxlO3 to IxlO5). The AAV serotype with the greatest transduction efficiency without visible cellular toxicity will be used for packaging. A minimal threshold of 30% transduction will be set for hPAECs.
[00225] Two targeting constructs and one non-targeting control will be cloned into ITR- containing plasmids and sequenced to verify the expression cassette. Validated plasmids will be amplified and maxi -prepped to generate sufficient endotoxin-free material. Plasmids will be sequenced and digested with Smal to verify ITR integrity prior to packaging. Small-scale preps will be produced and crude lysate will be titered by dPCR and quantified on a QuantStudio 3D Digital PCR. Crude AAV lysates will be added to the cell culture media of hPAECs and incubated for 48 hours. HIF-2a activation will be induced via exposure to hypoxia (1% O2) or normoxia for 24 hours. DNA will be isolated and Sanger-sequenced to verify editing, followed by confirmation by NGS (TABLE 5).
TABLE 5. Data Editing Analysis
Figure imgf000062_0001
[00226] HIF pathway suppression will be determined in hPAECs grown in either hypoxic or normoxic conditions for 24h. Pathway inhibition will be benchmarked against 20 pM C76, a selective HIF-2a translation inhibitor that reduces activated HIF-2a to wild-type levels. Due to rapid degradation of HIF -2a, RNA and protein will be harvested immediately. RNA will be extracted and cDNA will be generated as described above. cDNA samples and RT controls will be used as templates for qPCR reactions containing primer-probe sets for HIF-la and HIF-2a, as well as HIF-2a transcriptional targets, DLL4, ANGPT2, VEGFA, FGF2, and TIE254-58; and GAPDH and HIF-la as endogenous controls (TABLE 6).
[00227] HIF-la and HIF -2a proteins will be quantified as described above.
TABLE 6. HIF-2a Pathway Analysis
Figure imgf000062_0002
Example 5. In vivo Hif2a targeting in aaa murine pulmonary hypertension model assessing a prophylactic intervention strategy
[00228] Mouse HIF -2a Cast 2a target sites were identified bioinformatically and will be tested in mouse cells for HIF-2a pathway disruption, as described for human targets above. Validated targets will be packaged with AAV-L1 and HIF -2a targeting in vivo and assessed in a hypoxia mouse model for PH, a tractable model dependent on HIF -2a activation that recapitulates many hallmarks of PAH pathology. To address potential issues relating to the window of time for therapeutic intervention, two approaches referred to as a prophylactic or therapeutic strategy can be used. For the prophylactic strategy, AAV will be administered 3 weeks prior to the initiation of exposure to hypoxic conditions (FIG. 6). In the therapeutic model, AAV will be administered concomitantly with exposure (FIG. 7), thereby increasing the stringency through a reduction in the therapeutic window prior to hypoxia. For both models, the duration of hypoxia will be 4 weeks; and DNA editing, mean PA pressure (mPAP), hematocrit, RV hypertrophy, and pulmonary arterial remodeling, will be determined.
[00229] Cast 2a target sites in mouse Hif-2a will be tested to identify a lead target site for in vivo studies (targets matching the rat gene will be noted for subsequent studies). Forty (40) Casl2a target sites were identified in Hif-2a, of which 18 are in regions expected to negatively impact mRNA splicing or and protein expression. These sites will be tested as above, except that mouse MLE-12 cells will be used according to TABLE 7:
TABLE 7: HIF-2a Editing and Expression Analyses
Figure imgf000063_0001
[00230] The lead target (and non-targeting control) will be cloned into an ITR containing vector, as described. Packaging is done by triple-transfection of HEK293 cells, as described above with a few exceptions: an AAV-L1 cap plasmid is used, and in vivo preps are then purified by iodixanol gradient centrifugation and concentrated. Each vector is subjected to standardized assessments such as titering by digital PCR, endotoxin quantification, and AAV prep purity and stoichiometric analysis of VP1, VP2, and VP3 capsid proteins by polyacrylamide gel electrophoresis followed by silver staining or SYPRO Red staining. Preps are further imaged using electron microscopy to visualize full and empty vector particles and viral prep integrity. [00231] Male C57BL/6 mice will be randomized at 5 weeks of age to receive 2.5xl012 vg/kg AAVLl-Hif-2a or control via retroorbital injection (note: female mice have milder phenotypes in the hypoxia model.) After 3 weeks of normoxia (to enable AAV-mediated expression), half of the mice will be exposed to 10% O2 (hypoxia) for 4 weeks while the other half will remain at normoxia for 4 weeks. This treatment strategy will be benchmarked against C76, which will be administered daily by intraperitoneal injection at 12.5 mg/kg (or 0.5% DMSO vehicle control) throughout the exposure.
[00232] Mouse PA endothelial cells will be isolated from AAV-treated hypoxic mice as previously described (see, Dong, Q. G. et al. (1997) ARTERIOSCLER THROMB VASC BIOL 17, 1599-1604; Marelli-Berg et al. (2000) J IMMUNOL METHODS 244, 205-215). DNA will be extracted and amplified using PCR primers specific to the mouse Hif-2a sequence and column- purified amplicons analyzed by NGS, as described (TABLE 8).
TABLE 8: Prophylactic Treatment
Figure imgf000064_0001
[00233] All in vivo assays have been described in previous publications (Abud et al. (2012) PROC NATL ACAD SCI U S A 109: 1239-1244; Yu et al. (1999) J CLIN INVEST 103: 691-696; Walker et al. (2016) PHYSIOL REP 4, doi:10.14814/phy2.12702 (2016):
(i) mPAP. mPAP will be calculated from closed-chest RV diastolic and systolic pressures measured in anesthetized mice via a 23 -gauge needle filled with heparinized saline and connected to a pressure transducer. The threshold of pathologic mPAP will be > 20mmHg;
(ii) Hematocrit. Prolonged exposure to hypoxia produces polycythemia. Blood will be collected from the LV and placed in EGTA-treated tubes. Plasma will be separated via low-speed centrifugation, and hematocrit will be measured using a microhematocrit capillary tube reader chart. The threshold of pathologic hematocrit in this model will be > 40%;
(iii) RV Hypertrophy : Under a dissecting microscope, the atria and extraneous vascular material will be removed from the heart. The RV wall will be separated from the LV and septum, and both portions will be quickly blotted dry and weighed. RV weight will be normalized to the combined weight of LV plus septum. The threshold of pathologic RV/(LV+S) in this model will be > 0.25;
(iv) Pulmonary Arterial Remodeling. A suture will be used to occlude the right lung, which will be removed for subsequent DNA analysis. The left lung will be inflated with 10% formalin, embedded in paraffin, and sectioned. It is expected that extension of smooth muscle into previously non-muscular vessels will be observed by confocal microscopy as an increase in small diameter vessels (<100pm outer diameter) that are positive for smooth muscle-specific a- actin (SMA). Morphometric analysis of arterial medial thickness will be quantified from H&E- stained sections, while collagen deposition will be quantified from sections stained with Picrosirius red. The threshold of pathologic remodeling in this model will be > 40% SMA- positive vessels.
[00234] It is expected that PH phenotypes, RV Systolic Pressure, Hematocrit, RV Hypertrophy, and Pulmonary Arterial Remodeling will meet or exceed suppression due to C76 treatment.
Therapeutic Treatment Model
[00235] Male mice (N=72) will be randomized at 8 weeks of age according to the prophylactic table above. AAV injection and initiation of exposure will occur concomitantly and proceed for 4 weeks. Endpoints will be achieved if PH phenotypes, RV Systolic Pressure, Hematocrit, RV Hypertrophy, and Pulmonary Arterial Remodeling, meet or exceed suppression due to C76 treatment. Statistical significance across prophylactic and therapeutic strategies will be determined using ANOVA analysis.
Description of Procedures
[00236] Hypoxia Exposure. Hypoxia/normoxia exposures will be performed on C57BL/6 male mice starting at 8 weeks of age. Mice will be group housed (N=5 mice per cage) in filter-top cages that are placed in a hypoxia exposure chamber throughout the duration of exposure. Animals will be briefly (<5 min) removed from the chamber twice a week to enable changing of cages, replenishment of food and water, and determination of animal weights. The chamber will be flushed continuously with room air to prevent accumulation of carbon dioxide, and nitrogen will be injected as needed via a servo control system to maintain the chamber at 10% O2. All mice will be exposed to hypoxia continuously for 4 weeks. Control mice will be maintained in room air (normoxia) on racks next to the hypoxia chamber for the same duration and are exposed to the same light/dark cycle and ambient temperatures.
Treatment Groups
[00237] AAV Injection (N=10 per group). Mice will receive a single dose of either AAV-L1- HIF-2a or AAV-L1 -Control vector (2.5xl012 viral genomes/kg body weight) via retroorbital vein injection at 5 weeks of age (prophylactic protocol) or 8 weeks of age (therapeutic protocol). For the prophylactic protocol, treatment will occur 3 weeks prior to initiation of hypoxia/normoxia exposure; while for the therapeutic protocol, treatment will occur concomitantly with initiation of exposure. Prior to injection, mice will be anesthetized with isoflurane using the drop technique and sufficient depth of anesthesia confirmed by a toe pinch. 50 pL of AAV (in saline) will be injected into the retroorbital vessels of anesthetized mice using a 26-30 g gauge needle. Following injection, light pressure will be applied to the eye to control bleeding, and ophthalmic ointment will be applied to prevent eye drying. Animals will be immediately returned to their home cage and monitored for recovery from anesthesia before returning to the exposure chamber.
[00238] C76/Vehicle Injection (N=8 per group). In both the prophylactic and therapeutic treatment protocols, a positive (C76) and negative (0.5% DMSO) control group will be included. Conscious mice will receive daily dosing of the HIF-2a translation inhibitor C76 (12.5 mg/kg) or DMSO vehicle control via intraperitoneal injection (100 pL). In the prophylactic protocol, mice will be dosed with C76 or DMSO throughout the entire 4-week exposure. In the therapeutic protocol, mice will be dosed only for the final 2 weeks of exposure.
AA V- mediated gene expression
[00239] Assessment of Pulmonary Hypertension. At the conclusion of exposure, pulmonary hypertension will be verified by measuring right ventricular diastolic and systolic pressures (hemodynamics) and measurement of right ventricular weight and hematocrit post-mortem. Mice will be anesthetized using pentobarbital (65 mg/kg i.p.) until they are fully sedated, as determined by a toe pinch, and then maintained under anesthesia throughout the duration of the procedure. The diaphragm will be visualized via a lateral incision of the abdomen, and a 23- gauge needle attached to a pressure transducer will be inserted into the central area of the right ventricle near the pulmonary artery by direct puncture. After pressure measurements are obtained (approximately 20 sec), the animal will be euthanized via exsanguination from the right ventricle to obtain blood for hematocrit measurement. At the conclusion of this measurement, the mice will be euthanized according to AMVA-approved methods. Hearts and lungs will be harvested from euthanized mice. mPAP will be determined using the equation: 2/3 PADP + 1/3 PASP. Pulmonary hypertension will be defined as a resting mean pulmonary arterial pressure of > 20 mmHg.
INCORPORATION BY REFERENCE
[00240] The entire disclosure of each of the patent and scientific documents referred to herein is incorporated by reference for all purposes.
EQUIVALENTS
[00241] The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.

Claims

WHAT IS CLAIMED IS:
1. A non-naturally occurring endonuclease system comprising: a) a nucleotide sequence encoding a promoter; b) a nucleotide sequence encoding a polyribonucleotide (e.g., a guide RNA); and c) a nucleotide sequence encoding an endonuclease; wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia- induced gene, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
2. The endonuclease system of claim 1, wherein the promoter is a bidirectional promoter.
3. The endonuclease system of claim 1 or 2, wherein the promoter is an Hl promoter.
4. The endonuclease system of claim 3, wherein the Hl promoter is a bidirectional promoter comprising pol II and pol III activity.
5. The endonuclease system of any one of claims 1-4, wherein the promoter has a length of from 50 bp to 225 bp.
6. The endonuclease system of any one of claims 1-4, wherein the promoter has a length of from 50 bp to 200 bp.
7. The endonuclease system of any one of claims 1-4, wherein the promoter has a length of from 50 bp to 180 bp.
8. The endonuclease system of any one of claims 1-7, wherein the promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-226, 242-521, and 171-175, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
9. The endonuclease system of any one of claims 1-8, wherein the DNA endonuclease is Cas9, Casl2, or MAD7.
10. The endonuclease system of claim 9, wherein the Cas9 endonuclease is selected from the group consisting of SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9.
11. The endonuclease system of claim 9, wherein the Casl2 endonuclease is selected from the group consisting of Casl2a, Casl2b, Casl2c, Casl2d, Casl2e, Casl2fl, Casl2g, Casl2h, and Casl2i.
12. The endonuclease system of any one of claims 1-8, wherein the DNA endonuclease is selected from the group consisting of Casl4a, Casl4b, Cas 14c, and Cas .
13. The endonuclease system of any one of claims 1-12, wherein the endonuclease is codon optimized for expression in a eukaryotic cell.
14. The endonuclease system of any one of claims 1-13, wherein the portion of the polyribonucleotide that hybridizes to the target sequence comprises a nucleotide sequence selected from any one of SEQ ID NOs: 538-710.
15. The endonuclease system of any one of claims 1-14, wherein the endonuclease system is incorporated into a single vector.
16. The endonuclease system of claim 15, wherein the single vector is a viral vector or a plasmid.
17. The endonuclease system of claim 16, wherein the single vector is an AAV vector.
18. The endonuclease system of claim 17, wherein the AAV vector comprises a nucleotide sequence having a length of from 3 kbp to 6 kbp.
19. The endonuclease system of claim 17 or 18, wherein the AAV vector is selected from the group consisting of AAV1, AAV2, and AAV5.
20. The endonuclease system of claim 19, wherein the AAV vector comprises a non-naturally occurring nucleotide sequence encoding a targeting peptide.
21. The endonuclease system of claim 20, wherein the targeting peptide confers cell typespecific tropism to the AAV vector.
22. The endonuclease system of claim 21, wherein the AAV vector confers tropism to a lung endothelial cell and/or a lung artery smooth muscle cell.
23. The endonuclease system of any one of claims 20-22, wherein the targeting peptide comprises from 5 to 25 amino acids.
24. The endonuclease system of claim 23, wherein the targeting peptide comprises an amino acid sequence of any one of SEQ ID NOs: 533-537 or an amino acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
25. The endonuclease system of claim 24, wherein the targeting peptide comprises an amino acid sequence comprising ESGHGYF (SEQ ID NO: 533), GHGYF (SEQ ID NO: 534), CGFECVRQCPER (SEQ ID NO: 535), CGSPGWVRC (SEQ ID NO: 536), or CARSKNKDC (SEQ ID NO: 537).
26. The endonuclease system of claim 25, wherein the AAV vector is AAV-L1.
27. The endonuclease system of any one of claims 1-26, wherein the hypoxia-induced gene comprises hypoxia-inducible factor- 1 -alpha (HIF1A), hypoxia-inducible factor-2-alpha (HIF2A), bone morphogenic protein receptor 2 (BMPR2), and activin receptor-like kinase 1 (Al. KI).
28. A method of preventing or treating pulmonary hypertension (PH) or pulmonary arterial hypertension (PAH) in a subject in need thereof, the method comprising administering to the subject an adeno-associated viral (AAV) vector comprising: a) a nucleotide sequence encoding a promoter; b) a nucleotide sequence encoding a polyribonucleotide; and c) a nucleotide sequence encoding a endonuclease; wherein the promoter is operably linked to the sequence encoding the polyribonucleotide and to the sequence encoding the endonuclease, wherein a portion of the polyribonucleotide hybridizes with a target sequence of a hypoxia- induced gene in a cell of the subject, and wherein the polyribonucleotide directs the endonuclease to the target sequence.
29. The method of claim 28, wherein the promoter is a bidirectional promoter.
30. The method of claim 28 or 29, wherein the promoter is an Hl promoter.
31. The method of claim 30, wherein the Hl promoter is a bidirectional promoter comprising pol II and pol III activity.
32. The method of claim 31, wherein the pol II activity promotes expression of the endonuclease and the pol III activity promotes expression of the polyribonucleotide.
33. The method of any one of claims 28-32, wherein the promoter has a length of from 50 bp to 225 bp.
34. The method of claim 33, wherein the promoter has a length of from 50 bp to 200 bp.
35. The method of claim 34, wherein the promoter has a length of from 50 bp to 180 bp.
36. The method of any one of claims 28-35, wherein the promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-226, 242-521, and 171-175, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
37. The method of any one of claims 28-36, wherein the endonuclease is Cas9, Casl2, or MAD7.
38. The method of claim 37, wherein the Cas9 endonuclease is selected from the group consisting of SpCas9, SaCas9, StCas9, NmCas9, and GeoCas9.
39. The method of claim 38, wherein the Casl2 endonuclease is selected from the group consisting of Cast 2a, Cast 2b, Cast 2c, Cast 2d, Casl2e, Casl2fl, Cast 2g, Casl2h, and Casl2i.
40. The method of any one of claims 28-36, wherein the DNA endonuclease is selected from the group consisting of Casl4a, Casl4b, Cas 14c, and Cas .
41. The method of any one of claims 28-40, wherein the endonuclease is codon optimized for expression in a eukaryotic cell.
42. The method of any one of claims 28-41, wherein the portion of the polyribonucleotide that hybridizes to the target sequence comprises a nucleotide sequence selected from any one of SEQ ID NOs: 538-710.
43. The method of any one of claims 28-42, wherein the AAV vector comprises a nucleotide sequence having a length of from 3 kbp to 6 kbp.
44. The method of any one of claims 28-43, wherein the AAV vector is selected from the group consisting of AAV1, AAV2, and AAV5.
45. The method of claim 44, wherein the AAV vector further comprises a non-naturally occurring nucleotide sequence encoding a targeting peptide.
46. The method of claim 45, wherein the targeting peptide confers cell type-specific tropism to the AAV vector.
47. The method of claim 46, wherein the AAV vector confers tropism to a lung endothelial cell and/or a lung artery smooth muscle cell.
48. The method of any one of claims 45-47, wherein the targeting peptide comprises 5 to 25 amino acids.
49. The method of claim 46, wherein the targeting peptide comprises an amino acid sequence of SEQ ID NO: 533 or an amino acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% identity thereto.
50. The method of claim 49, wherein the targeting peptide comprises an amino acid sequence comprising ESGHGYF (SEQ ID NO: 533).
51. The method of claim 50, wherein the AAV vector is AAV-L1.
52. The method of any one of claims 28-51, comprising administering the composition prophylactically, concurrently, or following onset of PH or PAH.
53. The method of any one of claims 28-52, wherein the hypoxia-induced gene comprises HIF1A, HIF2A, BMPR2, and ALK1.
54. The method of any one of claims 28-52, further comprising administering an inhibitor of the hypoxia-induced gene.
55. The method of claim 54, wherein the inhibitor is a small molecule inhibitor.
56. The method of claim 55, wherein the small molecule inhibitor is selected from the group including belzutifan, PT2385, vadadustat, KC7F2, CAY10585, 2-Methoxyestradiol, SYP-5, PT2399, N-Acetylcysteine amide, IDF-11774, Lificiguat (YC-1), PX-478 2HC1, BAY 87-2243, C76 (Methyl-3-(2-(cyano(methylsulfonyl)methylene)hydrazino)thiophene-2-carboxylate), Roxadustat (FG-4592), Daprodustat (GSK1278863), Desidustat (ZYAN-1), Molidustat (Bay 85- 3934), MK-8617, IOX-2, 2-methoxyestradiol, GN-44028, AKB-4924, FG-2216, and FG-4497.
PCT/US2023/073368 2022-09-02 2023-09-01 Compact promoters for targeting hypoxia induced genes WO2024050548A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263403559P 2022-09-02 2022-09-02
US63/403,559 2022-09-02

Publications (3)

Publication Number Publication Date
WO2024050548A2 true WO2024050548A2 (en) 2024-03-07
WO2024050548A3 WO2024050548A3 (en) 2024-04-11
WO2024050548A9 WO2024050548A9 (en) 2024-05-23

Family

ID=90098790

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/073368 WO2024050548A2 (en) 2022-09-02 2023-09-01 Compact promoters for targeting hypoxia induced genes

Country Status (1)

Country Link
WO (1) WO2024050548A2 (en)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005035718A2 (en) * 2003-10-03 2005-04-21 Welgen, Inc. Biderectional promoters for small rna expression
US9790490B2 (en) * 2015-06-18 2017-10-17 The Broad Institute Inc. CRISPR enzymes and systems

Also Published As

Publication number Publication date
WO2024050548A3 (en) 2024-04-11
WO2024050548A9 (en) 2024-05-23

Similar Documents

Publication Publication Date Title
JP7498665B2 (en) Novel adeno-associated virus (AAV) vectors, AAV vectors with reduced capsid deamidation, and uses thereof
JP6495395B2 (en) Engineering systems, methods and optimization guide compositions for sequence manipulation
JP2024054426A (en) Gene therapy constructs for treating wilson disease
CA3001623A1 (en) Therapeutic targets for the correction of the human dystrophin gene by gene editing and methods of use
US20220184229A1 (en) Aav vector-mediated deletion of large mutational hotspot for treatment of duchenne muscular dystrophy
CN115023242A (en) Adeno-associated virus vector variants
JP2023522788A (en) CRISPR/CAS9 therapy to correct Duchenne muscular dystrophy by targeted genomic integration
JP6824169B2 (en) Adeno-associated virus vector encoding modified G6PC and its use
JP2024123226A (en) Liver-specific viral promoters and methods of using same
JP2021522273A (en) Gene therapy for CNS degeneration
CN113195721A (en) Compositions and methods for treating alpha-1 antitrypsin deficiency
WO2023039476A9 (en) Engineered muscle and central nervous system compositions
US20230062529A1 (en) Rna adeno-associated virus (raav) vector and uses thereof
WO2024050548A2 (en) Compact promoters for targeting hypoxia induced genes
EP4413025A2 (en) Engineered cardiac muscle compositions
US20230272428A1 (en) Methods and compositions for correction of dmd mutations
US20240175006A1 (en) Compact promoters for gene editing
CN113454226A (en) Methods and compositions for treating glycogen storage disease
RU2742435C2 (en) Promoter compositions
WO2024120528A1 (en) Improved system for producing rna-packaged aav particles
US20240209354A1 (en) MULTIPLEX CRISPR/Cas9-MEDIATED TARGET GENE ACTIVATION SYSTEM
WO2023147558A2 (en) Crispr methods for correcting bag3 gene mutations in vivo
WO2023150632A2 (en) Targeting moieties promoting transduction of central nervous system cells and tissues and methods of use
WO2023220386A1 (en) Adeno-associated viral vectors for targeting brain microvasculature

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23861622

Country of ref document: EP

Kind code of ref document: A2