WO2022251096A1

WO2022251096A1 - Promoter sequence and related products and uses thereof

Info

Publication number: WO2022251096A1
Application number: PCT/US2022/030501
Authority: WO
Inventors: Sunghee CHAI; Markus Grompe
Original assignee: Oregon Health & Science University
Priority date: 2021-05-23
Filing date: 2022-05-23
Publication date: 2022-12-01

Abstract

Strong mini-promoters and methods of use thereof are disclosed.

Description

PROMOTER SEQUENCE AND RELATED PRODUCTS AND USES THEREOF

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 63/192,060, filed May 23, 2021. The foregoing application is incorporated by reference herein. This invention was made with government support under 5U01DK123608-02 awarded by the National Institute of Diabetes and Digestive and Kidney Disease (NIDDK). The government has certain rights in the invention.

COPYRIGHT NOTICE

© 2021 Oregon Health & Science University. A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. 37 CFR §1.71(d).

FIELD OF THE INVENTION

The present invention relates to the field of biotechnology. More specifically, the invention provides novel promoter sequences and related products and uses thereof. BACKGROUND OF THE INVENTION

Several publications and patent documents are cited throughout the specification in order to describe the state of the art to which this invention pertains. Each of these citations is incorporated herein by reference as though set forth in full.

The most widely used strong ubiquitous promoters (CMV, CAG, etc.) in viral vectors are quite large (-1,000 bp). This can be extremely limiting in terms of the transgenes that can be expressed. For example, the AAV packaging limit is around 4.8 kb. After subtraction of the currently used promoters, this leaves only around 3.7 kb for the transgene in a single-stranded AAV and only around 1.2 kb for self-complementary recombinant AAVs. A need exists for smaller promoters that would allow for larger transgenes to be expressed. SUMMARY OF THE INVENTION

In accordance with the present invention, promoters are provided. In certain embodiments, the promoter is derived from the human glucagon promoter. In certain embodiments, the promoter is a fragment of the human glucagon promoter, particularly a fragment comprising the 110 nucleotides 5’ from the transcription start site. In certain embodiments, the promoter comprises a sequence having at least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 2. In certain embodiments, the promoter comprises or consists of SEQ ID NO: 1 or SEQ ID NO: 2. In certain embodiments, the promoter comprises less than 150 nucleotides. In certain embodiments, the promoter is operably linked to a heterologous nucleic acid or transgene.

In accordance with another aspect of the instant invention, nucleic acid molecules comprising a promoter of the instant invention are also provided. In certain embodiments, the promoter is operably linked to a heterologous nucleic acid or transgene. In certain embodiments, the nucleic acid molecule is contained within a plasmid. In certain embodiments, the nucleic acid molecule is contained within a vector. In certain embodiments, the nucleic acid molecule is contained within a viral vector. In certain embodiments, the nucleic acid molecule is contained within an AAV vector. In certain embodiments, the nucleic acid molecule is contained within an AAV virion or an AAV capsid shell.

In accordance with another aspect of the instant invention, methods for enhancing or increasing the expression of a nucleic acid are provided. In certain embodiments, the method compromises expressing the nucleic acid molecule from a promoter of the instant invention. In certain embodiments, the method comprises introducing a nucleic acid molecule comprising a promoter of the instant invention operably linked to a nucleic acid to a cell, such as a mammalian cell or a human cell. Modified cells comprising a nucleic acid molecule comprising a promoter of the instant invention operably linked to a nucleic acid are also provided.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 provides the nucleotide sequence (SEQ ID NO: 1) and organization of the GCG110 promoter. Nucleotides 1-110 of the nucleotide sequence is SEQ ID NO: 2. UTR: untranslated region.

Figure 2 provides a schematic of AAV vectors with different promoters (top). Images of the expression of the AAV vectors four days post transduction are also provided (bottom). ITR: inverted terminal repeat. Pm: promoter. mRFP: monomeric red fluorescent protein. WPRE: Woodchuck hepatitis virus posttranscriptional regulatory element. pA: SV40 polyA transcription termination signal. aTCl: mouse alpha cell line. MIN6: mouse insulinoma cells.

Figure 3 provides images of human islets transduced with AAVs with the indicated promoters at four days post transduction. Top row shows fluorescence. Bottom row provides microscopy images of the cells. CAG: strong synthetic promoter comprising the cytomegalovirus (CMV) early enhancer element and the promoter of the chicken beta-actin gene (Dou, et al. (2021) FEBS Open Bio, 11(1):95- 104; Farokhimanesh, et al. (2010) Biotechnol. Prog., 26:505-511). INS84: 84 base pair promoter derived from human insulin promoter comprising the core promoter TATA box and the upstream CAAT box region located upstream of the transcription start +1 (WO 2020/219949).

Figures 4A and 4B provide flow cytometry graphs of islet cells transduced with AAVs expressing mRFP under the GCG110 promoter (Fig. 4A) or CAG promoter (Fig. 4B). Alpha cells (glucagon (GCG) positive) and beta cells (C-peptide (C-pep) positive) are indicated.

Figure 5 provides images of human hepatocytes expressing mRFP under the indicated promoters at the indicated days post transduction. CMV : cytomegalovirus promoter.

Figure 6 provides images of muscle tissue from mice injected with AAV expressing mRFP under control of the GCG110 promoter or the CAG promoter.

Figure 7 provides images of the indicated regions of the brain from mice injected with AAV expressing mRFP under control of the GCG110 promoter. Bottom panels provide contrast to show neurites.

Figure 8A provides schematics of two AAV constructs. The first AAV contains the GCG110 promoter (Pm) driving expression of mRFP. pA: poly adenylation signal sequence from SV40pA. WPRE: Woodchuck hepatitis virus posttranscriptional regulatory element. The second AAV contains the GCG110 promoter driving expression of a tri-cistronic construct composed of two transcription factors (Pdxl and MafA) and a green fluorescent reporter eGFP. These three genes were fused by two ribosomal skipping sequences T2A and P2A. pA: poly adenylation signal sequence from bGHpA. ITR: inverted terminal repeat; I: intron; MRE: microRNA recognition elements. Figure 8B also provides confocal microscopy images of MIN6 mouse insulinoma cells transduced with the indicated AAV constructs.

DETAILED DESCRIPTION OF THE INVENTION

Provided herein is a ubiquitous small promoter element capable of driving high yields of gene or nucleic acid expression. The promoter is shown to work in multiple mammalian cell types, including human and mouse cells. The promoter is also shown to cause high expression levels in cells of medical interest such as human hepatocytes and pancreatic islet cells. The promoter of the instant invention is comparable in strength to or stronger than the very strong ubiquitous promoters CMV and CAG while being significantly smaller in size (almost 10-fold).

Recombinant adeno-associated vims (rAAV) vectors have emerged as an efficient gene delivery tool. Discovery of various natural serotypes and the recent development of recombinant capsids has significantly advanced the transduction efficiency of rAAVs in a variety of cells and tissues. Current AAV vectors mainly rely on well-established promoters for gene expression. Specifically, CMV and CAG promoters belong to the most frequently used strong promoters providing universal activity. The capacity of DNA packaging in AAV capsids is very limited (4.7 kb) and even less in self complementary AAVs (about 2.3 kb). Hence, the large size of the existing strong promoters is a drawback in delivering genes and gene editing tools of large sizes, reaching the limits of the viral packaging capacity. The smaller size of the strong, universal promoters of the instant invention (e.g., 110 or 135 bp) allow for significantly greater capacity in AAV vectors

In accordance with the instant invention, promoter sequences are provided. In certain embodiments, the promoter comprises a sequence of SEQ ID NO: 1, or a sequence having 80% identity to SEQ ID NO: 1. In certain embodiments, the promoter comprises a sequence of SEQ ID NO: 2, or a sequence having 80% identity to SEQ ID NO: 2. In certain embodiments, the promoter comprises SEQ ID NO: 1 or SEQ ID NO: 2. In certain embodiments, the promoter excludes tissue-specific transcription factor binding site sequences. In certain embodiments, the promoter has at least 80% sequence identity, at least 85% sequence identity, at least 90% sequence identity, at least 95% sequence identity, at least 98% sequence identity, or at least 99% sequence identity to SEQ ID NO:

1 or SEQ ID NO: 2. In certain embodiments, the promoter comprises about 135 nucleotides of which a 110 nucleotide segment at the 5'-end of the promoter provides a promoter region and a following 25 nucleotide segment provides a transcription start (+1) and a 5'- untranslated region (UTR). In certain embodiments, the promoter comprises less than 200 nucleotides, less than 150 nucleotides, less than 135 nucleotides, less than 130 nucleotides, less than 125 nucleotides, less than 120 nucleotides, less than 115 nucleotides, or less than 110 nucleotides. In certain embodiments, the promoter comprises at least 110 nucleotides, at least 115 nucleotides, at least 120 nucleotides, at least 125 nucleotides, at least 130 nucleotides, or at least 135 nucleotides. In certain embodiments, the promoter comprises about 110 to 135 nucleotides. In certain embodiments, the promoter comprises a fragment of the human glucagon promoter (e.g., the fragment comprises 200 or fewer nucleotides, 175 or fewer nucleotides, 150 or fewer nucleotides, 140 or fewer nucleotides, 135 or fewer nucleotides, 130 nucleotides or fewer, 125 nucleotides or fewer, 120 nucleotides or fewer, 115 or fewer nucleotides, or about 110 nucleotides, particularly wherein the fragment comprises at least 110 nucleotides and/or the fragment ends (3’) at the transcription start site or the fragment includes the 25 bp UTR of SEQ ID NO: 1 and the nucleotide sequence 5’ therefrom). In certain embodiments, the promoter comprises a sequence of 135 or fewer (e.g., 110) nucleotides from the human glucagon promoter. In certain embodiments, the promoter does not comprise a sequence of nucleotides from the human glucagon promoter that is more than 110 nucleotides 5’ of the transcription start site (+1). In certain embodiments, the promoter of the instant invention comprises the transcription start site (+1) of SEQ ID NO: 1.

In accordance with another aspect of the instant invention, nucleic acid molecules comprising a promoter of the instant invention are provided. In certain embodiments, the nucleic acid molecule comprises a promoter of the instant invention operably linked to a nucleic acid sequence, particularly a nucleic acid sequence to be expressed. In certain embodiments, the nucleic acid sequence operably linked to the promoter is a heterologous nucleic acid and/or transgene. Herein, the term “heterologous” is used for any combination of nucleic acid sequences that is not normally found operably linked in nature. In certain embodiments, a heterologous nucleic acid sequence operably linked to a promoter of the instant invention includes any sequence other than the one that naturally encodes glucagon, particularly human glucagon (e.g., Gene ID: 2641). In certain embodiments, the operably linked nucleic acid sequence comprises a coding sequence (e.g., encodes a polypeptide or protein). In certain embodiments, the operably linked nucleic acid sequence comprises a non-coding nucleic acid sequence (e.g., expresses RNA which does not encode a polypeptide or protein). In certain embodiments, the operably linked nucleic acid sequence encodes an inhibitory nucleic acid molecule (e.g., antisense RNA, interfering RNA, siRNA, shRNA, microRNA, etc.). In certain embodiments, the promoter of the instant invention is operably linked to more than one heterologous nucleic acid and/or transgene.

The invention also encompasses plasmids and vectors which comprise a promoter of the instant invention. In certain embodiments, the plasmid or vector comprises a nucleic acid molecule comprising a promoter of the instant invention. In certain embodiments, the plasmid or vector comprises a nucleic acid molecule comprising a promoter of the instant invention operably linked to a nucleic acid sequence, particularly a nucleic acid sequence to be expressed. In certain embodiments, the plasmid or vector comprises a nucleic acid molecule comprising a promoter of the instant invention operably linked to a nucleic acid sequence, wherein the nucleic acid sequence operably linked to the promoter is a heterologous nucleic acid and/or transgene.

The vector of the instant invention will typically be an expression vector. The vector may be a viral vector or a non-viral vector. Examples of viral vectors include, but are not limited to, adenoviral vectors, adeno-associated vims (AAV) vectors (e.g., of any serotype such as, without limitation AAV-1 to AAV-12, particularly AAV-1, AAV-2, AAV-4, AAV-5, AAV-6, AAV-7, AAV-8, or AAV-9; a hybrid AAV vector of two or more serotypes; or a recombinant AAV), lenti viral vectors and pseudo-typed lend viral vectors (e.g., Ebola virus, vesicular stomatitis virus (VSV), and feline immunodeficiency vims (FIV)), herpes simplex virus vectors, vaccinia virus vectors, retroviral vectors, Moloney murine sarcoma virus, murine stem cell virus, human immunodeficiency vims, Semliki Forest virus, Sindbis vims, Venezuelan equine encephalitis virus, Kunjin vims, West Nile vims, dengue vims, vesicular stomatitis virus, measles vims, Newcastle disease virus, vaccinia virus, cytomegalovims, or coxsackievirus. Viral vectors may be an integrating vector or a non-integrating vector. For example, some lentivims vectors are integration-proficient while some AAV vectors can persist in cells in an episomal form. Viral vectors may provide transient, short term to long term expression of nucleic acids and/or transgenes.

In certain embodiments, the vector is an adeno-associated virus (AAV) vector. Recombinant viral (e.g., AAV) vectors have found broad utility for a variety of gene therapy applications. Their utility for such applications is due largely to the high efficiency of in vivo gene transfer achieved in a variety of organ contexts. AAV are known to infect a wide variety of cell types in vivo and in vitro by receptor-mediated endocytosis. Attesting to the overall safety of AAV vectors, infection with AAV leads to a minimal disease state in humans comprising mild flu-like symptoms.

As explained herein, the packaging capacity of AAV vectors is small. Due to their small size, the promoters of the instant invention allow for the insertion of a nucleic acid or transgene(s) of about 4.7 kb in single stranded AAVs and about 2.3 kb in self complementary AAVs. Previously, the most widely used strong ubiquitous promoters (CMV, CAG etc.) used in rAAV vectors were quite large (-1,000 bp). Given the packaging limit of rAAV, this is limiting in terms of the nucleic acids or transgenes that can be expressed. After subtraction of large (-1,000 bp) promoters such as CMV and CAG, this leaves only around 3.7 kb for the nucleic acid or transgene in a single-stranded AAV and only around 1.2 kb for self-complementary rAAVs. The promoters of the instant invention allow for the insertion of nucleic acids or transgenes with about 4 kb in single stranded AAVs and about 1.6 kb in self- complementary AAVs along with two inverted terminal repeats (ITRs) and a polyadenylation (poly A) transcription termination signal sequence. In certain embodiments, a single stranded AAV comprises from about 3.75 kb to about 4.6 kb of nucleic acids and/or transgenes, about 3.8 kb to about 4.6 kb of nucleic acids and/or transgenes, about 3.9 kb to about 4.6 kb of nucleic acids and/or transgenes, or about 4.0 kb to about 4.6 kb of nucleic acids and/or transgenes. In certain embodiments, a self-complementary rAAV comprises from about 1.15 kb and about 2.0 kb of nucleic acids and/or transgenes, about 1.2 kb and about 2.0 kb of nucleic acids and/or transgenes, about 1.3 kb and about 2.0 kb of nucleic acids and/or transgenes, about 1.4 kb and about 2.0 kb of nucleic acids and/or transgenes, or about 1.5 kb and about 2.0 kb of nucleic acids and/or transgenes. In certain embodiments, the viral vector comprises at least 3.7 kb of nucleic acids and/or transgenes, at least 3.8 kb of nucleic acid and/or transgenes, at least 3.9 kb of nucleic acids and/or transgenes, or at least 4.0 kb of nucleic acids and/or transgenes.

In certain embodiments, the vector of the instant invention is a single- stranded AAV vector. In certain embodiments, the vector of the instant invention is a self complementary AAV vector. In certain embodiment, the vector further comprises a control element or control sequence in addition to the promoter of the instant invention. For some applications, an expression construct may further comprise regulatory elements which serve to drive expression in a particular cell or tissue type. Such regulatory elements are known to those of skill in the art. The incorporation of tissue specific regulatory elements in the expression constructs of the present invention provides for at least partial tissue tropism.

The AAV vectors of the instant invention may encompass any vector that comprises or derives from components of AAV. In certain embodiments, the AAV vector is suitable to infect mammalian cells, including human cells. In certain embodiments, the AAV vector is suitable to infect any of a number of tissue types such as, without limitation: brain, heart, lung, skeletal muscle, liver, kidney, spleen, or pancreas. For example, the AAV serotype with the desired cell tropism may be selected or utilized. The AAV vector of the instant invention can be used to infect cells either in vitro or in vivo. In certain embodiments, the AAV vectors disclosed herein encode for a desired protein(s) or protein variant(s). The AAV vectors of the instant invention may be derived from one or more serotypes (e.g., “pseudotyped” AAV) or from various genomes (e.g., single- stranded or self-complementary).

The AAV vector is typically flanked by inverted terminal repeats (ITRs) (usually about 100-150 nucleotides in length). The AAV vector will generally comprise T-shaped hairpin structures at the 5 ’ and 3 ’ ends which form from a self-complementary region, thereby forming energetically stable double stranded regions. In addition to the promoter of the instant invention and the transgene or heterologous nucleic acid, AAV vectors may comprise at least part of the viral genome. Typically, the AAV particles of the present invention are produced from recombinant AAV viral vectors which are replication defective (e.g., lacking Rep and Cap genes). In certain embodiments, the AAV vectors lack all AAV coding sequences. In certain embodiments, the AAV vectors comprise a polyadenylation sequence.

Any of the nucleic acid molecules (e.g., promoter, nucleic acid molecule, vectors, AAV, etc.) of the present invention may be incorporated into compositions (e.g., pharmaceutical compositions). In certain embodiments, the compositions (e.g., pharmaceutical compositions) of the instant invention also contain a pharmaceutically acceptable carrier. Such carriers include any pharmaceutical agent that does not itself induce an immune response harmful to the individual receiving the composition, and which may be administered without undue toxicity. Pharmaceutically acceptable carriers include, but are not limited to, liquids such as water, saline, glycerol, sugars and ethanol. Pharmaceutically acceptable salts can also be included therein, for example, mineral acid salts such as hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, malonates, benzoates, and the like. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. A thorough discussion of pharmaceutically acceptable excipients is available in Remington's Pharmaceutical Sciences.

Methods of synthesizing AAV vectors of the instant invention are also encompassed herein. In certain embodiments, the methods comprise inserting the promoter of the instant invention into the AAV vectors. For example, the promoter of an AAV vector may be replaced with the promoter of the instant invention, thereby increasing the loading capacity of the AAV vector. In certain embodiments, the AAV vector is engineered from wild-type AAV by deleting viral genes such as the rep and cap genes and replacing these with the transgene and/or heterologous nucleic acid of interest under the control of the promoter of the instant invention. AAV may be synthesized in human cells (e.g., the human embryonic kidney cell line 293). For example, plasmids expressing the transgene and/or heterologous nucleic acid under the control of the promoter of the instant invention and a second plasmid supplying adenovirus helper functions (e.g., pAD5) along with a third plasmid containing the rep and cap genes (e.g., AAV-2 rep and cap genes) may be used to produce AAV vectors (e.g., AAV-2 vectors). Other AAV serotype cap genes (e.g., AAV-1, AAV-6, or AAV-8 cap genes) or recombinant capsid (e.g., KP1) may be expressed with other serotype rep genes and ITRs (e.g., AAV-2 rep gene and ITRs) to produce different vectors. AAV may be purified by repeated CsCl density gradient centrifugation and the titer of purified vectors determined by quantitative dot-blot hybridization.

Methods of enhancing expression of a nucleic acid molecule in a cell, such as a mammalian cell (e.g., human cell), are also encompassed by the instant invention. The methods may be performed in vitro or in vivo. In certain embodiments, the method comprises expressing the nucleic acid molecule from the promoter of the instant invention. The nucleic acid molecule may be coding (e.g., transgene) or non-coding. In certain embodiments, the cell is a pancreatic endocrine cell, a pancreatic endocrine alpha cell, a pancreatic endocrine beta cell, a beta cell in human islets, a hepatocyte, or a primary human hepatocyte. The nucleic acid molecules may be delivered to the cell by any means known in the art such as, without limitation: transfection, transduction, and infection. The methods can lead to transient or stable expression within host cells. The modified cells which comprise the promoter of the instant invention are also encompassed. Definitions

The following definitions are provided to facilitate an understanding of the present invention.

The singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.

A “promoter” as used herein encompasses a DNA sequence that directs the binding of RNA polymerase and thereby promotes RNA synthesis (e.g., a minimal sequence sufficient to direct transcription). Promoters are generally located upstream or 5’ of the nucleic acid to be expressed. Promoter regions may comprise binding sites for transcription factors. Promoters and corresponding nucleic acid (RNA), protein or polypeptide expression may be ubiquitous (e.g., active in a wide range of cells, tissues and species) or cell-type specific, tissue-specific, or species specific. Promoters may be “constitutive” (e.g., continually active) or “inducible” (e.g., the promoter can be activated or deactivated by the presence or absence of biotic or abiotic factors).

A first nucleic acid sequence is “operably linked” with a second nucleic acid sequence when the first nucleic acid sequence is in a functional relationship with the second nucleic acid sequence. For example, a promoter is operably linked with a nucleic acid sequence when the promoter affects the transcription or expression of the nucleic acid sequence. In certain embodiments, operably linked nucleic acid sequences are generally contiguous. The distance to the operably linked nucleic acid may be variable, as long as the promoter of the present invention is capable of driving and/or regulating the transcription or expression of the operably linked nucleic acid.

“Pharmaceutically acceptable” indicates approval by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in animals, and more particularly in humans.

A “carrier” refers to, for example, a diluent, adjuvant, preservative (e.g., Thimersol, benzyl alcohol), anti-oxidant (e.g., ascorbic acid, sodium metabisulfite), solubilizer (e.g., polysorbate 80), emulsifier, buffer (e.g., TrisHCl, acetate, phosphate), water, aqueous solutions, oils, bulking substance (e.g., lactose, mannitol), excipient, auxiliary agent or vehicle with which an active agent of the present invention is administered. Suitable pharmaceutical carriers are described in “Remington's Pharmaceutical Sciences” by E.W. Martin (Mack Publishing Co., Easton, PA); Gennaro, A. R., Remington: The Science and Practice of Pharmacy, (Lippincott, Williams and Wilkins); Liberman, et al., Eds., Pharmaceutical Dosage Forms, Marcel Decker, New York, N.Y.; and Kibbe, et ak, Eds., Handbook of Pharmaceutical Excipients (3rd Ed.), American Pharmaceutical Association, Washington.

“Identity”, as used herein, refers to the percent identity between two polynucleotide or two polypeptide moieties. The term “substantial identity”, when referring to a nucleic acid, or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in about 90 to 100% of the aligned sequences. When referring to a polypeptide, or fragment thereof, the term “substantial identity” indicates that, when optimally aligned with appropriate gaps, insertions or deletions with another polypeptide, there is nucleotide sequence identity in about 90 to 100% of the aligned sequences. The term “highly conserved” means at least 80% identity, preferably at least 90% identity, and more preferably, over 97% identity. In some cases, highly conserved may refer to 100% identity. Identity can be readily determined by one of skill in the art by, for example, the use of algorithms and computer programs known by those of skill in the art.

Alignments between sequences of nucleic acids or polypeptides may be performed using any of a variety of publicly or commercially available Multiple Sequence Alignment Programs, such as "Clustal W", accessible through Web Servers on the internet. Alternatively, Vector NTI utilities may also be used. There are also a number of algorithms known in the art that can be used to measure nucleotide sequence identity, including those contained in the programs described above.

A polynucleotide or polypeptide has a certain percent “sequence identity” to another polynucleotide or polypeptide, meaning that, when aligned, that percentage of bases or amino acids are the same when comparing the two sequences. Sequence similarity can be determined in a number of different manners. To determine sequence identity, sequences can be aligned using the methods and computer programs, including BLAST, available over the world wide web at ncbi.nlm.nih.gov/BLAST/. Another alignment algorithm is FASTA, available in the Genetics Computing Group (GCG) package (Madison, WI), a wholly owned subsidiary of Oxford Molecular Group, Inc. Other techniques for alignment are described in Methods in Enzymology, vol. 266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, Academic Press, Inc., a division of Harcourt Brace & Co., San Diego, Calif., USA. Of particular interest are alignment programs that permit gaps in the sequence. The Smith- Waterman is one type of algorithm that permits gaps in sequence alignments. See Meth. Mol. Biol. 70: 173-187 (1997). Also, the GAP program using the Needleman and Wunsch alignment method can be utilized to align sequences. See J. Mol. Biol. 48: 443- 453 (1970).

With reference to nucleic acids of the invention, the term “isolated nucleic acid” is sometimes used. This term, when applied to DNA, refers to a DNA molecule that is separated from sequences with which it is immediately contiguous (in the 5' and 3' directions) in the naturally occurring genome of the organism from which it originates. For example, the “isolated nucleic acid” may comprise a DNA or cDNA molecule inserted into a vector, such as a plasmid or virus vector, or integrated into the DNA of a prokaryote or eukaryote. With respect to RNA molecules of the invention, the term “isolated nucleic acid” primarily refers to an RNA molecule encoded by an isolated DNA molecule as defined above. Alternatively, the term may refer to an RNA molecule that has been sufficiently separated from RNA molecules with which it would be associated in its natural state (i.e., in cells or tissues), such that it exists in a “substantially pure” form.

The term “vector” refers to a vehicle or a carrier nucleic acid molecule (e.g., RNA or DNA) into which a nucleic acid sequence can be inserted for introduction into a host cell, where it may be replicated. A “non- viral vector” refers to any vector that does not comprise a virus. A vector may be DNA or RNA and may be single- or double- stranded. A vector may comprise at least one origin of DNA replication and/or at least one selectable marker gene. A vector may comprise one or more transgenes or heterologous nucleic acids. Examples of vectors include, without limitation: a plasmid, cosmid, bacteriophage, bacterial artificial chromosome (BAC), or vims. Vectors may be used to transduce, transform, or infect a cell.

An “expression vector” is a specialized vector that contains a gene or nucleic acid sequence with the necessary regulatory regions (e.g., promoter) needed for expression, such as in a host cell. The following example is for illustration only. In light of this disclosure, those of skill in the art will recognize that variations of these examples and other embodiments of the disclosed subject matter are enabled without undue experimentation. EXAMPLE

Methods

Human pancreatic islets from non-diabetic donors were obtained from Integrated Islet Distribution Program (IIDP) at City of Hope. Human hepatocytes were obtained from Oregon Health and Science University. Antibodies mouse anti-glucagon (GCG) antibody and rat anti-C-peptide (c-pep) were used to label alpha and beta cells, respectively. Protocols for human islet medium and medium for primary culture hepatocytes are well established. The full-length glucagon promoter GCG650 (650 base pairs) was obtained from the University of Pennsylvania. The GCG324 and GCG110 are deletion derivatives of the GCG650. For construction of GCG110, 135 bp oligonucleotides corresponding to the top and bottom strand of the mini promoter were synthetized (Integrated DNA Technologies (IDT)) and hybridized to form a duplex molecule. Subsequently, the GCG110 fragment was cloned into a AAV vector containing an intron and mRFP. For AAV vectors with control promoters, CAG, CMV, and INS84 were cloned into the same vector backbone as AAV-GCG110 in the place of GCG110. Viral particles of AAV were produced using polyethylenimine (PEI) mediated transfection method with the transplasmids (AAV vector containing vector plasmids), a plasmid encoding recombinant capsid KP1, and pAD5 (Adeno helper plasmid) in HEK293 cells. After incubation for 6 days, culture supernatants were collected for concentration of AAV particles using polyethylene glycol 8000 (PEG 8000) precipitation method. For transduction of cells and islets, a multiplicity of infection (moi) of 10⁵ was applied.

Summary of the GCG110 Promoter

GCG110 is a mini promoter driven by the RNA polymerase II complex for transcription of genes and other DNA elements, such as in a viral or plasmid expression vector. As shown in Fig. 1, GCG110 consists of 135 nucleotides, of which the 110 base pair (bp) segment at the 5'-end serves as a promoter and the following 25 bp segment as the transcription start (+1) and the 5' untranslated region (UTR). The 135 bp sequence is derived from the human glucagon promoter, located directly upstream of the proglucagon gene. The 110 bp sequence contains the core promoter with TATA box for the binding of RNA polymerase and the G1 box that defines the glucagon gene specific expression unit. Construct of an AAV vector Containing the GCG110 Promoter and Expression in Mouse Cultured Cells

To test the activity of the GCG110 promoter, an adeno- associated virus expression vector was used. The GCG110 promoter was cloned upstream of a reporter gene encoding monomeric red fluorescent protein (mRFP) (Fig. 2). Two control promoters were also cloned in the same AAV vector backbone in the place of GCG110. The control promoters are the full-length (GCG650) and the half-length (GCG324) glucagon promoters which are 650 bp and 324 bp in length, respectively. All three AAV constructs were used to produce AAV particles. Mouse cell lines TCI (alpha cells) and MIN6 (insulinoma beta cells) were transduced with the AAVs and visualized at day 4 after the transduction. GCG110 showed a strong promoter activity in both mouse cells and the activity was higher than GCG650 or GCG324 promoters (Fig. 2). As shown in Fig. 2, the activity of the full-length glucagon promoter is highly regulated and significant expression is only observed in alpha cells of pancreatic islets or in aTCl cells. In contrast, GCG110 does not show alpha cell specific expression as it is active in both alpha (aTCl) and beta cells (MIN6). Surprisingly, despite lacking enhancer regions, GCG110 is substantially stronger than the GCG650 promoter or the GCG324 promoter (Fig. 2).

Universal Activity of the GCG110 Promoter in Human Islet Cells

As shown in Fig. 3, human islets were transduced with AAVs expressing the mRFP transgene under the GCG110 promoter and the two control promoters, CAG and INS84 (a universal mini promoter). At day 4 after the transduction, overall intensity of mRFP was high in all transduced islets. GCG110 was shown to be as strong as or stronger than CAG mRFP expression, indicating GCG110 promoter activity is at least comparable to the CAG promoter strength in human islets.

To identify the islet cell population(s) expressing mRFP under the GCG110 promoter, the islets were dissociated and two major cell populations - i.e., alpha and beta cells - were immunolabelled with antibodies anti-glucagon (alpha cell marker) and anti- C-peptide (beta cell marker) for flow cytometric analysis. As shown in Figs. 4A and 4B, both GCG110 and CAG showed universal promoter activity by mRFP expression in all cell types, in relation to their cell mass in islets.

GCG110 Promoter in Human Hepatocytes The promoter activity of GCG110 was also tested in primary cultures of human hepatocytes along with other universal promoters as controls. As shown in Fig. 5, human hepatocytes were plated and transduced with AAVs expressing mRFP transgene under the control of the respective promoter. Images of mRFP expression in the transduced cells were taken at different time points. CAG is known to be an early promoter and showed mRFP expression as early as at day 3, while the other promoters did not show such robust early activity. At day 4, however, all cell groups showed the mRFP expression in similar intensity. GCG110 was equally active as CAG by day 6, indicating that GCG110 is active in hepatocytes and the promoter strength is comparable to that of CAG, while higher than that of CMV and INS 84 in these cells.

GCG110 activity in mouse skeletal muscle

Adult mice were intramuscularly injected with AAV expressing an mRFP transgene from the GCG110 promoter. Confocal microscopy was performed 4 weeks after the AAV injection in muscle tissue, which was fixed with 4% paraformaldehyde. Nuclei of cells were stained with Hoechst33342. As seen in Figure 6, GCG110 shows strong promoter activity in these muscle cells. AAVs with the CAG promoter expressing mRFP were tested in the experiment to compare the promoter activity. As seen in Figure 6, the GCG110 promoter has similar activity to the CAG promoter.

GCG110 activity in mouse brain

AAV expressing an mRFP transgene from the GCG110 promoter were injected into the brain of an adult mouse (10 weeks old). Three weeks after the injection, the brain was fixed in 4% paraformaldehyde and embedded in optimal cutting temperature compound (OCT compound) for cryosectioning and confocal microscopy. As seen in Figure 7, different regions of brain along the injection site shows mRFP expression. The lower panels of Figure 7 show the neurons expressing mRFP in white to show neurites. GCG110 expressed mRFP in areas of hippocampus and thalamus.

Transgene expression from GCG110

Figure 8A provides schematics of two AAV constructs. The first AAV contains the GCG110 promoter driving expression of mRFP (0.7 kb). The second AAV contains the GCG110 promoter driving expression of a tri-cistronic construct composed of two transcription factors (Pdxl and MafA) and a green fluorescent reporter eGFP (2.8.kb). These three genes were fused by two ribosomal skipping sequences T2A and P2A. The second AAV also contains a MicroRNA recognition element (MRE) which contains a DNA sequence which is complementary to a microRNA sequence. More specifically, an MRE was used to downregulate expression of the transcript Pdxl-MafA-eGFP in cells expressing a cognate microRNA upon MRE-microRNA interaction.

The AAVs were packaged with the recombinant capsid KPl. Three days after the AAV transduction, MIN6 cells were fixed and processed for the immunofluorescence detection of transgenes. Fusion proteins Pdxl-T2A and MafA-P2A were visualized by immunostaining of 2A peptide using primary mouse monoclonal antibody against 2A peptide (Novus Biological, 3H4) and a secondary goat anti-mouse conjugated with Alexa Fluor®-555.

Figure 8B provides confocal microscopic images of MIN6 mouse insulinoma cells that were transduced with AAVs. Nuclei stained with DAPI shows both transduced and untransduced cells. The transgene mRFP is expressed in most cells. The tri-cistronic AAV expresses all three proteins, i.e., Pdxl-T2A, MafA-P2A, and eGFP. While these proteins are translated from a tri-cistronic mRNA, expression levels of these proteins differ due to the ribosome skipping at the 2A sequences. Consequently, the transduced cells show different levels of 2A-fused proteins and eGFP. Co-localization of red and green signal indicates expression of all three proteins in the same cell.

While certain of the preferred embodiments of the present invention have been described and specifically exemplified above, it is not intended that the invention be limited to such embodiments. It will be obvious to those having skill in the art that many changes may be made to the details of the above-described embodiments without departing from the underlying principles of the invention. The scope of the present invention should, therefore, be determined only by the following claims.

Claims

WHAT IS CLAIMED IS

1. A nucleic acid molecule comprising a promotor operably linked to a heterologous nucleic acid sequence, wherein said promoter comprises a sequence having at least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 2.

2. The nucleic acid molecule of claim 1, wherein said promoter has at least 95% sequence identity to SEQ ID NO: 1 or SEQ ID NO: 2.

3. The nucleic acid molecule of claim 1, wherein said promoter comprises SEQ ID NO: 1 or SEQ ID NO: 2.

4. The nucleic acid molecule of any one of claims 1-3, wherein said promoter comprises less than 150 nucleotides.

5. An expression vector comprising the nucleic acid molecule of any one of claims 1-4.

6. The expression vector of claim 5, wherein the expression vector comprises a plasmid, an adeno-associated virus (AAV) vector, single stranded AAV, self complementary AAV, adenovirus, Moloney murine sarcoma virus, murine stem cell vims, human immunodeficiency vims, Semliki Forest vims, Sindbis vims, Venezuelan equine encephalitis virus, Kunjin virus, West Nile virus, dengue vims, vesicular stomatitis vims, measles vims, Newcastle disease virus, vaccinia virus, cytomegalovirus, or coxsackievims.

7. The expression vector of claim 5 or claim 6, further comprising at least 3.7 kb of transgene(s).

8. The expression vector of claim 6, which is a single-stranded AAV.

9. The expression vector of claim 6, which is a self-complementary AAV.

10. A method of enhancing expression of a nucleic acid in a cell, said method comprising introducing a nucleic acid molecule of any one of claims 1-4 into said cell.

11. The method of claim 10, wherein said cell is a mammalian cell.

12. The method of claim 11, wherein the mammalian cell is a human cell.

13. The method of claim 10, wherein the cell is a human hepatocyte cell, a human islet cell, a Hek293 cell, an alpha-TCl cell, or an insulinoma beta cell.

14. A modified cell comprising a nucleic acid molecule of any one of claims 1-4.