WO2001034625A1 - Isolation and characterization of human nf-e4 - Google Patents

Isolation and characterization of human nf-e4 Download PDF

Info

Publication number
WO2001034625A1
WO2001034625A1 PCT/US2000/030988 US0030988W WO0134625A1 WO 2001034625 A1 WO2001034625 A1 WO 2001034625A1 US 0030988 W US0030988 W US 0030988W WO 0134625 A1 WO0134625 A1 WO 0134625A1
Authority
WO
WIPO (PCT)
Prior art keywords
expression
polypeptide
globin
cell
nucleic acid
Prior art date
Application number
PCT/US2000/030988
Other languages
French (fr)
Other versions
WO2001034625A9 (en
Inventor
Stephen M. Jane
John M. Cunningham
Wenlai Zhou
David R. Clouston
Original Assignee
St. Jude Children's Research Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by St. Jude Children's Research Hospital filed Critical St. Jude Children's Research Hospital
Priority to AU14831/01A priority Critical patent/AU1483101A/en
Publication of WO2001034625A1 publication Critical patent/WO2001034625A1/en
Publication of WO2001034625A9 publication Critical patent/WO2001034625A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4702Regulators; Modulating activity
    • C07K14/4703Inhibitors; Suppressors
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4702Regulators; Modulating activity
    • C07K14/4705Regulators; Modulating activity stimulating, promoting or activating activity
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value
    • G01N2500/20Screening for compounds of potential therapeutic value cell-free systems

Definitions

  • the present invention concerns the use of fetal globin to supplement defective adult globin for the treatment of hemoglobinopathies.
  • the invention concerns the identification of a novel developmental stage-specific and tissue-restricted protein NF-E4 that, when associated with the ubiquitous transcription factor CP2, induces gene expression from the stage selector element of the proximal fetal globin promoter.
  • the invention provides the isolated polypeptides, nucleic acids encoding these polypeptides, expression vectors, transfected cells, screening assays, and strategies for using NF-E4 to induce expression of fetal globin and reduce expression of defective adult globin, e.g., to treat hemoglobinopathies such as ⁇ - thalassemia and sickle-cell anemia.
  • the human ⁇ -globin cluster is the classic paradigm of a multigene locus.
  • the globin genes ( ⁇ , G ⁇ , , ⁇ , ⁇ ) are expressed at high level throughout ontogeny in a stringently regulated developmental and tissue-specific pattern.
  • the embryonic globin gene ( ⁇ ) is expressed in the yolk sac, the major site of erythropoiesis.
  • the first switch in globin subtype occurs, as the fetal globin genes ( G ⁇ , A ⁇ ) become the dominant transcripts in the erythropoietic cells of the fetal liver.
  • This expression pattern persists until birth, when the switch from fetal to adult ( ⁇ ) globin synthesis occurs, coincident with the bone marrow becoming the predominant erythropoietic organ (Stamatoyannopoulos and Nienhuis, In The Molecular Basis of Blood Diseases, 2 nd Edition, In Stamatoyannopoulos et al.
  • LCR Locus Control Region
  • HSs act cooperatively as a holocomplex which focuses the vast enhancing potential of the LCR to a single globin gene at any given time point during ontogeny (Wijgerde et al, Nature, 377:209-213, 1995).
  • murine fetal liver cells transgenic for the ⁇ -globin locus the LCR flip-flops back and forth between the ⁇ - and ⁇ -globin genes at the time of the fetal/adult switch.
  • the stability of the ⁇ -gene/LCR interaction decreases and ⁇ -globin becomes the predominantly transcribed gene.
  • stage selector element in the chick ⁇ - globin promoter was essential for the preferential interaction of that promoter with the locus enhancer during adult erythropoiesis.
  • the activity of the promoter element was mediated through the binding of a putative stage-specific factor, designated NF-E4 or nuclear factor-erythroid 4 (Gallarda et al, Genes & Dev., 3:1845-1859, 1989; Yang and Engel, J. Biol. Chem., 269: 13-10079, 1994).
  • NF-E4 stage-specific factor 4
  • Promoter sequences and stage- specific factors have also been shown to be critical for correct developmental regulation of the human and murine ⁇ -globin clusters.
  • mice carrying a transgene of the human locus lacking the LCR, or ES cells in which the native LCR has been removed by homologous recombination still display appropriate temporal patterns of globin expression, albeit at reduced levels (Starck et al, Blood, 84:1656-1665, 1994; Epner et al, Mol. Cell, 2:447-455, 1998).
  • deletion of the human ⁇ -gene promoter or its substitution with a non-developmentally specific erythroid promoter in transgenic mice abolishes the correct temporal profile of both ⁇ - and ⁇ -gene expression (Anderson et al, Mol. Biol. Cell, 4:1077-1085, 1993; Sabatino et al, Mol. Cell. Biol., 18:6634-6640, 1998).
  • the ⁇ -promoter Two regions of the ⁇ -promoter appear to be responsible for its competitive advantage in the fetal erythroid environment.
  • the first, the CACCC box binds a recently described member of the Kruppel family, fetal-Kruppel-like factor (FKLF) (Asano et al, Mol. Cell. Biol, 19:3571-3579, 1999). Expression of this gene is detectable in fetal liver and to a lesser extent adult bone marrow, but its functional effects appear to predominantly involve the ⁇ - and ⁇ -globin genes.
  • FKLF fetal-Kruppel-like factor
  • the second region in the ⁇ -promoter was defined in transfection studies in the K562 cell line, a model of fetal erythropoiesis (Lozzio and Lozzio, Blood, 45:321-334, 1975). In these experiments, an 18 bp SSE immediately 5' of the TATA box was sufficient for preferential transcription from the ⁇ -promoter when in competition with the ⁇ - promoter for a single enhancer element (HS2) from the LCR (Jane et al, EMBO J., 11 :2961-2969, 1992).
  • HS2 single enhancer element
  • Biochemical purification of the SSP revealed that the ubiquitously expressed transcription factor CP2 (also known as LBP-lc/LSF) formed a major component of the SSP binding activity (Kim et al, Proc. Natl. Acad. Sci. USA, 84:6025-6029, 1987; Lim et al, Mol. Cell. Biol., 12:828-835, 1992; Yoon et al, Mol. Cell. Biol., 14: 1776-1785, 1994; Jane et al, EMBO J., 14:97-105, 1995).
  • Antiserum to CP2 ablated the SSP/SSE complex in electrophoretic mobility shift assays (EMSA).
  • Hemoglobinopathies present a major health problem to individuals suffering from them. To date, no effective methods of treatment address the underlying deficiency of functional hemoglobin proteins. By providing a molecular mechanism to harness fetal and embryonic globin expression and reduce defective adult globin expression, the present invention overcomes hemoglobinopathies. In addition, the invention also permits regulation of fetal and embryonic gene expression, in the form of an inhibitory polypeptide.
  • the invention provides an isolated NF-E4 polypeptide, in particular, human NF-E4, but also other species variants (orthologs) of human NF-E4.
  • the polypeptide is a fetal and embryonic globin-gene expression promoting polypeptide, e.g. , having an apparent molecular weight of 22 kD by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).
  • the polypeptide can inhibit fetal or embryonic globin gene expression and, e.g., has an apparent molecular weight of 14 kD by SDS-PAGE.
  • the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:8; particularly the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:3; more particularly, the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:2.
  • the polypeptides of the invention can be fusion polypeptides, i.e., containing non-NF-E4 sequences.
  • the invention further provides an isolated nucleic acid encoding an NF-E4 polypeptide, e.g. , as set forth above.
  • the isolated nucleic acid comprises a nucleotide sequence as depicted in SEQ ID NO:l from about nucleotide 421 to about nucleotide 657, in a more particular embodiment, the sequence includes from about nucleotide 121 to about nucleotide 657.
  • the nucleic acid may be provided lacking an internal translation start site, for example, lacking a codon for methionine located at a position corresponding to nucleotide 421 (SEQ ID NO: 1).
  • the invention also provides a nucleic acid of at least 10 nucleotides that hybridizes under stringent conditions to a nucleic acid having a coding sequence as depicted in SEQ ID NO: 1.
  • a nucleic acid may be a coding sequence, an anti-sense nucleic acid, a ribozyme, a PCR primer, or an ohgonucleotide probe.
  • An expression vector of the invention comprises the nucleic acid operably associated with an expression control sequence.
  • the expression vector is a viral vector.
  • a host cell comprising the expression vector of the invention and an associated method for expressing an NF-E4 polypeptide. This method comprises propagating the host cell under conditions that permit expression of NF-E4 from the expression vector.
  • the invention provides a transgenic animal comprising the nucleic acid of the invention operably associated with an expression control sequence, which transgenic animal is capable of expressing the NF-E4 polypeptide at a level sufficient to modulate globin gene expression.
  • the transgenic is also transgenic with a human ⁇ -globin cluster locus control region (LCR).
  • the invention thus provides a method of screening for compounds that modulate globin expression by affecting stage selector protein activity (SSP) in a cell.
  • the method comprises determining if a test compound contacted with a recombinant NF-E4 polypeptide modulates the activity of the NF-E4 polypeptide. If the candidate compound modulates the activity of the NF-E4 polypeptide, it is a candidate for regulation of SSP activity.
  • the recombinant NF-E4 polypeptide is in a cell-free system; alternatively, the recombinant NF-E4 polypeptide is expressed in a host cell.
  • a test compound is evaluated for the ability to induce NF-E4 positive (or negative) regulator expression (or activity) in a test cell or animal.
  • the invention provides a method for inducing or increasing expression of fetal and embryonic globin in a cell.
  • This method comprises increasing the activity of positive regulator (e.g., 22 kD) NF-E4 in the cell, for example, by introducing an expression vector comprising a nucleic acid encoding a positive regulator NF-E4 polypeptide operably associated with an expression control sequence into the cell.
  • positive regulator NF-E4 polypeptide simultaneously causes reduction in adult globin (e.g. , ⁇ -globin) gene expression.
  • a method for inhibiting expression of fetal globin in a cell comprises increasing the activity of negative regulator (e.g., 14 kD) NF-E4 in the cell, for example, by introducing an expression vector comprising a nucleic acid encoding a 14 kD NF-E4 polypeptide operably associated with an expression control sequence into the cell.
  • negative regulator e.g. 14 kD
  • the present invention also provides a novel method of treatment for various hemoglobinopathies in mammals, including ⁇ -thalassemia and sickle-cell anemia.
  • increased fetal and embryonic globin gene expression is achieved by providing enhanced positive regulator NF-E4 activity either in the form of polypeptide or nucleic acid directing expression of said polypeptide in a mammal.
  • a mammal in need of such treatment is human.
  • the ability of enforced expression of NF-E4 to simultaneously suppress ⁇ -globin gene expression provides a particularly advantageous method of treatment for a sickle-cell disease due to the dual beneficial effects of enhanced fetal globin expression and reduction of synthesis of defective ⁇ s -globin.
  • NF-E4 nucleic acids and/or polypeptides in conjunction with the method of treatment, also disclosed herein is the use of NF-E4 nucleic acids and/or polypeptides in the manufacturing of various medicaments useful for treatment of hemoglobinopathies in mammals.
  • FIG. 1 Diagrammatic representation of the MSCV-NF-E4-HA retroviral vector.
  • the vector consists of the MSCV retroviral backbone containing NF-E4 coding sequence tagged at the COOH-terminus with a hemagglutinin epitope (HA), followed by an IRES from the encephalomyocarditis virus linked to the GFP cDNA.
  • HA hemagglutinin epitope
  • FIG. 1 Western analysis of native K562 cells. Nuclear extract from K562 cells was resolved on a 12% polyacrylamide gel, transferred to PVDF membrane and probed with polyclonal anti-NF-E4 antisera. The migration of molecular weight standards is indicated.
  • Figure 4 Expression of NF-E4 in primary tissues.
  • First strand cDNA transcribed from poly A(+) RNA from multiple primary tissues of cell lines was used as a template to PCR amplify a product using primers specific for SI 4.
  • Samples were then diluted and re-amplified to give comparable band intensities for the same number of amplification cycles of S14 RNA within the linear range of the assay. This represented 20, 25, and 30 cycles.
  • comparable amounts of cDNA from multiple primary human tissues were PCR amplified using primers specific for NF-E4. These primers span a 1.8 kb intron and thus discriminate between mRNA- and gnomic DNA-derived signal. Cycle numbers were chosen to represent the linear range of amplification, in this case, 30 and 35 cycles.
  • FIG. 1 Northern analysis shows that enforced expression of NF-E4 induces ⁇ -globin gene expression.
  • K562 cells were transduced with either MSCV-NF-E4-HA or MSCV (blanc vector) retrovirus and GFP-positive cells obtained by FACS. Cells were resorted and cultured in oligoclonal pools. Ten ⁇ g of total RNA from five MSCV-NF-E4-HA pools (lanes 3-7) or two MSCV pools (lanes 1 and 2) were analyzed with ⁇ -gene and NF-E4 probes. GAPDH served as the control.
  • FIGS. 6A and 6B Northern analysis demonstrates that enforced expression of NF-E4 induces ⁇ -globin gene expression.
  • K562 cells were transduced with either the MSCV-NF-E4-HA vector or MSCV retrovirus (blank vector), and
  • GFP-positive cells obtained by FACS. Cells were resorted and cultured in oligoclonal pools. Ten ⁇ g of total RNA from (A) four MSCV pools (lanes 1-4) or (B) five MSCV-NF-E4-HA pools (lanes 5-9) was analyzed with ⁇ -gene and NF-E4 probes. GAPDH served as the control. Figures 7A and 7B. Northern analysis shows that a dominant negative
  • NF-E4 reduces ⁇ -globin expression in K562 cells. Arrows indicate bands for NF-E4, GAPDH (a control marker), and ⁇ -globin.
  • Lanes 1 and 2 show extracts from K562 cells infected with empty MSCV vector.
  • Lanes 1-6 show different pools of K562 cells infected with MSCV retrovirus containing truncated NF-E4-HA fusion polypeptide construct.
  • FIG. 1 Genomic structure of NS-E4 gene.
  • the NS-E4 gene was identified by BLAST search on a segment of human X-chromosome deposited with GenBank (AC002416).
  • the present invention advantageously permits, for the first time, development of strategies for therapeutic induction of fetal and embryonic globin in hemoglobinopathies.
  • This invention is based, in part, on identification and cloning of a novel gene, NF-E4, isolated utilizing a part of CP2, the ubiquitously expressed component of the stage selector protein (SSP), in a yeast two-hybrid screen.
  • SSP stage selector protein
  • NF-E4 gene encodes the tissue-restricted component of the SSP which together with CP2 forms a heterodimeric complex involved in the regulation of fetal ⁇ -globin genes and embryonic ⁇ -globin genes.
  • NF-E4 is expressed in fetal liver, cord blood, and bone marrow, and in the K562 and HEL cell lines, which constitutively express the fetal globin genes. Enforced expression of NF-E4 in K562 cells and primary erythroid progenitors induces endogenous fetal globin gene expression.
  • NF-E4 e.g. , in cord blood progenitors
  • NF-E4 e.g. , in cord blood progenitors
  • ⁇ -globin adult globin
  • the present invention is further based, in part, on the unexpected discovery that a truncated form of NF-E4 negatively regulates NF-E4 function.
  • a smaller 14 kD peptide species found on Western blots from K562, bone marrow, and cord blood inhibits fetal globin expression.
  • the size of this species corresponds to initiation from the internal AUG of the 22 kD NF-E4 coding sequence.
  • a retrovirus containing the NF-E4 cDNA truncated to this AUG and tagged at the 3' end with an HA epitope generated a peptide which on Western analysis with anti-HA antisera co-migrates with the native 14 kD species. Enforced expression of this smaller peptide resulted in significant reduction in ⁇ -globin gene expression in K562 cells, suggesting that it functions as a dominant negative regulator.
  • the invention advantageously provides positive and negative regulators of both fetal and adult globin gene expression.
  • Manipulation of the activity of these regulators permits (1) identification of compounds that can regulate globin gene expression, (2) evaluation and diagnosis of various conditions, and (3) intervention in diseases or disorders that involve defects in globin function, e.g. , hemoglobinopathies.
  • NF-E4 is useful to reactivate endogenous fetal genes expressing fetal and embryonic globin.
  • Such reactivation of the fetal genes protects a patient with diseased ⁇ -globin genes, such as individuals suffering from sickle-cell anemia, ⁇ -thalassemia, and certain types of cancer from the harm caused through the pathogenesis of these diseases.
  • the ability of NF- E4 expression to differentially and simultaneously affect both fetal and adult globin levels provides an additional benefit for treatment of hemoglobinopathies characterized by expression of a defective adult globin (e.g. , sickle-cell disease).
  • Identification of human NF-E4 provides a novel method for screening compounds useful in the therapy of globin disorders, including sickle-cell disease and ⁇ -thalassemia. Such screens can be performed in vitro, e.g., by using erythroid cell differentiation assays termed Burst Forming Unit-erythroid assays (BFUe).
  • BFUe Burst Forming Unit-erythroid assays
  • primary hematopoietic cells such as bone marrow and peripheral cells, from patients with sickle-cell disease or ⁇ -thalassemia, or better still, host cells or transgenic animals that express NF-E4 and, optimally, a reporter gene under control of the ⁇ -globin locus control region (LCR), are grown in culture in the presence and absence of NF-E4 inducing agents.
  • the level of induction of the globin or reporter genes can be measured immunologically, by RNA analysis, or by detecting reporter gene activity (e.g., luminescense or color formation).
  • the initial in vitro compound screening can be performed in a high-throughput format, and the NF-E4 inducing agents which demonstrate the strongest effect on globin gene expression can be further evaluated in vivo (e.g., using transgenic animals).
  • the most active physiologically permissible compounds can be then used to prepare medicaments for treatment of patients suffering from hemoglobinopathies.
  • NF-E4 polypeptide is a polypeptide that is bound by an NF-E4 antisera or NF-E4-specific antibody, e.g., as described in the Examples, infra. NF-E4 exists in "positive regulator” and “negative regulator” forms. "Positive regulator NF- E4 polypeptide” is exemplified by NF-E4 polypeptide having an apparent or predicted molecular weight of 22 kD (a "22 kD NF-E4 polypeptide").
  • “Negative regulator NF- E4 polypeptide” is exemplified by a dominant negative truncated carboxy-terminal fragment of 22 kD NF-E4 polypeptide, said fragment having an apparent or predicted molecular weight of 14 kD (a "14 kD NF-E4 polypeptide”).
  • NF-E4 homodimerizes and also binds to CP2 leading to formation of a heteromeric complex which induces expression of a coding sequence operably associated with a stage selector element (SSE) associated with a promoter.
  • SSE stage selector element
  • NF-E4 inhibits expression of a coding sequence operably associated with an SSE associated with a promoter.
  • NF-E4 is recognized by antisera generated against a peptide LKTOSALEQTPQQLPSLHLSQG (SEQ ID NO:8) conjugated to KLH.
  • the NF-E4 polypeptide comprises this amino acid sequence.
  • NF-E4 comprises an amino acid sequence as depicted in SEQ ID NO:3.
  • NF-E4 comprises an amino acid sequence as depicted in SEQ ID NO:2.
  • NF-E4 polypeptides also include various fusion polypeptides as defined below, including N-terminal or C-terminal fusions with "tags", such as a hexahistidine (His 6 ) tag or a hemagglutinin (HA) tag exemplified infra, and covalent conjugates generated chemically.
  • tags such as a hexahistidine (His 6 ) tag or a hemagglutinin (HA) tag exemplified infra, and covalent conjugates generated chemically.
  • NF-E4 is modified so that it does not contain any N-terminal methionine residues; in still another embodiment, it does not contain any methionine residues (i.e., it lacks methionine codon at position 421 in SEQ ID NO:l, see Figure 1).
  • NF-E4 also includes polypeptides that are substantially homologous to NF-E4, as depicted in SEQ ID NO:2 or 3.
  • a "nucleic acid encoding human NF-E4" can be a DNA or RNA molecule. In one embodiment, the nucleic acid has at least about 50%, preferably at least about 75%, and more preferably at least about 90% sequence identity to a coding sequence depicted in Figure 1 (SEQ ID NO:l).
  • An NF-E4 "coding sequence" can either be a coding sequence for a 22 kD polypeptide (which initiates from a CTG codon, position 121 in SEQ ID NO:l; see Figure 1) or a 14 kD polypeptide (which initiates from an ATG codon, position 421 in SEQ ID NO:l; see Figure 1).
  • a nucleic acid encoding NF-E4 hybridizes under conditions set forth in detail below to a nucleic acid having a nucleotide sequence corresponding to one of the foregoing coding sequences.
  • a nucleic acid encoding human NF-E4 comprises at least a 10 nucleotide sequence, preferably about a 15 nucleotide sequence, and more preferably at least about a 20 nucleotide sequence that is identical to a sequence in a coding region of SEQ ID NO: 1 (see Figure 1).
  • the coding sequence corresponds to the 22 kD NF-E4 coding sequence, but is modified to omit an internal ATG/ AUG codon at position 421 in SEQ ID NO: 1 (and thus does not encode an internal methionine in the NF-E4 polypeptide); such a construct cannot express the 14 kD NF-E4.
  • the coding sequence is modified to encode the 22 kD NF-E4 with an N- terminal methionine. In yet another specific embodiment, the coding sequence is modified to both omit an internal methionine and encode an N-terminal methionine.
  • nucleic acid of at least 10 nucleotides that hybridizes under stringent conditions to a nucleic acid having a coding sequence as depicted in SEQ ID NO:l refers to a full-length coding sequence for NF-E4, or an ohgonucleotide such as a PCR primer, a probe (which may be labeled), a triple-helix-forming oligo, an antisense nucleic acid, or a ribozyme.
  • Embryonic globin is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated fashion, in particular in the yolk sac of an embryo, ⁇ -globin is an example of embryonic globin.
  • “Fetal globin” is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated fashion, in particular in fetal cells, ⁇ -globin is an example of fetal globin.
  • “Adult globin” is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated in fashion in adult cells.
  • ⁇ -Globin is an example of adult globin.
  • an isolated nucleic acid means that the referenced material is removed from the environment in which it is normally found.
  • an isolated biological material can be free of cellular components, i.e., components of the cells in which the material is found or produced.
  • an isolated nucleic acid includes a PCR product, an isolated mRNA, a cDNA, or a restriction fragment.
  • an isolated nucleic acid is preferably excised from the chromosome in which it may be found, and more preferably is no longer joined to non-regulatory, non-coding regions, or to other genes, located upstream or downstream of the gene contained by the isolated nucleic acid molecule when found in the chromosome.
  • the isolated nucleic acid lacks one or more introns.
  • Isolated nucleic acid molecules include sequences inserted into plasmids, cosmids, artificial chromosomes, and the like.
  • a recombinant nucleic acid is an isolated nucleic acid.
  • An isolated protein may be associated with other proteins or nucleic acids, or both, with which it associates in the cell, or with cellular membranes if it is a membrane-associated protein.
  • An isolated organelle, cell, or tissue is removed from the anatomical site in which it is found in an organism.
  • An isolated material may be, but need not be, purified.
  • purified refers to material that has been isolated under conditions that reduce or eliminate the presence of unrelated materials, i.e., contaminants, including native materials from which the material is obtained.
  • a purified protein is preferably substantially free of other proteins or nucleic acids with which it is associated in a cell; a purified nucleic acid molecule is preferably substantially free of proteins or other unrelated nucleic acid molecules with which it can be found within a cell.
  • substantially free is used operationally, in the context of analytical testing of the material.
  • purified material substantially free of contaminants is at least 50% pure; more preferably, at least 90% pure, and more preferably still at least 99% pure. Purity can be evaluated by chromatography, gel electrophoresis, immunoassay, composition analysis, biological assay, and other methods known in the art.
  • nucleic acids can be purified by precipitation, chromatography (including preparative solid phase chromatography, ohgonucleotide hybridization, and triple helix chromatography), ultracentrifugation, and other means.
  • Polypeptides and proteins can be purified by various methods including, without limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange and partition chromatography, precipitation and salting-out chromatography, extraction, and countercurrent distribution.
  • the polypeptide in a recombinant system in which the protein contains an additional sequence tag that facilitates purification, such as, but not limited to, a polyhistidine sequence, or a sequence that specifically binds to an antibody, such as FLAG and GST.
  • the polypeptide can then be purified from a crude lysate of the host cell by chromatography on an appropriate solid-phase matrix.
  • antibodies produced against the protein or against peptides derived therefrom can be used as purification reagents.
  • Cells can be purified by various techniques, including centrifugation, matrix separation (e.g., nylon wool separation), panning and other immunoselection techniques, depletion (e.g., complement depletion of contaminating cells), and cell sorting (e.g., fluorescence activated cell sorting [FACS]). Other purification methods are possible.
  • a purified material may contain less than about 50%, preferably less than about 75%, and most preferably less than about 90%, of the cellular components with which it was originally associated. The "substantially pure" indicates the highest degree of purity which can be achieved using conventional purification techniques known in the art.
  • the term “about” or “approximately” means within 20%, preferably within 10%, and more preferably within 5% of a given value or range.
  • the term “about” means within about a log (i.e., an order of magnitude) preferably within a factor of two of a given value, depending on how quantitative the measurement.
  • sample refers to a biological material which can be tested for the presence of NF-E4 protein or NF-E4 nucleic acids, e.g., to evaluate a gene therapy or expression in a transgenic animal.
  • samples can be obtained from any source, including tissue, bone marrow, blood and blood cells, particularly circulating hematopoietic stem cells, for possible detection of protein or nucleic acids); plural effusions; cerebrospinal fluid (CSF); ascites fluid; and cell culture.
  • CSF cerebrospinal fluid
  • Non-human animals include, without limitation, laboratory animals such as mice, rats, rabbits, hamsters, guinea pigs, etc.; domestic animals such as dogs and cats; and, farm animals such as sheep, goats, pigs, horses, and cows, and especially such animals made transgenic with human NF-E4.
  • nucleic acid molecule e.g., NF-E4 cDNA, gene, etc.
  • normal text indicates the polypeptide or protein.
  • the present invention advantageously provides NF-E4 protein, including fragments, derivatives, and analogs of NF-E4; NF-E4 nucleic acids, including ohgonucleotide primers and probes, and NF-E4 regulatory sequences; NF- E4-specific antibodies; and related methods of using these materials to detect the expression of NF-E4 proteins or nucleic acids, and in screens for agonists and antagonists of NF-E4.
  • the following sections of the application which are delineated by headings (in bold) and sub-headings (in bold italics), relating to these aspects of the invention, are provided for clarity, and not by way of limitation.
  • NF-E4 NF-E4 polypeptides are defined above.
  • the positive regulator NF-E4 comprises about 179 amino acids; in a specific embodiment, it has 179 amino acids. This form of NF-E4 has a calculated molecular weight of about 22 kD.
  • the negative regulator NF-E4 is a truncated form of the positive regulator variant, having about 79 amino acid residues, corresponding to the C-terminus of the positive regulator polypeptide. In a specific embodiment, the negative regulator polypeptide has 79 amino acids.
  • NF-E4 of the invention can be characterized by specific binding to an anti-NF-E4 antibody, as described below.
  • NF-E4 can also be characterized by its expression pattern (polypeptide and mRNA) in hematopoietic tissues (liver, cord blood, and bone marrow) and in tumor cell lines that constitutively express NF-E4, e.g., K562 cells.
  • NF-E4 fragments, derivatives, and analogs can be characterized by one or more of the characteristics of NF-E4 protein.
  • an NF-E4 fragment also termed herein an NF-E4 peptide, can have an amino acid sequence corresponding to SEQ ID NO:8. In another embodiment, it can have a sequence corresponding to a C-terminus of the positive regulator polypeptide, and in particular a fragment having SEQ ID NO:3.
  • Analogs and derivatives of NF-E4 of the invention have the same or homologous characteristics of NF-E4 as set forth above.
  • a truncated form of NF-E4 can be provided. Such a truncated form includes NF-E4 with a deletion.
  • the derivative is functionally active, i.e., capable of exhibiting one or more functional activities associated with a full-length, wild-type NF-E4 of the invention.
  • an NF-E4 chimeric fusion protein can be prepared in which the NF-E4 portion of the fusion protein has one or more characteristics of NF-E4.
  • Such fusion proteins include fusions of NF-E4 polypeptide with a marker polypeptide, such as FLAG, a hexahistidine (His 6 ) tag, glutathione-S- transferase (GST), or hemagglutinin (HA).
  • NF-E4 can also be fused with a unique phosphorylation site for labeling.
  • NF-E4 can be expressed as a fusion with a bacterial protein, such as ⁇ -galactosidase.
  • NF-E4 analogs can be made by altering encoding nucleic acid sequences by substitutions, additions or deletions that provide for functionally similar molecules, i. e. , molecules that perform one or more NF-E4 functions.
  • an analog of NF-E4 is a sequence-conservative variant of NF-E4.
  • an analog of NF-E4 is a function-conservative variant.
  • an analog of NF-E4 is an allelic variant of the human protein, or a mutant form of NF-E4.
  • Still another analog of NF-E4 is a substantially homologous NF-E4 from another species, e.g., mouse or chicken.
  • Sequence-conservative variants of a polynucleotide sequence are those in which a change of one or more nucleotides in a given codon position results in no alteration in the amino acid encoded at that position.
  • “Function-conservative variants” are those in which a given amino acid residue in a protein or enzyme has been changed without altering the overall conformation and function of the polypeptide, including, but not limited to, replacement of an amino acid with one having similar properties (such as, for example, polarity, hydrogen bonding potential, acidic, basic, hydrophobic, aromatic, and the like).
  • Amino acids with similar properties are well known in the art. For example, arginine, histidine and lysine are hydrophilic-basic amino acids and may be interchangeable. Similarly, isoleucine, a hydrophobic amino acid, may be replaced with leucine, methionine or valine.
  • Amino acids other than those indicated as conserved may differ in a protein or enzyme so that the percent protein or amino acid sequence similarity between any two proteins of similar function may vary and may be, for example, from 70%) to 99%) as determined according to an alignment scheme such as by the Cluster Method, wherein similarity is based on the MEGALIGN algorithm.
  • a “function- conservative variant” also includes a polypeptide or enzyme which has at least 60 % amino acid identity as determined by BLAST or FASTA algorithms, preferably at least 75%), most preferably at least 85%, and even more preferably at least 90%), and which has the same or substantially similar properties or functions as the native or parent protein or enzyme to which it is compared.
  • mutant and mutant mean any detectable change in genetic material, e.g. DNA, or any process, mechanism, or result of such a change. This includes gene mutations, in which the structure (e.g. DNA sequence) of a gene is altered, any gene or DNA arising from any mutation process, and any expression product (e.g. protein or enzyme) expressed by a modified gene or DNA sequence.
  • variant may also be used to indicate a modified or altered gene, DNA sequence, enzyme, cell, etc., i.e., any kind of mutant.
  • homologous in all its grammatical forms and spelling variations refers to the relationship between proteins that possess a "common evolutionary origin,” including proteins from superfamilies (e.g., the immunoglobulin superfamily) and homologous proteins from different species (e.g., myosin light chain, etc.) (Reeck et al, Cell 50:667, 1987). Such proteins (and their encoding genes) have sequence homology, as reflected by their sequence similarity, whether in terms of percent similarity or the presence of specific residues or motifs at conserved positions.
  • sequence similarity in all its grammatical forms refers to the degree of identity or correspondence between nucleic acid or amino acid sequences of proteins that may or may not share a common evolutionary origin (see Reeck et al. , supra).
  • sequence similarity when modified with an adverb such as “highly,” may refer to sequence similarity and may or may not relate to a common evolutionary origin.
  • two DNA sequences are "substantially homologous" or “substantially similar” when the encoded polypeptides are at least 35-40% similar as determined by one of the algorithms disclosed herein, preferably at least about 60%, and most preferably at least about 90 or 95%) in a highly conserved domain, or, for alleles, across the entire amino acid sequence.
  • Sequence comparison algorithms include BLAST (BLAST P, BLAST N, BLAST X), FASTA, DNA Strider, the GCG (Genetics Computer Group, Program Manual for the GCG Package, Version 7, Madison, Wisconsin) pileup program, etc. using the default parameters provided with these algorithms.
  • An example of such a sequence is an allelic or species variant of the specific NF-E4 genes of the invention.
  • NF-E4 derivatives include, but are by no means limited to, phosphorylated NF-E4, myristylated NF-E4, methylated NF-E4, and other NF-E4 proteins that are chemically modified.
  • NF-E4 derivatives also include labeled variants, e.g., radio-labeled with iodine (or, as pointed out above, phosphorous; see EP372707B); a detectable molecule, such as but by no means limited to biotin, a chelating group complexed with a metal ion, a chromophore or fluorophore, a gold colloid, or a particle such as a latex bead; or attached to a water soluble polymer.
  • labeled variants e.g., radio-labeled with iodine (or, as pointed out above, phosphorous; see EP372707B); a detectable molecule, such as but by no means limited to biotin, a chelating group complexed with a metal ion, a chromophore or fluorophore, a gold colloid, or a particle such as a latex bead; or attached to a water
  • the present invention contemplates analysis and isolation of a gene encoding a functional or mutant NF-E4, including a full-length, or naturally occurring form of NF-E4, and any antigenic fragments thereof from any source. It further contemplates expression of functional or mutant NF-E4 protein for evaluation, diagnosis, or, particularly, therapy.
  • functional or mutant NF-E4 protein for evaluation, diagnosis, or, particularly, therapy.
  • conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature.
  • PCR polymerase chain reaction
  • nucleic acid molecule refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules”); or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules”); or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix; or "protein nucleic acids” (PNA) formed by conjugating bases to an amino acid backbone; or nucleic acids containing modified bases, for example thiouracil, thio-guanine and fiuoro-uracil.
  • PNA protein nucleic acids
  • Double stranded DNA-DNA, DNA-RNA and RNA- RNA helices are possible.
  • nucleic acid molecule and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double- stranded DNA found, inter alia, in linear (e.g., restriction fragments) or circular DNA molecules, plasmids, and chromosomes.
  • sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA).
  • a "recombinant DNA molecule” is a DNA molecule that has undergone a molecular biological manipulation.
  • a "polynucleotide” or “nucleotide sequence” is a series of nucleotide bases (also called “nucleotides”) in DNA and RNA, and means any chain of two or more nucleotides.
  • a nucleotide sequence typically carries genetic information, including the information used by cellular machinery to make proteins and enzymes. These terms include double or single stranded genomic and cDNA, RNA, any synthetic and genetically manipulated polynucleotide, and both sense and anti-sense polynucleotide (although only sense stands are being represented herein). This includes single- and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids.
  • the polynucleotides herein may be flanked by natural regulatory
  • nucleic acids may also be modified by many means known in the art.
  • Non-limiting examples of such modifications include methylation, "caps”, substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.).
  • uncharged linkages e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.
  • charged linkages e.g., phosphorothioates, phosphorodithioates, etc.
  • Polynucleotides may contain one or more additional covalently linked moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and alkylators.
  • the polynucleotides may be derivatized by formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage.
  • the polynucleotides herein may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.
  • a "coding sequence” or a sequence “encoding” an expression product, such as a RNA, polypeptide, protein, or enzyme is a minimum nucleotide sequence that, when expressed, results in the production of that RNA, polypeptide, protein, or enzyme, i.e., the nucleotide sequence encodes an amino acid sequence for that polypeptide, protein or enzyme.
  • a coding sequence for a protein may include a start codon (usually ATG, though as shown herein, alternative start codons can be used) and a stop codon.
  • gene also called a "structural gene” means a DNA sequence that codes for a particular sequence of amino acids, which comprise all or part of one or more proteins or enzymes, and may include regulatory (non-transcribed) DNA sequences, such as promoter sequences, which determine for example the conditions under which the gene is expressed.
  • the transcribed region of the gene may include untranslated regions, including introns, 5'-untranslated region (UTR), and 3'-UTR, as well as the coding sequence.
  • a "promoter sequence” is a DNA regulatory region capable of binding RNA polymerase in a cell and initiating transcription of a downstream (3' direction) coding sequence.
  • the promoter . sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background.
  • a transcription initiation site (conveniently defined for example, by mapping with nuclease SI), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase.
  • a coding sequence is "under the control” or “operably (or operatively) associated with” of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans- RNA spliced (if it contains introns) and translated into the protein encoded by the coding sequence.
  • the terms "express” and “expression” mean allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence.
  • a DNA sequence is expressed in or by a cell to form an "expression product" such as mRNA or a protein. The expression product itself , e.g.
  • an expression product can be characterized as intracellular, extracellular or secreted.
  • intracellular means something that is inside a cell.
  • extracellular means something that is outside a cell.
  • a substance is "secreted” by a cell if it appears in significant measure outside the cell, from somewhere on or inside the cell.
  • transfection means the introduction of a heterologous nucleic acid into a cell.
  • transformation means the introduction of a heterologous gene, DNA or RNA sequence to a host cell, so that the host cell will express the introduced gene or sequence to produce a desired product.
  • the introduced gene or sequence may also be called a “cloned” or “heterologous” gene or sequence, and may include regulatory or control sequences, such as start, stop, promoter, signal, secretion, or other sequences used by a cell's genetic machinery.
  • the gene or sequence may include nonfunctional sequences or sequences with no known function.
  • a host cell that receives and expresses introduced DNA or RNA has been
  • the DNA or RNA introduced to a host cell can come from any source, including cells of the same genus or species as the host cell, or cells of a different genus or species.
  • vector means the vehicle by which a DNA or RNA sequence (e.g. a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g. transcription and translation) of the introduced sequence.
  • vectors include plasmids, phages, viruses, etc.; they are discussed in greater detail below.
  • Vectors typically comprise the DNA of a transmissible agent, into which heterologous DNA is inserted.
  • a common way to insert one segment of DNA into another segment of DNA involves the use of enzymes called restriction enzymes that cleave DNA at specific sites (specific groups of nucleotides) called restriction sites.
  • restriction enzymes that cleave DNA at specific sites (specific groups of nucleotides) called restriction sites.
  • a "cassette” refers to a DNA coding sequence or segment of DNA that codes for an expression product that can be inserted into a vector at defined restriction sites.
  • the cassette restriction sites are designed to ensure insertion of the cassette in the proper reading frame.
  • foreign DNA is inserted at one or more restriction sites of the vector DNA, and then is carried by the vector into a host cell along with the transmissible vector DNA.
  • a segment or sequence of DNA having inserted or added DNA, such as an expression vector can also be called a "DNA construct.”
  • a common type of vector is a "plasmid", which generally is a self-contained molecule of double-stranded DNA, usually of bacterial origin, that can readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell.
  • a plasmid vector often contains coding DNA and promoter DNA and has one or more restriction sites suitable for inserting foreign DNA.
  • Coding DNA is a DNA sequence that encodes a particular amino acid sequence for a particular protein or enzyme.
  • Promoter DNA is a DNA sequence which initiates, regulates, or otherwise mediates or controls the expression of the coding DNA.
  • Promoter DNA and coding DNA may be from the same gene or from different genes, and may be from the same or different organisms.
  • a large number of vectors, including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts.
  • Non-limiting examples include pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, WI), pRSET or pREP plasmids (Invitrogen, San Diego, CA), or pMAL plasmids (New England Biolabs, Beverly, MA), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art.
  • Recombinant cloning vectors will often include one or more replication systems for cloning or expression, one or more markers for selection in the host, e.g. antibiotic resistance, and one or more expression cassettes.
  • host cell means any cell of any organism that is selected, modified, transformed, grown, or used or manipulated in any way, for the production of a substance by the cell, for example the expression by the cell of a gene, a DNA or RNA sequence, a protein or an enzyme.
  • Host cells can further be used for screening or other assays, as described infra.
  • Host cells can be cultured cells in vitro or one or more cells in a non-human animal, e.g., a transgenic animal or transiently transfected animal.
  • expression system means a host cell and compatible vector under suitable conditions, e.g. for the expression of a protein coded for by foreign DNA carried by the vector and introduced to the host cell.
  • Common expression systems include E. coli host cells and plasmid vectors, insect host cells and Baculovirus vectors, and mammalian host cells, including but not limited to K562 cells, HEL cells, MEL cells, COS-1 cells, C 2 C 12 cells, CHO cells, HeLa cells, 293T (human kidney cells), mouse primary myoblasts, and NIH 3T3 cells.
  • heterologous refers to a combination of elements not naturally occurring.
  • heterologous DNA refers to DNA not naturally located in the cell, or in a chromosomal site of the cell.
  • a heterologous gene is a gene in which the regulatory control sequences are not found naturally in association with the coding sequence.
  • an NF-E4 gene is heterologous to the vector DNA in which it is inserted for cloning or expression, and it is heterologous to a host cell containing such a vector, in which it is expressed, e.g., a K562 cell.
  • a nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al, supra). The conditions of temperature and ionic strength determine the "stringency" of the hybridization.
  • low stringency hybridization conditions corresponding to a T m (melting temperature) of 55 °C
  • T m melting temperature
  • Moderate stringency hybridization conditions correspond to a higher T m , e.g., 40% formamide, with 5x or 6x SCC.
  • High stringency hybridization conditions correspond to the highest T m , e.g., 50% formamide, 5x or 6x SCC.
  • SCC is a 0.15M NaCI, 0.015M Na-citrate.
  • Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible.
  • the appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of T m for hybrids of nucleic acids having those sequences.
  • the relative stability (corresponding to higher T m ) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA.
  • a minimum length for a hybridizable nucleic acid is at least about 10 nucleotides; preferably at least about 15 nucleotides; and more preferably the length is at least about 20 nucleotides.
  • standard hybridization conditions refers to a T m of 55 °C, and utilizes conditions as set forth above.
  • the T m is 60 °C; in a more preferred embodiment, the T m is 65 °C.
  • high stringency refers to hybridization and/or washing conditions at 68 °C in 0.2XSSC, at 42 °C in 50% formamide, 4XSSC, or under conditions that afford levels of hybridization equivalent to those observed under either of these two conditions.
  • oligonucleotide refers to a nucleic acid, generally of at least 10, preferably at least 15, and more preferably at least 20 nucleotides, preferably no more than 100 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA molecule encoding a gene, mRNA, cDNA, or other nucleic acid of interest.
  • Oligonucleotides can be labeled, e.g., with 32 P-nucleotides or nucleotides to which a label, such as biotin, has been covalently conjugated.
  • a labeled ohgonucleotide can be used as a probe to detect the presence of a nucleic acid.
  • oligonucleotides (one or both of which may be labeled) can be used as PCR primers, either for cloning full-length or a fragment of NF-E4, or to detect the presence of nucleic acids encoding NF-E4.
  • an ohgonucleotide of the invention can form a triple helix with a NF-E4 DNA molecule.
  • oligonucleotides are prepared synthetically, preferably on a nucleic acid synthesizer. Accordingly, oligonucleotides can be prepared with non-naturally occurring phosphoester analog bonds, such as thioester bonds, etc.
  • the present invention provides antisense nucleic acids (including ribozymes), which may be used to inhibit expression of NF-E4 of the invention.
  • An "antisense nucleic acid” is a single stranded nucleic acid molecule which, on hybridizing under cytoplasmic conditions with complementary bases in an RNA or DNA molecule, inhibits the latter's role. If the RNA is a messenger RNA transcript, the antisense nucleic acid is a countertranscript or mRNA-interfering complementary nucleic acid.
  • antisense broadly includes RNA-RNA interactions, RNA-DNA interactions, triple helix interactions, ribozymes and RNase-H mediated arrest.
  • Antisense nucleic acid molecules can be encoded by a recombinant gene for expression in a cell (e.g., U.S. Patent No. 5,814,500; U.S. Patent No. 5,811,234), or alternatively they can be prepared synthetically (e.g., U.S. Patent No. 5,780,607).
  • synthetic oligonucleotides envisioned for this invention include oligonucleotides that contain phosphorothioates, phosphotriesters, methyl phosphonates, short chain alkyl, or cycloalkl intersugar linkages or short chain heteroatomic or heterocyclic intersugar linkages.
  • 5,637,684 describes phosphoramidate and phosphorothioamidate oligomeric compounds.
  • oligonucleotides having morpholino backbone structures U.S. Pat. No. 5,034,506.
  • the phosphodiester backbone of the ohgonucleotide may be replaced with a polyamide backbone, the bases being bound directly or indirectly to the aza nitrogen atoms of the polyamide backbone (Nielsen et al, Science 254:1497, 1991).
  • oligonucleotides may contain substituted sugar moieties comprising one of the following at the 2' position: OH, SH, SCH 3 , F, OCN, O(CH 2 ) n NH 2 or O(CH 2 ) n CH 3 where n is from 1 to about 10; C, to C 10 lower alkyl, substituted lower alkyl, alkaryl or aralkyl; CI; Br; CN; CF 3 ; OCF 3 ; O-; S-, or N-alkyl; O-, S-, or N-alkenyl; SOCH 3 ; SO 2 CH 3 ; ONO 2 ;NO 2 ; N 3 ; NH 2 ; heterocycloalkyl; heterocycloalkaryl; aminoalkylamino; polyalkylamino; substitued silyl; a fluorescein moiety; an RNA cleaving group; a reporter group; an intercalator; a group for improving the pharmacokinetic properties of an an
  • Oligonucleotides may also have sugar mimetics such as cyclobutyls or other carbocyclics in place of the pentofuranosyl group.
  • Nucleotide units having nucleosides other than adenosine, cytidine, guanosine, thymidine and uridine, such as inosine, may be used in an ohgonucleotide molecule.
  • a gene encoding NF-E4 can be isolated from any source, particularly from a human cDNA or genomic library. Methods for obtaining NF-E4 gene are well known in the art, as described above (see, e.g., Sambrook et al, 1989, supra).
  • the DNA may be obtained by standard procedures known in the art from cloned DNA (e.g., a DNA "library”), and preferably is obtained from a cDNA library prepared from tissues with high level expression of the protein (e.g., an embryonic or fetal hematopoietic cell library, since these are the cells that evidence highest levels of expression of NF-E4), by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA (e.g., DNA having a sequence as depicted in GenBank accession no.
  • a portion of an NF-E4 gene exemplified infra can be purified and labeled to prepare a labeled probe, and the generated DNA may be screened by nucleic acid hybridization to the labeled probe (Benton and Davis, Science 196:180, 1977; Grunstein and Hogness, Proc. Natl. Acad. Sci. U.S.A. 72:3961, 1975). Those DNA fragments with substantial homology to the probe, such as an allelic variant from another individual, will hybridize. In a specific embodiment, highest stringency hybridization conditions are used to identify a homologous NF-E4 gene.
  • the gene encodes a protein product having the isoelectric, electrophoretic, amino acid composition, partial or complete amino acid sequence, antibody binding activity, or ligand binding profile of NF-E4 protein as disclosed herein.
  • the presence of the gene may be detected by assays based on the physical, chemical, immunological, or functional properties of its expressed product.
  • DNA sequences which encode substantially the same amino acid sequence as an NF-E4 gene may be used in the practice of the present invention. These include but are not limited to allelic variants, species variants, sequence conservative variants, and functional variants. Amino acid substitutions may also be introduced to substitute an amino acid with a particularly preferable property. For example, a Cys may be introduced a potential site for disulfide bridges with another Cys.
  • NF-E4 derivatives and analogs of the invention can be produced by various methods known in the art. The manipulations which result in their production can occur at the gene or protein level.
  • the cloned NF-E4 gene sequence can be modified by any of numerous strategies known in the art (Sambrook et al, 1989, supra). The sequence can be cleaved at appropriate sites with restriction endonuclease(s), followed by further enzymatic modification if desired, isolated, and ligated in vitro.
  • the NF-E4-encoding nucleic acid sequence can be mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or termination sequences, or to create variations in coding regions and/or form new restriction endonuclease sites or destroy preexisting ones, to facilitate further in vitro modification.
  • NF-E4 is mutated to eliminate the methionine codon found internally (about amino acid 101) in the positive regulator polypeptide, thus producing a sequence lacking an internal translation initiation site.
  • modifications can also be made to introduce restriction sites and facilitate cloning the NF-E4 gene into an expression vector.
  • mutagenesis Any technique for mutagenesis known in the art can be used, including but not limited to, in vitro site-directed mutagenesis (Hutchinson, C, et al, J. Biol. Chem. 253:6551, 1978; Zoller and Smith, DNA 3:479-488, 1984; Oliphant et al, Gene 44:177, 1986; Hutchinson et al, Proc. Natl. Acad. Sci. U.S.A. 83:710, 1986), use of TAB " linkers (Pharmacia), etc.
  • PCR techniques are preferred for site directed mutagenesis (see Higuchi, 1989, "Using PCR to Engineer DNA", in PCR Technology: Principles and Applications for DNA Amplification, H. Erlich, ed., Stockton Press, Chapter 6, pp. 61-70).
  • the identified and isolated gene can then be inserted into an appropriate cloning vector.
  • vector-host systems known in the art may be used. Possible vectors include, but are not limited to, plasmids or modified viruses, but the vector system must be compatible with the host cell used. Examples of vectors include, but are not limited to, E. coli, bacteriophages such as lambda derivatives, or plasmids such as pBR322 derivatives or pUC plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, etc.
  • the insertion into a cloning vector can, for example, be accomplished by ligating the DNA fragment into a cloning vector which has complementary cohesive termini.
  • the ends of the DNA molecules may be enzymatically modified.
  • any site desired may be produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers may comprise specific chemically synthesized oligonucleotides encoding restriction endonuclease recognition sequences.
  • Recombinant molecules can be introduced into host cells via transformation, transfection, infection, electroporation, etc., so that many copies of the gene sequence are generated.
  • the cloned gene is contained on a shuttle vector plasmid, which provides for expansion in a cloning cell, e.g., E. coli, and facile purification for subsequent insertion into an appropriate expression cell line, if such is desired.
  • a shuttle vector which is a vector that can replicate in more than one type of organism, can be prepared for replication in both E. coli and Saccharomyces cerevisiae by linking sequences from an E. coli plasmid with sequences form the yeast 2 ⁇ plasmid.
  • nucleotide sequence coding for NF-E4, or antigenic fragment, derivative or analog thereof, or a functionally active derivative, including a chimeric protein, thereof can be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of the inserted protein-coding sequence.
  • a nucleic acid encoding NF-E4 of the invention can be operationally associated with a promoter in an expression vector of the invention. Both cDNA and genomic sequences can be cloned and expressed under control of such regulatory sequences.
  • Such vectors can be used to express functional or functionally inactivated NF-E4 polypeptides.
  • the necessary transcriptional and translational signals can be provided on a recombinant expression vector.
  • Potential host-vector systems include but are not limited to mammalian cell systems transfected with expression plasmids or infected with virus (e.g., vaccinia virus, adenovirus, adeno-associated virus, herpes virus, etc.); insect cell systems infected with virus (e.g., baculovirus); microorganisms such as yeast containing yeast vectors; or bacteria transformed with bacteriophage, DNA, plasmid DNA, or cosmid DNA.
  • the expression elements of vectors vary in their strengths and specificities. Depending on the host- vector system utilized, any one of a number of suitable transcription and translation elements may be used.
  • NF-E4 protein may be controlled by any promoter/enhancer element known in the art, but these regulatory elements must be functional in the host selected for expression. Promoters which may be used to control NF-E4 gene expression include, but are not limited to, cytomegalovirus (CMV) promoter (U.S. Patent Nos. 5,385,839 and No.
  • CMV cytomegalovirus
  • Soluble forms of the protein can be obtained by collecting culture fluid, or solubilizing inclusion bodies, e.g., by treatment with detergent, and if desired sonication or other mechanical processes, as described above.
  • the solubilized or soluble protein can be isolated using various techniques, such as polyacrylamide gel electrophoresis (PAGE), isoelectric focusing, 2-dimensional gel electrophoresis, chromatography (e.g., ion exchange, affinity, immunoaffinity, and sizing column chromatography), centrifugation, differential solubility, immunoprecipitation, or by any other standard technique for the purification of proteins.
  • PAGE polyacrylamide gel electrophoresis
  • isoelectric focusing e.g., isoelectric focusing
  • 2-dimensional gel electrophoresis e.g., ion exchange, affinity, immunoaffinity, and sizing column chromatography
  • centrifugation e.g., ion exchange, affinity, immunoaffinity, and sizing column chromatography
  • a wide variety of host/expression vector combinations may be employed in expressing the DNA sequences of this invention.
  • Useful expression vectors may consist of segments of chromosomal, non-chromosomal and synthetic DNA sequences.
  • Suitable vectors include derivatives of SV40 and known bacterial plasmids, e.g., E.
  • coli plasmids col El, pCRl, pBR322, pMal-C2, pET, pGEX (Smith et al , Gene, 67:31-40, 1988), pMB9 and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous derivatives of phage 1, e.g., NM989, and other phage DNA, e.g., Ml 3 and filamentous single stranded phage DNA; yeast plasmids such as the 2 ⁇ plasmid or derivatives thereof; vectors useful in eukaryotic cells, such as vectors useful in insect or mammalian cells; vectors derived from combinations of plasmids and phage DNAs, such as plasmids that have been modified to employ phage DNA or other expression control sequences; and the like.
  • phage DNAS e.g., the numerous derivatives of phage 1, e.g.,
  • Preferred vectors are viral vectors, such as lentiviruses, retroviruses, herpes viruses, adenoviruses, adeno-associated viruses, vaccinia virus, baculovirus, and other recombinant viruses with desirable cellular tropism.
  • viral vectors such as lentiviruses, retroviruses, herpes viruses, adenoviruses, adeno-associated viruses, vaccinia virus, baculovirus, and other recombinant viruses with desirable cellular tropism.
  • a gene encoding a functional or mutant NF-E4 protein or polypeptide domain fragment thereof can be introduced in vivo, ex vivo, or in vitro using a viral vector or through direct introduction of DNA.
  • Expression in targeted tissues can be effected by targeting the transgenic vector to specific cells, such as with a viral vector or a receptor ligand, or by using a tissue-specific promoter, or both. Targeted gene delivery is described in International
  • Viral vectors commonly used for in vivo or ex vivo targeting and therapy procedures are DNA-based vectors and retroviral vectors. Methods for constructing and using viral vectors are known in the art (see, e.g., Miller and Rosman, BioTechniques, 7:980-990, 1992).
  • the viral vectors are replication defective, that is, they are unable to replicate autonomously in the target cell.
  • the genome of the replication defective viral vectors which are used within the scope of the present invention lack at least one region which is necessary for the replication of the virus in the infected cell. These regions can either be eliminated (in whole or in part), be rendered non-functional by any technique known to a person skilled in the art.
  • These techniques include the total removal, substitution (by other sequences, in particular by the inserted nucleic acid), partial deletion or addition of one or more bases to an essential (for replication) region.
  • Such techniques may be performed in vitro (on the isolated DNA) or in situ, using the techniques of genetic manipulation or by treatment with mutagenic agents.
  • the replication defective virus retains the sequences of its genome which are necessary for encapsidating the viral particles.
  • DNA viral vectors include an attenuated or defective DNA virus, such as but not limited to herpes simplex virus (HSV), papillomavirus, Epstein Barr virus (EBV), adenovirus, adeno-associated virus (AAV), and the like.
  • HSV herpes simplex virus
  • EBV Epstein Barr virus
  • AAV adeno-associated virus
  • Defective viruses which entirely or almost entirely lack viral genes, are preferred. Defective virus is not infective after introduction into a cell.
  • Use of defective viral vectors allows for administration to cells in a specific, localized area, without concern that the vector can infect other cells. Thus, a specific tissue can be specifically targeted.
  • particular vectors include, but are not limited to, a defective herpes virus 1 (HSV1) vector (Kaplitt et al, Molec. Cell. Neurosci.
  • viral vectors commercially, including but by no means limited to Avigen, Inc. (Alameda, CA; AAV vectors), Cell Genesys (Foster City, CA; retroviral, adenoviral, AAV vectors, and lentiviral vectors), Clontech (retroviral and baculoviral vectors), Genovo, Inc.
  • Avigen, Inc. Almeda, CA; AAV vectors), Cell Genesys (Foster City, CA; retroviral, adenoviral, AAV vectors, and lentiviral vectors), Clontech (retroviral and baculoviral vectors), Genovo, Inc.
  • an appropriate immunosuppressive treatment is employed in conjunction with the viral vector, e.g., adenovirus vector, to avoid immuno-deactivation of the viral vector and transfected cells.
  • the viral vector e.g., adenovirus vector
  • immunosuppressive cytokines such as interleukin-12 (IL-12), interferon-g (IFN- ⁇ ), or anti-CD4 antibody
  • IL-12 interleukin-12
  • IFN- ⁇ interferon-g
  • anti-CD4 antibody can be administered to block humoral or cellular immune responses to the viral vectors (see, e.g., Wilson, Nature Medicine, 1995).
  • a viral vector that is engineered to express a minimal number of antigens.
  • Adenovirus vectors are eukaryotic DNA viruses that can be modified to efficiently deliver a nucleic acid of the invention to a variety of cell types.
  • Various serotypes of adenovirus exist. Of these serotypes, preference is given, within the scope of the present invention, to using type 2 or type 5 human adenoviruses (Ad 2 or Ad 5) or adenoviruses of animal origin (see WO94/26914).
  • adenoviruses of animal origin which can be used within the scope of the present invention include adenoviruses of canine, bovine, murine (example: Mavl, Beard et al, Virology 75 (1990) 81), ovine, porcine, avian, and simian (example: S AV) origin.
  • the adenovirus of animal origin is a canine adenovirus, more preferably a CAV2 adenovirus (e.g. , Manhattan or A26/61 strain [ATCC VR-800]).
  • replication defective adenovirus and minimum adenovirus vectors have been described (WO94/26914, WO95/02697, WO94/28938, WO94/28152, WO94/12649, WO95/02697 WO96/22378).
  • the replication defective recombinant adenoviruses according to the invention can be prepared by any technique known to the person skilled in the art (Levrero et al. , Gene 101 :195 1991 ; EP 185 573; Graham, EMBO J. 3:2917, 1984; Graham et al, J. Gen. Virol. 36:59 1977). Recombinant adenoviruses are recovered and purified using standard molecular biological techniques, which are well known to one of ordinary skill in the art.
  • Adeno-associated viruses are DNA viruses of relatively small size which can integrate, in a stable and site-specific manner, into the genome of the cells which they infect. They are able to infect a wide spectrum of cells without inducing any effects on cellular growth, morphology or differentiation, and they do not appear to be involved in human pathologies.
  • the AAV genome has been cloned, sequenced and characterized. The use of vectors derived from the AAVs for transferring genes in vitro and in vivo has been described (see WO 91/18088; WO 93/09239; US 4,797,368, US 5,139,941, EP 488 528).
  • the replication defective recombinant AAVs according to the invention can be prepared by cotransfecting a plasmid containing the nucleic acid sequence of interest flanked by two AAV inverted terminal repeat (ITR) regions, and a plasmid carrying the AAV encapsidation genes (rep and cap genes), into a cell line which is infected with a human helper virus (for example an adenovirus).
  • ITR inverted terminal repeat
  • Retrovirus vectors In another embodiment the gene can be introduced in a retroviral vector, e.g., as described in Anderson et al, U.S. Patent No. 5,399,346; Mann et al, 1983, Cell 33: 153; Temin et al, U.S. Patent No. 4,650,764; Temin et al, U.S. Patent No. 4,980,289; Markowitz et al, 1988, J. Virol. 62:1120; Temin et al, U.S. Patent No. 5,124,263; EP 453242, EP178220; Bernstein et al. Genet. Eng.
  • the retroviruses are integrating viruses which infect dividing cells.
  • the retrovirus genome includes two LTRs, an encapsidation sequence and three coding regions (gag, pol and env).
  • the gag, pol and env genes are generally deleted, in whole or in part, and replaced with a heterologous nucleic acid sequence of interest.
  • vectors can be constructed from different types of retrovirus, such as, HIV, MoMuLV ("murine Moloney leukaemia virus” MSV ("murine Moloney sarcoma virus”), HaSV ("Harvey sarcoma virus”); SNV ("spleen necrosis virus”); RSV ("Rous sarcoma virus”) and Friend virus.
  • Suitable packaging cell lines have been described in the prior art, in particular the cell line PA317 (US 4,861,719); the PsiCRIP cell line (WO 90/02806) and the GP+envAm-12 cell line (WO 89/07150).
  • the recombinant retroviral vectors can contain modifications within the LTRs for suppressing transcriptional activity as well as extensive encapsidation sequences which may include a part of the gag gene (Bender et al, J. Virol. 61 : 1639, 1987).
  • Recombinant retroviral vectors are purified by standard techniques known to those having ordinary skill in the art.
  • Retroviral vectors can be constructed to function as infectious particles or to undergo a single round of transfection.
  • the virus is modified to retain all of its genes except for those responsible for oncogenic transformation properties, and to express the heterologous gene.
  • Non-infectious viral vectors are manipulated to destroy the viral packaging signal, but retain the structural genes required to package the co-introduced virus engineered to contain the heterologous gene and the packaging signals.
  • the viral particles that are produced are not capable of producing additional virus.
  • Retrovirus vectors can also be introduced by DNA viruses, which permits one cycle of retroviral replication and amplifies tranfection efficiency (see WO 95/22617, WO 95/26411, WO 96/39036, WO 97/19182).
  • Lentivirus vectors are can be used as agents for the direct delivery and sustained expression of a transgene in several tissue types, including brain, retina, muscle, liver and blood.
  • the vectors can efficiently transduce dividing and nondividing cells in these tissues, and maintain long-term expression of the gene of interest.
  • Lentiviral packaging cell lines are available and known generally in the art. They facilitate the production of high-titer lentivirus vectors for gene therapy.
  • An example is a tetracycline-inducible VSV-G pseudotyped lentivirus packaging cell line which can generate virusparticles at titers greater than 106 IU/ml for at least 3 to 4 days (Kafri, et al, J. Virol., 73: 576-584, 1999).
  • the vector produced by the inducible cell line can be concentrated as needed for efficiently transducing nondividing cells in vitro and in vivo.
  • Non-viral vectors can be introduced in vivo by lipofection, as naked DNA, or with other transfection facilitating agents (peptides, polymers, etc.).
  • Synthetic cationic lipids can be used to prepare liposomes for in vivo transfection of a gene encoding a marker (Feigner, et al, Proc. Natl. Acad. Sci. U.S.A. 84:7413-7417, 1987; Feigner and Ringold, Science 337:387- 388, 1989; see Mackey, et al, Proc. Natl. Acad. Sci. U.S.A.
  • lipid compounds and compositions for transfer of nucleic acids are described in International Patent Publications WO95/18863 and WO96/17823, and in U.S. Patent No. 5,459,127.
  • Lipids may be chemically coupled to other molecules for the purpose of targeting (see Mackey et al, supra).
  • Targeted peptides e.g., hormones or neurotransmitters, and proteins such as antibodies, or non-peptide molecules could be coupled to liposomes chemically.
  • a nucleic acid in vivo such as a cationic oligopeptide (e.g., International Patent Publication
  • WO95/21931 peptides derived from DNA binding proteins (e.g., International Patent Publication WO96/25508), or a cationic polymer (e.g., International Patent Publication WO95/21931).
  • naked DNA vectors for gene therapy can be introduced into the desired host cells by methods known in the art, e.g., electroporation, micro injection, cell fusion, DEAE dextran, calcium phosphate precipitation, use of a gene gun, or use of a DNA vector transporter (see, e.g., Wu et al, J. Biol. Chem. 267:963-967, 1992; Wu and Wu, J. Biol. Chem. 263:14621-14624, 1988; Hartmut et al, Canadian Patent Application No. 2,012,311, filed March 15, 1990; Williams et al, Proc. Natl. Acad. Sci.
  • Antibodies to NF-E4 are useful, inter alia, for diagnostics and intracellular regulation of NF-E4 activity, as set forth below.
  • NF-E4 polypeptides produced recombinantly or by chemical synthesis, and fragments or other derivatives or analogs thereof, including fusion proteins may be used as an immunogen to generate antibodies that recognize the NF-E4 polypeptide.
  • Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab expression library. Such an antibody is specific for human NF-E4; it may recognize a mutant form of NF-E4, or wild-type NF-E4.
  • NF-E4 polypeptide or derivative or analog thereof various procedures known in the art may be used for the production of polyclonal antibodies to NF-E4 polypeptide or derivative or analog thereof.
  • various host animals can be immunized by injection with the NF-E4 polypeptide, or a derivative (e.g., fragment or fusion protein) thereof, including but not limited to rabbits, mice, rats, sheep, goats, etc.
  • the NF-E4 polypeptide or fragment thereof can be conjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH).
  • BSA bovine serum albumin
  • KLH keyhole limpet hemocyanin
  • a peptide comprising SEQ ID NO:8 is conjugated to KLH and used to immunize rabbits.
  • Various adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Gueri ⁇ ) and Corynebacterium parvum.
  • any technique that provides for the production of antibody molecules by continuous cell lines in culture may be used. These include but are not limited to the hybridoma technique originally developed by Kohler and Milstein (Nature 256:495-497, 1975), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al, Immunology Today 4:72, 1983; Cote et al, Proc. Natl. Acad. Sci. USA, 80:2026-2030, 1983), and the EBV- hybridoma technique to produce human monoclonal antibodies (Cole et al. , in
  • monoclonal antibodies can be produced in germ-free animals (International Patent Publication No. WO 89/12690, published 28 December 1989).
  • techniques developed for the production of "chimeric antibodies” (Morrison et al, J. Bacteriol., 159:870, 1984; Neuberger et ⁇ /., Nature, 312:604-608, 1984; Takeda et al.
  • human or humanized chimeric antibodies are preferred for use in therapy of human diseases or disorders (described infra), since the human or humanized antibodies are much less likely than xenogenic antibodies to induce an immune response, in particular an allergic response, themselves.
  • Antibody fragments which contain the idiotype of the antibody molecule can be generated by known techniques.
  • such fragments include but are not limited to: the F(ab') 2 fragment which can be produced by pepsin digestion of the antibody molecule; the Fab' fragments which can be generated by reducing the disulfide bridges of the F(ab') 2 fragment, and the Fab fragments which can be generated by treating the antibody molecule with papain and a reducing agent.
  • Patent 4,946,778 can be adapted to produce NF-E4 polypeptide-specific single chain antibodies.
  • An additional embodiment of the invention utilizes the techniques described for the construction of Fab expression libraries (Huse et al, Science 246:1275-1281, 1989) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity for an NF-E4 polypeptide, or its derivatives, or analogs.
  • screening for or testing with the desired antibody can be accomplished by techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), "sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc.
  • radioimmunoassay e.g., ELISA (enzyme-linked immunosorbant assay), "sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in
  • antibody binding is detected by detecting a label on the primary antibody.
  • the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody.
  • the secondary antibody is labeled.
  • Many means are known in the art for detecting binding in an immunoassay and are within the scope of the present invention. For example, to select antibodies which recognize a specific epitope of an NF-E4 polypeptide, one may assay generated hybridomas for a product which binds to an NF-E4 polypeptide fragment containing such epitope.
  • an antibody specific to an NF-E4 polypeptide from a particular species of animal For selection of an antibody specific to an NF-E4 polypeptide from a particular species of animal, one can select on the basis of positive binding with NF-E4 polypeptide expressed by or isolated from cells of that species of animal.
  • the foregoing antibodies can be used in methods known in the art relating to the localization and activity of the NF-E4 polypeptide, e.g., for Western blotting, imaging NF-E4 polypeptide in situ, measuring levels thereof in appropriate physiological samples, etc. using any of the detection techniques mentioned above or known in the art.
  • Such antibodies can also be used in assays for ligand binding, e.g. , as described in US Patent No. 5,679,582.
  • Antibody binding generally occurs most readily under physiological conditions, e.g., pH of between about 7 and 8, and physiological ionic strength.
  • physiological conditions e.g., pH of between about 7 and 8, and physiological ionic strength.
  • a carrier protein in the buffer solutions stabilizes the assays.
  • perturbation of optimal conditions e.g., increasing or decreasing ionic strength, temperature, or pH, or adding detergents or chaotropic salts, such perturbations will decrease binding stability.
  • antibodies that agonize or antagonize the activity of NF-E4 polypeptide can be generated.
  • intracellular single chain Fv antibodies can be used to regulate (inhibit) NF-E4 (Marasco et al, Proc. Natl. Acad. Sci.
  • nucleotide sequences derived from the gene encoding NF-E4, and peptide sequences derived from NF-E4, are useful targets to identify drugs that are effective in treating hemoglobin disorders.
  • Drug targets include without limitation (i) isolated nucleic acids derived from the gene encoding,NF-E4 and (ii) isolated peptides and polypeptides derived from NF-E4 polypeptides.
  • identification and isolation of NF-E4 provides for development of screening assays, particularly for high throughput screening of molecules that up- or down-regulate the activity of NF-E4, e.g., by permitting expression of NF-E4 in quantities greater than can be isolated from natural sources, or in indicator cells that are specially engineered to indicate the activity of NF-E4 expressed after transfection or transformation of the cells. Accordingly, the present invention contemplates methods for identifying specific ligands of NF-E4 using various screening assays known in the art.
  • Any screening technique known in the art can be used to screen for NF-E4 agonists or antagonists.
  • the present invention contemplates screens for small molecule ligands or ligand analogs and mimics, as well as screens for natural ligands that bind to and agonize or antagonize NF-E4 in vivo.
  • Such agonists or antagonists may, for example, interfere in the phosphorylation or dephosphorylation of NF-E4, with resulting effects on NF-E4 function.
  • natural products libraries can be screened using assays of the invention for molecules that agonize or antagonize NF-E4 activity.
  • Test compounds are screened from large libraries of synthetic or natural compounds. Numerous means are currently used for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds. Synthetic compound libraries are commercially available from Maybridge Chemical Co.
  • Candidate agents are added to in vitro cell cultures of hemopoietic cells, prepared by known methods in the art (D. Metcalf, Hemopoietic Colonies, In Vitro Cloning of Normal and Leukemic Cells, particularly at Ch. 3, Springer- Verlag, 1977), and the levels of NF-E4 RNA are measured. If the amount of RNA in cultures which include the agent rises, compared to the control cultures which are devoid of the agent, the agent is chosen as a candidate for in vivo testing.
  • Various in vitro systems can be used to analyze the effects of a new compound on NF-E4 expression.
  • Human bone marrow or peripheral blood mononuclear cells are isolated by standard techniques known in the art. They are plated in methylcellulose in the presence of the cytokines, SCF, IL-3, IL-6, and erythropoietin (semisolid culture) or in tissue culture media with the same cytokines (liquid culture). Each experiment is performed in triplicate at five different dilutions of compound on at least two different patient samples. At about 12 to 14 days, colonies are counted to exclude direct cytotoxic effects; morphological analysis is performed to exclude adverse affects on erythroid differentiation.
  • BFUe colonies are plucked from the semisolid media (cells are lysed in the liquid culture) and assayed by Reverse Transcriptase Polymerase Chain Reaction (RT-PCR) for NF-E4 and globin subtype expression, using RNA obtained form the cells.
  • RT-PCR Reverse Transcriptase Polymerase Chain Reaction
  • antibodies are used to detect expression of NF-E4, and the ratio of the 22 kD to 14 kD NF-E4 polypeptides, as well as transcription factors including but not limited to GATA1-3, NF-E2, and EKLF.
  • a recombinant NF-E4 activity system can be constructed.
  • a host cell is modified to contain either globin genes or a reporter gene operably associated with the ⁇ -globin or ⁇ -globin promoter from the ⁇ -globin control locus.
  • the host cell can be a human hematopoietic cell containing the endogenous NF-E4 gene, including the native NF-E4 expression control sequences; or the host cell can be a non-human cell modified to express human NF-E4 constitutively or under control of its native expression control sequences.
  • Compounds are tested for the ability to promote or inhibit NF-E4 activity, which is evaluated on the level of expression of ⁇ - or ⁇ -globin, or both, or the reporter gene.
  • Compounds that modulate NF-E4 activity can be tested in cells that constitutively express NF-E4, e.g. , K562 cells or modified cells.
  • Compounds that modulate NF-E4 expression can be tested in cells in which the NF-E4 coding sequence is operably associated with an NF-E4 expression control sequence. In this later embodiment, NF-E4 expression can be tested directly.
  • Reporter genes for use in the invention encode enzymatically, spectroscopically or immunologically detectable proteins, including, but are by no means limited to, chloramphenicol acetyl transferase (CAT), ⁇ -galactosidase ( ⁇ -gal), luciferase, green fluorescent protein (GFP), alkaline phosphatase, and derivatives thereof.
  • CAT chloramphenicol acetyl transferase
  • ⁇ -gal ⁇ -galactosidase
  • GFP green fluorescent protein
  • alkaline phosphatase alkaline phosphatase
  • Intact cells or whole animals expressing a gene encoding NF-E4 can be used in screening methods to identify candidate drugs.
  • a permanent cell line is established.
  • cells including without limitation mammalian, insect, yeast, or bacterial cells
  • NF-E4 gene is transiently programmed to express an NF-E4 gene by introduction of appropriate DNA or mRNA, e.g., using the vector systems described above.
  • Identification of candidate compounds can be achieved using any suitable assay, including without limitation (i) assays that measure selective binding of test compounds to NF-E4 (ii) assays that measure the ability of a test compound to modify (i.e., inhibit or enhance) a measurable activity or function of NF-E4 and (iii) assays that measure the ability of a compound to modify (i.e., inhibit or enhance) the transcriptional activity of sequences derived from the promoter (i.e., regulatory) regions the NF-E4 gene.
  • Transgenic mammals can be prepared for evaluating the molecular mechanisms of NF-E4. Such mammals provide excellent models for screening or testing drug candidates.
  • the term "transgenic” usually refers to animal whose germ line and somatic cells contain the transgene of interest, i.e., NF-E4.
  • transient transgenic animals can be created by the ex vivo or in vivo introduction of an expression vector of the invention. Both types of "transgenic" animals are contemplated for use in the present invention, e.g., to evaluate the effect of a test compound on NF-E4 expression or activity.
  • human NF-E4, or NF-E4 and , or both, "knock-in” mammals can be prepared for evaluating the molecular biology of this system in greater detail than is possible with human subjects. It is also possible to evaluate compounds or diseases on "knockout" animals, e.g., to identify a compound that can compensate for a defect in NF-E4 activity. Both technologies permit manipulation of single units of genetic information in their natural position in a cell genome and to examine the results of that manipulation in the background of a terminally differentiated organism.
  • transgenic animals Although rats and mice, as well as rabbits, are most frequently employed as transgenic animals, particularly for laboratory studies of protein function and gene regulation in vivo, any animal can be employed in the practice of the invention.
  • double transgenic animals e.g., for NF-E4 and the ⁇ -locus control regions can be prepared my mating the corresponding single transgenic animals.
  • Various transgenic animals containing non-native ⁇ -globin locus control regions have been generated (see Taboit-Dameron et al, Trangenic Res., 1999, 8:223- 35; Osborne et al, J. Virol.
  • a "knock-in" mammal is a mammal in which an endogenous gene is substituted with a heterologous gene (Roemer et al, New Biol., 3:331, 1991).
  • the heterologous gene is "knocked-in” to a locus of interest, either the subject of evaluation(in which case the gene may be a reporter gene; see Elefanty et al, Proc. Natl. Acad. Sci. USA, 95:11897,1998) of expression or function of a homologous gene, thereby linking the heterologous gene expression to transcription from the appropriate promoter. This can be achieved by homologous recombination, transposon (Westphal and Leder, Curr.
  • a "knockout mammal” is a mammal (e.g., mouse) that contains within its genome a specific gene that has been inactivated by the method of gene targeting (see, e.g., US Patents No. 5,777,195 and No. 5,616,491).
  • a knockout mammal includes both a heterozygous knockout (/ ' . e. , one defective allele and one wild-type allele) and a homozygous knockout (i.e., two defective alleles).
  • Preparation of a knockout mammal requires first introducing a nucleic acid construct that will be used to suppress expression of a particular gene into an undifferentiated cell type termed an embryonic stem (ES) cell.
  • ES embryonic stem
  • This cell is then injected into a mammalian embryo.
  • a mammalian embryo with an integrated cell is then implanted into a foster mother for the duration of gestation.
  • Zhou, et al. (Genes and Development, 9:2623-34, 1995) describe PPCA knockout mice.
  • knockout refers to partial or complete suppression of the expression of at least a portion of a protein encoded by an endogenous DNA sequence in a cell.
  • knockout construct refers to a nucleic acid sequence that is designed to decrease or suppress expression of a protein encoded by endogenous DNA sequences in a cell.
  • the nucleic acid sequence used as the knockout construct is typically comprised of (1) DNA from some portion of the gene (exon sequence, intron sequence, and/or promoter sequence) to be suppressed and (2) a marker sequence used to detect the presence of the knockout construct in the cell.
  • the knockout construct is inserted into a cell, and integrates with the genomic DNA of the cell in such a position so as to prevent or interrupt transcription of the native DNA sequence.
  • Such insertion usually occurs by homologous recombination (i.e., regions of the knockout construct that are homologous to endogenous DNA sequences hybridize to each other when the knockout construct is inserted into the cell and recombine so that the knockout construct is incorporated into the corresponding position of the endogenous DNA).
  • the knockout construct nucleic acid sequence may comprise (1) a full or partial sequence of one or more exons and/or introns of the gene to be suppressed, (2) a full or partial promoter sequence of the gene to be suppressed, or (3) combinations thereof.
  • the knockout construct is inserted into an embryonic stem cell (ES cell) and is integrated into the ES cell genomic DNA, usually by the process of homologous recombination. This ES cell is then injected into, and integrates with, the developing embryo.
  • ES cell embryonic stem cell
  • disruption of the gene refers to insertion of a nucleic acid sequence into one region of the native DNA sequence
  • a nucleic acid construct can be prepared containing a DNA sequence encoding an antibiotic resistance gene which is inserted into the DNA sequence that is complementary to the DNA sequence
  • the DNA will be at least about 1 kilobase (kb) in length and preferably 3-4 kb in length, thereby providing sufficient complementary sequence for recombination when the construct is introduced.
  • Transgenic constructs can be introduced into the genomic DNA of the ES cells, into the male pronucleus of a fertilized oocyte by microinjeciton, or by any methods known in the art, e.g., as described in U.S. Patent Nos. 4,736,866 and 4,870,009, and by Hogan et al. , Transgenic Animals: A Laboratory Manual, 1986, Cold Spring Harbor.
  • a transgenic founder animal can be used to breed other transgenic animals; alternatively, a transgenic founder may be cloned to produce other transgenic animals.
  • a mammal in which two or more genes have been knocked out or knocked in, or both.
  • Such mammals can be generated by repeating the procedures set forth herein for generating each knockout construct, or by breeding to mammals, each with a single gene knocked out, to each other, and screening for those with the double knockout genotype.
  • Regulated knockout animals can be prepared using various systems, such as the tet-repressor system (see US Patent No. 5,654,168) or the Cre-Lox system (see US Patents No. 4,959,317 and No. 5,801,030).
  • Agents according to the invention may be identified by screening in high-throughput assays, including without limitation cell-based or cell-free assays. It will be appreciated by those skilled in the art that different types of assays can be used to detect different types of agents. Several methods of automated assays have been developed in recent years so as to permit screening of tens of thousands of compounds in a short period of time. Such high-throughput screening methods are particularly preferred. The use of high-throughput screening assays to test for agents is greatly facilitated by the availability of large amounts of purified polypeptides, as provided by the invention.
  • the present invention permits, for the first time, an effective molecular treatment for various hemoglobinopathies, including ⁇ -thalassemia and sickle-cell anemia.
  • the subjects to which the present invention is applicable may be any mammalian or vertebrate species, which include, but are not limited to, cows, horses, sheep, pigs, fowl (e.g., chickens), goats, cats, dogs, hamsters, mice, rats, monkeys, rabbits, chimpanzees, and humans.
  • the subject is a human.
  • Positive regulation of expression of genes encoding fetal and embryonic globin and simultaneous negative regulation of genes encoding defective adult globin can be achieved by delivery of positive regulator NF-E4 polypeptide (e.g., the 22 kD polypeptide), by gene therapy, including providing a vector that expresses positive regulator NF-E4 in target (erythropoietic) cells or modifying target cells by introduction of a heterologous promoter in the NF-E4 gene that provides for amplification of expression of the endogenous NF-E4 gene.
  • positive regulator NF-E4 polypeptide e.g., the 22 kD polypeptide
  • gene therapy including providing a vector that expresses positive regulator NF-E4 in target (erythropoietic) cells or modifying target cells by introduction of a heterologous promoter in the NF-E4 gene that provides for amplification of expression of the endogenous NF-E4 gene.
  • the polypeptide or gene therapeutic can be delivered as a pharmaceutical composition, i.e., a mixture or admixture of the polypeptide or vector with a pharmaceutically acceptable carrier or excipient.
  • pharmaceutically acceptable refers to molecular entities and compositions that are physiologically tolerable and do not typically produce an allergic or similar untoward reaction, such as gastric upset, dizziness and the like, when administered to a human.
  • pharmaceutically acceptable means approved by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in animals, and more particularly in humans.
  • carrier refers to a diluent, adjuvant, excipient, or vehicle with which the compound is administered.
  • Such pharmaceutical carriers can be sterile liquids, such as water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like.
  • Water or aqueous solution saline solutions and aqueous dextrose and glycerol solutions are preferably employed as carriers, particularly for injectable solutions. Suitable pharmaceutical carriers are described in "Remington's Pharmaceutical Sciences” by E.W. Martin.
  • terapéuticaally effective amount is used herein to mean an amount sufficient to reduce by at least about 15 percent, preferably by at least 50 percent, more preferably by at least 90 percent, and most preferably prevent, a clinically significant deficit in the activity, function and response of the host. Alternatively, a therapeutically effective amount is sufficient to cause an improvement in a clinically significant condition in the host.
  • a therapeutically effective polypeptide or gene therapy is a therapy that results in expression of a sufficient level of fetal or embryonic, or both, globins so that there is a measurable improvement in the pathological condition.
  • a measurable improvement is one in which the level of blood oxygenation increases; or a symptom of hemoglobinopathy is reduced perceptibly.
  • a therapeutically effective polypeptide or gene therapy is a therapy that results in a reduction in the level of expression of a fetal or embryonic globin mRNA or protein.
  • a therapy results in an improvement of a condition associated with overexpression of the fetal or embryonic globins.
  • compositions containing NF-E4 polypeptide for use in accordance with the present invention can be formulated in any conventional manner using one or more physiologically acceptable carriers or excipients.
  • polypeptide or functionally active fragments thereof
  • physiologically acceptable salts and solvents can be formulated for administration by inhalation or insufflation (either through the mouth or the nose) or oral, buccal, parenteral or rectal administration.
  • the therapeutics can take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinized maize starch, polyvinylpyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate).
  • binding agents e.g., pregelatinized maize starch, polyvinylpyrrolidone or hydroxypropyl methylcellulose
  • fillers e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate
  • lubricants e.g., magnesium stearate, talc or silica
  • disintegrants e.g., potato starch or
  • Liquid preparations for oral administration can take the form of, for example, solutions, syrups or suspensions, or they can be presented as a dry product for constitution with water or other suitable vehicle before use.
  • Such liquid preparations can be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non-aqueous vehicles (e.g., almond oil, oily esters, ethyl alcohol or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p- hydroxybenzoates or sorbic acid).
  • the preparations can also contain buffer salts, flavoring, coloring and sweetening agents as appropriate.
  • Preparations for oral administration can be suitably formulated to give controlled release of the active compound.
  • the therapeutics can take the form of tablets or lozenges formulated in conventional manner.
  • the therapeutics according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • a suitable propellant e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • a suitable propellant e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • a suitable propellant e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other
  • the therapeutics can be formulated for parenteral administration (i.e., intravenous or intramuscular) by injection, via, for example, bolus injection or continuous infusion.
  • parenteral administration i.e., intravenous or intramuscular
  • Formulations for injection can be presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added preservative.
  • the compositions can take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and can contain formulatory agents such as suspending, stabilizing and/or dispersing agents.
  • the active ingredient can be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use.
  • the therapeutics can also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides.
  • the therapeutics can also be formulated as a depot preparation.
  • Such long acting formulations can be administered by implantation (for example, subcutaneously or intramuscularly) or by intramuscular injection.
  • the compounds can be formulated with suitable polymeric or hydrophobic materials (for example, as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
  • composition if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents.
  • the composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder.
  • Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, etc.
  • the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophihzed powder or water-free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
  • a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
  • an ampoule of sterile diluent can be provided so that the ingredients may be mixed prior to administration.
  • the invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the vaccine formulations of the invention.
  • a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the vaccine formulations of the invention.
  • Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.
  • compositions may, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient.
  • the pack may for example comprise metal or plastic foil, such as a blister pack.
  • the pack or dispenser device may be accompanied by instructions for administration.
  • Composition comprising a compound of the invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an appropriate container, and labelled for treatment of an indicated condition.
  • a vector for expression of NF-E4 or alternatively a vector for introduction of a heterologous expression regulatory sequence to amplify endogenous NF-E4 expression (see U.S. Patent No. 5,733,761), can be delivered to treat or prevent a disease or disorder associated with a hemoglobinopathy.
  • the therapeutic vector comprises a nucleic acid that expresses NF-E4 in a suitable host.
  • a vector has a promoter operably linked to the coding sequence for NF-E4.
  • the promoter can be inducible or constitutive and, optionally, tissue-specific.
  • a nucleic acid molecule is used in which the NF-E4 sequences and any other desired sequences are flanked by regions that promote homologous recombination at a desired site in the genome, thus providing for intrachromosomal expression of the synthebody (Koller and Smithies, Proc. Natl. Acad. Sci. USA, 1989, 86:8932-8935; Zijlstra et al, Nature, 1989, 342:435-438).
  • Delivery of the vector into a patient may be either direct, in which case the patient is directly exposed to the vector or a delivery complex, or indirect, in which case, cells are first transformed with the vector in vitro then transplanted into the patient. These two approaches are known, respectively, as in vivo and ex vivo gene therapy.
  • the form and amount of therapeutic nucleic acid envisioned for use depends on the type of disease and the severity of the desired effect, patient state, etc., and can be determined by one skilled in the art.
  • EXAMPLE 1 Induction of Human Fetal Globin Gene Expression by a
  • Yeast two-hybrid screen The cDNA sequence encoding the COOH- terminal 242 amino acids (aa 260-502) of CP2 was inserted into the yeast expression vector pGBT9 (Clontech). The resultant plasmid encodes a hybrid protein containing the DNA-binding domain of GAL4 fused in-frame to CP2 residues.
  • the yeast reporter strain, HF7C, rendered competent by the lithium acetate method was sequentially transformed with this vector and a plasmid cDNA library derived from K562 cells constructed in the yeast expression vector pACT2 (Fields and Song, Nature, 340:245-246, 1989).
  • the cDNAs in this vector were fused with the GAL4 transactivation domain.
  • the yeast were plated on leucine-/tryptophan-/histidine- plates and incubated at 30° for 4 days. Potential protein interactions were indicated by activation of the histidine reporter gene and growth on these plates and by activation of the second reporter gene, ⁇ -galactosidase and a positive X-gal assay.
  • Library plasmids were rescued from yeast clones using the acid-washed glass beads procedure and electroporated into the competent E. coli strain MCI 061.
  • Sequencing reactions were performed using the Taq Dyedeoxy terminator cycle sequencing method (Applied Biosystems) and analyzed on an Applied Biosystems 373 automated sequencer.
  • Yeast one-hybrid assay A concatamer of four copies of the SSE was cloned into the EcoRUSall sites of the yeast vector, pLacZ. As a control, a four copy concatamer of the direct repeat elements (DRE) from the proximal ⁇ -promoter was also cloned into this vector. Small scale transformations of each vector were performed into the Saccharomyces cerevisiae strain YM4271 which is auxotrophic for histidine, uracil, leucine and tryptophan. Prior to transfection, the vector was linearized with Notl to allow genomic integration into the ura3-52 site which confers auxotrophy to uracil allowing selection of transformants.
  • DRE direct repeat elements
  • a single yeast colony which displayed no basal reporter gene activity was chosen for subsequent experiments. This colony was expanded and transformed with pACT-CP2, pACT106 or pACTl 17. The transformants were selected on minimal medium lacking leucine and uracil and colonies were lifted on filters and assayed for ⁇ -galactosidase activity.
  • Mammalian two-hybrid assay Mammalian expression vectors containing the dimerization domain of CP2 fused in-frame to the GAL4DB and ⁇ F- E4 fused in-frame to the VP16AD were generated. These plasmids were co- transfected with pG5CAT, a reporter construct with 5 GAL4 D ⁇ A binding sites linked to the chloramphenicol acetyltransferase (CAT) gene into 293 cells using calcium phosphate precipitation. Vectors lacking either CP2 or ⁇ F-E4 were transfected as controls. After 48 hours, cells were harvested and whole cell lysate was prepared.
  • CAT chloramphenicol acetyltransferase
  • CAT activity was measured using the CAT enzyme-linked immunosorbent assay (ELISA) according to the manufacturer's instructions (Boehringer Mannheim). 5 'RACE (rapid amplification of 5' cDNA ends). A marathon 5' RACE cDNA library was constructed from mRNA from K562 cells according to the manufacturer's instructions (Clontech).
  • Nested PCR was performed using the following vector- and gene- specific primers: gene-specific primer 1 5'- CCCTTGGCTCAGATGAAGCGATGGTAGT-3' (SEQ ID NO:4), gene-specific primer 2 5'-TGGCCTGCAGGGCCCCAGTAGGT-3' (SEQ ID NO:5), vector-specific primer 1 5'-CCATCCTAATACGACTCACTATAGGGC-3' (SEQ ID NO:7), vector- specific primer 2 5'-ACTCACTATAGGGCTCGAGCGGC-3' (SEQ ID NO:8).
  • PCR conditions were as follows: 95°C - 1 minute, 1 cycle; 94°C - 10 seconds, 68°C - 2 minutes, 30 cycles; 68°C - 5 minutes, 1 cycle.
  • Nested PCR was performed under identical conditions except that the cycle number was reduced to 20. PCR products were electrophoresed on 1% agarose, blotted onto nitrocellulose, and probed with internal gene-specific oligonucleotides. Final PCR products were cloned into the TOPO 2.1 vector according to the manufacturer's instructions (Invitrogen) and sequenced.
  • the NF-E4 coding region was cloned into the retroviral vector plasmid MSCV-HA at a unique Xhol or EcoW. site.
  • This bicistronic vector contains (i) an amphotropic retrovirus murine stem cell virus (MSCV) 5' long terminal repeat (LTR); (ii) a hemagglutinin (HA) epitope tag with the NF-E4 coding sequence in-frame either 5' or 3' to the tag; (iii) the encephalomyocarditis internal ribosomal entry site (IRES); (iv) the green fluorescent protein (GFP) cDNA, and (v) the MSCV 3' LTR ( Figure 2).
  • the plasmid was co-transfected with an amphotropic packaging plasmid into 293T cells by calcium phosphate precipitation.
  • the supernatant containing amphotropic particles was harvested, filtered and added to K562 or MEL cells every 12 hours for three days. The cells were allowed to recover for 72 hours and then analyzed for GFP expression by flow cytometry. The highest expressing 10% of cells were sterilely sorted expanded and resorted and subsequently expanded in oligoclonal pools. A biological titer of the supernatant on NIH3T3 cells was equivalent to lxl0 6 cfu/ml.
  • CD34+ cells Human cord blood was provided by the Bone Marrow Donor Institute.
  • CD34+ cells were isolated using a MiniMACS magnetic cell sorting system (Miltenyi Biotec Inc). Cells were then cultured in IMDM supplemented with IL-3 (lOng/ml), SCF (300 ng/ml), IL- 6 (50 ng/ml), G-CSF (10 ng/ml), Flt3 (300 ng/ml), and anti-TGF-Bl (100 ng/ml) for 72 hours at 37°C.
  • GST glutathione S-transferase
  • the beads were resuspended in 200 ⁇ l binding buffer (lOmM Tris.HCL, pH 7.9/500mM KCl/0.1 mM EDTA/150 ⁇ g/ml BSA/0.1% Nonidet P-40/10% glycerol) and incubated for 1 hour at room temperature with 35 S methionine labeled NF-E4. After extensive washing retained proteins were eluted by boiling in SDS loading buffer and analyzed by SDS-PAGE and autoradiography.
  • binding buffer laOmM Tris.HCL, pH 7.9/500mM KCl/0.1 mM EDTA/150 ⁇ g/ml BSA/0.1% Nonidet P-40/10% glycerol
  • NF-E4 antisera An NF-E4 peptide having the sequence LKTDSALEQTPQQLPSLHLSQG (SEQ ID NO:8) was synthesized and conjugated to KLH.
  • the peptide-KLH conjugate was prepared in complete Freunds' adjuvant (CFA) and incomplete Freunds' adjuvant (IF A). Rabbits were primed with the immunogen-CFA mixture, then boosted three times at monthly intervals with the immunogen-IFA mixture.
  • CFA complete Freunds' adjuvant
  • IF A incomplete Freunds' adjuvant
  • Anti-sera were screened for reactivity with E. coli produced NF-E4-GST fusion proteins by ELISA and Western blotting. Positive sera were used in Western analysis, immunoprecipitation, and function ablation assays.
  • Extract preparation, immunoprecipitation and electrophoretic mobility shift assays were prepared by the method of Dignam as previously described (Dignam, Methods Enzymol., 182:194-203, 1990). Highly purified SSP was obtained by fractionating crude extract over heparin-Sepharose and DNA affinity columns as described previously (Jane et al. , EMBO J., 14:97-105, 1995).
  • nuclear extracts were initially precleared with normal rabbit serum (lO ⁇ g/ml) and then incubated with preimmune serum or antisera to CP2 or NF-E4 overnight at 4°C.
  • EMS As were performed by incubating varying amounts of nuclear extract with 10 5 cpm of 32 P-dCTP endlabelled double stranded oligonucleotides encoding the SSE region of the ⁇ -globin promoter in a 20 ⁇ l reaction containing 500 ng of poly [d(I-C)], 6mM MgCl 2 , 16.5mM KC1 and 100 ⁇ g of bovine serum albumin.
  • 3 ⁇ l of pre-immune serum or rabbit anti-mouse CP2 or NF-E4 antibody were preincubated for 10 minutes with the binding reaction, prior to addition of the probe.
  • First strand cDNA was prepared from 2 ⁇ g of mRNA from primary erythroid progenitors using random hexamers.
  • RNA under the same PCR conditions.
  • the PCR conditions were as follows: 95°C for 1 minute followed by various cycles of 94°C for 30 seconds, 60°C for 30 seconds and
  • PCR primer sequences were as follows:
  • S14 sense 5'-GGCAGACCGAGATGAATCCTCA-3' (SEQ ID NO:9);
  • CP2 formed a major component of the stage selector protein (SSP).
  • SSP stage selector protein
  • the protein dimerization domain of CP2 has been mapped to the 242 amino acid residues at its carboxy terminus (Shirra et al, Mol. Cell. Biol., 14:5076-5087, 1994; Uv et al, Mol. Cell. Biol., 14:4020-4031, 1994).
  • a 17-amino acid stretch (aa 292- 309) is essential for protein-protein interactions.
  • a cDNA sequence encoding the COOH-terminal 242 amino acids (aa 260-502) of CP2 was inserted into the yeast expression vector pGBT9.
  • the resultant plasmid encodes a hybrid protein containing the DNA-binding domain of GAL4 fused to CP2 residues 260-502.
  • the yeast reporter strain HF7C was transformed with this vector and an expression library derived from K562 cell line cDNAs fused to the sequences encoding the GAL4 transactivation domain.
  • This library was chosen as K562, a human cell line, is a model of fetal erythropoiesis, constitutively expressing the ⁇ - and ⁇ -globin genes but not the adult ⁇ -globin genes (Rowley et al, Leukemia Res., 8:45-54, 1984).
  • a strain containing 4 concatamerized direct repeat elements (DRE) from the proximal ⁇ -promoter linked to the same reporter was transfected as a control. After transfection, both strains were then plated on media lacking leucine and uracil and resultant colonies assayed for ⁇ - galactosidase activity. A positive result was observed with cl06 but not with cl 17 or CP2 using the SSE binding site. No enzymatic activity was observed with the DRE binding sites with any of the three plasmids. Based on these findings, we postulated that cl06 was a strong candidate for the partner protein of CP2 in the SSP complex and subsequently refer to it as NF-E4.
  • DRE direct repeat elements
  • the NF-E4 gene encodes a 22 kD protein which initiates at a CUG codon.
  • 5' RACE using K562 cell cDNA to obtain a full-length clone.
  • a 966 bp fragment was generated using nested gene and vector-specific primers.
  • Comparison of this sequence with the databases using the BLAST algorithm revealed a high degree of homology with a sequence from a bacterial artificial chromosome (BAC) containing a region of the human X- chromosome.
  • Sequence analysis revealed a long open reading frame (ORF) contiguous with that defined in the original yeast two-hybrid GAL4AD/cl06 fusion vector ( Figure 1).
  • initiation codon (AUG) was observed beginning at nucleotide 421, several observations suggested that translation of full- length NF-E4 might not start at this AUG. Firstly, the NF-E4 reading frame remains open for an additional 115 codons upstream of the first AUG before an in-frame termination codon is encountered. Secondly, the predicted size of the protein from the first AUG is markedly less than that suggested by the studies reported here. Finally, a CUG codon is preceded by a Kozak sequence and termination codon is present in the correct reading frame 100 codons upstream of the first AUG. Translation from this codon would generate a protein with a predicted molecular weight of approximately 22 kD.
  • NF-E4 sequence Comparison of the NF-E4 sequence with a human genomic clone isolated in our laboratory confirmed the presence of the single AUG in the mid-region of the sequence and the upstream CUG and termination codon.
  • the dominant protein species translated from the AUG containing NF-E4 vectors had a molecular weight of 22 kD, identical to that predicted and observed with the CUG initiated NF-E4. No increase in efficiency of transcription translation was observed with either AUG containing construct.
  • MM-E4 cDNA a Murine Stem Cell Virus (MSCV) based retroviral vector containing the NF-E4 cDNA (Hawley et al, Gene Therapy, 1 :136-138, 1994).
  • This bicistronic vector contains the green fluorescence protein (GFP) cDNA linked by the encephalomyelitis internal ribosome entry site (IRES) to the NF-E4 cDNA tagged at its COOH-terminus with the hemagglutinin epitope (HA) ( Figure 2).
  • GFP green fluorescence protein
  • IVS encephalomyelitis internal ribosome entry site
  • K562 cells were transduced with this virus (MSCV-NF-E4-HA) or the parental virus carrying the GFP cDNA alone (MSCV) and after 5 days GFP-positive cells were selected by FACS.
  • Western analysis with an anti-HA antibody demonstrated a band in extract from the MSCV-NF-E4-HA transduced cells, which co-migrated with recombinant NF-E4-HA generated in bacteria. No corresponding band was observed in K562 cells transduced with MSCV alone. Further support for CUG initiated translation was obtained with the generation of polyclonal antiserum to the native NF-E4 protein.
  • a dominant band of 22 kD was observed in Western analysis of native K562 cells, consistent with initiation at the CUG codon ( Figure 3). A second minor band was observed at approximately 14 kD, which could reflect initiation at the downstream AUG.
  • NF-E4 interacts with CP2 in vitro and in vivo.
  • GST-chromatographic assays Glutathione-S- transferase alone (GST) or GST fused in-frame with full-length endophilin (GST-END), or GST fused in-frame with full-length CP2 (GST-CP2) were coupled to glutathione Sepharose beads and incubated under stringent conditions with 35 S methionine-labeled in vitro transcribed/ translated NF-E4. Specific retention of NF-E4 was observed with the GST-CP2 beads, but not on control GST or GST-END beads.
  • Nuclear extract from K562 cells was immunoprecipitated with either anti-CP2 antiserum or preimmune serum and blotted with anti-NF-E4 antiserum. Immunoprecipitation and blotting with anti-NF-E4 antiserum served as the positive control. A specific band of 22 kD was observed after immunoprecipitation with 8 ⁇ l or 4 ⁇ l of anti-CP2 antiserum. No band was observed with preimmune (PI) serum derived from NF-E4 or CP2 inoculated rabbits. This finding indicates that CP2 and NF-E4 form a physiological complex in vivo.
  • PI preimmune
  • NF-E4 is a component of the SSP complex.
  • ESA electrophoretic mobility shift assay
  • EMSA using an SSE probe revealed the presence of a new complex in nuclear extract derived from MSCV-HA-NF-E4-transduced cells which co-migrated with native SSP. No complex was observed in the line transduced with MSCV alone. Addition of anti-CP2 antisera to this EMSA ablated the SSP-SSE complex.
  • Glutathione-S-transferase alone GST
  • GST fused in-frame with full-length NF-E4 GST fused in-frame with full-length NF-E4
  • GST-NF-E4 GST fused in-frame with full-length NF-E4
  • NF-E4 cDNA probes To determine the tissue distribution of NF-E4 expression we initially performed Northern analysis on mRNA derived from K562 cells. Despite the demonstration of the presence of NF-E4 mRNA (by RT-PCR) and protein (by Western and EMSA) in these cells, we were unable to detect a signal with a variety of NF-E4 cDNA probes.
  • NF-E4 is expressed in fetal liver, cord blood and bone marrow. No expression was observed from a variety of other organs including colon, heart, spleen, kidney, liver, lymph node, and thymus ( Figure 4).
  • RT-PCR analysis of cell lines expression was demonstrated with mRNA derived from the fetal and erythroid cell lines K562 and HEL, and the embryonic kidney cell line 293T. No product was amplifiable from a variety of other lines including Jurkat, CEM, MCF7, DU528, SY5Y, and COS.
  • NF-E4 Enforced expression of NF-E4 in K562 cells and cord blood progenitors induces fetal globin gene expression.
  • an MSCV-based vector was utilized containing the NF-E4 cDNA tagged with the HA epitope at the NH2-terminus (MSCV-NF-E4-HA).
  • K562 cells were transduced with this vector or the parent GFP-containing vector (MSCV) and then sorted twice for green fluorescence by FACS and expanded in oligoclonal pools. All pools were subsequently shown to contain more than 99% GFP-positive cells by FACS analysis.
  • MSCV-HA-NF-E4 transduced cells showed a significant upregulation (5- 10-fold) of ⁇ -globin gene expression compared to pools from the MSCV transduced cells ( Figure 5). Expression of the housekeeping gene (GAPDH) was unchanged between pools.
  • GFP-positive cells were sorted at day 2 and expanded in pools in erythropoietin, IL-3, IL-6, GM-CSF, and SCF. In this setting more than 80% of the colonies derived are erythroid in nature.
  • Embryonic ⁇ -globin gene expression is also induced by NF-E4.
  • the gene was isolated from a yeast two-hybrid screen of a K562 cell cDNA library using CP2, the previously identified ubiquitous component of the SSP, as the bait.
  • NF-E4 is essential for DNA binding of the SSP, as demonstrated by the disruption of the SSP/SSE complex induced by NF-E4 antiserum and the activation in the yeast one-hybrid assay induced by NF-E4. Based on GST chromatographic assays and previous UV cross-linking data, it appears that the SSP is composed of two molecules of NF-E4 linked to a single molecule of CP2.
  • NF-E4 may also exist in a truncated form initiated from a downstream AUG (Bruening and Pelletier, J.
  • NF-E4 mRNA and/or protein in fetal liver, bone marrow, and cord blood raises the question of the developmental stage specificity of the SSP complex.
  • This finding is analogous to the expression pattern observed for another stage-specific globin regulatory factor, EKLF, which is present at both mRNA and protein level in yolk sac, fetal liver, and adult bone marrow (Donze et al. , J. Biol. Chem., 270:1955-1959, 1995).
  • NF-E4 also displays a high degree of selectivity as it induces fetal and embryonic, but not adult globin gene expression.
  • the lack of ⁇ -globin gene induction is observed in the context of K562 cells, in which constitutive ⁇ -globin gene expression is absent, and MEL cells, in which high levels of ⁇ -globin gene expression are observed (for further discussion of the effect of NF-E4 on adult globin gene expression see Example 4, infra).
  • induction of fetal and embryonic globin is observed only in cells in which these genes are normally transcribed.
  • NF-E4 and FKLF have highly restricted patterns of expression and offer promise for both pharmacological manipulation and gene therapy.
  • EXAMPLE 2 14 kD NF-E4 Peptide Initiated off an Integral Methionine at Nucleotide 421 Acts as a Negative Regulator of NF-E4 Activity
  • Murine Stem Cell Virus (MSCV)-based retroviral vector containing the NF-E4 cDNA truncated to methionine 421 and tagged at the 3' end with a hemagglutinin (HA) epitope was constructed. This virus was transfected into K562 cells and nuclear extract prepared. Western analysis of this extract with anti-HA antisera confirmed the presence of a 14 kD peptide species. Additional Western analysis was performed on extract derived from human cord blood and bone marrow progenitors. These studies demonstrated that in cord blood, the full-length NF-E4 peptide is about two to five times as abundant as the 14 kD species. In contrast, bone marrow has a ratio of full-length to truncated form of about 1 :1.
  • K562 cells were transduced with the MSCV retrovirus carrying the truncated cDNA.
  • This virus also contains the Green Fluorescent Protein (GFP) cDNA allowing selection of infected cells by FACS. Pools of transduced K562 cells were selected and analyzed for NF-E4 and ⁇ -globin gene expression. Western analysis of GFP- positive pools with anti-HA antisera confirmed the presence of the truncated NF-E4 species. Northern analysis of these pools showed a dramatic reduction in ⁇ -globin gene expression compared with control pools transduced with the parental MSCV retrovirus ( Figures 7A and 7B).
  • the genomic NF-E4 is located in GenBank Accession No. AC002416. It includes three exons and two introns and significant 5' and 3' flanking sequences. Exon I starts at nucleotide 108,464 (according to the GenBank numbering) and ends at 108,799. It contains nucleotides 1-336. Intron I has 1,812 base pairs. Exon II starts at nucleotide 110,611 and ends at nucleotide 110,950; it contains nucleotides 339-676. Intron II has 3,775 base pairs.
  • Exon III starts at position 114,725 and ends at position 114,996; it contains nucleotides 677-966 (the 20 nucleotide truncation of this exon in the genomic sequence is likely to be an artifact of the genomic sequence, as this segment is present in PCR products from genomic DNA.
  • NF-E4 in the producer cell line was verified by immunoblotting with anti-HA antiserum. Isolation and retroviral transduction of human CD34+ cells. Human cord blood was provided by the Bone Marrow Donor Institute. CD34+ cells were isolated using a MiniMACS magnetic cell sorting system (Miltenyi Biotec Inc.).
  • BSA deionized bovine serum albumin
  • insulin 5 ⁇ g/ml; Sigma
  • transferrin 100 ⁇ g/ml; BRL
  • low-density lipoprotein 10 ⁇ g/ml; Sigma
  • 10 "4 M ⁇ -mercaptoethanol BBL
  • rhIL-3 10 ng/ml; R&D
  • rhIL-6 10 ng/ml; R&D
  • hSCF recombinant human stem cell factor
  • Flt-3 300 ng/ml; R&D
  • Non-tissue culture-treated 35-mm- diameter dishes were coated with RetroNectin CH286 solution (TaKaRa Biochemicals, Shiga, Japan) at the concentration of 20 ⁇ g/cm 2 for 2 hours at room temperature and then blocked with 2% BSA fraction V (Fisher Scientifics) for 30 min at room temperature (Moritz et al, Blood, 88: 855-862, 1996).
  • the coated dishes were preloaded with virus supernatant from RD18 producer lines (2 ml/well) for 30 min, after which the supernatant was removed.
  • RNase protection analysis was performed using an Ambion RNase protection assay kit according to the manufacturer's instructions. Probes used in these studies were as described previously (Morley et al. , Blood, 78: 1355-1363, 1991). Probe input was 10 6 cpm/sample for ⁇ - and ⁇ -globin probes and 0.25 x 10 6 cpm/sample for the 18S probe.
  • NF-E4 To extend the functional studies of NF-E4, we generated stable FLYRD 18 producer cell lines containing either MSCV or MSCV-HA-NF-E4 (Cosset et al, J. Virol., 69: 7430-7436, 1995). Supernatants from these lines were used to transduce CD34+ cells derived from human cord blood (see Materials and Methods). The cells were then cultured for 12 days in differentiation medium, and GFP- and glycophorin A-positive cells were separated by FACS. RNA was prepared from these cells and analyzed by RNase protection assay. The most striking difference between the MSCV-NF-E4 and control MSCV pools was the reduction in ⁇ -globin gene expression in the NF-E4-transduced pools.
  • NF-E4 in cord blood progenitors may have significant implications in genetic therapy of sickle-cell disease with the dual beneficial effects of enhanced fetal globin expression and reduction of ⁇ s -globin synthesis.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention concerns the use of fetal globin to supplement defective adult globin genes and gene therapy for the treatment of hemoglobinopathies. In particular, the invention concerns the identification of a developmental stage-specific and tissue-restricted protein NF-E4 that, when associated with the ubiquitous transcription factor CP2 induces fetal globin gene expression from the stage selector element of the proximal η-promoter. NF-E4 is expressed in fetal liver, cord blood and bone marrow and in the K562 and HEL cell lines, which constitutively express the fetal globin genes. Enforced expression of NF-E4 in K562 cells or primary erythroid progenitors induces endogenous fetal and embryonic globin gene expression and represses adult globin gene expression. The invention provides the isolated NF-E4 polypeptides (in particular, a 22 kD positive regulator polypeptide and a 14 kD dominant negative truncated polypeptide), nucleic acids encoding these polypeptides, expression vectors, screening assays, and strategies for using NF-E4 to modulate globin expression, e.g., to treat hemoglobinopathies such as β-thalassemia and sickle-cell anemia.

Description

ISOLATION AND CHARACTERIZATION OF HUMAN NF-E4
The research leading to the present invention was supported, in part, by National Institutes of Health Grant No. PO1 HL53749-03. Thus, the United States Government has certain rights in this invention.
FIELD OF THE INVENTION
The present invention concerns the use of fetal globin to supplement defective adult globin for the treatment of hemoglobinopathies. In particular, the invention concerns the identification of a novel developmental stage-specific and tissue-restricted protein NF-E4 that, when associated with the ubiquitous transcription factor CP2, induces gene expression from the stage selector element of the proximal fetal globin promoter. The invention provides the isolated polypeptides, nucleic acids encoding these polypeptides, expression vectors, transfected cells, screening assays, and strategies for using NF-E4 to induce expression of fetal globin and reduce expression of defective adult globin, e.g., to treat hemoglobinopathies such as β- thalassemia and sickle-cell anemia.
BACKGROUND OF THE INVENTION The human β-globin cluster is the classic paradigm of a multigene locus. The globin genes (ε, Gγ, , δ, β) are expressed at high level throughout ontogeny in a stringently regulated developmental and tissue-specific pattern. From
conception until the fifth week of gestation, the embryonic globin gene (ε) is expressed in the yolk sac, the major site of erythropoiesis. After this time, the first switch in globin subtype occurs, as the fetal globin genes (Gγ, Aγ) become the dominant transcripts in the erythropoietic cells of the fetal liver. This expression pattern persists until birth, when the switch from fetal to adult (β) globin synthesis occurs, coincident with the bone marrow becoming the predominant erythropoietic organ (Stamatoyannopoulos and Nienhuis, In The Molecular Basis of Blood Diseases, 2nd Edition, In Stamatoyannopoulos et al. (eds.), 1994, pp. 107-156; Orkin, Eur. J. Biochem., 231 :271-281, 1995; Jane and Cunningham, B. J. Hematol., 102:415-422, 1998).
Studies of the human β-globin locus in transgenic mice, and in patients carrying the Hispanic and Dutch thalassemic mutations, have revealed that the key regulatory sequences required for high level globin expression reside 6-20 kb upstream of the ε-gene (van der Ploeg et al, Nature, 283:637-642, 1980; Kioussis et al, Nature, 306:662-664, 1983; Vanin et a , Cell, 35:701-709, 1983; Taramelli et al, Nucleic Acids Res., 14:7017-7029, 1986; Forrester et al, Genes & Dev., 4:1637-1649, 1990). These sequences, characterized by the presence of five DNasel hypersensitivity sites (5ΗS1-5), are known as the Locus Control Region (LCR) (Tuan et al, Proc. Natl. Acad. Sci. USA, 82:6384-6388, 1985; Forrester et al, Proc. Natl. Acad. Sci. USA, 83:1359-1363, 1986; Grosveld et al. , Cell, 51 :975-985, 1987).
Studies suggest that the HSs act cooperatively as a holocomplex which focuses the vast enhancing potential of the LCR to a single globin gene at any given time point during ontogeny (Wijgerde et al, Nature, 377:209-213, 1995). In murine fetal liver cells transgenic for the β-globin locus, the LCR flip-flops back and forth between the γ- and β-globin genes at the time of the fetal/adult switch. As the cellular transcription factor milieu changes to favour adult globin expression, the stability of the γ-gene/LCR interaction decreases and β-globin becomes the predominantly transcribed gene.
Competition between globin genes for a single regulatory sequence was first proposed as a mechanism of developmental regulation by Choi and Engel
(Cell, 55:17-26, 1988). In these studies, a stage selector element (SSE) in the chick β- globin promoter was essential for the preferential interaction of that promoter with the locus enhancer during adult erythropoiesis. The activity of the promoter element was mediated through the binding of a putative stage-specific factor, designated NF-E4 or nuclear factor-erythroid 4 (Gallarda et al, Genes & Dev., 3:1845-1859, 1989; Yang and Engel, J. Biol. Chem., 269: 13-10079, 1994). Promoter sequences and stage- specific factors have also been shown to be critical for correct developmental regulation of the human and murine β-globin clusters. Mice carrying a transgene of the human locus lacking the LCR, or ES cells in which the native LCR has been removed by homologous recombination, still display appropriate temporal patterns of globin expression, albeit at reduced levels (Starck et al, Blood, 84:1656-1665, 1994; Epner et al, Mol. Cell, 2:447-455, 1998). Conversely, deletion of the human γ-gene promoter or its substitution with a non-developmentally specific erythroid promoter in transgenic mice abolishes the correct temporal profile of both γ- and β-gene expression (Anderson et al, Mol. Biol. Cell, 4:1077-1085, 1993; Sabatino et al, Mol. Cell. Biol., 18:6634-6640, 1998).
Two regions of the γ-promoter appear to be responsible for its competitive advantage in the fetal erythroid environment. The first, the CACCC box, binds a recently described member of the Kruppel family, fetal-Kruppel-like factor (FKLF) (Asano et al, Mol. Cell. Biol, 19:3571-3579, 1999). Expression of this gene is detectable in fetal liver and to a lesser extent adult bone marrow, but its functional effects appear to predominantly involve the ε- and γ-globin genes. The second region in the γ-promoter was defined in transfection studies in the K562 cell line, a model of fetal erythropoiesis (Lozzio and Lozzio, Blood, 45:321-334, 1975). In these experiments, an 18 bp SSE immediately 5' of the TATA box was sufficient for preferential transcription from the γ-promoter when in competition with the β- promoter for a single enhancer element (HS2) from the LCR (Jane et al, EMBO J., 11 :2961-2969, 1992). Analogous to the chicken cluster, the activity of this SSE was dependent upon the binding of a stage-specific factor, the fetal/erythroid-specific stage selector protein (SSP) (Jane et al, EMBO J., 14:97-105, 1995). Several lines of evidence support the importance of the SSP/SSE interaction in the developmental regulation of globin gene expression. Evolutionary phylogenetic footprinting studies demonstrate absolute conservation of the SSP binding site in species with a distinct stage of fetal globin expression of the γ-genes and loss in species where the γ-genes are embryonic (Tagle et al. , J. Mol. Biol., 203:439-455, 1988; Gumucio et al, In The Regulation of Hemoglobin Switching, Stamatoyannopoulos and Nienhuis, 1991, pp. 277-289). Multiple SSP binding sites have also been defined in phylogenetic footprints in the ε-promoter, HS2 and HS3 (Gumucio et al, Proc. Natl. Acad. Sci. USA., 90:6018-6022, 1993). The formation of a new binding site for the SSP by the -202(C>G) HPFH mutation also lends credence to a role for this factor in γ-gene regulation (Jane et al, Mol. Cell. Biol., 13:3272-3281, 1993).
Biochemical purification of the SSP revealed that the ubiquitously expressed transcription factor CP2 (also known as LBP-lc/LSF) formed a major component of the SSP binding activity (Kim et al, Proc. Natl. Acad. Sci. USA, 84:6025-6029, 1987; Lim et al, Mol. Cell. Biol., 12:828-835, 1992; Yoon et al, Mol. Cell. Biol., 14: 1776-1785, 1994; Jane et al, EMBO J., 14:97-105, 1995). Antiserum to CP2 ablated the SSP/SSE complex in electrophoretic mobility shift assays (EMSA). It also reacted with highly purified chicken NF-E4 factor in Western analysis indicating that CP2 is part of this developmental complex and that this complex is conserved in evolution (Yang and Engel, supra, 1994). However, CP2 alone was incapable of binding to the SSE, suggesting that the SSP consisted of a heteromeric complex between CP2 and an unknown factor that provided the tissue- specificity and DNA binding activity of the complex. The presence of this factor (named NF-E4 after the putative chick complex) was confirmed in EMSA and UV cross-linking experiments and its molecular weight was estimated as 40-45 kD (Jane et al, supra, 1995).
In summary, despite progress in understanding developmental regulation of globin gene expression, a key feature of this process remained unknown: i.e., the molecular identity and functional properties of the developmental stage-specific factor NF-E4. Identification and characterization of this factor at the protein and nucleic acid level would permit more complete evaluation of its function in regulating globin gene expression. Most importantly, this knowledge would allow to regulate expression of globin genes (e.g., increase expression of fetal globin genes) in subjects suffering from hemoglobinopathies.
Thus, until the present invention, there was a long-felt and unsatisfied need in the art to identify, isolate, and molecularly characterize NF-E4, particularly human NF-E4. There was a further need to harness this factor to better understand and ultimately treat hemoglobino-pathies. The present invention addresses these and other needs in the art, as set forth herein.
SUMMARY OF THE INVENTION Hemoglobinopathies present a major health problem to individuals suffering from them. To date, no effective methods of treatment address the underlying deficiency of functional hemoglobin proteins. By providing a molecular mechanism to harness fetal and embryonic globin expression and reduce defective adult globin expression, the present invention overcomes hemoglobinopathies. In addition, the invention also permits regulation of fetal and embryonic gene expression, in the form of an inhibitory polypeptide.
In one embodiment, the invention provides an isolated NF-E4 polypeptide, in particular, human NF-E4, but also other species variants (orthologs) of human NF-E4. In a specific embodiment, the polypeptide is a fetal and embryonic globin-gene expression promoting polypeptide, e.g. , having an apparent molecular weight of 22 kD by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). In another embodiment, the polypeptide can inhibit fetal or embryonic globin gene expression and, e.g., has an apparent molecular weight of 14 kD by SDS-PAGE. In a more specific embodiment, the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:8; particularly the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:3; more particularly, the polypeptide comprises an amino acid sequence as depicted in SEQ ID NO:2. The polypeptides of the invention can be fusion polypeptides, i.e., containing non-NF-E4 sequences.
As a corollary, the invention further provides an isolated nucleic acid encoding an NF-E4 polypeptide, e.g. , as set forth above. In a specific embodiment, the isolated nucleic acid comprises a nucleotide sequence as depicted in SEQ ID NO:l from about nucleotide 421 to about nucleotide 657, in a more particular embodiment, the sequence includes from about nucleotide 121 to about nucleotide 657.
The nucleic acid may be provided lacking an internal translation start site, for example, lacking a codon for methionine located at a position corresponding to nucleotide 421 (SEQ ID NO: 1).
The invention also provides a nucleic acid of at least 10 nucleotides that hybridizes under stringent conditions to a nucleic acid having a coding sequence as depicted in SEQ ID NO: 1. Such a nucleic acid may be a coding sequence, an anti-sense nucleic acid, a ribozyme, a PCR primer, or an ohgonucleotide probe. An expression vector of the invention comprises the nucleic acid operably associated with an expression control sequence. In a specific aspect, the expression vector is a viral vector. Also provided is a host cell comprising the expression vector of the invention and an associated method for expressing an NF-E4 polypeptide. This method comprises propagating the host cell under conditions that permit expression of NF-E4 from the expression vector.
In another embodiment, the invention provides a transgenic animal comprising the nucleic acid of the invention operably associated with an expression control sequence, which transgenic animal is capable of expressing the NF-E4 polypeptide at a level sufficient to modulate globin gene expression. In a more particular embodiment, the transgenic is also transgenic with a human β-globin cluster locus control region (LCR).
Having identified the SSP-mediated molecular mechanism which controls the developmental stage-specific globin gene expression, the invention thus provides a method of screening for compounds that modulate globin expression by affecting stage selector protein activity (SSP) in a cell. The method comprises determining if a test compound contacted with a recombinant NF-E4 polypeptide modulates the activity of the NF-E4 polypeptide. If the candidate compound modulates the activity of the NF-E4 polypeptide, it is a candidate for regulation of SSP activity. In a specific embodiment, the recombinant NF-E4 polypeptide is in a cell-free system; alternatively, the recombinant NF-E4 polypeptide is expressed in a host cell. Alternatively, a test compound is evaluated for the ability to induce NF-E4 positive (or negative) regulator expression (or activity) in a test cell or animal.
Most advantageously, the invention provides a method for inducing or increasing expression of fetal and embryonic globin in a cell. This method comprises increasing the activity of positive regulator (e.g., 22 kD) NF-E4 in the cell, for example, by introducing an expression vector comprising a nucleic acid encoding a positive regulator NF-E4 polypeptide operably associated with an expression control sequence into the cell. In a specific embodiment, the increased activity of the positive regulator NF-E4 polypeptide simultaneously causes reduction in adult globin (e.g. , β-globin) gene expression.
Also provided is a method for inhibiting expression of fetal globin in a cell. This method comprises increasing the activity of negative regulator (e.g., 14 kD) NF-E4 in the cell, for example, by introducing an expression vector comprising a nucleic acid encoding a 14 kD NF-E4 polypeptide operably associated with an expression control sequence into the cell.
By providing in vivo methods for increasing fetal and embryonic globin gene expression, the present invention also provides a novel method of treatment for various hemoglobinopathies in mammals, including β-thalassemia and sickle-cell anemia. According to this method, increased fetal and embryonic globin gene expression is achieved by providing enhanced positive regulator NF-E4 activity either in the form of polypeptide or nucleic acid directing expression of said polypeptide in a mammal. In a preferred embodiment, a mammal in need of such treatment is human. In a specific embodiment, the ability of enforced expression of NF-E4 to simultaneously suppress β-globin gene expression provides a particularly advantageous method of treatment for a sickle-cell disease due to the dual beneficial effects of enhanced fetal globin expression and reduction of synthesis of defective βs-globin.
In conjunction with the method of treatment, also disclosed herein is the use of NF-E4 nucleic acids and/or polypeptides in the manufacturing of various medicaments useful for treatment of hemoglobinopathies in mammals. These and other aspects and advantages of the invention will be better understood by reference to the Drawings, Detailed Description, and Examples.
DESCRIPTION OF THE DRAWINGS Figure 1. Nucleotide and amino acid sequence of human NF-E4, a 22 kD protein that initiates at a CUG codon. The potential CUG is underlined and bolded. The 5' and 3' termination codons are asterixed.
Figure 2. Diagrammatic representation of the MSCV-NF-E4-HA retroviral vector. The vector consists of the MSCV retroviral backbone containing NF-E4 coding sequence tagged at the COOH-terminus with a hemagglutinin epitope (HA), followed by an IRES from the encephalomyocarditis virus linked to the GFP cDNA.
Figure 3. Western analysis of native K562 cells. Nuclear extract from K562 cells was resolved on a 12% polyacrylamide gel, transferred to PVDF membrane and probed with polyclonal anti-NF-E4 antisera. The migration of molecular weight standards is indicated.
Figure 4. Expression of NF-E4 in primary tissues. First strand cDNA transcribed from poly A(+) RNA from multiple primary tissues of cell lines was used as a template to PCR amplify a product using primers specific for SI 4. Samples were then diluted and re-amplified to give comparable band intensities for the same number of amplification cycles of S14 RNA within the linear range of the assay. This represented 20, 25, and 30 cycles. Based on the S14 quantification, comparable amounts of cDNA from multiple primary human tissues were PCR amplified using primers specific for NF-E4. These primers span a 1.8 kb intron and thus discriminate between mRNA- and gnomic DNA-derived signal. Cycle numbers were chosen to represent the linear range of amplification, in this case, 30 and 35 cycles.
Figure 5. Northern analysis shows that enforced expression of NF-E4 induces γ-globin gene expression. K562 cells were transduced with either MSCV-NF-E4-HA or MSCV (blanc vector) retrovirus and GFP-positive cells obtained by FACS. Cells were resorted and cultured in oligoclonal pools. Ten μg of total RNA from five MSCV-NF-E4-HA pools (lanes 3-7) or two MSCV pools (lanes 1 and 2) were analyzed with γ-gene and NF-E4 probes. GAPDH served as the control.
Figures 6A and 6B. Northern analysis demonstrates that enforced expression of NF-E4 induces ε-globin gene expression. K562 cells were transduced with either the MSCV-NF-E4-HA vector or MSCV retrovirus (blank vector), and
GFP-positive cells obtained by FACS. Cells were resorted and cultured in oligoclonal pools. Ten μg of total RNA from (A) four MSCV pools (lanes 1-4) or (B) five MSCV-NF-E4-HA pools (lanes 5-9) was analyzed with ε-gene and NF-E4 probes. GAPDH served as the control. Figures 7A and 7B. Northern analysis shows that a dominant negative
14 kD form of NF-E4 reduces γ-globin expression in K562 cells. Arrows indicate bands for NF-E4, GAPDH (a control marker), and γ-globin. (A) Lanes 1 and 2 show extracts from K562 cells infected with empty MSCV vector. (B) Lanes 1-6 show different pools of K562 cells infected with MSCV retrovirus containing truncated NF-E4-HA fusion polypeptide construct.
Figure 8. Genomic structure of NS-E4 gene. The NS-E4 gene was identified by BLAST search on a segment of human X-chromosome deposited with GenBank (AC002416).
DETAILED DESCRIPTION
The present invention advantageously permits, for the first time, development of strategies for therapeutic induction of fetal and embryonic globin in hemoglobinopathies. This invention is based, in part, on identification and cloning of a novel gene, NF-E4, isolated utilizing a part of CP2, the ubiquitously expressed component of the stage selector protein (SSP), in a yeast two-hybrid screen. This
NF-E4 gene encodes the tissue-restricted component of the SSP which together with CP2 forms a heterodimeric complex involved in the regulation of fetal γ-globin genes and embryonic ε-globin genes. The two components of the SSP co-immunoprecipitate, and antisera to NF-E4 ablates binding of the SSP to the γ-promoter. NF-E4 is expressed in fetal liver, cord blood, and bone marrow, and in the K562 and HEL cell lines, which constitutively express the fetal globin genes. Enforced expression of NF-E4 in K562 cells and primary erythroid progenitors induces endogenous fetal globin gene expression. Thus, by enhancing the activity of NF-E4, one can increase fetal globin gene expression even in cells already expressing it. In addition, as demonstrated herein, the enforced expression of NF-E4 (e.g. , in cord blood progenitors) can simultaneously lead to a significant reduction in adult globin (i.e. , β-globin) gene expression.
The present invention is further based, in part, on the unexpected discovery that a truncated form of NF-E4 negatively regulates NF-E4 function. In particular, a smaller 14 kD peptide species found on Western blots from K562, bone marrow, and cord blood inhibits fetal globin expression. The size of this species corresponds to initiation from the internal AUG of the 22 kD NF-E4 coding sequence. A retrovirus containing the NF-E4 cDNA truncated to this AUG and tagged at the 3' end with an HA epitope generated a peptide which on Western analysis with anti-HA antisera co-migrates with the native 14 kD species. Enforced expression of this smaller peptide resulted in significant reduction in γ-globin gene expression in K562 cells, suggesting that it functions as a dominant negative regulator.
Thus, the invention advantageously provides positive and negative regulators of both fetal and adult globin gene expression. Manipulation of the activity of these regulators permits (1) identification of compounds that can regulate globin gene expression, (2) evaluation and diagnosis of various conditions, and (3) intervention in diseases or disorders that involve defects in globin function, e.g. , hemoglobinopathies.
In particular, NF-E4 is useful to reactivate endogenous fetal genes expressing fetal and embryonic globin. Such reactivation of the fetal genes protects a patient with diseased β-globin genes, such as individuals suffering from sickle-cell anemia, β-thalassemia, and certain types of cancer from the harm caused through the pathogenesis of these diseases. According to the instant invention, the ability of NF- E4 expression to differentially and simultaneously affect both fetal and adult globin levels provides an additional benefit for treatment of hemoglobinopathies characterized by expression of a defective adult globin (e.g. , sickle-cell disease).
Thus, enhancement of NF-E4 activity in a patient suffering from sickle-cell disease would not only cause an increase in the level of "good" fetal globin but will also reduce synthesis of "harmful" βs-globin.
Identification of human NF-E4 provides a novel method for screening compounds useful in the therapy of globin disorders, including sickle-cell disease and β-thalassemia. Such screens can be performed in vitro, e.g., by using erythroid cell differentiation assays termed Burst Forming Unit-erythroid assays (BFUe). In these assays, primary hematopoietic cells, such as bone marrow and peripheral cells, from patients with sickle-cell disease or β-thalassemia, or better still, host cells or transgenic animals that express NF-E4 and, optimally, a reporter gene under control of the β-globin locus control region (LCR), are grown in culture in the presence and absence of NF-E4 inducing agents. The level of induction of the globin or reporter genes can be measured immunologically, by RNA analysis, or by detecting reporter gene activity (e.g., luminescense or color formation). The initial in vitro compound screening can be performed in a high-throughput format, and the NF-E4 inducing agents which demonstrate the strongest effect on globin gene expression can be further evaluated in vivo (e.g., using transgenic animals). The most active physiologically permissible compounds can be then used to prepare medicaments for treatment of patients suffering from hemoglobinopathies.
A "NF-E4" polypeptide is a polypeptide that is bound by an NF-E4 antisera or NF-E4-specific antibody, e.g., as described in the Examples, infra. NF-E4 exists in "positive regulator" and "negative regulator" forms. "Positive regulator NF- E4 polypeptide" is exemplified by NF-E4 polypeptide having an apparent or predicted molecular weight of 22 kD (a "22 kD NF-E4 polypeptide"). "Negative regulator NF- E4 polypeptide" is exemplified by a dominant negative truncated carboxy-terminal fragment of 22 kD NF-E4 polypeptide, said fragment having an apparent or predicted molecular weight of 14 kD (a "14 kD NF-E4 polypeptide"). In its positive regulator form, NF-E4 homodimerizes and also binds to CP2 leading to formation of a heteromeric complex which induces expression of a coding sequence operably associated with a stage selector element (SSE) associated with a promoter. In contrast, in its negative regulator form, NF-E4 inhibits expression of a coding sequence operably associated with an SSE associated with a promoter. NF-E4 is recognized by antisera generated against a peptide LKTOSALEQTPQQLPSLHLSQG (SEQ ID NO:8) conjugated to KLH. In a specific embodiment, the NF-E4 polypeptide comprises this amino acid sequence. In another specific embodiment, NF-E4 comprises an amino acid sequence as depicted in SEQ ID NO:3. In yet another specific embodiment, NF-E4 comprises an amino acid sequence as depicted in SEQ ID NO:2. NF-E4 polypeptides also include various fusion polypeptides as defined below, including N-terminal or C-terminal fusions with "tags", such as a hexahistidine (His6) tag or a hemagglutinin (HA) tag exemplified infra, and covalent conjugates generated chemically. In still another embodiment, NF-E4 is modified so that it does not contain any N-terminal methionine residues; in still another embodiment, it does not contain any methionine residues (i.e., it lacks methionine codon at position 421 in SEQ ID NO:l, see Figure 1). The term "NF-E4" also includes polypeptides that are substantially homologous to NF-E4, as depicted in SEQ ID NO:2 or 3. A "nucleic acid encoding human NF-E4" can be a DNA or RNA molecule. In one embodiment, the nucleic acid has at least about 50%, preferably at least about 75%, and more preferably at least about 90% sequence identity to a coding sequence depicted in Figure 1 (SEQ ID NO:l). An NF-E4 "coding sequence" can either be a coding sequence for a 22 kD polypeptide (which initiates from a CTG codon, position 121 in SEQ ID NO:l; see Figure 1) or a 14 kD polypeptide (which initiates from an ATG codon, position 421 in SEQ ID NO:l; see Figure 1). Alternatively, a nucleic acid encoding NF-E4 hybridizes under conditions set forth in detail below to a nucleic acid having a nucleotide sequence corresponding to one of the foregoing coding sequences. In still another embodiment, a nucleic acid encoding human NF-E4 comprises at least a 10 nucleotide sequence, preferably about a 15 nucleotide sequence, and more preferably at least about a 20 nucleotide sequence that is identical to a sequence in a coding region of SEQ ID NO: 1 (see Figure 1). In a specific embodiment, the coding sequence corresponds to the 22 kD NF-E4 coding sequence, but is modified to omit an internal ATG/ AUG codon at position 421 in SEQ ID NO: 1 (and thus does not encode an internal methionine in the NF-E4 polypeptide); such a construct cannot express the 14 kD NF-E4. In another specific embodiment, the coding sequence is modified to encode the 22 kD NF-E4 with an N- terminal methionine. In yet another specific embodiment, the coding sequence is modified to both omit an internal methionine and encode an N-terminal methionine.
A "nucleic acid of at least 10 nucleotides that hybridizes under stringent conditions to a nucleic acid having a coding sequence as depicted in SEQ ID NO:l" refers to a full-length coding sequence for NF-E4, or an ohgonucleotide such as a PCR primer, a probe (which may be labeled), a triple-helix-forming oligo, an antisense nucleic acid, or a ribozyme.
"Embryonic globin" is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated fashion, in particular in the yolk sac of an embryo, ε-globin is an example of embryonic globin.
"Fetal globin" is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated fashion, in particular in fetal cells, γ-globin is an example of fetal globin. "Adult globin" is an oxygen-binding globin that forms hemoglobin, and that is usually expressed in a developmentally regulated in fashion in adult cells. β-Globin is an example of adult globin.
General Definitions As used herein, the term "isolated" means that the referenced material is removed from the environment in which it is normally found. Thus, an isolated biological material can be free of cellular components, i.e., components of the cells in which the material is found or produced. In the case of nucleic acid molecules, an isolated nucleic acid includes a PCR product, an isolated mRNA, a cDNA, or a restriction fragment. In another embodiment, an isolated nucleic acid is preferably excised from the chromosome in which it may be found, and more preferably is no longer joined to non-regulatory, non-coding regions, or to other genes, located upstream or downstream of the gene contained by the isolated nucleic acid molecule when found in the chromosome. In yet another embodiment, the isolated nucleic acid lacks one or more introns. Isolated nucleic acid molecules include sequences inserted into plasmids, cosmids, artificial chromosomes, and the like. Thus, in a specific embodiment, a recombinant nucleic acid is an isolated nucleic acid. An isolated protein may be associated with other proteins or nucleic acids, or both, with which it associates in the cell, or with cellular membranes if it is a membrane-associated protein. An isolated organelle, cell, or tissue is removed from the anatomical site in which it is found in an organism. An isolated material may be, but need not be, purified.
The term "purified" as used herein refers to material that has been isolated under conditions that reduce or eliminate the presence of unrelated materials, i.e., contaminants, including native materials from which the material is obtained. For example, a purified protein is preferably substantially free of other proteins or nucleic acids with which it is associated in a cell; a purified nucleic acid molecule is preferably substantially free of proteins or other unrelated nucleic acid molecules with which it can be found within a cell. As used herein, the term "substantially free" is used operationally, in the context of analytical testing of the material. Preferably, purified material substantially free of contaminants is at least 50% pure; more preferably, at least 90% pure, and more preferably still at least 99% pure. Purity can be evaluated by chromatography, gel electrophoresis, immunoassay, composition analysis, biological assay, and other methods known in the art.
Methods for purification are well-known in the art. For example, nucleic acids can be purified by precipitation, chromatography (including preparative solid phase chromatography, ohgonucleotide hybridization, and triple helix chromatography), ultracentrifugation, and other means. Polypeptides and proteins can be purified by various methods including, without limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange and partition chromatography, precipitation and salting-out chromatography, extraction, and countercurrent distribution. For some purposes, it is preferable to produce the polypeptide in a recombinant system in which the protein contains an additional sequence tag that facilitates purification, such as, but not limited to, a polyhistidine sequence, or a sequence that specifically binds to an antibody, such as FLAG and GST. The polypeptide can then be purified from a crude lysate of the host cell by chromatography on an appropriate solid-phase matrix. Alternatively, antibodies produced against the protein or against peptides derived therefrom can be used as purification reagents. Cells can be purified by various techniques, including centrifugation, matrix separation (e.g., nylon wool separation), panning and other immunoselection techniques, depletion (e.g., complement depletion of contaminating cells), and cell sorting (e.g., fluorescence activated cell sorting [FACS]). Other purification methods are possible. A purified material may contain less than about 50%, preferably less than about 75%, and most preferably less than about 90%, of the cellular components with which it was originally associated. The "substantially pure" indicates the highest degree of purity which can be achieved using conventional purification techniques known in the art.
In a specific embodiment, the term "about" or "approximately" means within 20%, preferably within 10%, and more preferably within 5% of a given value or range. Alternatively, especially in biological systems, the term "about" means within about a log (i.e., an order of magnitude) preferably within a factor of two of a given value, depending on how quantitative the measurement.
A "sample" as used herein refers to a biological material which can be tested for the presence of NF-E4 protein or NF-E4 nucleic acids, e.g., to evaluate a gene therapy or expression in a transgenic animal. Such samples can be obtained from any source, including tissue, bone marrow, blood and blood cells, particularly circulating hematopoietic stem cells, for possible detection of protein or nucleic acids); plural effusions; cerebrospinal fluid (CSF); ascites fluid; and cell culture.
Non-human animals include, without limitation, laboratory animals such as mice, rats, rabbits, hamsters, guinea pigs, etc.; domestic animals such as dogs and cats; and, farm animals such as sheep, goats, pigs, horses, and cows, and especially such animals made transgenic with human NF-E4.
The use of italics indicates a nucleic acid molecule (e.g., NF-E4 cDNA, gene, etc.); normal text indicates the polypeptide or protein.
Thus, the present invention advantageously provides NF-E4 protein, including fragments, derivatives, and analogs of NF-E4; NF-E4 nucleic acids, including ohgonucleotide primers and probes, and NF-E4 regulatory sequences; NF- E4-specific antibodies; and related methods of using these materials to detect the expression of NF-E4 proteins or nucleic acids, and in screens for agonists and antagonists of NF-E4. The following sections of the application, which are delineated by headings (in bold) and sub-headings (in bold italics), relating to these aspects of the invention, are provided for clarity, and not by way of limitation.
NF-E4 NF-E4 polypeptides are defined above. The positive regulator NF-E4 comprises about 179 amino acids; in a specific embodiment, it has 179 amino acids. This form of NF-E4 has a calculated molecular weight of about 22 kD. The negative regulator NF-E4 is a truncated form of the positive regulator variant, having about 79 amino acid residues, corresponding to the C-terminus of the positive regulator polypeptide. In a specific embodiment, the negative regulator polypeptide has 79 amino acids.
As noted above, NF-E4 of the invention can be characterized by specific binding to an anti-NF-E4 antibody, as described below.
NF-E4 can also be characterized by its expression pattern (polypeptide and mRNA) in hematopoietic tissues (liver, cord blood, and bone marrow) and in tumor cell lines that constitutively express NF-E4, e.g., K562 cells.
NF-E4 fragments, derivatives, and analogs can be characterized by one or more of the characteristics of NF-E4 protein. For example, an NF-E4 fragment, also termed herein an NF-E4 peptide, can have an amino acid sequence corresponding to SEQ ID NO:8. In another embodiment, it can have a sequence corresponding to a C-terminus of the positive regulator polypeptide, and in particular a fragment having SEQ ID NO:3. Analogs and derivatives of NF-E4 of the invention have the same or homologous characteristics of NF-E4 as set forth above. For example, a truncated form of NF-E4 can be provided. Such a truncated form includes NF-E4 with a deletion. In a specific embodiment, the derivative is functionally active, i.e., capable of exhibiting one or more functional activities associated with a full-length, wild-type NF-E4 of the invention. Alternatively, an NF-E4 chimeric fusion protein can be prepared in which the NF-E4 portion of the fusion protein has one or more characteristics of NF-E4. Such fusion proteins include fusions of NF-E4 polypeptide with a marker polypeptide, such as FLAG, a hexahistidine (His6) tag, glutathione-S- transferase (GST), or hemagglutinin (HA). NF-E4 can also be fused with a unique phosphorylation site for labeling. In another embodiment, NF-E4 can be expressed as a fusion with a bacterial protein, such as β-galactosidase.
NF-E4 analogs can be made by altering encoding nucleic acid sequences by substitutions, additions or deletions that provide for functionally similar molecules, i. e. , molecules that perform one or more NF-E4 functions. In a specific embodiment, an analog of NF-E4 is a sequence-conservative variant of NF-E4. In another embodiment, an analog of NF-E4 is a function-conservative variant. In yet another embodiment, an analog of NF-E4 is an allelic variant of the human protein, or a mutant form of NF-E4. Still another analog of NF-E4 is a substantially homologous NF-E4 from another species, e.g., mouse or chicken.
"Sequence-conservative variants" of a polynucleotide sequence are those in which a change of one or more nucleotides in a given codon position results in no alteration in the amino acid encoded at that position.
"Function-conservative variants" are those in which a given amino acid residue in a protein or enzyme has been changed without altering the overall conformation and function of the polypeptide, including, but not limited to, replacement of an amino acid with one having similar properties (such as, for example, polarity, hydrogen bonding potential, acidic, basic, hydrophobic, aromatic, and the like). Amino acids with similar properties are well known in the art. For example, arginine, histidine and lysine are hydrophilic-basic amino acids and may be interchangeable. Similarly, isoleucine, a hydrophobic amino acid, may be replaced with leucine, methionine or valine. Such changes are expected to have little or no effect on the apparent molecular weight or isoelectric point of the protein or polypeptide. Amino acids other than those indicated as conserved may differ in a protein or enzyme so that the percent protein or amino acid sequence similarity between any two proteins of similar function may vary and may be, for example, from 70%) to 99%) as determined according to an alignment scheme such as by the Cluster Method, wherein similarity is based on the MEGALIGN algorithm. A "function- conservative variant" also includes a polypeptide or enzyme which has at least 60 % amino acid identity as determined by BLAST or FASTA algorithms, preferably at least 75%), most preferably at least 85%, and even more preferably at least 90%), and which has the same or substantially similar properties or functions as the native or parent protein or enzyme to which it is compared.
The terms "mutant" and "mutation" mean any detectable change in genetic material, e.g. DNA, or any process, mechanism, or result of such a change. This includes gene mutations, in which the structure (e.g. DNA sequence) of a gene is altered, any gene or DNA arising from any mutation process, and any expression product (e.g. protein or enzyme) expressed by a modified gene or DNA sequence. The term "variant" may also be used to indicate a modified or altered gene, DNA sequence, enzyme, cell, etc., i.e., any kind of mutant.
As used herein, the term "homologous" in all its grammatical forms and spelling variations refers to the relationship between proteins that possess a "common evolutionary origin," including proteins from superfamilies (e.g., the immunoglobulin superfamily) and homologous proteins from different species (e.g., myosin light chain, etc.) (Reeck et al, Cell 50:667, 1987). Such proteins (and their encoding genes) have sequence homology, as reflected by their sequence similarity, whether in terms of percent similarity or the presence of specific residues or motifs at conserved positions.
Accordingly, the term "sequence similarity" in all its grammatical forms refers to the degree of identity or correspondence between nucleic acid or amino acid sequences of proteins that may or may not share a common evolutionary origin (see Reeck et al. , supra). However, in common usage and in the instant application, the term "homologous," when modified with an adverb such as "highly," may refer to sequence similarity and may or may not relate to a common evolutionary origin.
In a specific embodiment, two DNA sequences are "substantially homologous" or "substantially similar" when the encoded polypeptides are at least 35-40% similar as determined by one of the algorithms disclosed herein, preferably at least about 60%, and most preferably at least about 90 or 95%) in a highly conserved domain, or, for alleles, across the entire amino acid sequence. Sequence comparison algorithms include BLAST (BLAST P, BLAST N, BLAST X), FASTA, DNA Strider, the GCG (Genetics Computer Group, Program Manual for the GCG Package, Version 7, Madison, Wisconsin) pileup program, etc. using the default parameters provided with these algorithms. An example of such a sequence is an allelic or species variant of the specific NF-E4 genes of the invention. Sequences that are substantially homologous can be identified by comparing the sequences using standard software available in sequence data banks, or in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. NF-E4 derivatives include, but are by no means limited to, phosphorylated NF-E4, myristylated NF-E4, methylated NF-E4, and other NF-E4 proteins that are chemically modified. NF-E4 derivatives also include labeled variants, e.g., radio-labeled with iodine (or, as pointed out above, phosphorous; see EP372707B); a detectable molecule, such as but by no means limited to biotin, a chelating group complexed with a metal ion, a chromophore or fluorophore, a gold colloid, or a particle such as a latex bead; or attached to a water soluble polymer.
Chemical modification of biologically active component or components of NF-E4 may provide additional advantages under certain circumstances. See U.S. Patent No. 4,179,337, Davis et al., issued December 18, 1979. For a review, see Abuchowski et al., in Enzymes as Drugs (J.S. Holcerberg and J. Roberts, eds.) 1981, pp. 367-383). A review article describing protein modification and fusion proteins is found in Francis, Focus on Growth Factors, 1992, 3:4-10, Mediscript: Mountview Court, Friern Barnet Lane, London N20, OLD, UK.
Cloning and Expression of NF-E4
The present invention contemplates analysis and isolation of a gene encoding a functional or mutant NF-E4, including a full-length, or naturally occurring form of NF-E4, and any antigenic fragments thereof from any source. It further contemplates expression of functional or mutant NF-E4 protein for evaluation, diagnosis, or, particularly, therapy. In accordance with the present invention there may be employed conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein "Sambrook et al, 1989"); DNA Cloning: A Practical Approach, Volumes I and II (D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid Hybridization [B.D. Hames & S.J. Higgins eds. (1985)]; Transcription And Translation [B.D. Hames & S.J. Higgins, eds. (1984)]; Animal Cell Culture [R.I. Freshney, ed. (1986)]; Immobilized Cells And Enzymes [IRL Press,
(1986)]; B.EPerbal, A Practical Guide To Molecular Cloning (1984); F.M. Ausubel et al (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (1994).
Molecular Biology - Definitions "Amplification" of DNA as used herein denotes the use of polymerase chain reaction (PCR) to increase the concentration of a particular DNA sequence within a mixture of DNA sequences. For a description of PCR see Saiki et al, Science, 239:487, 1988.
A "nucleic acid molecule" refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules"); or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; "DNA molecules"); or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix; or "protein nucleic acids" (PNA) formed by conjugating bases to an amino acid backbone; or nucleic acids containing modified bases, for example thiouracil, thio-guanine and fiuoro-uracil. Double stranded DNA-DNA, DNA-RNA and RNA- RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double- stranded DNA found, inter alia, in linear (e.g., restriction fragments) or circular DNA molecules, plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA). A "recombinant DNA molecule" is a DNA molecule that has undergone a molecular biological manipulation.
A "polynucleotide" or "nucleotide sequence" is a series of nucleotide bases (also called "nucleotides") in DNA and RNA, and means any chain of two or more nucleotides. A nucleotide sequence typically carries genetic information, including the information used by cellular machinery to make proteins and enzymes. These terms include double or single stranded genomic and cDNA, RNA, any synthetic and genetically manipulated polynucleotide, and both sense and anti-sense polynucleotide (although only sense stands are being represented herein). This includes single- and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids. The polynucleotides herein may be flanked by natural regulatory
(expression control) sequences, or may be associated with heterologous sequences, including promoters, internal ribosome entry sites (IRES) and other ribosome binding site sequences, enhancers, response elements, suppressors, signal sequences, polyadenylation sequences, introns, 5'- and 3'- non-coding regions, and the like. The nucleic acids may also be modified by many means known in the art. Non-limiting examples of such modifications include methylation, "caps", substitution of one or more of the naturally occurring nucleotides with an analog, and internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, carbamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). Polynucleotides may contain one or more additional covalently linked moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), intercalators (e.g., acridine, psoralen, etc.), chelators (e.g., metals, radioactive metals, iron, oxidative metals, etc.), and alkylators. The polynucleotides may be derivatized by formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage. Furthermore, the polynucleotides herein may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.
A "coding sequence" or a sequence "encoding" an expression product, such as a RNA, polypeptide, protein, or enzyme, is a minimum nucleotide sequence that, when expressed, results in the production of that RNA, polypeptide, protein, or enzyme, i.e., the nucleotide sequence encodes an amino acid sequence for that polypeptide, protein or enzyme. A coding sequence for a protein may include a start codon (usually ATG, though as shown herein, alternative start codons can be used) and a stop codon. The term "gene", also called a "structural gene" means a DNA sequence that codes for a particular sequence of amino acids, which comprise all or part of one or more proteins or enzymes, and may include regulatory (non-transcribed) DNA sequences, such as promoter sequences, which determine for example the conditions under which the gene is expressed. The transcribed region of the gene may include untranslated regions, including introns, 5'-untranslated region (UTR), and 3'-UTR, as well as the coding sequence.
A "promoter sequence" is a DNA regulatory region capable of binding RNA polymerase in a cell and initiating transcription of a downstream (3' direction) coding sequence. For purposes of defining the present invention, the promoter . sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence will be found a transcription initiation site (conveniently defined for example, by mapping with nuclease SI), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase.
A coding sequence is "under the control" or "operably (or operatively) associated with" of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans- RNA spliced (if it contains introns) and translated into the protein encoded by the coding sequence. The terms "express" and "expression" mean allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence. A DNA sequence is expressed in or by a cell to form an "expression product" such as mRNA or a protein. The expression product itself , e.g. the resulting mRNA or protein, may also be said to be "expressed" by the cell. An expression product can be characterized as intracellular, extracellular or secreted. The term "intracellular" means something that is inside a cell. The term "extracellular" means something that is outside a cell. A substance is "secreted" by a cell if it appears in significant measure outside the cell, from somewhere on or inside the cell.
The term "transfection" means the introduction of a heterologous nucleic acid into a cell. The term "transformation" means the introduction of a heterologous gene, DNA or RNA sequence to a host cell, so that the host cell will express the introduced gene or sequence to produce a desired product. The introduced gene or sequence may also be called a "cloned" or "heterologous" gene or sequence, and may include regulatory or control sequences, such as start, stop, promoter, signal, secretion, or other sequences used by a cell's genetic machinery. The gene or sequence may include nonfunctional sequences or sequences with no known function. A host cell that receives and expresses introduced DNA or RNA has been
"transformed" and is a "transformant" or a "clone." The DNA or RNA introduced to a host cell can come from any source, including cells of the same genus or species as the host cell, or cells of a different genus or species.
The terms "vector", "cloning vector" and "expression vector" mean the vehicle by which a DNA or RNA sequence (e.g. a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g. transcription and translation) of the introduced sequence. Vectors include plasmids, phages, viruses, etc.; they are discussed in greater detail below.
Vectors typically comprise the DNA of a transmissible agent, into which heterologous DNA is inserted. A common way to insert one segment of DNA into another segment of DNA involves the use of enzymes called restriction enzymes that cleave DNA at specific sites (specific groups of nucleotides) called restriction sites. A "cassette" refers to a DNA coding sequence or segment of DNA that codes for an expression product that can be inserted into a vector at defined restriction sites. The cassette restriction sites are designed to ensure insertion of the cassette in the proper reading frame. Generally, foreign DNA is inserted at one or more restriction sites of the vector DNA, and then is carried by the vector into a host cell along with the transmissible vector DNA. A segment or sequence of DNA having inserted or added DNA, such as an expression vector, can also be called a "DNA construct." A common type of vector is a "plasmid", which generally is a self-contained molecule of double-stranded DNA, usually of bacterial origin, that can readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell. A plasmid vector often contains coding DNA and promoter DNA and has one or more restriction sites suitable for inserting foreign DNA. Coding DNA is a DNA sequence that encodes a particular amino acid sequence for a particular protein or enzyme. Promoter DNA is a DNA sequence which initiates, regulates, or otherwise mediates or controls the expression of the coding DNA. Promoter DNA and coding DNA may be from the same gene or from different genes, and may be from the same or different organisms. A large number of vectors, including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts. Non-limiting examples include pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, WI), pRSET or pREP plasmids (Invitrogen, San Diego, CA), or pMAL plasmids (New England Biolabs, Beverly, MA), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art. Recombinant cloning vectors will often include one or more replication systems for cloning or expression, one or more markers for selection in the host, e.g. antibiotic resistance, and one or more expression cassettes.
The term "host cell" means any cell of any organism that is selected, modified, transformed, grown, or used or manipulated in any way, for the production of a substance by the cell, for example the expression by the cell of a gene, a DNA or RNA sequence, a protein or an enzyme. Host cells can further be used for screening or other assays, as described infra. Host cells can be cultured cells in vitro or one or more cells in a non-human animal, e.g., a transgenic animal or transiently transfected animal.
The term "expression system" means a host cell and compatible vector under suitable conditions, e.g. for the expression of a protein coded for by foreign DNA carried by the vector and introduced to the host cell. Common expression systems include E. coli host cells and plasmid vectors, insect host cells and Baculovirus vectors, and mammalian host cells, including but not limited to K562 cells, HEL cells, MEL cells, COS-1 cells, C2C12 cells, CHO cells, HeLa cells, 293T (human kidney cells), mouse primary myoblasts, and NIH 3T3 cells.
The term "heterologous" refers to a combination of elements not naturally occurring. For example, heterologous DNA refers to DNA not naturally located in the cell, or in a chromosomal site of the cell. A heterologous gene is a gene in which the regulatory control sequences are not found naturally in association with the coding sequence. In the context of the present invention, an NF-E4 gene is heterologous to the vector DNA in which it is inserted for cloning or expression, and it is heterologous to a host cell containing such a vector, in which it is expressed, e.g., a K562 cell.
A nucleic acid molecule is "hybridizable" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al, supra). The conditions of temperature and ionic strength determine the "stringency" of the hybridization. For preliminary screening for homologous nucleic acids, low stringency hybridization conditions, corresponding to a Tm (melting temperature) of 55 °C, can be used, e.g., 5x SSC, 0.1% SDS, 0.25% milk, and no formamide; or 30% formamide, 5x SSC, 0.5% SDS). Moderate stringency hybridization conditions correspond to a higher Tm, e.g., 40% formamide, with 5x or 6x SCC. High stringency hybridization conditions correspond to the highest Tm, e.g., 50% formamide, 5x or 6x SCC. SCC is a 0.15M NaCI, 0.015M Na-citrate.
Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotides in length, equations for calculating Tm have been derived (see Sambrook et al, supra, 9.50-9.51). For hybridization with shorter nucleic acids, i. e. , oligonucleotides, the position of mismatches becomes more important, and the length of the ohgonucleotide determines its specificity (see Sambrook et al. , supra, 11.7-11.8). A minimum length for a hybridizable nucleic acid is at least about 10 nucleotides; preferably at least about 15 nucleotides; and more preferably the length is at least about 20 nucleotides. In a specific embodiment, the term "standard hybridization conditions" refers to a Tm of 55 °C, and utilizes conditions as set forth above. In a preferred embodiment, the Tm is 60 °C; in a more preferred embodiment, the Tm is 65 °C. In a specific embodiment, "high stringency" refers to hybridization and/or washing conditions at 68 °C in 0.2XSSC, at 42 °C in 50% formamide, 4XSSC, or under conditions that afford levels of hybridization equivalent to those observed under either of these two conditions.
As used herein, the term "ohgonucleotide" refers to a nucleic acid, generally of at least 10, preferably at least 15, and more preferably at least 20 nucleotides, preferably no more than 100 nucleotides, that is hybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNA molecule encoding a gene, mRNA, cDNA, or other nucleic acid of interest. Oligonucleotides can be labeled, e.g., with 32P-nucleotides or nucleotides to which a label, such as biotin, has been covalently conjugated. In one embodiment, a labeled ohgonucleotide can be used as a probe to detect the presence of a nucleic acid. In another embodiment, oligonucleotides (one or both of which may be labeled) can be used as PCR primers, either for cloning full-length or a fragment of NF-E4, or to detect the presence of nucleic acids encoding NF-E4. In a further embodiment, an ohgonucleotide of the invention can form a triple helix with a NF-E4 DNA molecule. Generally, oligonucleotides are prepared synthetically, preferably on a nucleic acid synthesizer. Accordingly, oligonucleotides can be prepared with non-naturally occurring phosphoester analog bonds, such as thioester bonds, etc.
The present invention provides antisense nucleic acids (including ribozymes), which may be used to inhibit expression of NF-E4 of the invention. An "antisense nucleic acid" is a single stranded nucleic acid molecule which, on hybridizing under cytoplasmic conditions with complementary bases in an RNA or DNA molecule, inhibits the latter's role. If the RNA is a messenger RNA transcript, the antisense nucleic acid is a countertranscript or mRNA-interfering complementary nucleic acid. As presently used, "antisense" broadly includes RNA-RNA interactions, RNA-DNA interactions, triple helix interactions, ribozymes and RNase-H mediated arrest. Antisense nucleic acid molecules can be encoded by a recombinant gene for expression in a cell (e.g., U.S. Patent No. 5,814,500; U.S. Patent No. 5,811,234), or alternatively they can be prepared synthetically (e.g., U.S. Patent No. 5,780,607).
Specific non-limiting examples of synthetic oligonucleotides envisioned for this invention include oligonucleotides that contain phosphorothioates, phosphotriesters, methyl phosphonates, short chain alkyl, or cycloalkl intersugar linkages or short chain heteroatomic or heterocyclic intersugar linkages. Most preferred are those with CH2-NH-O-CH2, CH2-N(CH3)-O-CH2, CH2-O-N(CH3)-CH2, CH2-N(CH3)-N(CH3)-CH2 and O-N(CH3)-CH2-CH2 backbones (where phosphodiester is O-PO2-O-CH2). US Patent No. 5,677,437 describes heteroaromatic olignucleoside linkages. Nitrogen linkers or groups containing nitrogen can also be used to prepare ohgonucleotide mimics (U.S. Patents Nos. 5,792,844 and 5,783,682). US Patent No. 5,637,684 describes phosphoramidate and phosphorothioamidate oligomeric compounds. Also envisioned are oligonucleotides having morpholino backbone structures (U.S. Pat. No. 5,034,506). In other embodiments, such as the peptide-nucleic acid (PNA) backbone, the phosphodiester backbone of the ohgonucleotide may be replaced with a polyamide backbone, the bases being bound directly or indirectly to the aza nitrogen atoms of the polyamide backbone (Nielsen et al, Science 254:1497, 1991). Other synthetic oligonucleotides may contain substituted sugar moieties comprising one of the following at the 2' position: OH, SH, SCH3, F, OCN, O(CH2)nNH2 or O(CH2)nCH3 where n is from 1 to about 10; C, to C10 lower alkyl, substituted lower alkyl, alkaryl or aralkyl; CI; Br; CN; CF3; OCF3; O-; S-, or N-alkyl; O-, S-, or N-alkenyl; SOCH3 ; SO2CH3; ONO2;NO2; N3; NH2; heterocycloalkyl; heterocycloalkaryl; aminoalkylamino; polyalkylamino; substitued silyl; a fluorescein moiety; an RNA cleaving group; a reporter group; an intercalator; a group for improving the pharmacokinetic properties of an ohgonucleotide; or a group for improving the pharmacodynamic properties of an ohgonucleotide, and other substituents having similar properties. Oligonucleotides may also have sugar mimetics such as cyclobutyls or other carbocyclics in place of the pentofuranosyl group. Nucleotide units having nucleosides other than adenosine, cytidine, guanosine, thymidine and uridine, such as inosine, may be used in an ohgonucleotide molecule.
NF-E4 Nucleic Acids
A gene encoding NF-E4, whether genomic DNA or cDNA, can be isolated from any source, particularly from a human cDNA or genomic library. Methods for obtaining NF-E4 gene are well known in the art, as described above (see, e.g., Sambrook et al, 1989, supra). The DNA may be obtained by standard procedures known in the art from cloned DNA (e.g., a DNA "library"), and preferably is obtained from a cDNA library prepared from tissues with high level expression of the protein (e.g., an embryonic or fetal hematopoietic cell library, since these are the cells that evidence highest levels of expression of NF-E4), by chemical synthesis, by cDNA cloning, or by the cloning of genomic DNA (e.g., DNA having a sequence as depicted in GenBank accession no. AC-002416 from about base 108,464 to about base 115,014), or fragments thereof, purified from the desired cell (See, for example, Sambrook et al, 1989, supra; Glover, D.M. (ed.), 1985, DNA Cloning: A Practical Approach, MRL Press, Ltd., Oxford, U.K. Vol. I, II). Clones derived from genomic DNA may contain regulatory and intron DNA regions in addition to coding regions; clones derived from cDNA will not contain intron sequences. Whatever the source, the gene should be molecularly cloned into a suitable vector for propagation of the gene. Identification of the specific DNA fragment containing the desired NF-E4 gene may be accomplished in a number of ways. For example, a portion of an NF-E4 gene exemplified infra can be purified and labeled to prepare a labeled probe, and the generated DNA may be screened by nucleic acid hybridization to the labeled probe (Benton and Davis, Science 196:180, 1977; Grunstein and Hogness, Proc. Natl. Acad. Sci. U.S.A. 72:3961, 1975). Those DNA fragments with substantial homology to the probe, such as an allelic variant from another individual, will hybridize. In a specific embodiment, highest stringency hybridization conditions are used to identify a homologous NF-E4 gene. Further selection can be carried out on the basis of the properties of the gene, e.g., if the gene encodes a protein product having the isoelectric, electrophoretic, amino acid composition, partial or complete amino acid sequence, antibody binding activity, or ligand binding profile of NF-E4 protein as disclosed herein. Thus, the presence of the gene may be detected by assays based on the physical, chemical, immunological, or functional properties of its expressed product.
Other DNA sequences which encode substantially the same amino acid sequence as an NF-E4 gene may be used in the practice of the present invention. These include but are not limited to allelic variants, species variants, sequence conservative variants, and functional variants. Amino acid substitutions may also be introduced to substitute an amino acid with a particularly preferable property. For example, a Cys may be introduced a potential site for disulfide bridges with another Cys.
The genes encoding NF-E4 derivatives and analogs of the invention can be produced by various methods known in the art. The manipulations which result in their production can occur at the gene or protein level. For example, the cloned NF-E4 gene sequence can be modified by any of numerous strategies known in the art (Sambrook et al, 1989, supra). The sequence can be cleaved at appropriate sites with restriction endonuclease(s), followed by further enzymatic modification if desired, isolated, and ligated in vitro. In the production of the gene encoding a derivative or analog of NF-E4, care should be taken to ensure that the modified gene remains within the same translational reading frame as the NF-E4 gene, uninterrupted by translational stop signals, in the gene region where the desired activity is encoded.
Additionally, the NF-E4-encoding nucleic acid sequence can be mutated in vitro or in vivo, to create and/or destroy translation, initiation, and/or termination sequences, or to create variations in coding regions and/or form new restriction endonuclease sites or destroy preexisting ones, to facilitate further in vitro modification. In a specific embodiment, NF-E4 is mutated to eliminate the methionine codon found internally (about amino acid 101) in the positive regulator polypeptide, thus producing a sequence lacking an internal translation initiation site. Such modifications can also be made to introduce restriction sites and facilitate cloning the NF-E4 gene into an expression vector. Any technique for mutagenesis known in the art can be used, including but not limited to, in vitro site-directed mutagenesis (Hutchinson, C, et al, J. Biol. Chem. 253:6551, 1978; Zoller and Smith, DNA 3:479-488, 1984; Oliphant et al, Gene 44:177, 1986; Hutchinson et al, Proc. Natl. Acad. Sci. U.S.A. 83:710, 1986), use of TAB" linkers (Pharmacia), etc. PCR techniques are preferred for site directed mutagenesis (see Higuchi, 1989, "Using PCR to Engineer DNA", in PCR Technology: Principles and Applications for DNA Amplification, H. Erlich, ed., Stockton Press, Chapter 6, pp. 61-70).
The identified and isolated gene can then be inserted into an appropriate cloning vector. A large number of vector-host systems known in the art may be used. Possible vectors include, but are not limited to, plasmids or modified viruses, but the vector system must be compatible with the host cell used. Examples of vectors include, but are not limited to, E. coli, bacteriophages such as lambda derivatives, or plasmids such as pBR322 derivatives or pUC plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, etc. The insertion into a cloning vector can, for example, be accomplished by ligating the DNA fragment into a cloning vector which has complementary cohesive termini. However, if the complementary restriction sites used to fragment the DNA are not present in the cloning vector, the ends of the DNA molecules may be enzymatically modified. Alternatively, any site desired may be produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated linkers may comprise specific chemically synthesized oligonucleotides encoding restriction endonuclease recognition sequences.
Recombinant molecules can be introduced into host cells via transformation, transfection, infection, electroporation, etc., so that many copies of the gene sequence are generated. Preferably, the cloned gene is contained on a shuttle vector plasmid, which provides for expansion in a cloning cell, e.g., E. coli, and facile purification for subsequent insertion into an appropriate expression cell line, if such is desired. For example, a shuttle vector, which is a vector that can replicate in more than one type of organism, can be prepared for replication in both E. coli and Saccharomyces cerevisiae by linking sequences from an E. coli plasmid with sequences form the yeast 2μ plasmid.
Expression ofNF-E4 Polypeptides
The nucleotide sequence coding for NF-E4, or antigenic fragment, derivative or analog thereof, or a functionally active derivative, including a chimeric protein, thereof, can be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of the inserted protein-coding sequence. Thus, a nucleic acid encoding NF-E4 of the invention can be operationally associated with a promoter in an expression vector of the invention. Both cDNA and genomic sequences can be cloned and expressed under control of such regulatory sequences. Such vectors can be used to express functional or functionally inactivated NF-E4 polypeptides.
The necessary transcriptional and translational signals can be provided on a recombinant expression vector. Potential host-vector systems include but are not limited to mammalian cell systems transfected with expression plasmids or infected with virus (e.g., vaccinia virus, adenovirus, adeno-associated virus, herpes virus, etc.); insect cell systems infected with virus (e.g., baculovirus); microorganisms such as yeast containing yeast vectors; or bacteria transformed with bacteriophage, DNA, plasmid DNA, or cosmid DNA. The expression elements of vectors vary in their strengths and specificities. Depending on the host- vector system utilized, any one of a number of suitable transcription and translation elements may be used.
Expression of NF-E4 protein may be controlled by any promoter/enhancer element known in the art, but these regulatory elements must be functional in the host selected for expression. Promoters which may be used to control NF-E4 gene expression include, but are not limited to, cytomegalovirus (CMV) promoter (U.S. Patent Nos. 5,385,839 and No. 5,168,062), the SV40 early promoter region (Benoist and Chambon, 1981, Nature 290:304-310), the promoter contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto, et al, Cell 22:787-797, 1980), the herpes thymidine kinase promoter (Wagner et al, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445, 1981), the regulatory sequences of the metallothionein gene (Brinster et al, Nature 296:39-42, 1982); prokaryotic expression vectors such as the b-lactamase promoter (Villa-Komaroff, et al. , Proc. Natl. Acad. Sci. U.S.A. 75:3727-3731, 1978), or the tac promoter (DeBoer, et al, Proc. Natl. Acad. Sci. U.S.A. 80:21-25, 1983); see also "Useful proteins from recombinant bacteria" in Scientific American, 242:74-94, 1980; promoter elements from yeast or other fungi such as the Gal 4 promoter, the ADC (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkaline phosphatase promoter; and transcriptional control regions that exhibit hematopoietic tissue specificity, in particular: beta-globin gene control region which is active in myeloid cells (Mogram et al, Nature 315:338-340, 1985; Kollias et al, Cell 46:89-94, 1986), hematopoietic stem cell differentiation factor promoters, erythropoietin receptor promoter (Maouche et al, Blood, 15:2557, 1991), etc.
Soluble forms of the protein can be obtained by collecting culture fluid, or solubilizing inclusion bodies, e.g., by treatment with detergent, and if desired sonication or other mechanical processes, as described above. The solubilized or soluble protein can be isolated using various techniques, such as polyacrylamide gel electrophoresis (PAGE), isoelectric focusing, 2-dimensional gel electrophoresis, chromatography (e.g., ion exchange, affinity, immunoaffinity, and sizing column chromatography), centrifugation, differential solubility, immunoprecipitation, or by any other standard technique for the purification of proteins. Vectors
A wide variety of host/expression vector combinations may be employed in expressing the DNA sequences of this invention. Useful expression vectors, for example, may consist of segments of chromosomal, non-chromosomal and synthetic DNA sequences. Suitable vectors include derivatives of SV40 and known bacterial plasmids, e.g., E. coli plasmids col El, pCRl, pBR322, pMal-C2, pET, pGEX (Smith et al , Gene, 67:31-40, 1988), pMB9 and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous derivatives of phage 1, e.g., NM989, and other phage DNA, e.g., Ml 3 and filamentous single stranded phage DNA; yeast plasmids such as the 2μ plasmid or derivatives thereof; vectors useful in eukaryotic cells, such as vectors useful in insect or mammalian cells; vectors derived from combinations of plasmids and phage DNAs, such as plasmids that have been modified to employ phage DNA or other expression control sequences; and the like. Preferred vectors are viral vectors, such as lentiviruses, retroviruses, herpes viruses, adenoviruses, adeno-associated viruses, vaccinia virus, baculovirus, and other recombinant viruses with desirable cellular tropism. Thus, a gene encoding a functional or mutant NF-E4 protein or polypeptide domain fragment thereof can be introduced in vivo, ex vivo, or in vitro using a viral vector or through direct introduction of DNA. Expression in targeted tissues can be effected by targeting the transgenic vector to specific cells, such as with a viral vector or a receptor ligand, or by using a tissue-specific promoter, or both. Targeted gene delivery is described in International Patent Publication WO 95/28494, published October 1995.
Viral vectors commonly used for in vivo or ex vivo targeting and therapy procedures are DNA-based vectors and retroviral vectors. Methods for constructing and using viral vectors are known in the art (see, e.g., Miller and Rosman, BioTechniques, 7:980-990, 1992). Preferably, the viral vectors are replication defective, that is, they are unable to replicate autonomously in the target cell. In general, the genome of the replication defective viral vectors which are used within the scope of the present invention lack at least one region which is necessary for the replication of the virus in the infected cell. These regions can either be eliminated (in whole or in part), be rendered non-functional by any technique known to a person skilled in the art. These techniques include the total removal, substitution (by other sequences, in particular by the inserted nucleic acid), partial deletion or addition of one or more bases to an essential (for replication) region. Such techniques may be performed in vitro (on the isolated DNA) or in situ, using the techniques of genetic manipulation or by treatment with mutagenic agents. Preferably, the replication defective virus retains the sequences of its genome which are necessary for encapsidating the viral particles.
DNA viral vectors include an attenuated or defective DNA virus, such as but not limited to herpes simplex virus (HSV), papillomavirus, Epstein Barr virus (EBV), adenovirus, adeno-associated virus (AAV), and the like. Defective viruses, which entirely or almost entirely lack viral genes, are preferred. Defective virus is not infective after introduction into a cell. Use of defective viral vectors allows for administration to cells in a specific, localized area, without concern that the vector can infect other cells. Thus, a specific tissue can be specifically targeted. Examples of particular vectors include, but are not limited to, a defective herpes virus 1 (HSV1) vector (Kaplitt et al, Molec. Cell. Neurosci. 2:320-330, 1991), defective herpes virus vector lacking a glyco-protein L gene (Patent Publication RD 371005 A), or other defective herpes virus vectors (International Patent Publication No. WO 94/21807, published September 29, 1994; International Patent Publication No. WO 92/05263, published April 2, 1994); an attenuated adenovirus vector, such as the vector described by Stratford-Perricaudet et al (J. Clin. Invest. 90:626-630, 1992; see also La Salle et al, Science 259:988-990, 1993); and a defective adeno-associated virus vector (Samulski et al, J. Virol. 61 :3096-3101, 1987; Samulski et al, J. Virol. 63:3822-3828, 1989; Lebkowski et al, Mol. Cell. Biol. 8:3988-3996, 1988). Various companies produce viral vectors commercially, including but by no means limited to Avigen, Inc. (Alameda, CA; AAV vectors), Cell Genesys (Foster City, CA; retroviral, adenoviral, AAV vectors, and lentiviral vectors), Clontech (retroviral and baculoviral vectors), Genovo, Inc. (Sharon Hill, PA; adenoviral and AAV vectors), Genvec (adenoviral vectors), IntroGene (Leiden, Netherlands; adenoviral vectors), Molecular Medicine (retroviral, adenoviral, AAV, and herpes viral vectors), Norgen (adenoviral vectors), Oxford BioMedica (Oxford, United Kingdom; lentiviral vectors), and Transgene (Strasbourg, France; adenoviral, vaccinia, retroviral, and lentiviral vectors).
Preferably, for in vivo administration, an appropriate immunosuppressive treatment is employed in conjunction with the viral vector, e.g., adenovirus vector, to avoid immuno-deactivation of the viral vector and transfected cells. For example, immunosuppressive cytokines, such as interleukin-12 (IL-12), interferon-g (IFN-γ), or anti-CD4 antibody, can be administered to block humoral or cellular immune responses to the viral vectors (see, e.g., Wilson, Nature Medicine, 1995). In that regard, it is advantageous to employ a viral vector that is engineered to express a minimal number of antigens.
Adenovirus vectors. Adenoviruses are eukaryotic DNA viruses that can be modified to efficiently deliver a nucleic acid of the invention to a variety of cell types. Various serotypes of adenovirus exist. Of these serotypes, preference is given, within the scope of the present invention, to using type 2 or type 5 human adenoviruses (Ad 2 or Ad 5) or adenoviruses of animal origin (see WO94/26914). Those adenoviruses of animal origin which can be used within the scope of the present invention include adenoviruses of canine, bovine, murine (example: Mavl, Beard et al, Virology 75 (1990) 81), ovine, porcine, avian, and simian (example: S AV) origin. Preferably, the adenovirus of animal origin is a canine adenovirus, more preferably a CAV2 adenovirus (e.g. , Manhattan or A26/61 strain [ATCC VR-800]). Various replication defective adenovirus and minimum adenovirus vectors have been described (WO94/26914, WO95/02697, WO94/28938, WO94/28152, WO94/12649, WO95/02697 WO96/22378). The replication defective recombinant adenoviruses according to the invention can be prepared by any technique known to the person skilled in the art (Levrero et al. , Gene 101 :195 1991 ; EP 185 573; Graham, EMBO J. 3:2917, 1984; Graham et al, J. Gen. Virol. 36:59 1977). Recombinant adenoviruses are recovered and purified using standard molecular biological techniques, which are well known to one of ordinary skill in the art.
Adeno-associated viruses. The adeno-associated viruses (AAV) are DNA viruses of relatively small size which can integrate, in a stable and site-specific manner, into the genome of the cells which they infect. They are able to infect a wide spectrum of cells without inducing any effects on cellular growth, morphology or differentiation, and they do not appear to be involved in human pathologies. The AAV genome has been cloned, sequenced and characterized. The use of vectors derived from the AAVs for transferring genes in vitro and in vivo has been described (see WO 91/18088; WO 93/09239; US 4,797,368, US 5,139,941, EP 488 528). The replication defective recombinant AAVs according to the invention can be prepared by cotransfecting a plasmid containing the nucleic acid sequence of interest flanked by two AAV inverted terminal repeat (ITR) regions, and a plasmid carrying the AAV encapsidation genes (rep and cap genes), into a cell line which is infected with a human helper virus (for example an adenovirus). The AAV recombinants which are produced are then purified by standard techniques.
Retrovirus vectors. In another embodiment the gene can be introduced in a retroviral vector, e.g., as described in Anderson et al, U.S. Patent No. 5,399,346; Mann et al, 1983, Cell 33: 153; Temin et al, U.S. Patent No. 4,650,764; Temin et al, U.S. Patent No. 4,980,289; Markowitz et al, 1988, J. Virol. 62:1120; Temin et al, U.S. Patent No. 5,124,263; EP 453242, EP178220; Bernstein et al. Genet. Eng. 7 (1985) 235; McCormick, BioTechnology 3 (1985) 689; International Patent Publication No. WO 95/07358, published March 16, 1995, by Dougherty et al ; and Kuo et al, 1993, Blood 82:845. The retroviruses are integrating viruses which infect dividing cells. The retrovirus genome includes two LTRs, an encapsidation sequence and three coding regions (gag, pol and env). In recombinant retroviral vectors, the gag, pol and env genes are generally deleted, in whole or in part, and replaced with a heterologous nucleic acid sequence of interest. These vectors can be constructed from different types of retrovirus, such as, HIV, MoMuLV ("murine Moloney leukaemia virus" MSV ("murine Moloney sarcoma virus"), HaSV ("Harvey sarcoma virus"); SNV ("spleen necrosis virus"); RSV ("Rous sarcoma virus") and Friend virus. Suitable packaging cell lines have been described in the prior art, in particular the cell line PA317 (US 4,861,719); the PsiCRIP cell line (WO 90/02806) and the GP+envAm-12 cell line (WO 89/07150). In addition, the recombinant retroviral vectors can contain modifications within the LTRs for suppressing transcriptional activity as well as extensive encapsidation sequences which may include a part of the gag gene (Bender et al, J. Virol. 61 : 1639, 1987). Recombinant retroviral vectors are purified by standard techniques known to those having ordinary skill in the art.
Retroviral vectors can be constructed to function as infectious particles or to undergo a single round of transfection. In the former case, the virus is modified to retain all of its genes except for those responsible for oncogenic transformation properties, and to express the heterologous gene. Non-infectious viral vectors are manipulated to destroy the viral packaging signal, but retain the structural genes required to package the co-introduced virus engineered to contain the heterologous gene and the packaging signals. Thus, the viral particles that are produced are not capable of producing additional virus.
Retrovirus vectors can also be introduced by DNA viruses, which permits one cycle of retroviral replication and amplifies tranfection efficiency (see WO 95/22617, WO 95/26411, WO 96/39036, WO 97/19182).
Lentivirus vectors. In another embodiment, lentiviral vectors are can be used as agents for the direct delivery and sustained expression of a transgene in several tissue types, including brain, retina, muscle, liver and blood. The vectors can efficiently transduce dividing and nondividing cells in these tissues, and maintain long-term expression of the gene of interest. For a review, see, Naldini, Curr. Opin. Biotechiiol., 9:457-63, 1998; see also Zufferey, et al, J. Virol., 72:9873-80, 1998). Lentiviral packaging cell lines are available and known generally in the art. They facilitate the production of high-titer lentivirus vectors for gene therapy. An example is a tetracycline-inducible VSV-G pseudotyped lentivirus packaging cell line which can generate virusparticles at titers greater than 106 IU/ml for at least 3 to 4 days (Kafri, et al, J. Virol., 73: 576-584, 1999). The vector produced by the inducible cell line can be concentrated as needed for efficiently transducing nondividing cells in vitro and in vivo.
Non-viral vectors. In another embodiment, the vector can be introduced in vivo by lipofection, as naked DNA, or with other transfection facilitating agents (peptides, polymers, etc.). Synthetic cationic lipids can be used to prepare liposomes for in vivo transfection of a gene encoding a marker (Feigner, et al, Proc. Natl. Acad. Sci. U.S.A. 84:7413-7417, 1987; Feigner and Ringold, Science 337:387- 388, 1989; see Mackey, et al, Proc. Natl. Acad. Sci. U.S.A. 85:8027-8031, 1988; Ulmer et al, Science 259: 1745-1748, 1993). Useful lipid compounds and compositions for transfer of nucleic acids are described in International Patent Publications WO95/18863 and WO96/17823, and in U.S. Patent No. 5,459,127. Lipids may be chemically coupled to other molecules for the purpose of targeting (see Mackey et al, supra). Targeted peptides, e.g., hormones or neurotransmitters, and proteins such as antibodies, or non-peptide molecules could be coupled to liposomes chemically.
Other molecules are also useful for facilitating transfection of a nucleic acid in vivo, such as a cationic oligopeptide (e.g., International Patent Publication
WO95/21931), peptides derived from DNA binding proteins (e.g., International Patent Publication WO96/25508), or a cationic polymer (e.g., International Patent Publication WO95/21931).
It is also possible to introduce the vector in vivo as a naked DNA plasmid. Naked DNA vectors for gene therapy can be introduced into the desired host cells by methods known in the art, e.g., electroporation, micro injection, cell fusion, DEAE dextran, calcium phosphate precipitation, use of a gene gun, or use of a DNA vector transporter (see, e.g., Wu et al, J. Biol. Chem. 267:963-967, 1992; Wu and Wu, J. Biol. Chem. 263:14621-14624, 1988; Hartmut et al, Canadian Patent Application No. 2,012,311, filed March 15, 1990; Williams et al, Proc. Natl. Acad. Sci. USA 88:2726-2730, 1991). Receptor-mediated DNA delivery approaches can also be used (Curiel et al, Hum. Gene Ther. 3:147-154, 1992; Wu and Wu, J. Biol. Chem. 262:4429-4432, 1987). US Patent Nos. 5,580,859 and 5,589,466 disclose delivery of exogenous DNA sequences, free of transfection facilitating agents, in a mammal. Recently, a relatively low voltage, high efficiency in vivo DNA transfer technique, termed electrotransfer, has been described (Mir et al, C.P. Acad. Sci., 321 :893, 1998; WO 99/01 157; WO 99/01158; WO 99/01175).
Antibodies to NF-E4 Antibodies to NF-E4 are useful, inter alia, for diagnostics and intracellular regulation of NF-E4 activity, as set forth below. According to the invention, NF-E4 polypeptides produced recombinantly or by chemical synthesis, and fragments or other derivatives or analogs thereof, including fusion proteins, may be used as an immunogen to generate antibodies that recognize the NF-E4 polypeptide. Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab expression library. Such an antibody is specific for human NF-E4; it may recognize a mutant form of NF-E4, or wild-type NF-E4.
Various procedures known in the art may be used for the production of polyclonal antibodies to NF-E4 polypeptide or derivative or analog thereof. For the production of antibody, various host animals can be immunized by injection with the NF-E4 polypeptide, or a derivative (e.g., fragment or fusion protein) thereof, including but not limited to rabbits, mice, rats, sheep, goats, etc. In one embodiment, the NF-E4 polypeptide or fragment thereof can be conjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH). In a specific embodiment, exemplified infra, a peptide comprising SEQ ID NO:8 is conjugated to KLH and used to immunize rabbits. Various adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Gueriή) and Corynebacterium parvum.
For preparation of monoclonal antibodies directed toward the NF-E4 polypeptide, or fragment, analog, or derivative thereof, any technique that provides for the production of antibody molecules by continuous cell lines in culture may be used. These include but are not limited to the hybridoma technique originally developed by Kohler and Milstein (Nature 256:495-497, 1975), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al, Immunology Today 4:72, 1983; Cote et al, Proc. Natl. Acad. Sci. USA, 80:2026-2030, 1983), and the EBV- hybridoma technique to produce human monoclonal antibodies (Cole et al. , in
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96, 1985). In an additional embodiment of the invention, monoclonal antibodies can be produced in germ-free animals (International Patent Publication No. WO 89/12690, published 28 December 1989). In fact, according to the invention, techniques developed for the production of "chimeric antibodies" (Morrison et al, J. Bacteriol., 159:870, 1984; Neuberger et α/., Nature, 312:604-608, 1984; Takeda et al. , Nature, 314:452-454, 1985) by splicing the genes from a mouse antibody molecule specific for an NF-E4 polypeptide together with genes from a human antibody molecule of appropriate biological activity can be used; such antibodies are within the scope of this invention. Such human or humanized chimeric antibodies are preferred for use in therapy of human diseases or disorders (described infra), since the human or humanized antibodies are much less likely than xenogenic antibodies to induce an immune response, in particular an allergic response, themselves.
Antibody fragments which contain the idiotype of the antibody molecule can be generated by known techniques. For example, such fragments include but are not limited to: the F(ab')2 fragment which can be produced by pepsin digestion of the antibody molecule; the Fab' fragments which can be generated by reducing the disulfide bridges of the F(ab')2 fragment, and the Fab fragments which can be generated by treating the antibody molecule with papain and a reducing agent.
According to the invention, techniques described for the production of single chain antibodies (U.S. Patent Nos. 5,476,786 and 5,132,405 to Huston; U.S.
Patent 4,946,778) can be adapted to produce NF-E4 polypeptide-specific single chain antibodies. An additional embodiment of the invention utilizes the techniques described for the construction of Fab expression libraries (Huse et al, Science 246:1275-1281, 1989) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity for an NF-E4 polypeptide, or its derivatives, or analogs.
In the production and use of antibodies, screening for or testing with the desired antibody can be accomplished by techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbant assay), "sandwich" immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc. In one embodiment, antibody binding is detected by detecting a label on the primary antibody. In another embodiment, the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, the secondary antibody is labeled. Many means are known in the art for detecting binding in an immunoassay and are within the scope of the present invention. For example, to select antibodies which recognize a specific epitope of an NF-E4 polypeptide, one may assay generated hybridomas for a product which binds to an NF-E4 polypeptide fragment containing such epitope. For selection of an antibody specific to an NF-E4 polypeptide from a particular species of animal, one can select on the basis of positive binding with NF-E4 polypeptide expressed by or isolated from cells of that species of animal. The foregoing antibodies can be used in methods known in the art relating to the localization and activity of the NF-E4 polypeptide, e.g., for Western blotting, imaging NF-E4 polypeptide in situ, measuring levels thereof in appropriate physiological samples, etc. using any of the detection techniques mentioned above or known in the art. Such antibodies can also be used in assays for ligand binding, e.g. , as described in US Patent No. 5,679,582. Antibody binding generally occurs most readily under physiological conditions, e.g., pH of between about 7 and 8, and physiological ionic strength. The presence of a carrier protein in the buffer solutions stabilizes the assays. While there is some tolerance of perturbation of optimal conditions, e.g., increasing or decreasing ionic strength, temperature, or pH, or adding detergents or chaotropic salts, such perturbations will decrease binding stability. In a specific embodiment, antibodies that agonize or antagonize the activity of NF-E4 polypeptide can be generated. In particular, intracellular single chain Fv antibodies can be used to regulate (inhibit) NF-E4 (Marasco et al, Proc. Natl. Acad. Sci. USA, 1993, 90:7884- 7893; see generally, Chen., Mol. Med. Today, 1997, 3:160-167; Spitz et /. , Anticancer Res., 1996, 16:3415-22; Indolfi et al, Nat. Med., 1996, 2:634-5; Kijma et al, Pharmacol. Ther., 1995, 68:247-267). Such antibodies can be tested using the assays described infra for identifying ligands.
Screening and Chemistry According to the present invention, nucleotide sequences derived from the gene encoding NF-E4, and peptide sequences derived from NF-E4, are useful targets to identify drugs that are effective in treating hemoglobin disorders. Drug targets include without limitation (i) isolated nucleic acids derived from the gene encoding,NF-E4 and (ii) isolated peptides and polypeptides derived from NF-E4 polypeptides.
In particular, identification and isolation of NF-E4 provides for development of screening assays, particularly for high throughput screening of molecules that up- or down-regulate the activity of NF-E4, e.g., by permitting expression of NF-E4 in quantities greater than can be isolated from natural sources, or in indicator cells that are specially engineered to indicate the activity of NF-E4 expressed after transfection or transformation of the cells. Accordingly, the present invention contemplates methods for identifying specific ligands of NF-E4 using various screening assays known in the art.
Any screening technique known in the art can be used to screen for NF-E4 agonists or antagonists. The present invention contemplates screens for small molecule ligands or ligand analogs and mimics, as well as screens for natural ligands that bind to and agonize or antagonize NF-E4 in vivo. Such agonists or antagonists may, for example, interfere in the phosphorylation or dephosphorylation of NF-E4, with resulting effects on NF-E4 function. For example, natural products libraries can be screened using assays of the invention for molecules that agonize or antagonize NF-E4 activity.
Knowledge of the primary sequence of NF-E4, and the similarity of that sequence with proteins of known function, can provide an initial clue as the inhibitors or antagonists of the protein. Identification and screening of antagonists is further facilitated by determining structural features of the protein, e.g. , using X-ray crystallography, neutron diffraction, nuclear magnetic resonance spectrometry, and other techniques for structure determination. These techniques provide for the rational design or identification of agonists and antagonists.
Another approach uses recombinant bacteriophage to produce large libraries. Using the "phage method" (Scott and Smith, Science, 249:386-390, 1990; Cwirla, et al, Proc. Natl. Acad. Sci. USA, 87:6378-6382, 1990; Devlin et al,
Science, 49:404-406, 1990), very large libraries can be constructed (106-108 chemical entities). A second approach uses primarily chemical methods, of which the Geysen method (Geysen et al, Molecular Immunology 23:709-715, 1986; Geysen et al. J. Immunologic Method 102:259-274, 1987; and the method of Fodor et al. (Science 251 -.161-113, 1991) are examples. Furka et al. (14th International Congress of
Biochemistry, Volume #5, Abstract FR:013, 1988; Furka, Int. J. Peptide Protein Res. 37:487-493, 1991), Houghton (U.S. Patent No. 4,631,211, issued December 1986) and Rutter et al. (U.S. Patent No. 5,010,175, issued April 23, 1991) describe methods to produce a mixture of peptides that can be tested as agonists or antagonists. In another aspect, synthetic libraries (Needels et al, Proc. Natl. Acad.
Sci. USA 90:10700-4, 1993; Ohlmeyer et al, Proc. Natl. Acad. Sci. USA 90:10922- 10926, 1993; Lam et al. , International Patent Publication No. WO 92/00252; Kocis et al, International Patent Publication No. WO 9428028) and the like can be used to screen for NF-E4 ligands according to the present invention. Test compounds are screened from large libraries of synthetic or natural compounds. Numerous means are currently used for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds. Synthetic compound libraries are commercially available from Maybridge Chemical Co. (Trevillet, Cornwall, UK), Comgenex (Princeton, NJ), Brandon Associates (Merrimack, NH), and Microsource (New Milford, CT). A rare chemical library is available from Aldrich (Milwaukee, WI). Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available from e.g. Pan Laboratories (Bothell, WA) or MycoSearch (NC), or are readily producible. Additionally, natural and synthetically produced libraries and compounds are readily modified through conventional chemical, physical, and biochemical means (Blondelle et al, TIBTECH, 14:60, 1996). In Vitro Screening Methods
Candidate agents are added to in vitro cell cultures of hemopoietic cells, prepared by known methods in the art (D. Metcalf, Hemopoietic Colonies, In Vitro Cloning of Normal and Leukemic Cells, particularly at Ch. 3, Springer- Verlag, 1977), and the levels of NF-E4 RNA are measured. If the amount of RNA in cultures which include the agent rises, compared to the control cultures which are devoid of the agent, the agent is chosen as a candidate for in vivo testing.
Various in vitro systems can be used to analyze the effects of a new compound on NF-E4 expression. Human bone marrow or peripheral blood mononuclear cells are isolated by standard techniques known in the art. They are plated in methylcellulose in the presence of the cytokines, SCF, IL-3, IL-6, and erythropoietin (semisolid culture) or in tissue culture media with the same cytokines (liquid culture). Each experiment is performed in triplicate at five different dilutions of compound on at least two different patient samples. At about 12 to 14 days, colonies are counted to exclude direct cytotoxic effects; morphological analysis is performed to exclude adverse affects on erythroid differentiation. BFUe colonies are plucked from the semisolid media (cells are lysed in the liquid culture) and assayed by Reverse Transcriptase Polymerase Chain Reaction (RT-PCR) for NF-E4 and globin subtype expression, using RNA obtained form the cells. At the same time, antibodies are used to detect expression of NF-E4, and the ratio of the 22 kD to 14 kD NF-E4 polypeptides, as well as transcription factors including but not limited to GATA1-3, NF-E2, and EKLF.
Alternatively, a recombinant NF-E4 activity system can be constructed. In this recombinant system, a host cell is modified to contain either globin genes or a reporter gene operably associated with the γ-globin or ε-globin promoter from the β-globin control locus. Optionally, the host cell can be a human hematopoietic cell containing the endogenous NF-E4 gene, including the native NF-E4 expression control sequences; or the host cell can be a non-human cell modified to express human NF-E4 constitutively or under control of its native expression control sequences. Compounds are tested for the ability to promote or inhibit NF-E4 activity, which is evaluated on the level of expression of γ- or ε-globin, or both, or the reporter gene. Compounds that modulate NF-E4 activity can be tested in cells that constitutively express NF-E4, e.g. , K562 cells or modified cells. Compounds that modulate NF-E4 expression can be tested in cells in which the NF-E4 coding sequence is operably associated with an NF-E4 expression control sequence. In this later embodiment, NF-E4 expression can be tested directly.
Reporter genes for use in the invention encode enzymatically, spectroscopically or immunologically detectable proteins, including, but are by no means limited to, chloramphenicol acetyl transferase (CAT), β-galactosidase (β-gal), luciferase, green fluorescent protein (GFP), alkaline phosphatase, and derivatives thereof.
In Vivo Screening Methods
Intact cells or whole animals expressing a gene encoding NF-E4 can be used in screening methods to identify candidate drugs. In one series of embodiments, a permanent cell line is established.
Alternatively, cells (including without limitation mammalian, insect, yeast, or bacterial cells) are transiently programmed to express an NF-E4 gene by introduction of appropriate DNA or mRNA, e.g., using the vector systems described above. Identification of candidate compounds can be achieved using any suitable assay, including without limitation (i) assays that measure selective binding of test compounds to NF-E4 (ii) assays that measure the ability of a test compound to modify (i.e., inhibit or enhance) a measurable activity or function of NF-E4 and (iii) assays that measure the ability of a compound to modify (i.e., inhibit or enhance) the transcriptional activity of sequences derived from the promoter (i.e., regulatory) regions the NF-E4 gene.
In Vivo Testing Using Transgenic Animals
Transgenic mammals can be prepared for evaluating the molecular mechanisms of NF-E4. Such mammals provide excellent models for screening or testing drug candidates. The term "transgenic" usually refers to animal whose germ line and somatic cells contain the transgene of interest, i.e., NF-E4. However, transient transgenic animals can be created by the ex vivo or in vivo introduction of an expression vector of the invention. Both types of "transgenic" animals are contemplated for use in the present invention, e.g., to evaluate the effect of a test compound on NF-E4 expression or activity. Thus, human NF-E4, or NF-E4 and , or both, "knock-in" mammals can be prepared for evaluating the molecular biology of this system in greater detail than is possible with human subjects. It is also possible to evaluate compounds or diseases on "knockout" animals, e.g., to identify a compound that can compensate for a defect in NF-E4 activity. Both technologies permit manipulation of single units of genetic information in their natural position in a cell genome and to examine the results of that manipulation in the background of a terminally differentiated organism.
Although rats and mice, as well as rabbits, are most frequently employed as transgenic animals, particularly for laboratory studies of protein function and gene regulation in vivo, any animal can be employed in the practice of the invention. Moreover, double transgenic animals, e.g., for NF-E4 and the β-locus control regions can be prepared my mating the corresponding single transgenic animals. Various transgenic animals containing non-native β-globin locus control regions have been generated (see Taboit-Dameron et al, Trangenic Res., 1999, 8:223- 35; Osborne et al, J. Virol. 1999, 73:5490-6; Zhu et al, Blood, 1999, 93:3540-9; Tanimoto et al, Nature, 1999, 398:344-8; Bunger et al, Mol. Cell. Biol., 1999, 19:3062-72).
A "knock-in" mammal is a mammal in which an endogenous gene is substituted with a heterologous gene (Roemer et al, New Biol., 3:331, 1991). Preferably, the heterologous gene is "knocked-in" to a locus of interest, either the subject of evaluation(in which case the gene may be a reporter gene; see Elefanty et al, Proc. Natl. Acad. Sci. USA, 95:11897,1998) of expression or function of a homologous gene, thereby linking the heterologous gene expression to transcription from the appropriate promoter. This can be achieved by homologous recombination, transposon (Westphal and Leder, Curr. Biol., 7:530, 1997), using mutant recombination sites (Araki et al, Nucleic Acids Res 25:868, 1997) or PCR (Zhang and Henderson, Biotechniques, 25:784, 1998). See also, Coffman, Semin. Nephrol., 17:404, 1997; Esther et al, Lab. Invest., 74:953, 1996; Murakami et al, Blood Press. Suppl., 2:36, 1996.
A "knockout mammal" is a mammal (e.g., mouse) that contains within its genome a specific gene that has been inactivated by the method of gene targeting (see, e.g., US Patents No. 5,777,195 and No. 5,616,491). A knockout mammal includes both a heterozygous knockout (/'. e. , one defective allele and one wild-type allele) and a homozygous knockout (i.e., two defective alleles). Preparation of a knockout mammal requires first introducing a nucleic acid construct that will be used to suppress expression of a particular gene into an undifferentiated cell type termed an embryonic stem (ES) cell. This cell is then injected into a mammalian embryo. A mammalian embryo with an integrated cell is then implanted into a foster mother for the duration of gestation. Zhou, et al. (Genes and Development, 9:2623-34, 1995) describe PPCA knockout mice.
The term "knockout" refers to partial or complete suppression of the expression of at least a portion of a protein encoded by an endogenous DNA sequence in a cell. The term "knockout construct" refers to a nucleic acid sequence that is designed to decrease or suppress expression of a protein encoded by endogenous DNA sequences in a cell. The nucleic acid sequence used as the knockout construct is typically comprised of (1) DNA from some portion of the gene (exon sequence, intron sequence, and/or promoter sequence) to be suppressed and (2) a marker sequence used to detect the presence of the knockout construct in the cell. The knockout construct is inserted into a cell, and integrates with the genomic DNA of the cell in such a position so as to prevent or interrupt transcription of the native DNA sequence. Such insertion usually occurs by homologous recombination (i.e., regions of the knockout construct that are homologous to endogenous DNA sequences hybridize to each other when the knockout construct is inserted into the cell and recombine so that the knockout construct is incorporated into the corresponding position of the endogenous DNA). The knockout construct nucleic acid sequence may comprise (1) a full or partial sequence of one or more exons and/or introns of the gene to be suppressed, (2) a full or partial promoter sequence of the gene to be suppressed, or (3) combinations thereof. Typically, the knockout construct is inserted into an embryonic stem cell (ES cell) and is integrated into the ES cell genomic DNA, usually by the process of homologous recombination. This ES cell is then injected into, and integrates with, the developing embryo.
The phrases "disruption of the gene" and "gene disruption" refer to insertion of a nucleic acid sequence into one region of the native DNA sequence
(usually one or more exons) and/or the promoter region of a gene so as to decrease or prevent expression of that gene in the cell as compared to the wild-type or naturally occurring sequence of the gene. By way of example, a nucleic acid construct can be prepared containing a DNA sequence encoding an antibiotic resistance gene which is inserted into the DNA sequence that is complementary to the DNA sequence
(promoter and/or coding region) to be disrupted. When this nucleic acid construct is then transfected into a cell, the construct will integrate into the genomic DNA. Thus, many progeny of the cell will no longer express the gene at least in some cells, or will express it at a decreased level, as the DNA is now disrupted by the antibiotic resistance gene.
Generally, for homologous recombination, the DNA will be at least about 1 kilobase (kb) in length and preferably 3-4 kb in length, thereby providing sufficient complementary sequence for recombination when the construct is introduced. Transgenic constructs can be introduced into the genomic DNA of the ES cells, into the male pronucleus of a fertilized oocyte by microinjeciton, or by any methods known in the art, e.g., as described in U.S. Patent Nos. 4,736,866 and 4,870,009, and by Hogan et al. , Transgenic Animals: A Laboratory Manual, 1986, Cold Spring Harbor. A transgenic founder animal can be used to breed other transgenic animals; alternatively, a transgenic founder may be cloned to produce other transgenic animals.
Included within the scope of this invention is a mammal in which two or more genes have been knocked out or knocked in, or both. Such mammals can be generated by repeating the procedures set forth herein for generating each knockout construct, or by breeding to mammals, each with a single gene knocked out, to each other, and screening for those with the double knockout genotype. Regulated knockout animals can be prepared using various systems, such as the tet-repressor system (see US Patent No. 5,654,168) or the Cre-Lox system (see US Patents No. 4,959,317 and No. 5,801,030).
High-Throughput Screen
Agents according to the invention may be identified by screening in high-throughput assays, including without limitation cell-based or cell-free assays. It will be appreciated by those skilled in the art that different types of assays can be used to detect different types of agents. Several methods of automated assays have been developed in recent years so as to permit screening of tens of thousands of compounds in a short period of time. Such high-throughput screening methods are particularly preferred. The use of high-throughput screening assays to test for agents is greatly facilitated by the availability of large amounts of purified polypeptides, as provided by the invention.
Therapy As noted above, by providing for increasing (or decreasing) fetal and embryonic globin gene expression and (at least in some cases) simultaneously decreasing adult globin gene expression by harnessing the activity of NF-E4, the present invention permits, for the first time, an effective molecular treatment for various hemoglobinopathies, including β-thalassemia and sickle-cell anemia. The subjects to which the present invention is applicable may be any mammalian or vertebrate species, which include, but are not limited to, cows, horses, sheep, pigs, fowl (e.g., chickens), goats, cats, dogs, hamsters, mice, rats, monkeys, rabbits, chimpanzees, and humans. In a preferred embodiment, the subject is a human. Positive regulation of expression of genes encoding fetal and embryonic globin and simultaneous negative regulation of genes encoding defective adult globin can be achieved by delivery of positive regulator NF-E4 polypeptide (e.g., the 22 kD polypeptide), by gene therapy, including providing a vector that expresses positive regulator NF-E4 in target (erythropoietic) cells or modifying target cells by introduction of a heterologous promoter in the NF-E4 gene that provides for amplification of expression of the endogenous NF-E4 gene.
The polypeptide or gene therapeutic can be delivered as a pharmaceutical composition, i.e., a mixture or admixture of the polypeptide or vector with a pharmaceutically acceptable carrier or excipient. The phrase "pharmaceutically acceptable" refers to molecular entities and compositions that are physiologically tolerable and do not typically produce an allergic or similar untoward reaction, such as gastric upset, dizziness and the like, when administered to a human. Preferably, as used herein, the term "pharmaceutically acceptable" means approved by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in animals, and more particularly in humans. The term "carrier" refers to a diluent, adjuvant, excipient, or vehicle with which the compound is administered. Such pharmaceutical carriers can be sterile liquids, such as water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like. Water or aqueous solution saline solutions and aqueous dextrose and glycerol solutions are preferably employed as carriers, particularly for injectable solutions. Suitable pharmaceutical carriers are described in "Remington's Pharmaceutical Sciences" by E.W. Martin. The phrase "therapeutically effective amount" is used herein to mean an amount sufficient to reduce by at least about 15 percent, preferably by at least 50 percent, more preferably by at least 90 percent, and most preferably prevent, a clinically significant deficit in the activity, function and response of the host. Alternatively, a therapeutically effective amount is sufficient to cause an improvement in a clinically significant condition in the host.
A therapeutically effective polypeptide or gene therapy is a therapy that results in expression of a sufficient level of fetal or embryonic, or both, globins so that there is a measurable improvement in the pathological condition. A measurable improvement is one in which the level of blood oxygenation increases; or a symptom of hemoglobinopathy is reduced perceptibly. These improvements can be evaluated both quantitatively, using medical testing apparatus available to physicians, or by using molecular techniques to evaluate expression of fetal and/or embyronic and/or adult globin mRNA or protein; and qualitatively, by evaluating the patient's perception of his or her condition.
Each of these methods can also be used to deliver the negative regulator NF-E4 (e.g., the 14 kD polypeptide), e.g., to suppress inappropriate fetal or embryonic globin expression, which may be associated with inappropriate expression of NF-E4 or expression of a mutant variant of NF-E4. In this embodiment, a therapeutically effective polypeptide or gene therapy is a therapy that results in a reduction in the level of expression of a fetal or embryonic globin mRNA or protein. Preferably, such therapy results in an improvement of a condition associated with overexpression of the fetal or embryonic globins.
Polypeptide Administration
Therapeutic compositions containing NF-E4 polypeptide for use in accordance with the present invention can be formulated in any conventional manner using one or more physiologically acceptable carriers or excipients.
Thus, the polypeptide (or functionally active fragments thereof) and their physiologically acceptable salts and solvents can be formulated for administration by inhalation or insufflation (either through the mouth or the nose) or oral, buccal, parenteral or rectal administration.
For oral administration, the therapeutics can take the form of, for example, tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g., pregelatinized maize starch, polyvinylpyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose, microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g., magnesium stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or wetting agents (e.g., sodium lauryl sulphate). The tablets can be coated by methods well known in the art. Liquid preparations for oral administration can take the form of, for example, solutions, syrups or suspensions, or they can be presented as a dry product for constitution with water or other suitable vehicle before use. Such liquid preparations can be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); emulsifying agents (e.g., lecithin or acacia); non-aqueous vehicles (e.g., almond oil, oily esters, ethyl alcohol or fractionated vegetable oils); and preservatives (e.g., methyl or propyl-p- hydroxybenzoates or sorbic acid). The preparations can also contain buffer salts, flavoring, coloring and sweetening agents as appropriate.
Preparations for oral administration can be suitably formulated to give controlled release of the active compound.
For buccal administration the therapeutics can take the form of tablets or lozenges formulated in conventional manner.
For administration by inhalation, the therapeutics according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit can be determined by providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in an inhaler or insufflator can be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch. Pulmonary delivery of polypeptides is an effective route of administration.
The therapeutics (including in this case, gene therapeutices) can be formulated for parenteral administration (i.e., intravenous or intramuscular) by injection, via, for example, bolus injection or continuous infusion. Formulations for injection can be presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added preservative. The compositions can take such forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and can contain formulatory agents such as suspending, stabilizing and/or dispersing agents. Alternatively, the active ingredient can be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before use. The therapeutics can also be formulated in rectal compositions such as suppositories or retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other glycerides.
In addition to the formulations described previously, the therapeutics can also be formulated as a depot preparation. Such long acting formulations can be administered by implantation (for example, subcutaneously or intramuscularly) or by intramuscular injection. Thus, for example, the compounds can be formulated with suitable polymeric or hydrophobic materials (for example, as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
The composition, if desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering agents. The composition can be a liquid solution, suspension, emulsion, tablet, pill, capsule, sustained release formulation, or powder. Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, etc.
Generally, the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophihzed powder or water-free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent. Where the composition is administered by injection, an ampoule of sterile diluent can be provided so that the ingredients may be mixed prior to administration.
The invention also provides a pharmaceutical pack or kit comprising one or more containers filled with one or more of the ingredients of the vaccine formulations of the invention. Associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use or sale for human administration.
The compositions may, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient. The pack may for example comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may be accompanied by instructions for administration. Composition comprising a compound of the invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an appropriate container, and labelled for treatment of an indicated condition.
Gene Therapy In a specific embodiment, a vector for expression of NF-E4, or alternatively a vector for introduction of a heterologous expression regulatory sequence to amplify endogenous NF-E4 expression (see U.S. Patent No. 5,733,761), can be delivered to treat or prevent a disease or disorder associated with a hemoglobinopathy.
Any of the methods for gene therapy available in the art can be used according to the present invention. Exemplary methods are described below.
For general reviews of the methods of gene therapy, see Goldspiel et al, Clinical Pharmacy, 1993, 12:488-505; Wu and Wu, Biotherapy, 1991, 3:87-95; Tolstoshev, Ann. Rev. Pharmacol. Toxicol., 1993, 32:573-596; Mulligan, Science, 1993, 260:926-932; and Morgan and Anderson, Ann. Rev. Biochem., 1993, 62:191-217; May, TIBTECH, 1993, 11 :155-215). Methods commonly known in the art of recombinant DNA technology which can be used are described in Ausubel et al. , (eds.), 1993, Current Protocols in Molecular Biology, John Wiley & Sons, NY;
Kriegler, 1990, Gene Transfer and Expression, A Laboratory Manual, Stockton Press, NY; and in Chapters 12 and 13, Dracopoli et al, (eds.), 1994, Current Protocols in Human Genetics, John Wiley & Sons, NY.
In one aspect, the therapeutic vector, e.g., any of the vectors described above, comprises a nucleic acid that expresses NF-E4 in a suitable host. In particular, such a vector has a promoter operably linked to the coding sequence for NF-E4. The promoter can be inducible or constitutive and, optionally, tissue-specific. In another embodiment, a nucleic acid molecule is used in which the NF-E4 sequences and any other desired sequences are flanked by regions that promote homologous recombination at a desired site in the genome, thus providing for intrachromosomal expression of the synthebody (Koller and Smithies, Proc. Natl. Acad. Sci. USA, 1989, 86:8932-8935; Zijlstra et al, Nature, 1989, 342:435-438).
Delivery of the vector into a patient may be either direct, in which case the patient is directly exposed to the vector or a delivery complex, or indirect, in which case, cells are first transformed with the vector in vitro then transplanted into the patient. These two approaches are known, respectively, as in vivo and ex vivo gene therapy.
The form and amount of therapeutic nucleic acid envisioned for use depends on the type of disease and the severity of the desired effect, patient state, etc., and can be determined by one skilled in the art.
EXAMPLES
The present invention will be better understood by reference to the following Examples, which are provided by way of illustration of the invention and are not intended to limit it.
EXAMPLE 1: Induction of Human Fetal Globin Gene Expression by a
Novel Erythroid Factor. NF-E4
Using the protein dimerization domain of CP2 as "bait" in a yeast two-hybrid library screen, this Example reports the cloning and characterization of human NF-E4, the tissue-restricted component of the SSP. We demonstrate that enforced expression of this factor in fetal/erythroid cells induces fetal and embryonic globin expression.
Materials and Methods Yeast two-hybrid screen. The cDNA sequence encoding the COOH- terminal 242 amino acids (aa 260-502) of CP2 was inserted into the yeast expression vector pGBT9 (Clontech). The resultant plasmid encodes a hybrid protein containing the DNA-binding domain of GAL4 fused in-frame to CP2 residues. The yeast reporter strain, HF7C, rendered competent by the lithium acetate method, was sequentially transformed with this vector and a plasmid cDNA library derived from K562 cells constructed in the yeast expression vector pACT2 (Fields and Song, Nature, 340:245-246, 1989). The cDNAs in this vector were fused with the GAL4 transactivation domain. The yeast were plated on leucine-/tryptophan-/histidine- plates and incubated at 30° for 4 days. Potential protein interactions were indicated by activation of the histidine reporter gene and growth on these plates and by activation of the second reporter gene, β-galactosidase and a positive X-gal assay. Library plasmids were rescued from yeast clones using the acid-washed glass beads procedure and electroporated into the competent E. coli strain MCI 061.
Sequencing reactions were performed using the Taq Dyedeoxy terminator cycle sequencing method (Applied Biosystems) and analyzed on an Applied Biosystems 373 automated sequencer.
Yeast one-hybrid assay. A concatamer of four copies of the SSE was cloned into the EcoRUSall sites of the yeast vector, pLacZ. As a control, a four copy concatamer of the direct repeat elements (DRE) from the proximal β-promoter was also cloned into this vector. Small scale transformations of each vector were performed into the Saccharomyces cerevisiae strain YM4271 which is auxotrophic for histidine, uracil, leucine and tryptophan. Prior to transfection, the vector was linearized with Notl to allow genomic integration into the ura3-52 site which confers auxotrophy to uracil allowing selection of transformants. A single yeast colony which displayed no basal reporter gene activity was chosen for subsequent experiments. This colony was expanded and transformed with pACT-CP2, pACT106 or pACTl 17. The transformants were selected on minimal medium lacking leucine and uracil and colonies were lifted on filters and assayed for β-galactosidase activity.
Mammalian two-hybrid assay. Mammalian expression vectors containing the dimerization domain of CP2 fused in-frame to the GAL4DB and ΝF- E4 fused in-frame to the VP16AD were generated. These plasmids were co- transfected with pG5CAT, a reporter construct with 5 GAL4 DΝA binding sites linked to the chloramphenicol acetyltransferase (CAT) gene into 293 cells using calcium phosphate precipitation. Vectors lacking either CP2 or ΝF-E4 were transfected as controls. After 48 hours, cells were harvested and whole cell lysate was prepared. CAT activity was measured using the CAT enzyme-linked immunosorbent assay (ELISA) according to the manufacturer's instructions (Boehringer Mannheim). 5 'RACE (rapid amplification of 5' cDNA ends). A marathon 5' RACE cDNA library was constructed from mRNA from K562 cells according to the manufacturer's instructions (Clontech). Nested PCR was performed using the following vector- and gene- specific primers: gene-specific primer 1 5'- CCCTTGGCTCAGATGAAGCGATGGTAGT-3' (SEQ ID NO:4), gene-specific primer 2 5'-TGGCCTGCAGGGCCCCAGTAGGT-3' (SEQ ID NO:5), vector-specific primer 1 5'-CCATCCTAATACGACTCACTATAGGGC-3' (SEQ ID NO:7), vector- specific primer 2 5'-ACTCACTATAGGGCTCGAGCGGC-3' (SEQ ID NO:8). PCR conditions were as follows: 95°C - 1 minute, 1 cycle; 94°C - 10 seconds, 68°C - 2 minutes, 30 cycles; 68°C - 5 minutes, 1 cycle. Nested PCR was performed under identical conditions except that the cycle number was reduced to 20. PCR products were electrophoresed on 1% agarose, blotted onto nitrocellulose, and probed with internal gene-specific oligonucleotides. Final PCR products were cloned into the TOPO 2.1 vector according to the manufacturer's instructions (Invitrogen) and sequenced.
Generation ofMSCV-based amphotropic retroviral supernatant and transduction of mammalian cell lines. The NF-E4 coding region was cloned into the retroviral vector plasmid MSCV-HA at a unique Xhol or EcoW. site. This bicistronic vector contains (i) an amphotropic retrovirus murine stem cell virus (MSCV) 5' long terminal repeat (LTR); (ii) a hemagglutinin (HA) epitope tag with the NF-E4 coding sequence in-frame either 5' or 3' to the tag; (iii) the encephalomyocarditis internal ribosomal entry site (IRES); (iv) the green fluorescent protein (GFP) cDNA, and (v) the MSCV 3' LTR (Figure 2). The plasmid was co-transfected with an amphotropic packaging plasmid into 293T cells by calcium phosphate precipitation. After 48 hours the supernatant containing amphotropic particles was harvested, filtered and added to K562 or MEL cells every 12 hours for three days. The cells were allowed to recover for 72 hours and then analyzed for GFP expression by flow cytometry. The highest expressing 10% of cells were sterilely sorted expanded and resorted and subsequently expanded in oligoclonal pools. A biological titer of the supernatant on NIH3T3 cells was equivalent to lxl06cfu/ml.
Isolation and retroviral transduction of human CD34+ cells. Human cord blood was provided by the Bone Marrow Donor Institute. CD34+ cells were isolated using a MiniMACS magnetic cell sorting system (Miltenyi Biotec Inc). Cells were then cultured in IMDM supplemented with IL-3 (lOng/ml), SCF (300 ng/ml), IL- 6 (50 ng/ml), G-CSF (10 ng/ml), Flt3 (300 ng/ml), and anti-TGF-Bl (100 ng/ml) for 72 hours at 37°C. Cells were then harvested and washed in Hanks media prior to being replated on RetroNectin coated plates in the presence of amphotropic retroviral supernatant in IMDM and the cytokines listed above. Supernatant was changed every 12 hours for three days and then cells were washed in Hanks media and expanded for 2 days in BFU-E mix containing IL-3, IL-6, SCF and GM-CSF at 100 ng/ml and erythropoietin (5 U/ml). Cells were then analyzed for GFP expression by flow cytometry and positive cells were sterilely sorted into pools and expanded for a further 5 days in BFU-E mix. The pools were then harvested and RNA and cDNA prepared. Expression of glutathione S-transferase (GST) fusion proteins and affinity chromatography. CP2 and NF-E4 cDNAs were cloned in-frame with the GST coding sequence in the pGEX vectors (Pharmacia). The GST fusion proteins were expressed in the E. coli strain BL21. Fusion proteins were purified on glutathione-Sepharose (Pharmacia) and their integrity confirmed with Coomassie Blue staining after SDS-PAGE. For in vitro protein-protein interactions assays, 1 μg of GST or GST fusion protein was incubated for 1 hour at 4°C with 10 μl glutathione- Sepharose beads, which had been preblocked with 0.5% milk. After extensive washing, the beads were resuspended in 200 μl binding buffer (lOmM Tris.HCL, pH 7.9/500mM KCl/0.1 mM EDTA/150 μg/ml BSA/0.1% Nonidet P-40/10% glycerol) and incubated for 1 hour at room temperature with 35S methionine labeled NF-E4. After extensive washing retained proteins were eluted by boiling in SDS loading buffer and analyzed by SDS-PAGE and autoradiography.
Preparation ofanti-NF-E4 antisera. An NF-E4 peptide having the sequence LKTDSALEQTPQQLPSLHLSQG (SEQ ID NO:8) was synthesized and conjugated to KLH. The peptide-KLH conjugate was prepared in complete Freunds' adjuvant (CFA) and incomplete Freunds' adjuvant (IF A). Rabbits were primed with the immunogen-CFA mixture, then boosted three times at monthly intervals with the immunogen-IFA mixture.
Anti-sera were screened for reactivity with E. coli produced NF-E4-GST fusion proteins by ELISA and Western blotting. Positive sera were used in Western analysis, immunoprecipitation, and function ablation assays.
Extract preparation, immunoprecipitation and electrophoretic mobility shift assays (EMSA). Nuclear extracts were prepared by the method of Dignam as previously described (Dignam, Methods Enzymol., 182:194-203, 1990). Highly purified SSP was obtained by fractionating crude extract over heparin-Sepharose and DNA affinity columns as described previously (Jane et al. , EMBO J., 14:97-105, 1995). For immunoprecipitation studies, nuclear extracts were initially precleared with normal rabbit serum (lOμg/ml) and then incubated with preimmune serum or antisera to CP2 or NF-E4 overnight at 4°C. A 50% slurry of protein G Sepharose was added and incubated at 4°C for 1 hour. The mixture was then centrifuged at 3000Xg for 1 minute and the pellet was washed in 50 mM Tris.HCL, pH 7.9 containing 150 mM NaCI prior to being resuspended in SDS loading buffer. Samples were subjected to SDS-PAGE, transferred to polyvinylidene difluoride (PVDF) membranes and blotted with antisera to NF-E4. Signal detection was achieved with the ECL system as per the manufacturer's instructions (Amersham Pharmacia).
EMS As were performed by incubating varying amounts of nuclear extract with 105 cpm of 32P-dCTP endlabelled double stranded oligonucleotides encoding the SSE region of the γ-globin promoter in a 20 μl reaction containing 500 ng of poly [d(I-C)], 6mM MgCl2, 16.5mM KC1 and 100 μg of bovine serum albumin. For antibody studies, 3 μl of pre-immune serum or rabbit anti-mouse CP2 or NF-E4 antibody were preincubated for 10 minutes with the binding reaction, prior to addition of the probe. After incubation on ice for 15 min and 25°C for 15 min, samples were electrophoresed on a 4% non-denaturing polyacrylamide gel in 0.5 x Tris-Borate- EDTA buffer for 90 min at 10 V/cm. RT-PCR and Northern analysis. First strand cDNA was prepared from 2 μg of mRNA from primary erythroid progenitors using random hexamers.
Each cDNA sample was appropriately diluted to give similar amplification of S14
RNA under the same PCR conditions. The PCR conditions were as follows: 95°C for 1 minute followed by various cycles of 94°C for 30 seconds, 60°C for 30 seconds and
72°C for 1 minute with a final extension at 72°C for 5 minutes. The PCR primer sequences were as follows:
S14 sense 5'-GGCAGACCGAGATGAATCCTCA-3' (SEQ ID NO:9);
S14 antisense 5'-CAGGTCCAGGGGTCTTGGTCC-3' (SEQ ID NO:10); NF-E4 sense 5'-ACCCGGGAGGGGCTCCGGTCTT-3' (SEQ ID NO:l 1);
NF-E4 antisense 5'-CCCTTGGCTCAGATGAAGCGATGGTAGT-3' (SEQ ID
NO: 12); γ-globin sense 5'-AAGCTCCTAGTCCAGACGCCA-3' (SEQ ID NO: 13); γ-globin antisense 5'-GGCCACTCCAGTCACCATCTT-3' (SEQ ID NO: 14); β-globin sense 5'-AGGAGAAGTCTGCCGTTACTGC-3' (SEQ ID NO: 15); β-globin antisense 5'-CATAACAGCATCAGGAGTGG-3' (SEQ ID NO: 16). All PCR products were electrophoresed on 1.5% agarose gels, transferred to nitrocellulose membrane and analyzed by Southern blot using 32P radiolabeled internal oligonucleotides as probes. Membranes were then autoradiographed for 2 hours at -70°C. Northern analysis of K562 pools was performed as described previously (Sambrook et al. , Molecular cloning: A
Laboratory Manual, Cold Spring Harbour Press, Cold Spring Harbour, New York,
USA, 1989).
Results
Isolation ofCP2 interacting proteins from a K562 cDNA library.
Previous studies had demonstrated that the ubiquitous transcription factor, CP2 formed a major component of the stage selector protein (SSP). The protein dimerization domain of CP2 has been mapped to the 242 amino acid residues at its carboxy terminus (Shirra et al, Mol. Cell. Biol., 14:5076-5087, 1994; Uv et al, Mol. Cell. Biol., 14:4020-4031, 1994). Within this region, a 17-amino acid stretch (aa 292- 309) is essential for protein-protein interactions. A cDNA sequence encoding the COOH-terminal 242 amino acids (aa 260-502) of CP2 was inserted into the yeast expression vector pGBT9. The resultant plasmid (GAL4CP2-260) encodes a hybrid protein containing the DNA-binding domain of GAL4 fused to CP2 residues 260-502. The yeast reporter strain HF7C was transformed with this vector and an expression library derived from K562 cell line cDNAs fused to the sequences encoding the GAL4 transactivation domain. This library was chosen as K562, a human cell line, is a model of fetal erythropoiesis, constitutively expressing the ε- and γ-globin genes but not the adult β-globin genes (Rowley et al, Leukemia Res., 8:45-54, 1984). In addition, abundant SSP binding activity is evident in nuclear extract from these cells, which was the source for the biochemical purification of the CP2 component of the SSP. From 5x106 clones screened we isolated 100 clones that appeared to interact with the CP2 bait. From this collection we identified about 40 clones that encoded CP2, an expected result in view of the protein's ability to homodimerize. Another 50 clones represented previously identified false positives from the two-hybrid screen. Of the 10 remaining clones, eight corresponded to known genes or ESTs (expressed sequence tags) whose tissue distribution suggested that they were unlikely to represent the tissue-restricted component of the SSP. Only two clones, cl06 and cl 17, were novel and hence their further evaluation was prioritized. As the DNA binding activity of the SSP appears to depend on the presence of the CP2 partner protein, we evaluated the ability of c 106 and cl 17 to bind to the SSE in the yeast one-hybrid assay. A yeast strain, YM4271, containing 4 concatamerised SSE sites linked to a LacZ reporter gene was generated and transfected with the K562 cDNA library plasmids encoding the GAL4ADcl06, GAL4ADcl 17 or GAL4ADCP2 fusion proteins. A strain containing 4 concatamerized direct repeat elements (DRE) from the proximal β-promoter linked to the same reporter was transfected as a control. After transfection, both strains were then plated on media lacking leucine and uracil and resultant colonies assayed for β- galactosidase activity. A positive result was observed with cl06 but not with cl 17 or CP2 using the SSE binding site. No enzymatic activity was observed with the DRE binding sites with any of the three plasmids. Based on these findings, we postulated that cl06 was a strong candidate for the partner protein of CP2 in the SSP complex and subsequently refer to it as NF-E4.
To validate the interaction between CP2 and NF-E4 in a eukaryotic expression system, we employed the mammalian two-hybrid assay. Mammalian expression vectors containing the dimerization domain of CP2 fused in-frame to the GAL4DB and NF-E4 fused in- frame to the VP16AD were generated. These plasmids were co-transfected with pG5CAT, a reporter construct with 5 GAL4 DNA binding sites linked to the chloramphenicol acetyltransferase (CAT) gene. Vectors lacking either CP2 or NF-E4 were transfected as controls. A marked induction of CAT activity was observed only in the presence of both proteins.
The NF-E4 gene encodes a 22 kD protein which initiates at a CUG codon. To facilitate further studies of NF-E4, we employed 5' RACE using K562 cell cDNA to obtain a full-length clone. A 966 bp fragment was generated using nested gene and vector-specific primers. Comparison of this sequence with the databases using the BLAST algorithm revealed a high degree of homology with a sequence from a bacterial artificial chromosome (BAC) containing a region of the human X- chromosome. Sequence analysis revealed a long open reading frame (ORF) contiguous with that defined in the original yeast two-hybrid GAL4AD/cl06 fusion vector (Figure 1). Although one potential initiation codon (AUG) was observed beginning at nucleotide 421, several observations suggested that translation of full- length NF-E4 might not start at this AUG. Firstly, the NF-E4 reading frame remains open for an additional 115 codons upstream of the first AUG before an in-frame termination codon is encountered. Secondly, the predicted size of the protein from the first AUG is markedly less than that suggested by the studies reported here. Finally, a CUG codon is preceded by a Kozak sequence and termination codon is present in the correct reading frame 100 codons upstream of the first AUG. Translation from this codon would generate a protein with a predicted molecular weight of approximately 22 kD.
Many studies have documented translation of human mRNAs from initiation codons other than AUG, most often CUG (Peabody, J. Biol. Chem.,
264:5031-5035, 1989; Kozak, Proc. Natl. Acad. Sci. USA, 87:8301-8305, 1990; Mehdi et al, Gene, 91 :173-178, 1990). These include transcription factors such as TEF-1 and Krox-24 and proto-oncogenes such as c-myc, Int2, and others (Xiao et al, Cell, 65:551-568, 1991; Acland et al , Nature, 343:662-665, 1990; Hann et α/., Genes & Dev., 6:1229-1240, 1992; Lemaire et al, Mol. Cell. Biol., 10:3456- 3467, 1990). Comparison of the NF-E4 sequence with a human genomic clone isolated in our laboratory confirmed the presence of the single AUG in the mid-region of the sequence and the upstream CUG and termination codon. To determine whether effective translation was achieved from this CUG, we subcloned the NF-E4 cDNA into the in-vitro transcription/ translation vector PSP72 and generated 35S methionine- labeled protein. We also generated NF-E4 clones in which the CUG was mutated to an AUG but retained the native Kozak sequence or was mutated to an AUG with a consensus Kozak sequence to determine whether these sequence changes resulted in an enhanced efficiency in translation (Kozak, Nucleic Acids Res., 15:8125-8145, 1987). The dominant protein species translated from the AUG containing NF-E4 vectors had a molecular weight of 22 kD, identical to that predicted and observed with the CUG initiated NF-E4. No increase in efficiency of transcription translation was observed with either AUG containing construct.
To determine whether CUG initiated translation could occur in vivo, we generated a Murine Stem Cell Virus (MSCV) based retroviral vector containing the NF-E4 cDNA (Hawley et al, Gene Therapy, 1 :136-138, 1994). This bicistronic vector contains the green fluorescence protein (GFP) cDNA linked by the encephalomyelitis internal ribosome entry site (IRES) to the NF-E4 cDNA tagged at its COOH-terminus with the hemagglutinin epitope (HA) (Figure 2). K562 cells were transduced with this virus (MSCV-NF-E4-HA) or the parental virus carrying the GFP cDNA alone (MSCV) and after 5 days GFP-positive cells were selected by FACS. Western analysis with an anti-HA antibody demonstrated a band in extract from the MSCV-NF-E4-HA transduced cells, which co-migrated with recombinant NF-E4-HA generated in bacteria. No corresponding band was observed in K562 cells transduced with MSCV alone. Further support for CUG initiated translation was obtained with the generation of polyclonal antiserum to the native NF-E4 protein. A dominant band of 22 kD was observed in Western analysis of native K562 cells, consistent with initiation at the CUG codon (Figure 3). A second minor band was observed at approximately 14 kD, which could reflect initiation at the downstream AUG.
NF-E4 interacts with CP2 in vitro and in vivo. To confirm the interaction between CP2 and full-length NF-E4, we utilized GST-chromatographic assays. Glutathione-S- transferase alone (GST) or GST fused in-frame with full-length endophilin (GST-END), or GST fused in-frame with full-length CP2 (GST-CP2) were coupled to glutathione Sepharose beads and incubated under stringent conditions with 35S methionine-labeled in vitro transcribed/ translated NF-E4. Specific retention of NF-E4 was observed with the GST-CP2 beads, but not on control GST or GST-END beads. To confirm this interaction in an in vivo setting, co-immunoprecipitation studies were performed. Nuclear extract from K562 cells was immunoprecipitated with either anti-CP2 antiserum or preimmune serum and blotted with anti-NF-E4 antiserum. Immunoprecipitation and blotting with anti-NF-E4 antiserum served as the positive control. A specific band of 22 kD was observed after immunoprecipitation with 8μl or 4μl of anti-CP2 antiserum. No band was observed with preimmune (PI) serum derived from NF-E4 or CP2 inoculated rabbits. This finding indicates that CP2 and NF-E4 form a physiological complex in vivo.
NF-E4 is a component of the SSP complex. To confirm that NF-E4 contributed to the formation of the SSP, we examined the effect of anti-NF-E4 antiserum on the SSP/SSE interaction in an electrophoretic mobility shift assay (EMSA). Addition of either anti-CP2 or anti-NF-E4 antisera to crude K562 cell nuclear extract specifically ablated the formation of the SSP/SSE complex, leaving the Spl/SSE complex unaltered. Addition of the respective preimmune (PI) sera had no effect.
To determine whether de novo expression of NF-E4 in a CP2-expressing NF-E4 null cell line could result in the formation of the SSP complex, we transduced the human sarcoma cell line HT1080 with an MSCV-based retrovirus containing the NF-E4 cDNA tagged at the NH2 -terminus with HA (MSCV-HA-NF-E4). This cell line contains abundant CP2 but no NF-E4 at the RNA or protein level. HT1080 cells transduced with MSCV alone served as the control. GFP -positive cells were selected by FACS, and expression of NF-E4 was confirmed by Western analysis. EMSA using an SSE probe revealed the presence of a new complex in nuclear extract derived from MSCV-HA-NF-E4-transduced cells which co-migrated with native SSP. No complex was observed in the line transduced with MSCV alone. Addition of anti-CP2 antisera to this EMSA ablated the SSP-SSE complex.
Further evidence for the role of NF-E4 in formation of the SSP complex was derived from Western analysis of an SSP fraction purified by heparin- Sepharose and DNA affinity chromatography. In this experiment, both the 22 kD and the lower molecular mass NF-E4 species were detected in crude and purified samples. One confounding result from our previous UV cross-linking studies was the prediction that the molecular weight of the CP2 partner protein in the SSP was 40-45 kD. In view of this, we examined whether the 22 kD NF-E4 protein could form homodimeric complexes in GST chromatography experiments. Glutathione-S-transferase alone (GST) or GST fused in-frame with full-length NF-E4 (GST-NF-E4) were coupled to glutathione Sepharose beads and incubated under stringent conditions with 35S methionine-labeled in vitro transcribed/translated NF-E4. Specific retention of NF-E4 was observed with the GST-NF-E4 beads, but not on control GST beads. This finding coupled with our UV cross-linking data suggests that the SSP complex is composed of two NF-E4 molecules linked to a molecule of CP2. NF-E4 demonstrates a highly restricted pattern of expression. To determine the tissue distribution of NF-E4 expression we initially performed Northern analysis on mRNA derived from K562 cells. Despite the demonstration of the presence of NF-E4 mRNA (by RT-PCR) and protein (by Western and EMSA) in these cells, we were unable to detect a signal with a variety of NF-E4 cDNA probes.
Additional analysis of multi -tissue Northern blots also failed to detect a signal in a variety of tissues, as did RNAse protection analysis on mRNA from tissues and cell lines. Therefore we proceeded to define the expression pattern of NF-E4 using RT- PCR. Based on the genomic sequence, we designed primers from exons I and II, which are separated by an 1800 bp intron. The identity of the correct size PCR product was confirmed by Southern analysis using an internal primer as a probe. As shown in Figure 4, NF-E4 is expressed in fetal liver, cord blood and bone marrow. No expression was observed from a variety of other organs including colon, heart, spleen, kidney, liver, lymph node, and thymus (Figure 4).
In RT-PCR analysis of cell lines, expression was demonstrated with mRNA derived from the fetal and erythroid cell lines K562 and HEL, and the embryonic kidney cell line 293T. No product was amplifiable from a variety of other lines including Jurkat, CEM, MCF7, DU528, SY5Y, and COS.
To confirm the expression of NF-E4 in cord blood and bone marrow at protein level, Western analysis was performed with nuclear extract derived from these sources, using K562 nuclear extract as a control. A dominant band of 22 kD was observed in all extracts. In addition, the previously recognized smaller species was also detected in all samples.
Enforced expression ofNF-E4 in K562 cells and cord blood progenitors induces fetal globin gene expression. To examine the functional role of NF-E4 in globin gene expression, an MSCV-based vector was utilized containing the NF-E4 cDNA tagged with the HA epitope at the NH2-terminus (MSCV-NF-E4-HA). K562 cells were transduced with this vector or the parent GFP-containing vector (MSCV) and then sorted twice for green fluorescence by FACS and expanded in oligoclonal pools. All pools were subsequently shown to contain more than 99% GFP-positive cells by FACS analysis. Northern analysis of pools derived from
MSCV-HA-NF-E4 transduced cells showed a significant upregulation (5- 10-fold) of γ-globin gene expression compared to pools from the MSCV transduced cells (Figure 5). Expression of the housekeeping gene (GAPDH) was unchanged between pools. To extend the functional studies of NF-E4, we utilized the same retroviral supernatants to transduce CD34+ cells derived from human cord blood (see Methods). GFP-positive cells were sorted at day 2 and expanded in pools in erythropoietin, IL-3, IL-6, GM-CSF, and SCF. In this setting more than 80% of the colonies derived are erythroid in nature. At day 7, individual pools were analyzed by semi-quantitative RT-PCR for γ-globin gene expression. Increased γ-globin gene expression was observed in pools transduced with MSCV-NF-E4 compared with those transduced with the control MSCV retrovirus. After normalization for expression of the housekeeping gene SI 4, the increase in γ-globin gene expression was approximately 2-fold.
Embryonic ε-globin gene expression is also induced by NF-E4.
Previous reports have suggested a potential role for the SSP in embryonic gene regulation. We therefore examined the effects of enforced expression of NF-E4 on ε- globin gene expression. Total RNA from oligoclonal pools of K562 cells transduced with either the MSCV (lanes 1-4) or MSCV-HA-NF-E4 (lanes 5-9) retrovirus was analyzed by Northern blot (Figures 6 A and 6B). Comparable results to those observed for γ-globin gene expression were obtained, with significant induction of ε-gene expression in the cells transduced with MSCV-HA-NF-E4. No evidence of β-globin gene activation was observed in any pool.
To evaluate the specificity of NF-E4 activity, we transduced a murine erythroleukemia cell line (MEL) with either MSCV or MSCV-HA-NF-E4 and sorted for GFP expression. Oligoclonal pools were analyzed by Northern blotting. No change in murine βmaj expression was observed. Expression of murine εy and βHl was undetectable in all clones.
Discussion
This report details the molecular cloning and characterisation of human NF-E4, a novel gene encoding the tissue-restricted component of the SSP complex.
The gene was isolated from a yeast two-hybrid screen of a K562 cell cDNA library using CP2, the previously identified ubiquitous component of the SSP, as the bait.
NF-E4 is essential for DNA binding of the SSP, as demonstrated by the disruption of the SSP/SSE complex induced by NF-E4 antiserum and the activation in the yeast one-hybrid assay induced by NF-E4. Based on GST chromatographic assays and previous UV cross-linking data, it appears that the SSP is composed of two molecules of NF-E4 linked to a single molecule of CP2.
Analysis of the NF-E4 cDNA and protein sequence revealed no homology to known genes or ESTs. Specifically, no known DNA binding, protein dimerization, or trans-activation domains were evident. Protein translation appears to commence at a CUG codon as evidenced by the in vitro transcription-translation assays, analysis of in vivo translation using retroviral vectors, and the molecular weight of the native protein. Non-AUG initiation has been previously reported for a variety of mammalian proteins, most often from CUG codons (Boeck, and Kolakofsky, EMBO J., 13:3608-3617, 1994; Kozak, Proc. Natl. Acad. Sci. USA, 87:8301-8305, 1990; Peabody, J. Biol. Chem., 264:5031-5035, 1989). Studies of some mammalian proteins and of viral mRNA in mammalian cells have also demonstrated initiation mediated by GUG, ACG, AUA and AUU (Mehdi et al. , Gene, 91 : 173-178, 1990). Although our evidence for CUG initiation of native NF-E4 is indirect, it is significant that none of these alternate initiating codons are present either 5' or within 125 bp 3' of the CUG. It is also significant that, like many of the other factors with non-AUG initiated isoforms, NF-E4 may also exist in a truncated form initiated from a downstream AUG (Bruening and Pelletier, J. Biol. Chem., 271 :8646-8654, 1996; Packham et al, Biochem. J., 328:807-813, 1997). This is evidenced by the smaller species observed in Western analyses and immunoprecipitation experiments with anti-NF-E4 antisera. Although it is possible that this smaller species represents a proteolytic, cleavage product of NF-E4, the generation of a co-migrating protein from an MSCV retrovirus carrying the NF-E4 cDNA truncated to this ATG suggests that this species represents the product of alternate translation initiation. Many proteins are known to initiate at non-AUG codons (Prats et al. ,
Proc. Natl. Acad. Sci. USA, 86:1836-1840, 1989; Mellentin et al, Cell, 58:77-83, 1989; Lemaire et al, Mol. Cell. Biol., 10:3456-3467, 1990; Xiao et al, cell, 65:551-568, 1991; Saris et al, EMBO J., 10:655-664, 1991; Harm et al, Genes & Dev., 6:1229-1240, 1992; Nagpal et al, Proc. Natl. Acad. Sci. USA., 89:2718-2722, 1992; Bruening and Pelletier, J. Biol. Chem., 271-8646-8654, 1996). The use of non- AUG initiation codons in many of these proteins plays a key regulatory role. For example, the CUG and AUG- initiated isoforms of the steroid receptor binding protein Bag-1, and the proto-oncogenes Int2 and Hck-1 differ in their subcellular localization (Packham et al, Biochem. J., 328:807-813, 1997; Acland et al, Nature, 343:662-665, 1990; Lock et α/., Mol. Cell. Biol., 11:4363-4370, 1991). In addition, the CUG- initiated isoform of Bag-1 interacts with different protein partners and consequently has a unique functional role (Froesch et al, J. Biol. Chem., 273:11660-11666, 1998). Interestingly, the smaller NF-E4 peptide was immunoprecipitated with anti-NF-E4 antisera, but not co-immunoprecipitated with anti-CP2 antisera. This suggests that there may be intrinsic functional differences between these species (see also Example 2, infra).
The demonstration of NF-E4 mRNA and/or protein in fetal liver, bone marrow, and cord blood raises the question of the developmental stage specificity of the SSP complex. This finding is analogous to the expression pattern observed for another stage-specific globin regulatory factor, EKLF, which is present at both mRNA and protein level in yolk sac, fetal liver, and adult bone marrow (Donze et al. , J. Biol. Chem., 270:1955-1959, 1995). Despite this, β-gene expression is only observed in definitive erythroid cells and mice nullizygous for this factor demonstrate no abnormalities in primitive erythropoiesis (Perkins et al, Nature, 375:318-322, 1995; Nuez et al. , Nature, 375:316-318, 1995). The mechanism underlying this selectivity remains unknown, although the increase in γ-globin gene transcription observed in the fetal livers of EKLF-/- mice transgenic for the human globin locus suggests that it may be influenced by promoter competition (Wijgerde et al, Genes & Dev., 10:2894-2902, 1996; Perkins et al, Proc. Natl. Acad. Sci. USA, 93: 12267-12271, 1996). Functionally, NF-E4 also displays a high degree of selectivity as it induces fetal and embryonic, but not adult globin gene expression. The lack of β-globin gene induction is observed in the context of K562 cells, in which constitutive β-globin gene expression is absent, and MEL cells, in which high levels of β-globin gene expression are observed (for further discussion of the effect of NF-E4 on adult globin gene expression see Example 4, infra). In contrast, induction of fetal and embryonic globin is observed only in cells in which these genes are normally transcribed. This suggests that the chromatin modifications associated with fetal/embryonic gene silencing cannot be altered simply by changing the transcription factor milieu. This observation is comparable to the effects of enforced expression of EKLF, which fails to induce β-globin gene expression in cells in which the adult globin gene is constitutively silent (Donze et al, J. Biol. Chem., 270:1955-1959, 1995). As low levels of fetal globin expression are detectable in adult bone marrow, the presence of some NF-E4 in this tissue is not surprising. However, the levels observed appear similar to those observed in cord blood, in which the expression of the γ-globin genes is appreciably greater. This finding may indicate that another fetal factor, such as FKLF is also required for the developmental-specificity of γ-globin gene expression (Asano et al, Mol. Cell. Biol, 19:3571-3579, 1999).
Our previous studies have suggested that the enhancer activity of the LCR is influenced by the developmentally-specific factors EKLF and the SSP (Amrolia et al, J. Biol. Chem., 273:13593-13598, 1998). This effect is not mediated by direct protein-protein interactions between the stage-specific promoter-bound factors and factors bound to the LCR (Gallarda et al, Genes & Dev., 3:1845-1859, 1989). An alternate model is that EKLF and the SSP alter promoter structure, rendering it more amenable to the effects of the LCR enhanceosome. This model is supported by the loss of HS formation in the β-globin promoter in EKLF nullizygous mice and by the recent demonstration of complex formation between EKLF and the SWI-SNF chromatin remodeling factors (Wijgerde et al, supra, 1996; Armstrong et al, Cell, 95:93-104, 1998). Studies addressing the role of NF-E4 in chromatin modification are currently in progress.
The ultimate goal of defining factors which activate fetal globin is their potential for therapeutic intervention in the hemoglobinopathies. Patients with β- thalassemia and sickle-cell disease who also inherit genetic mutations that prolong fetal globin expression after birth have a significantly ameliorated clinical course (Poncz et al, In McLean, N. (ed)., Oxford Surveys of Eukaryotic Genes, Oxford University Press, Oxford, UK, 1989, pp. 163-203). To date, three factors which augment γ-globin gene expression in cell lines have been identified. The first of these, the helix-loop-helix protein Id2, is ubiquitously expressed and likely to have pleiotropic effects on gene regulation (Holmes et al, Mol. Cell. Biol., 19:4182-4190, 1999). However, NF-E4 and FKLF have highly restricted patterns of expression and offer promise for both pharmacological manipulation and gene therapy. Studies of these genes in mouse models of human hemoglobin switching (Gaensler et al, Proc. Natl. Acad. Sci. USA, 90:11381-11385, 1993) and hemoglobinopathies (Ciavatta et al, Proc. Natl. Acad. Sci. USA, 92:9259-9263, 1995; Paszty et al, Science, 278:876-878, 1997; Ryan et al, Science, 278:873-876, 1997) represent the next step in the evaluation of these factors as therapeutic tools.
EXAMPLE 2: 14 kD NF-E4 Peptide Initiated off an Integral Methionine at Nucleotide 421 Acts as a Negative Regulator of NF-E4 Activity
The initial studies with the anti-NF-E4 antisera revealed the presence of a 14 kD species in addition to the predicted protein at 22 kD (see Example 1 , supra). Examination of the nucleotide sequence revealed the presence of an internal methionine at nucleotide 421 that could serve as the initiation codon for the smaller peptide (Figure 1).
To validate this, Murine Stem Cell Virus (MSCV)-based retroviral vector containing the NF-E4 cDNA truncated to methionine 421 and tagged at the 3' end with a hemagglutinin (HA) epitope was constructed. This virus was transfected into K562 cells and nuclear extract prepared. Western analysis of this extract with anti-HA antisera confirmed the presence of a 14 kD peptide species. Additional Western analysis was performed on extract derived from human cord blood and bone marrow progenitors. These studies demonstrated that in cord blood, the full-length NF-E4 peptide is about two to five times as abundant as the 14 kD species. In contrast, bone marrow has a ratio of full-length to truncated form of about 1 :1.
To assess the functional role of the smaller NF-E4 peptide species, K562 cells were transduced with the MSCV retrovirus carrying the truncated cDNA. This virus also contains the Green Fluorescent Protein (GFP) cDNA allowing selection of infected cells by FACS. Pools of transduced K562 cells were selected and analyzed for NF-E4 and γ-globin gene expression. Western analysis of GFP- positive pools with anti-HA antisera confirmed the presence of the truncated NF-E4 species. Northern analysis of these pools showed a dramatic reduction in γ-globin gene expression compared with control pools transduced with the parental MSCV retrovirus (Figures 7A and 7B). These findings suggest that the 14 kD form of NF-E4 can function as a dominant negative, inhibiting γ-globin gene expression. Based on these results, we would propose that NF-E4 exists in two forms in erythroid cells. The full-length peptide functions in the SSP complex as an inducer of γ-globin gene expression in fetal/erythroid cells (see Example 1, supra). In contrast, the smaller 14 kD species plays a negative regulatory role and consequently is more abundant in bone marrow than cord blood progenitors.
EXAMPLE 3: Genomic Organization of NF-E4
Although no known expression regulatory or transcription factors were homologous to human NF-E4, BLAST searches of the GenBank sequence database discovered a previously uncharacterized and unannotated DNA stretch from the human X-chromosome that contains the genomic NF-E4 gene. Figure 8 shows the NF-E4 genomic structure.
The genomic NF-E4 is located in GenBank Accession No. AC002416. It includes three exons and two introns and significant 5' and 3' flanking sequences. Exon I starts at nucleotide 108,464 (according to the GenBank numbering) and ends at 108,799. It contains nucleotides 1-336. Intron I has 1,812 base pairs. Exon II starts at nucleotide 110,611 and ends at nucleotide 110,950; it contains nucleotides 339-676. Intron II has 3,775 base pairs. Exon III starts at position 114,725 and ends at position 114,996; it contains nucleotides 677-966 (the 20 nucleotide truncation of this exon in the genomic sequence is likely to be an artifact of the genomic sequence, as this segment is present in PCR products from genomic DNA.
EXAMPLE 4: Enforced Expression of Positive Regulator NF-E4 in Cord
Blood Progenitors Induces γ-Globin and Represses β- Globin Gene Expression
Materials and Methods Generation of RD 18 producer cell lines. Amphotropic supernatant generated in 293T cells (as described above) was used to transfect the FLYRD18 packaging cell line (Cosset et al, J. Virol., 69:7430-7436, 1995). Briefly, fresh filtered supernatant from 293 T cells was added to RD18 cells plated at a density of 103 cells every 12 hours for 3 days. Subsequently, the top 20% of GFP-positive RD18 cells were obtained by fluorescence-activated cell sorting (FACS) and cultured until confluent. Amphotropic supernatant harvested from these plates was used to transfect CD34+ progenitors. The expression of NF-E4 in the producer cell line was verified by immunoblotting with anti-HA antiserum. Isolation and retroviral transduction of human CD34+ cells. Human cord blood was provided by the Bone Marrow Donor Institute. CD34+ cells were isolated using a MiniMACS magnetic cell sorting system (Miltenyi Biotec Inc.). Cells were then cultured overnight with expansion medium, which contains 1% deionized bovine serum albumin (BSA; Stem Cell Technologies), insulin (5 μg/ml; Sigma), transferrin (100 μg/ml; BRL), low-density lipoprotein (10 μg/ml; Sigma), 10"4 M β-mercaptoethanol (BRL), recombinant human interleukin-3 (rhIL-3; 10 ng/ml; R&D), rhIL-6 (10 ng/ml; R&D), recombinant human stem cell factor (hSCF, 300 ng/ml; R&D), and Flt-3 (300 ng/ml; R&D). Non-tissue culture-treated 35-mm- diameter dishes were coated with RetroNectin CH286 solution (TaKaRa Biochemicals, Shiga, Japan) at the concentration of 20 μg/cm2 for 2 hours at room temperature and then blocked with 2% BSA fraction V (Fisher Scientifics) for 30 min at room temperature (Moritz et al, Blood, 88: 855-862, 1996). The coated dishes were preloaded with virus supernatant from RD18 producer lines (2 ml/well) for 30 min, after which the supernatant was removed. Another 1 ml of supernatant was added along with expanded CD34+ cells at less than 5 x 105 cells/dish; then 1 ml of expanded medium (in which the amount of each ingredient was doubled) was added and mixed well. Cells were cultured for 24 hours and then harvested and carefully washed three times with a large volume of 1 phosphate-buffered saline. The cells were then cultured at 105 cells/ml in Iscove modified differentiation medium containing 30% fetal calf serum (HyClone), 1% deionized BSA (Stem Cell
Technologies), recombinant human stem cell factor (100 ng/ml; R&D), rhIL-3 (0.1 pg/ml; R&D), human erythropoietin (10 U/ml; Amgen), insulin (10 μg/ml; Sigma), 10"4 M β-mercaptoethanol (BRL), 10"6 M hydrocortisone (Stem Cell Technologies), and penicillin-streptomycin and glutamine (1 :1,000; BRL) as previously described (Sui et al, Blood, 92:1142-1149, 1998). At day 12, GFP- and glycophorin A-positive cells were isolated by FACS, and RNA was prepared using RNAzol (Tel-Test Inc.). RNase protection. RNase protection analysis was performed using an Ambion RNase protection assay kit according to the manufacturer's instructions. Probes used in these studies were as described previously (Morley et al. , Blood, 78: 1355-1363, 1991). Probe input was 106 cpm/sample for γ- and β-globin probes and 0.25 x 106 cpm/sample for the 18S probe.
All other methods were as described in Example 1 , supra.
Results and Discussion
To extend the functional studies of NF-E4, we generated stable FLYRD 18 producer cell lines containing either MSCV or MSCV-HA-NF-E4 (Cosset et al, J. Virol., 69: 7430-7436, 1995). Supernatants from these lines were used to transduce CD34+ cells derived from human cord blood (see Materials and Methods). The cells were then cultured for 12 days in differentiation medium, and GFP- and glycophorin A-positive cells were separated by FACS. RNA was prepared from these cells and analyzed by RNase protection assay. The most striking difference between the MSCV-NF-E4 and control MSCV pools was the reduction in β-globin gene expression in the NF-E4-transduced pools. After normalization for the housekeeping gene 18S, the reduction in β-globin gene expression induced by NF-E4 was approximately 10-fold. As the signal detected with the γ-globin gene probe in this assay was intense in both MSCV and NF-E4 pools, we examined various dilutions of RNA to determine whether a difference was evident at lower concentrations. Analysis in the linear portion of the assay revealed a two-fold increase in γ-globin gene expression in pools transduced with MSCV-NF-E4 compared with those transduced with the control MSCV retrovirus. The suppression of β-gene and augmentation of γ-gene expression . observed in primary cord blood progenitors are of interest, as this is a cell population in which γ- and β- globin are normally expressed concurrently. This finding suggests that in the context of promoter competition, the presence of NF-E4 is sufficient to alter the balance between the promoters, favoring transcription of the fetal gene. This finding is reminiscent of the studies of EKLF nulhzygous mice carrying the human β-globin locus yeast artificial chromosome (β-YAC mice), in which enhanced γ-globin gene expression is observed in the fetal liver due to the diminution of effective competition from the inactive promoter (Perkins et al, Proc. Natl. Acad. Sci. USA, 93:12267-12271, 1996; Wijgerde et al, Genes Dev., 10:2894-2902, 1996). The difference in magnitude between β-globin suppression and γ-globin gene activation in the cord blood progenitors in the setting of enforced expression of NF-E4 is intriguing. One interpretation is that although the γ-promoter has the competitive advantage over the β-promoter in the presence of NF-E4, allowing its preferential interaction with the LCR, other factors necessary for optimal γ-globin gene expression are diminishing as the switch from fetal to adult globin expression progresses. This interpretation is consistent with our previous findings which suggest that binding of the SSP confers only weak transcriptional activation in the setting of promoter competition (Jane et al, EMBO J., 11 :2961-2969, 1992). Studies of the effects of enforced expression of NF-E4 in earlier developmental stages in the β-YAC transgenic line will further address this issue (Gaensler et al, Proc. Natl. Acad. Sci. USA, 90:11381-11385, 1993).
The ability of enforced expression of NF-E4 in cord blood progenitors to suppress β-globin gene expression may have significant implications in genetic therapy of sickle-cell disease with the dual beneficial effects of enhanced fetal globin expression and reduction of βs-globin synthesis.
* * *
The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and the accompanying figures. Such modifications are intended to fall within the scope of the appended claims.
It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular mass values, are approximate, and are provided for description.
All patents, patent applications, publications, and other materials cited herein are hereby incorporated herein by reference in their entireties.

Claims

WHAT IS CLAIMED:
1. An isolated NF-E4 polypeptide.
2. The isolated polypeptide of claim 1 which is human NF-E4.
3. The isolated polypeptide of claim 1 which has an apparent molecular weight of 22 kD by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).
4. The isolated polypeptide of claim 1 which has an apparent molecular weight of 14 kD by SDS-PAGE.
5. The isolated polypeptide of claim 1 which comprises an amino acid sequence as depicted in SEQ ID NO: 8.
6. The isolated polypeptide of claim 5 which comprises an amino acid sequence as depicted in SEQ ID NO:3.
7. The isolated polypeptide of claim 6 which comprises an amino acid sequence as depicted in SEQ ID NO:2.
8. An isolated nucleic acid encoding an NF-E4 polypeptide.
9. The isolated nucleic acid of claim 8 wherein the NF-E4 is human NF-E4.
10. The isolated nucleic acid of claim 8, wherein the NF-E4 polypeptide has a predicted molecular weight of approximately 22 kD.
11. The isolated nucleic acid of claim 10, wherein a coding sequence translated into the NF-E4 polypeptide starts with a CTG or CUG codon.
12. The isolated nucleic acid of claim 11, which comprises a nucleotide sequence as depicted in SEQ ID NO:l from nucleotide 121 to nucleotide 657.
13. The nucleic acid of claim 12, which lacks a codon for methionine at position corresponding to nucleotide 421.
14. The nucleic acid of claim 8, wherein the NF-E4 polypeptide has a predicted molecular weight of approximately 14 kD.
15. The nucleic acid of claim 14, which comprises a nucleotide sequence as depicted in SEQ ID NO:l from nucleotide 421 to nucleotide 657.
16. A nucleic acid of at least 10 nucleotides that hybridizes under stringent conditions to a nucleic acid having a coding sequence as depicted in SEQ ID NO:l.
17. An expression vector comprising the nucleic acid of claim 10 operably associated with an expression control sequence.
18. The expression vector of claim 17 which is a viral vector.
19. The expression vector of claim 17, wherein the nucleic acid lacks a codon for methionine at position corresponding to NF-E4 nucleotide 421.
20. A host cell comprising the expression vector of claim 17.
21. A method for expressing an NF-E4 polypeptide comprising propagating the host cell of claim 20 under conditions that permit expression of NF- E4 from the expression vector.
22. An expression vector comprising the nucleic acid of claim 14 operably associated with an expression control sequence.
23. The expression vector of claim 22 which is a viral vector.
24. A host cell comprising the expression vector of claim 22.
25. A method for expressing an NF-E4 polypeptide comprising propagating the host cell of claim 24 under conditions that permit expression of NF- E4 from the expression vector.
26. A transgenic animal comprising the nucleic acid of claim 8 operably associated with an expression control sequence, which transgenic animal is capable of expressing the NF-E4 polypeptide at a level sufficient to modulate fetal globin gene expression.
27. The transgenic animal of claim 26 further comprising a human β-globin locus control region.
28. The transgenic animal of claim 26 wherein the NF-E4 polypeptide is human.
29. The transgenic animal of claim 26 further expressing a defective β-globin gene.
30. The transgenic animal of claim 26 expressing a positive regulator NF-E4 polypeptide, wherein the NF-E4 polypeptide is expressed at a level sufficient to induce or enhance fetal globin gene expression.
31. The transgenic animal of claim 30 expressing the positive regulator NF-E4 polypeptide at a level sufficient to reduce adult globin gene expression.
32. The transgenic animal of claim 30, wherein the positive regulator NF-E4 polypeptide has a predicted molecular weight of approximately 22 kD.
33. The transgenic animal of claim 26 expressing a negative regulator NF- E4 polypeptide, wherein the NF-E4 polypeptide is expressed at a level sufficient to reduce fetal globin gene expression.
34. The transgenic animal of claim 33, wherein the negative regulator NF-E4 polypeptide has a predicted molecular weight of approximately 14 kD.
35. A method of screening for a compound that is a candidate for regulation of stage selector protein activity (SSP) in a cell, which method comprises determining if a test compound contacted with a recombinant NF-E4 polypeptide modulates the activity of the NF-E4 polypeptide, wherein if the candidate compound modulates the activity of the NF-E4 polypeptide it is a candidate for regulation of SSP activity.
36. The method according to claim 35, wherein the recombinant NF-E4 polypeptide is in a cell-free system.
37. The method according to claim 35, wherein the recombinant NF-E4 polypeptide is expressed in a host cell.
38. A method for inducing or increasing expression in a cell of fetal globin, or embryonic globin, or both, which method comprises increasing the activity of a positive regulator NF-E4 in the cell.
39. The method according to claim 38, further resulting in reduction of adult globin gene expression.
40. The method according to claim 38, wherein the positive regulator NF-E4 polypeptide has a predicted molecular weight of approximately 22 kD.
41. The method according to claim 40, wherein the level of 22 kD NF-E4 is at least about two-fold greater than the level of 14 kD NF-E4 in the cell.
42. The method according to claim 38, which method comprises introducing an expression vector comprising a nucleic acid encoding the positive regulator NF-E4 polypeptide operably associated with an expression control sequence into the cell.
43. The method according to claim 42, wherein the expression vector is a viral vector.
44. The method according to claim 38, wherein the cell expresses a defective adult globin.
45. A method for inhibiting expression of fetal globin in a cell, which method comprises increasing the activity of a negative regulator NF-E4 in the cell.
46. The method according to claim 45, wherein the negative regulator NF-E4 polypeptide has a predicted molecular weight of approximately 14 kD.
47. The method according to claim 46, wherein the level of 22 kD NF-E4 is about the same as or less than the level of 14 kD NF-E4 in the cell.
48. The method according to claim 45, which method comprises introducing an expression vector comprising a nucleic acid encoding the negative regulator NF-E4 polypeptide operably associated with an expression control sequence into the cell.
49. The method according to claim 48, wherein the expression vector is a viral vector.
50. A pharmaceutical composition comprising the NF-E4 polypeptide of claim 1, 2, 3, or 4.
51. A pharmaceutical composition comprising a positive regulator NF-E4 polypeptide.
52. A pharmaceutical composition comprising a negative regulator NF-E4 polypeptide.
53. A pharmaceutical composition comprising the NF-E4 nucleic acid of any one of claims 8, 9, 10, or 14 operably associated with an expression control sequence .
54. The pharmaceutical composition of claim 53 wherein the expression vector is a viral vector.
55. A pharmaceutical composition comprising the 22 kD NF-E4 expression vector of claim 17.
56. The pharmaceutical composition of claim 55 wherein the expression vector is a viral vector.
57. The pharmaceutical composition of claim 55 wherein the expression vector lacks a codon for methionine at position corresponding to NF-E4 nucleotide 421.
58. A pharmaceutical composition comprising the 14 kD NF-E4 expression vector of claim 22.
59. The pharmaceutical composition of claim 58 wherein the expression vector is a viral vector.
60. A method for modulating globin expression in a mammal, which method comprises administering to the mammal a therapeutically effective amount of the pharmaceutical composition according to any one of claims 50-59.
61. A method for treating a hemoglobinopathy in a mammal, which method comprises administering to the mammal the pharmaceutical composition of any one of claims 50-51 or 53-57 in an amount effective to induce or increase expression of fetal globin, or embryonic globin, or both.
62. The method of claim 61 wherein the mammal is human.
63. The method of claim 61 wherein the hemoglobinopathy is selected from the group consisting of β-thalassemia and sickle-cell disease.
64. The method of claim 61 further resulting in decreased expression of adult globin.
65. Use of the NF-E4 polypeptide of claim 1, 2, 3, or 4 or NF-E4 nucleic acid of claim 8, 9, 10, or 14 in a manufacturing of a medicament useful for treating a hemoglobinopathy.
PCT/US2000/030988 1999-11-12 2000-11-13 Isolation and characterization of human nf-e4 WO2001034625A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU14831/01A AU1483101A (en) 1999-11-12 2000-11-13 Isolation and characterization of human nf-e4

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16500499P 1999-11-12 1999-11-12
US60/165,004 1999-11-12

Publications (2)

Publication Number Publication Date
WO2001034625A1 true WO2001034625A1 (en) 2001-05-17
WO2001034625A9 WO2001034625A9 (en) 2002-08-15

Family

ID=22597006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/030988 WO2001034625A1 (en) 1999-11-12 2000-11-13 Isolation and characterization of human nf-e4

Country Status (2)

Country Link
AU (1) AU1483101A (en)
WO (1) WO2001034625A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007140335A2 (en) * 2006-05-25 2007-12-06 The Regents Of The University Of Michigan Screening methods and transgenic animals for treatment of beta-globin related diseases and conditions
WO2008061303A1 (en) * 2006-11-21 2008-05-29 Melbourne Health Method of treatment and prophylaxis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE GENEMBL [online] 1998, CHEN ET AL., XP002939027, accession no. STN Database accession no. AC002416 *
GALLARDA ET AL.: "The beta-globin stage selector element factor is erythroid-specific promoter/enhancer binding protein NF-E4", GENES AND DEVELOPMENT, vol. 3, no. 12A, December 1989 (1989-12-01), pages 1845 - 1859, XP002939028 *
ZHOU ET AL.: "Induction of human fetal globin gene expression by a novel erythroid factor NF-E4", MOL. CELL. BIOL., vol. 20, no. 20, October 2000 (2000-10-01), pages 7662 - 7672, XP002939026 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007140335A2 (en) * 2006-05-25 2007-12-06 The Regents Of The University Of Michigan Screening methods and transgenic animals for treatment of beta-globin related diseases and conditions
WO2007140335A3 (en) * 2006-05-25 2008-02-21 Univ Michigan Screening methods and transgenic animals for treatment of beta-globin related diseases and conditions
WO2008061303A1 (en) * 2006-11-21 2008-05-29 Melbourne Health Method of treatment and prophylaxis

Also Published As

Publication number Publication date
AU1483101A (en) 2001-06-06
WO2001034625A9 (en) 2002-08-15

Similar Documents

Publication Publication Date Title
US6197925B1 (en) NF-AT polypeptides and polynucleotides
JP4007611B2 (en) Immunosuppressant target protein
AU694502B2 (en) Mts gene, mutations therein, and methods for diagnosing cancer using mts gene sequence
US6441156B1 (en) Calcium channel compositions and methods of use thereof
JP2004504054A (en) Common polymorphisms in SCN5A associated with drug-induced cardiac arrhythmias
AU782355B2 (en) Antagonists of BMP and TGFbeta signalling pathways
US20030064363A1 (en) Gene encoding a new TRP channel is mutated in mucolipidosis IV
WO2001034625A1 (en) Isolation and characterization of human nf-e4
WO2000029571A1 (en) Gene encoding novel transmembrane protein
WO1999055728A2 (en) Ese genes and proteins
US5705380A (en) Identification of a gene encoding TULP2, a retina specific protein
US7354722B1 (en) Modulators of Smurf and BMP/TGFβ signaling pathways
US6506889B1 (en) Ras suppressor SUR-8 and related compositions and methods
US20020088015A1 (en) Wilms' tumor wt1 binding proteins
US20030129631A1 (en) Gene family with transformation modulating activity
WO2005024024A1 (en) Mutations in the nedd4 gene family in epilepsy and other cns disorders
AU777192B2 (en) Immunosuppressant target proteins
AU2005225080B2 (en) Antagonists of BMP and TGFBeta signalling pathways
AU702363C (en) NF-at polypeptides and polynucleotides
US7118886B1 (en) Ese genes and proteins
WO1998013494A2 (en) Wilms' tumor wti binding proteins
WO1998013494A9 (en) Wilms' tumor wti binding proteins
US20020115104A1 (en) MMSC2 - an MMAC1 interacting protein
AU1738899A (en) Immunosuppressant target proteins
CA2326601A1 (en) Ese genes and proteins

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
COP Corrected version of pamphlet

Free format text: PAGES 1/8-8/8, DRAWINGS, REPLACED BY NEW PAGES 1/7-7/7; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase