EP1087985A1

EP1087985A1 - Exons 4 and 7 encode separate transactivating and chromatin localizing domains in esx

Info

Publication number: EP1087985A1
Application number: EP99928601A
Authority: EP
Inventors: Christopher C. Benz; Gary K. Scott; Chuan-Hsiung Chang; Yesu Chao
Original assignee: University of California
Current assignee: University of California
Priority date: 1998-06-16
Filing date: 1999-06-15
Publication date: 2001-04-04
Also published as: CA2331266A1; WO1999065929A1; AU4563599A

Abstract

This invention identified two domains of ESX, a member of the ETS transcription regulator family, that provide particularly effective targets useful in screening for ESX modulators. One of these domains, ESX exon 4 is a potent transactivator and can be used in constructs to up-or downregulate genes or cDNAs, particularly genes or cDNAs under the control of a promoter containing an Ets element. Another of these domains, exon 7, is capable of acetylation and the level of acetylation can be used in assays for abnormal ESX regulation.

Description

EXONS 4 AND 7 ENCODE SEPARATE TRANSACTIVATING AND CHROMATIN LOCALIZING DOMAINS IN ESX

CROSS-REFERENCE TO RELATED APPLICATIONS

This claims benefit under 35 U.S.C. §119 of provisional patent application USSN 60/089,409, filed on June 16, 1998 and is a continuation-in-part of USSN 08/978,217, filed on November 25, 1997, which claims benefit under 35 U.S.C. §119 of provisional patent application USSN 60/031,504, filed on November 27, 1996, all of which are herein incorporated by reference, in their entirety, for all purposes.

FIELD OF THE INVENTION This invention pertains to the field of oncology. In particular, this invention pertains to the discovery of domains of a transcription factor gene that provide novel targets for modulators of that gene and that are implicated in the etiology of human epithelial cancers, including breast cancer, and other malignancies including gastric, ovarian, and lung adenocarcinomas.

BACKGROUND OF THE INVENTION

Many cancers are believed to result from a series of genetic alterations leading to progressive disordering of normal cellular growth mechanisms (Nowell (1976) Science 194: 23, Foulds (1958) J. Chronic Dis. 8: 2). The deletion or multiplication of copies of whole chromosomes or chromosomal segments, or specific regions of the genome are common (see, e.g., Smith et al. (1991) Breast Cancer Res. Treat., 18: Suppl. 1: 5-14; van de Vijer & Nusse (1991) Biochim. Biophys. Ada. 1072: 33-50; Sato et al. (1990) Cancer. Res., 50: 7184-7189). In particular, the amplification and deletion of DNA sequences containing proto-oncogenes and tumor-suppressor genes, respectively, are frequently characteristic of tumorigenesis. Dutrillaux, et al. (1990) Cancer Genet. Cytogenet, 49: 203- 217. As an example, overexpression of the RER2/neu (c-erbB-2) proto-oncogene product is found in approximately 20-30% of primary breast cancers and in a similar fraction of human gastric, ovarian, and lung carcinomas. For many of these malignancies, this overexpressed membrane growth factor receptor (pl85^HER2) is associated with HER2 gene amplification, more aggressive tumor growth, and reduced patient survival. Maguire & Greene (1989) Semin. Oncol. 16: 148-155; Singleton & Strickler (1992) Pathol. Annu. 1: 165-190; Tripathy & Benz (1993) in Oncogenes and Tumor Suppressor Genes in Human Malignancies (Benz and Liu, eds.) pp. 15-60, Kluwer, Boston. In approximately 10-20% of HER2- overexpressing breast tumors, some gastric, and virtually all HER2-positive lung cancers, HER2 m NA and protein overexpression occur in the absence of increased gene copy number, suggesting that HER2 transcriptional dysregulation may be a fundamental defect of clinical significance in these malignancies. Berger et al. (1988) Cancer Res. 48: 1238-1243; Kameda et al. (1990) Cancer Res. 50: 8002-8009; Kern et al. (1990) Cancer Res. 50: 5184- 5191; King et al. (1989) Cancer Res. 49: 4185-4191; Slamon et al. (1989) Science 244: 707- 712; Tandon et al. (1989) J. Clin. Oncol. 7: 1120-1128. It has been speculated that a primary defect leading to dysregulated HER2 transcription might also predispose to the in vivo development of gene amplification and stable acquisition of a more malignant tumor cell phenotype. Kameda et al., supra.; King et al., supra.; Hynes et al. (1989) J. Biol. Chem. 39: 167-173; Kraus et al. (1987) EMBOJ. 6: 605-610; Pasleau et al. (1993) Oncogene 8: 849-854.

Recently, a previously unrecognized response element similar to those recognized by the ets transcriptional regulator family was identified within both the human HER2 and murine neu promoters. Scott et al. (1994) J. Biol. Chem. 269: 19848-19858. The ets multigene family of transcriptional regulators includes more than thirty known members that are involved in early embryonic development and late tissue maturation, directing stage- specific and tissue-restricted programs of gene expression. The ETS transcription factors, which are recognizable primarily by their 85 amino acid ETS DNA-binding domain, are dispersed across all metazoan lineages into distinct subfamilies. Ets genes can produce malignancies in humans and other vertebrates when overexpressed or rearranged into chimeras retaining the ETS domain.

Members of the Ets family of transcription factors have been shown to play important roles in regulating gene programs critical for normal organismal development and cellular differentiation, while fusion proteins involving Ets family members arising from chromosomal translocations are thought to account for a significant fraction of all human leukemias and lymphomas as well as virtually all Ewings sarcomas and Primitive Neuro- Ectodermal Tumors (PNET), otherwise known as small round cell bone and soft tissue sarcomas of childhood (reviewed in Crepieux et al. (1994) Critical Reviews in Oncogenesis, 5: 615-6381; Dittmer and Nordheim (1998) Biochim. Biophys. Ada, 1377: Fl-11; Hromas and Klemsz (1994) Int. J. Hematol, 59: 257-265; Sharrocks et al. (1997) Int. J. Biochem. Cell Biol., 29: 1371-1387; Wasylyk and Nordheim (1997) Transcription Factors in Eukaryotes. Papavassiliou AG (ed) Springer- Verlag: Heidelberg, Germany, pp. 253-286). Searching for Ets factors potentially involved in human mammary gland development and malignancy, we recently cloned and characterized a novel Ets factor, ESX (Epithelial-restricted with Serine boX; HUGO/GDB:6837498), which was found to be transcriptionally upregulated in a subset of early breast tumors and breast cancer cell lines where it was postulated to be a candidate transactivator of the Ets responsive proto- oncogene, ErbB2 (USSN 08/978,217, Chang et al. (1997) Oncogene, 14: 1617-1622). Four other groups have since published on the potential biological and developmental importance of this epithelium-specific Ets factor (variably named ESE-1, ELF-3, Jen, or ERT) in non- mammary epithelial systems (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-42951 Choi et al. (1998) J. Biol. Chem., 273: 110-1171; Oettgen et al. (1997) Mo/. Cell. Biol., 17: 4419-4433; Tymms et al. (1997) Oncogene, 15, 2449-2462). While its expression profile suggests that ESX is associated with development of both simple and stratified epithelium (Andreoli et al, supra.; Choi et al. supra.; Chang et al, supra.; Oettgen et al. (1997) supra., Tymms et al. supra.), detailed studies were first performed in the latter and these showed that ESX is unique among transcription factors generally, and Ets factors specifically, for its restricted expression in terminally differentiated epidermal cells ((Andreoli et al, supra.; Choi et al. supra. ; ). In stratified epithelium, ESX is thought to transactivate such genes as the transforming growth factor-β type II receptor (TGF-βRII), Endo-A/keratin-8, and several markers of epidermal cell differentiation including transglutaminase 3, SPRR2A, and profilaggrin (Andreoli et al, supra.; Choi et al. supra.; Oettgen et al. (1997) supra.; Tymms et al supra.). Because most, if not all, cancers involve dysregulation of gene expression, a need exists for information as to transcription factors and other regulatory moieties that are involved in mediating the dysregulation. More particularly, knowledge of particular domains or subunits within transcription factors that provide good targets for the development of modulators of transcription factor activity is is helpful in developing methods and compositions for use in diagnosing and treating cancers. The present invention fulfills this and other needs.

SUMMARY OF THE INVENTION

This invention pertains to discoveries regarding the structural and functional organization of ES (epithelial-restricted with serine box) a member of the ETS family of genes. In particular, this invention is premised, in part, on the discovery that exons 4 and 7 encode separate transactivating and chromatin localizing domains in ESX.

In one embodiment, this invention provides methods of screening for a modulator of ESX activity. The methods involve providing a target such as a nucleic acid encoding a polypeptide of ESX exon 7 (or exon 4), and/or polypeptide comprising ESX exon 7 (or exon 4); contacting the nucleic acid or polypeptide with the test agent; and detecting binding of the test agent to the nucleic acid or polypeptide wherein a test agent that binds to said nucleic acid or polypeptide modulates ESX activity. In one preferred embodiment, the target is a nucleic acid encoding a polypeptide of ESX exon 7, while in another preferred embodiment the polypeptide comprises ESX exon 7. In preferred embodiments, the polypeptide is not a full-length ESX polypeptide. The nucleic acid or polypeptide can be labeled with a detectable label (e.g., radioisotope, enzyme, fluorescent molecule, quantum dot, chemiluminescent molecule, bioluminescent molecules, colloidal metals, etc.) and preferred detectable labels are fluorescent labels. Preferred detection methods include immunoassays, particularly immunoassays utilizing an antibody that specifically binds to a polypeptide of ESX exon 7.

This invention also provides methods of activating transcription of a gene or a cDNA. The methods involve contacting the "target" gene or cDNA with a construct comprising a DNA binding domain attached to a polypeptide comprising exon 4 of ESX. In a preferred embodiment, the polypeptide is not a full-length ESX polypeptide. In one embodiment, the polypeptide comprising exon 4 of ESX contains one or more mutations of amino acids at positions 155, 154, 152, 150, 145, 135, 131, and 132. The polypeptide can also comprise conservative substitutions in the ESX exon(s). Alternatively, or in addition, the polypeptide can comprise an exon 4 of ESX having a carboxyl terminal deletion of exon 4 of up to 27 amino acids and/or an amino terminal deletion of exon 4 of up to a 5 amino acids. In one particularly preferred embodiment, the ESX exon 4 region contains both an amino terminal and a carboxyl terminal deletion of exon 4 leaving the exon 4 amino acids 134 through 147. The transactivation is most effective against target genes/cDNAs under the control of a promoter (preferably an epithelial gene promoter) having an endogenous or engineered Ets response element. The gene or cDNA can be an endogenous gene or cDNA or present in a transfected (e.g., transiently transfected) vector. A preferred DNA binding domain is a GAL4 DNA binding domain.

This invention also provides constructs comprising a nucleic acid (e.g. DNA) binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX. Also provided are nucleic acids encoding such constructs. Preferred constructs (and their corresponding nucleic acids) are constructs comprising a DNA binding domain attached to a polypeptide comprising exon 4 of ESX. Any of the exon 4 polypeptides are described herein are suitable. The nucleic acid binding domain is either chemically conjugated to the ESX polypeptide or they are expressed as a fusion protein.

In still another embodiment, this invention provides an affinity matrix comprising a solid support attached to a polypeptide comprising exon 4 or exon 7 of ESX as described herein. The polypeptide is preferably not a full-length ESX. Preferred solid supports include glass, plastic, metal, ceramic, gels and aerogels. Also provided are kits for performing any of the methods described herein.

Preferred kits comprising a container containing one or more of the constructs selected from the group consisting of: a polypeptide that is not a full-length ESX and that comprises ESX exon 4 or ESX exon 7, a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX, a nucleic acid encoding a polypeptide that is not a full-length ESX and that comprises ESX exon 4 or ESX exon 7, and a nucleic acid encoding a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX.

This invention also provides methods of detecting dysregulation of an ESX gene (e.g., in an organism). The methods involve detecting the degree of acetylation of ESX in a biological sample; and comparing said degree of acetylation of ESX in the biological sample with the degree of acetylation in a control sample from a normal healthy tissue, wherein a difference in the degree of acetylation of ESX in said biological sample with the degree of acetylation in said control sample indicates dysregulation of an ESX gene. In preferred embodiments, the difference is a statistically significant difference (e.g. at least a 1.5 fold difference, more preferably at least a two-fold difference, and most preferably at least a 5-fold or even at least a 10-fold difference). The detecting can be by use of an antibody that specifically binds to an acetylated ESX and not to an unacetylated ESX or vice versa. In one embodiment, the statistically significant difference is indicative of an epithelial cancer (e.g., human breast cancer). In one embodiment the healthy tissue comprises normal human mammary epithelial cells. In one embodiment the statistically significant difference is indicative of an unfavorable prognosis. The method can further involve selecting an appropriate treatment regime.

Also provided are methods of inhibiting growth or proliferation of neoplastic cells. The methods involve administering to said cells an effective amount of an agent that inhibits activity of exon 4 or exon 7. The neoplastic cells can comprise a cancer in an organism. The method can involve transfecting cells of the mammal with a vector expressing an antisense ESX nucleic acid that specifically binds to a nucleic acid of exon 4 or exon 7. Various agents include, but are not limited to an exon 4 mutein, an exon 7 mutein. This invention also provides methods of depressing transcription of a gene or a cDNA. The methods involve contacting the gene or cDNA with a recombinantly expressed polypeptide comprising an exon 4 of ESX where the polypeptide is not a full-length ESX and lacks a DNA binding domain.

This invention also provides probes for detection or localization of proteins that bind to ESX. These probes comprise a detectable label attached to a polypeptide comprising ESX exon 4 wherein the polypeptide is not a full length ESX polypeptide. Probes are also provided for detection or localization of proteins or nucleic acids that bind to ESX. These probes comprise a detectable label attached to a polypeptide comprising ESX exon 7 wherein said polypeptide is not a full length ESX polypeptide. The probes can be isolated, recombinantly expressed, or chemically synthesized. Preferred detectable labels include radioisotopes, enzymes, fluorescent molecules, chemiluminescent molecules, quantum dots, bioluminescent molecules, and colloidal metals.

Definitions

The term "antibody" refers to a polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof which specifically bind and recognize an analyte (antigen). The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively. An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (V_L) and variable heavy chain (V_H) refer to these light and heavy chains respectively.

Antibodies exist e.g., as intact immunoglobulins or as a number of well- characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'_2ι a dimer of Fab which itself is a light chain joined to V_H-C_H1 by a disulfide bond. The F(ab)'₂ may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)'₂ dimer into an Fab' monomer. The Fab¹ monomer is essentially an Fab with part of the hinge region (see, Fundamental Immunology, Third Edition, W.E. Paul, ed., Raven Press, N.Y. 1993). While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by utilizing recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments either produced by the modification of whole antibodies or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv).

An "anti-ESX antibody" is an antibody or antibody fragment that specifically binds a polypeptide encoded by the ESAf gene, cDNA, or a subsequence thereof.

A "chimeric antibody" is an antibody molecule in which (a) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity. The term "immunoassay" is an assay that utilizes an antibody to specifically bind an analyte. The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, target, and/or quantify the analyte.

The terms "isolated" "purified" or "biologically pure" refer to material which is substantially or essentially free from components which normally accompany it as found in its native state.

The term "nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogs of natural nucleotides that can function in a similar manner as naturally occurring nucleotides. The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. A "label" is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include ³²P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available (e.g., the peptide of SEQ ID NO 2 can be made detectable, e.g., by incorporating a radio-label into the peptide, and used to detect antibodies specifically reactive with the peptide).

As used herein a "nucleic acid probe" is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. As used herein, a probe may include natural (i.e. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. Thus, for example, probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. The probes are preferably directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin complex may later bind. By assaying for the presence or absence of the probe, one can detect the presence or absence of the select sequence or subsequence.

A "labeled nucleic acid probe" is a nucleic acid probe that is bound, either covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label such that the presence of the probe may be detected by detecting the presence of the label bound to the probe.

The term "target nucleic acid" refers to a nucleic acid (often derived from a biological sample), to which a nucleic acid probe is designed to specifically hybridize. It is either the presence or absence of the target nucleic acid that is to be detected, or the amount of the target nucleic acid that is to be quantified. The target nucleic acid has a sequence that is complementary to the nucleic acid sequence of the corresponding probe directed to the target. The term target nucleic acid may refer to the specific subsequence of a larger nucleic acid to which the probe is directed or to the overall sequence (e.g., gene or mRNA) whose expression level it is desired to detect. The difference in usage will be apparent from context. "Subsequence" refers to a sequence of nucleic acids or amino acids that comprise a part of a longer sequence of nucleic acids or amino acids (e.g., polypeptide) respectively.

The term "recombinant" when used with reference to a cell, or nucleic acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.

The term "identical" in the context of two nucleic acids or polypeptide sequences refers to the residues in the two sequences which are the same when aligned for maximum correspondence. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. Appl Math. 2: 482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol Biol. 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat 'I. Acad. Sci. USA 85: 2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by inspection. An additional algorithm that is suitable for deteraiining sequence similarity is the BLAST algorithm, which is described in Altschul et al. (1990) J. Mol. Biol. 215: 403- 410. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra.). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a word length (W) of 11, the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1992) Proc. Natl Acad. Sci. USA, 89: 10915-10919) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.

The BLAST algorithm performs a statistical analysis of the similarity between two sequences; see, e.g., Karlin and Altschul (1993) Proc. Nat'l Acad. Sci. USA 90: 5873-5787. One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to an ESX nucleic acid if the smallest sum probability in a comparison of the test nucleic acid to an ESX nucleic acid is less than about 0.1 , more preferably less than about 0.01, and most preferably less than about 0.001. Where the test nucleic acid encodes an ESX polypeptide, it is considered similar to a specified ESX nucleic acid if the comparison results in a smallest sum probability of less than about 0.5, and more preferably less than about 0.2. The term "substantial identity" or "substantial similarity" in the context of a polypeptide indicates that a polypeptides comprises a sequence with at least 70% sequence identity to a reference sequence, or preferably 80%, or more preferably 85% sequence identity to the reference sequence, or most preferably 90% identity over a comparison window of about 10-20 amino acid residues. An indication that two polypeptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thus, a polypeptide is substantially identical to a second polypeptide, for example, where the two peptides differ only by a conservative substitution.

An indication that two nucleic acid sequences are substantially identical is that the polypeptide which the first nucleic acid encodes is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.

Another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions.

"Bind(s) substantially" refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence.

The phrase "hybridizing specifically to", refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. The term "stringent conditions" refers to conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (T_m) for the specific sequence at a defined ionic strength and pH. The T_m is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the probes complementary to the target sequence hybridize to the target sequence at equilibrium. (As the target sequences are generally present in excess, at T_m, 50% of the probes are occupied at equilibrium). Typically, stringent conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) and at least about 60EC for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.

The phrases "specifically binds to a protein" or "specifically immunoreactive with", when referring to an antibody refers to a binding reaction which is determinative of the presence of the protein in the presence of a heterogeneous population of proteins and other biologies. Thus, under designated immunoassay conditions, the specified antibodies bind preferentially to a particular protein and do not bind in a significant amount to other proteins present in the sample. Specific binding to a protein under such conditions requires an antibody that is selected for its specificity for a particular protein. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity. For determination of specific binding of an anti-ESX antibody, an immunoprecipitation assay is preferred. Under appropriate conditions, an antibody that specifically binds to an ESX polypeptide will immunoprecipitate ESX, but not other ETS transcription factors. A "conservative substitution", when describing a protein refers to a change in the amino acid composition of the protein that does not substantially alter the protein's activity. Thus, "conservatively modified variations" of a particular amino acid sequence refers to amino acid substitutions of those amino acids that are not critical for protein activity or substitution of amino acids with other amino acids having similar properties (e.g., acidic, basic, positively or negatively charged, polar or non-polar, etc.) such that the substitutions of even critical amino acids do not substantially alter activity. Conservative substitution tables providing functionally similar amino acids are well known in the art. The following six groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), Glutamic acid (E);

3) Asparagine (N), Glutamine (Q);

4) Arginine (R), Lysine (K);

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also, Creighton (1984) Proteins W.H. Freeman and Company. One of skill in the art will appreciate that the above-identified substitutions are not the only possible conservative substitutions. For example, one may regard all charged amino acids as conservative substitutions for each other whether they are positive or negative (see, e.g., Figures 2B, 2C, and 2D). In addition, individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small percentage of amino acids in an encoded sequence are also "conservatively modified variations".

The terms human "esx" or human "ESX gene or cDNA" are used interchangeably to refer to the human esx gene, which is a transcription factor gene that is also involved in the etiology of cancers, for example, epithelial cancers. The esx gene is determined to be a member of the ETS gene family by significant homology between the ESX DNA binding domain and the DNA binding domain of other members of the ETS family. ESX, however, is distinct from previously known ETS genes because of 5 non- conservative substitutions in the ETS consensus sequence. Nevertheless, ESX is still recognized to belong to the ETS family because ESX contains 27 identical amino acid residues among the 38 recognized consensus residues making up the ETS DNA binding domain (i.e., greater than 50% *sequence identity, more preferably greater than 60% sequence identity and most preferably greater than 70% sequence identity in the ETS consensus sequence). Similarly the terms mouse or murine ESX genes or cDNAs refer to the mouse or murine ESX genes or cDNAs respectively. A "gene product", as used herein, refers to a nucleic acid whose presence, absence, quantity, or nucleic acid sequence is indicative of a presence, absence, quantity, or nucleic acid composition of the gene. Gene products thus include, but are not limited to, an mRNA transcript, a cDNA reverse transcribed from an mRNA, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA or subsequences of any of these nucleic acids. Polypeptides expressed by the gene or subsequences thereof are also gene products. The particular type of gene product will be evident from the context of the usage of the term.

An "abnormal esx gene or cDNA" refers to an esx gene or cDNA that encodes an increased or decreased amount of ESX polypeptide, a non-functional ESX polypeptide, or an ESX polypeptide of substantially reduced functionality. Animal cells having non-functional, or reduced functionality, ESX polypeptides are characterized by a decrease in ESX-mediated transcriptional regulation. In a cancer cell, this relaxation of ESX-mediated regulation can result in a decrease in neoplastic cell proliferation. Similarly, "abnormal ES gene product" refers to a nucleic acid encoding a non-functional or reduced functionality ΕSX polypeptide or the non-functional or reduced functionality ΕSX polypeptide itself. Abnormal esx genes or gene products include, for example, esx genes or subsequences altered by mutations (e.g. insertions, deletions, point mutations, etc.), splicing errors, premature termination codons, missing initiators, etc. Abnormal ΕSX polypeptides include polypeptides expressed by abnormal esx genes or nucleic acid gene products or subsequences thereof. Abnormal expression of esx genes includes underexpression (as compared to the "normal" healthy population) of ESX, e.g., through partial or complete inactivation, haploinsufficiency, etc.

The terms "rodent" and "rodents" refer to all members of the phylogenetic order Rodentia including any and all progeny of all future generations derived therefrom. The term "murine" refers to any and all members of the family Muridae, including rats and mice.

A "therapeutic lead compound" refers to a compound that has a particular characteristic activity, e.g., an activity that is therapeutically useful.^" While the compound itself may not be suitable a therapeutic the compound provides a basis or starting point for the creation and/or screening of analogues for similar desired activity (e.g., for ΕSX modulatory activity).

The term "test agent" (used interchangeably herein with "candidate agent" and "test compound" and "test composition") refers to an element molecule or composition whose effect e.g., on ΕSX activity it is desired to assay. The "test composition" can be any molecule or mixture of molecules, optionally in a suitable carrier.

A "polypeptide comprising exon X of ΕSX" (where X is the exon number) refers to a polypeptide encoded by exon X of ΕSX. In some instances, particularly where the exon is present in a construct that is not a full-length ΕSX, the exon can include deletions and/or mutations that transactivation activity of this domain. BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 shows the nucleotide (SEQ ID NO:l) and deduced amino acid (SEQ ID NO:2)sequences of a human ESXcDNA.

Figures 2 A through 2Ε show the amino acid sequence of the human ESX polypeptide and the domain homologies of the ESX polypeptide as compared to other members of the ETS transcription factor family. Figure 2 A shows the amino acid sequence corresponding to the longest open reading frame in the human ESΛTcDNA (SΕQ ID NO:2). Highlighted regions (boxed, bold font) are homologous to domains of other ΕTS transcription factors; these include the A-region Pointed domain (amino acids 64-103), the serine-rich box (amino acids 188-238), and the ΕTS DNA binding domain (amino acids 274- 354). Four regions that are not homologous to other Εts transcription factor domains are unboxed. Figure 2B presents a comparison of the A-region/Pointed domain of ΕSX (SΕQ ID NO: 17) to that encoded by the human ΕTS-1 gene (SEQ ID NO: 18). Consensus residues most highly conserved- among Ets family members are shown (Lautenberger et al. (1992) Oncogene 7: 1713-1719. Conservative substitutions are indicated by (+). Figure 2C shows the similarity between the ESX serine box (SEQ ID NO: 19) and that of SOX4 (SEQ ID NO:20). A portion of the ESX serine box (SEQ ID NO:21) is shown in a helical wheel model to demonstrate clustering of serine residues opposite a hydrophobic helical face (boxed residues). Figure 2D shows the amino acid identity and similarity within the ETS DNA binding domain of the two related subfamily members, ESX (SEQ ID NO:22) and Elf- 1 (SEQ ID NO:23). Consensus residues in this domain are the most highly conserved among all Ets family members (Janknecht and Nordheim (1993) Biochem. Biophys. Acta. 1155: 346-356). Conservative (•) and non-conservative (*) substitutions found in ESX relative to the consensus residues (SEQ ID NOs:24-29) and their locations within known structural components of the ETS domain are shown (Wemer et al. (1995) Cell 83: 761-771; Kodandapani et al (1996) Nature 380: 457-460). Figure 2E illustrates the human ESX protein sequence (SEQ ID NO:2) showing the residues encoded by exon 4 (bold), the residues conserved in all Topo-I proteins (•) the Topo-I homologous fragment (4) and the Lysine¹⁴⁵ critical for transactivation (circled and bolded K). Figure 3 Illustrates the murine ESX(mESX) genomic organization and gene product.

Figure 4 shows the human ESX(hESX) (cDNA=SEQ ID NO:l, amino acid = SEQ ID NO:2) exon/intron junctions. The bold sequences contain the "tranactivating domain" as mapped by GAL4 fusion studies. Figure 5 shows the mouse ESX (mESX) (SEQ ED NO: 16) and human ESX (hESX) (SEQ ID NO:2) primary structure and domain homologies.

Figure 6 shows the conserved elements in the mouse ESX (mESX) (SEQ ID NO:30) and human ES (hESX) (SΕQ ID NO:2) proximal promoter. Figure 7 illustrates the mouse ESX (mESX) and human ESX (hESX) genomic

DNA structure.

Figures 8 A through 8D show the results of DNA binding and transactivation by recombinant ESX gene product, as well as chromosomal localization and copy number of the ESX gene. Figure 8 A shows specific DNA-binding of full-length (42 kDa) recombinantly expressed ΕSX to an oligonucleotide sequence (TA5) containing the Εts responsive element (GGAA) from the ΗER2/neu promoter. Five different competing unlabeled (cold) oligonucleotides containing specific mutations in the wild-type (WT) TA5 (SEQ ID NO:32) sequence, ml-m5 (SEQ ID NOs:33-37), were added at 50-fold molar excess; gel lanes containing the excess cold competitors are labeled. Figure 8B shows a DNase-I hypersensitivity site and footprint produced by ESX on the antisense strand of an Ets response element in the HER2/«ew promoter. The antisense strand sequence (SEQ ID NO:38) as shown (-40 bp to -26 bp upstream of major transcriptional start site in HER2/wew promoter) is marked with asterisk at the hypersensitivity site within Ets response element (GGAA on sense strand). Figures 8C and 8D show the induction of CAT activity from two different ETS -responsive reporter constructs (p3TA5-BLCAT5, pHER2-CAT) in COS cells cotransfected with an ESX expression plasmid (pcDNAI-ESX). Mutant reporter plasmids (p3TA5P-BLCAT5, pHER2m-CAT) are identical to their normal counterparts except for alterations in the Ets response element within the TA5 sequence (GGAA to GAGA and GGAA to TTAA, respectively). Figure 9 illustrates mapping of the hESX activation domain. The varying hESX deletion constructs and their transactivation activity is shown.

Figure 10 shows a comparison of exon-encoded mouse (m) and human (h) ESX amino acid sequences. The 371 amino acid sequences encoded by genomic exons 2-9 were determined by comparing cloned mouse and human cDNA sequences, with the 7 exon boundaries mapped (arrows) after comparison between mouse and human genomic sequences, as described in Methods. Amino acid identities (vertical lines) and similarities (:, .) are as indicated.

Figure 11 illustrates nucleotide sequences and consensus response elements conserved between mouse and human ESX promoters. Comparison of aligned mouse and human genomic sequences revealed 83% nucleotide identity (vertical lines) between the 0.4 kb of upstream sequences shown above. Sequence numbering is relative to a putative transcriptional start site (+1) within a conserved pyrimidine-rich type Inr (box with arrow) located -75 bp downstream from a conserved CCAAT sequence (box); this putative site was also identified as the most 5 '-terminal nucleotide from a hESX cDNA clone and agrees with a previously determined hESX transcriptional start site (Oettgen et al. (1997) Mol. Cell. Biol. 17: 4419-4433). The locations of conserved consensus response elements for Ets, AP- 2, SP1/GC box, USF, Oct, and NF-κB are indicated by the horizontal bars; of the pair of Ets elements, the 5' element (GGAA) appears displaced by 3 nucleotides while the 3' element (TTCC) is positionally conserved in both promoters. As described in Methods, this 0.4 kb of mESX promoter sequence was cloned into pGL2-Basic to produce the mESX-luc reporter construct described in Example 6.

Figures 12A and 12B show the organization and features of the_.11 kb Hind III ESX genomic fragment and comparison of Ets domain protein homologies showing exon-intron junctions. Figure 12A shows regions of identity between the 11 kb Hind HI ESX genomic fragment and the three contigs THC13038 (identical also to regions of UEV- 1, assesssion no. U49278), THC209687 and THC203540 are identified by the respective bracketed region. Coding ESX exons are shown as shaded boxes while noncoding 5' and 3' exonic sequences are shown as open boxes. ESX transcription start, translation initation and translation termination are indicated by an arrow, ATG and TGA respectively.

Polyadenylation signal for the 2.2 kb ESX transcript indicated by AATAAA(2.2kb) with presumptive polyadenylation signals for the 4.1 kb ESX transcript in THC203540 noted by ATTAAA(4.1kb) and GATAAA(4.1kb). Presumptive UEV-1 polyadenylation signal in THC213038 (also in corresponding genomic sequence) noted by AATAAA(UEV). The 350 bp of ESX promoter with >80% homology to that of mouse is shown as a darkly stippled box while the 1150 bp region with <50% homology is shown as a lightly stippled box. Regions homologous to Alu sequences and containing CpG islands are indicated. Figure 12B: Protein sequence alignment of the Ets DNA-binding domain from Ets family members whose genomic structure has been determined. The arrow (if applicable) indicates the position(s) of exon-intron junctions. Where exon-intron information for a particular Ets factor exists in different species (m=mouse, r=rat, c=chicken, d=Drosophila) only the human prototype is shown. Numbers flanking the protein sequences reference their location within the full-lengh protein. Percent protein homology to ESX, ERF and ETS-1 were determined as percentage of identical amino acids within the corresponding Ets domains. References for the genomic structures are: human/mouse ERF (de Castro et al. (1997) Genomics, 42: 227- 235; Liu et al. (1997) Oncogene, 14: 1445-1451), human/mouse/rat/chicken ETS1 (Bellacosa et al. (1994) J. Virol, 68: 2320-2330; Jorcyk et al. (1991) Oncogene, 6: 523- 532; Watson et al. (1988) Virology, 164: 99-105), human mouse/chicken drosophilia ETS2 (Pribyl et al. (1988) Dev. Biol, 111: 45-53; Watson et al. (1990) Oncogene 5: 1521-1527), drosphilia PNT (Klambt C. (1993) Development, 117: 163-176), human ERM (Monte et al. (1996) Genomics, 35: 236-240), human TEL (Baens et al. (1996) Genome Res., 6: 404- 413), mouse ELK1 (Grevin et al. (1996) Gene, 174: 185-188), mouse PU.l (Moreau- Gachelin (1989) Oncogene, 4: 1449-1456) and Spi-B (Chen et al. (1998) Gene, 207: 209- 218).

Figure 13 shows that the transactivating capacity of ESX localizes to exon 4. Top panel presents reporter (luciferase) activity induced by a series of C-teπninally deleted ESX constructs fused to the GAL4 DNA-binding domain (G) and shown relative to G- VP 16(413/490) induced activity which served as a positive control. Bracketed numbers specify positions of the flanking amino acids in the ESX or VP16 fusion constructs. Middle panel presents the activity induced by various fragments of ESX fused to the GAL4 DNA- binding domain and shown relative to G-VP16(413/490). Activity of an ESX construct harboring the double mutations, S131R/S132A, and its normal counterpart, G-ESX(1/156), are displayed in the bottom two rows of this panel. Schematic at bottom of figure depicts approximate domain positioning within ESX including the Pointed domain, SOX box, A T hook, and Ets domain shown above boundary lines specifying the location of the 8 coding exons (2-9) in ESX.

Figures 14A and 14B show the results of a mutational analysis of exon 4. Figure 14A shows reporter (luciferase) activity induced by exon 4 constructs containing single or double amino acid substitutions fused to the GAL4 DNA binding domain (G) and expressed as a percentage of induced activity by the unmodified exon 4 G-ESX(129-159) construct. For each exon 4 mutational construct, the position(s) of the substituted amino acid(s) relative to unmodified exon 4 amino acids shown on the first row are indicated (A = alanine, P = proline and Q = glutamine). The bottom four rows represent truncated exon 4 constructs fused to the GAL4 DNA-binding domain (G). Positions of the terminal amino acids of the truncated ESX constructs are shown in brackets. Figure 14B show mutational analysis of the FXXφφ motif within exon 4 (F = phenylalanine, X = any amino acid, φφ = hydrophobic amino acid). Top panel aligns the FXXφφ motif within ESX, VP16, p65NF-κB and p53 proteins. The reporter (luciferase) activity induced by exon 4 mutations within the FXXφφ motif are expressed as a percentage of induced activity by the unmodified exon 4 G- ESX(129-159) construct. The position(s) of alanine substitution(s) in the various exon 4 constructs are shown. Positive and negative controls consisted of the double VP16 mutant (F479A/L483A) and the unmodified VP16 fusion construct G-VP16(413-490), and their activities are shown in the bottom two rows.

Figures 15A and 15B show evidence for α-helical secondary structure within ESX exon 4 domain. Figure 15A shows a helical wheel projection of the 13 amino acids (aa 134-146) from the acidic core transactivating domain of exon 4 demonstrates their amphipathic distribution. Figure 15B shows CD spectra of a 25 amino acid exon 4 peptide (aa 131-155) recorded at six different methano water concentrations. Insert gives the percent α-helical content of the peptide at each methanol concentration.

Figurea 16A and 16B shows cell line dependent squelching of heterologous promoters by ESX exon 4. Figure 16 A: In transiently transfected SKBr3 cells, .expression off a GAL4(DBD)-exon 4 fusion construct, G-ESX(129-159), is shown to suppress the activity of two GAL4(DBD)-independent luciferase reporters, the SV40 early promoter and a synthetic promoter containing three tandem copies of the erόB2/HER2 promoter's Ets response. element (Ets triple repeat). The level of G-ESX(129-159) induced squelching is comparable to that induced by expression of the positive control construct containing the VP16 transactivation domain, G-VP 16(413-490). Activity is presented relative to luciferase activity following transfection of the GAL4(DBD) negative control expression construct (G). Similar squelching results were obtained in transfected COS-7 cells (data not shown). Figure 16B: Transient transfection into either COS-7 or SKBr3 cells using a full-length ESX construct, pcDNAl-ESX, produced >4 fold upregulation of the Ets responsive reporter (Ets triple repeat) in the COS-7 cells but squelching (to <0.25 relative activity) of this same reporter in SKBr3 cells. Transfection of these cells with similar pcDNAl-ESX expression constructs either deleted of its exon 4 domain, pcDNAl-ESXΔ(129-159), or bearing double mutations in this domain, pcDNAl-ESX(D134A/E135A), produced no significant difference in reporter activity relative to cells transfected with pcDNAl (empty vector) alone.

Figure 17 schematically illustrates how the exon 4 domain is capable of multiple protein-protein interactions. Given the link between ESX overexpression and erbB2 amplification overexpression, and the ability of ESX to bind and transactivate the erKB2 promoter, it is likely that this domain predisposes this oncogene to both overexpression and amplification (unscheduled DNA replication) as illustrated in the figure. DETAILED DESCRIPTION

This invention pertains to the discovery of a transcription factor associated with the etiology of cancers, including epithelial cancers. This transcription factor, referred to as ESX (for epithelial-restricted with serine box), is located at chromosome lq32 in a region known to be amplified in 50% of early breast cancers. ESX is heregulin-inducible and overexpressed in HΕR2/wew activated breast cancer cells. Tissue hybridization suggests that ESXbecomes overexpressed at an early stage of human breast cancer development known as ductal carcinoma in situ (DOS).

More particularly, this invention pertains to the discovery that ESXexons 4 and 7 encode separate transactivating and chromatin localizing domains in ΕSX.

Comparison of cloned murine and human cDNAs and genomic sequences (including intron- exon mapping), along with creation of GAL4 DBD and GST fusion proteins with full-length or partial ΕSX sequences have revealed that ΕSX contains the following unique structural and functional domains in addition to its defining 85 amino acid (C-terminal) DNA-binding domain.

The first domain characterized herein is a 33 amino acid transactivating domain (exon 4-encoded), with transactivating potency comparable to VP16 when fused to a GAL4-DBD (in a mammalian cell 2-hybrid-type experiment), and when fused to GST in a "pull-down" assay is able to bind specifically to both TATAA Binding Protein (TBP) and and the major subunit of Replication Protein A (RPA). It is believed that the ability of exon 4 to potentially induce DNA replication by recruiting RPA (binding to the major subunit but not neither of the 2 minor RPA subunits), as has been shown for a few other transcription factors, has not yet been reported for any Εts factor. Since different specific exon 4 residues are critical for each of these binding functions (as described in Example 7), they can be separated to generate reagents with one or the other functions or to design methods of inhibiting either or both of these functions. This domain also has 50% homology to the PDZ domain in the Notch-interacting Dishevelled (Dsh) gene product and, in another region, significant homology to the highly conserved core domain of topoisomerase-I (Topo I). Using purified recombinant ESX we have been able to show that the full-length ESX protein has Topo I-like supercoil-relaxing activity not present in other Ets factors. Thus, it is likely that this exon 4 domain is designed for multiple protein-protein interactions, specifying critical tissue- and development-specific gene regulatory functions. Given the link between ESX overexpression and erbB2 amplification/overexpression, and the ability of ESX to bind and transactivate the erbB2 promoter, it is likely that this domain predisposes this oncogene to both overexpression and amplification (unscheduled DNA replication) in a fashion schematically represented in the diagram below (Figure 17). With this bifunctional property, in vitro systems employing ESX or portions thereof can be developed that both transcribe and replicate from the same DNA template. The second domain characterized herein is a typical bipartite nuclear localization signal, (Example 6), located in a domain (exon 7-encoded) also having strong homology to the A/T hook domain of HMG-Y. This sequence-nonspecific DNA-binding motif recognizes and stabilizes architecturally irregular DNA structures like the H-DNA form thought to be present within the GGA mirror-repeat and nuclear matrix binding region of the erbB2 proximal promoter (adjacent to the ESX binding EBS). This domain has now been shown to result in the nuclear sublocalizing predilection of ESX for the matrix- chromatin fraction, unlike other transcription factors (e.g. AP-2) and other related Ets factors (e.g. Elf-1). Clustered lysine (K) residues in this domain are homologous in number and position to the functionally important acetylation sites known to be present in the A T hook domain of HMG-Y and in the DNA-binding domain of p53; and our studies using the active histone acetyl transferase (HAT) domain from the nuclear co-regulatory factor, pCAF, have now shown that exon 7 of ESX not only binds pCAF/HAT but is also acetylated by pCAF/HAT to a degree exceeding that of p53 but not quite as completely as histones. Without being bound to a particular theory, it is predicted that acetylation of ESX exon 7 results in its altered function (e.g. DNA-binding, perhaps also its nuclear sublocalization and its protein-protein interactions) and defines at least two populations of intracellular ESX, acetylated vs. non-acetylated ESX, for which specific antibodies and other reagents can be designed. Since the A T hook domain from HMG-Y has been shown to be chimerially fused by chromosomal translocation in human tumors, this same domain of ESX could be genetically rearranged and involved in human tumorigenesis.

I. Uses of the ESX exons 4 and 7.

As indicated above, the ESX gene of this invention is a transcription factor gene. Defects in the expression of this gene are associated the onset of various cancers (e.g., cancers of the ovary, bladder, head and neck, and colon, etc.), particularly with epithelial cancers, including breast cancer among others.

It was a discovery of this invention that exons 4 is a strong transactivator and that exon 7 is bipartite nuclear localization signal (Example 7) that is a sequence-nonspecific DNA-binding motif that recognizes and stabilizes architecturally irregular DNA structures. In addition, exon 7 is subject to acetylation and thus appears to be intimately involved in transcription regulation. ESX exons 4 and 7 (the nucleic acid or the encoded protein) are thus good targets for agents that are capable of modulating (upregulating or downregulating) ESX activity. Thus, in one embodiment, this invention provides methods of screening for potential regulators of ΕSX activity by screening for agents capable of specificallly binding to the ESX exon 4 or 7 nucleic acid or expressed protein.

In addition, because ESX exon 7 is subject to acetyation, a measure of the relative acetylation (e.g. ratio of acetylated to unacetylated exon 7) provides a measure of the degree of ESX activation. Abnormal levels of ESX (exon 7) acetylation (e.g. as compared to that found in a healthy tissue) indicate abnormal regulation of ESX and indicate a predilection to the development of cancer.

ESX exon 4 is shown herein to be a potent transactivator comparable to VP16. When coupled (e.g. chemically conjugated or expressed as a fusion protein) to a DNA binding domain exon 4 is capable of inducing transcription of a target gene or cDNA. The target gene or cDNA can be a (e.g., introduced through homologous recombination or in a non-integrated vector and/or expression cassette) or an endogenous gene. ). Transactivation is most effective when the target gene or cDNA is under the control of a promoter having an ΕTS response element (e.g. an epithelial gene promoter).

The polypeptides of exons 4 and/or 7 can also be used to prepare an affinity matrix for isolation of nucleic acids and/or polypeptides that interact with ΕSX in these domains. Thus, for example, the exon 4 and/or exon 7 polypeptide can be attached to a solid support and then contacted, under appropriate conditions, with target "sample" (e.g. a cell lysate). Polypeptides that bind to the exon 4 polypeptide and/or polypeptides or nucleic acids that bind to the exon 7 polypeptide will be retained on the support bound exon polypeptide while the remainder of the target "sample" is washed off. The bound target can then be released from the affinity matrix.

Labeled exon 4 or exon 7 polypeptides can be used to probe cells, tissues, etc. for targets that interact with these polypeptides. In preferred embodiments, the exon 4 and/or exon 7 polypeptides are labeled with a detectable label. Then a "sample" can be probed in vivo, ex vivo, or in situ to identify regions in which the probes are localized and/or to identify molecules that interact with these probes.

Cells and/or tissues expressing the ESX gene may be used to monitor acetylation levels of ΕSX polypeptides in a wide variety of contexts. For example, where the effects of a drug on ΕSX expression is to be determined the drug will be administered to the cell. Acetylation levels, or expression products will be assayed as described below and the results compared results from to organisms, tissues, or cells similarly treated, but without the drug being tested.

These uses are intended to be illustrative and not limiting. Other uses, e.g., as suggested herein are within the purview of this invention.

II. The ES gene and cDNA.

A) The human ESX gene.

Figure 1 provides both nucleic acid (SΕQ ID NO:l) and polypeptide (SΕQ ID NO:2) sequence listings for the human ESXcDNA of this invention. In addition, the human genomic sequence is provided herein in the sequence listing (SΕQ ID NO: 39). The sequence of human ESX consists of an open reading frame of 1113 nucleotides; an additional 161 and 703 nucleotides of 5'- and 3'-flanking sequence are presented in SΕQ ID NO:3. The open reading frame of human ESXcDNA encodes for a putative protein of 371 amino acids and a predicted molecular weight of 41428 Daltons.

B) The murine ESX eene. A 7.8 kb mESX genomic clone was isolated that contains -2.9 kb of promoter upstream of -4.9 kb of DNA incorporating at least 9 exons (see Figure 3 and SΕQ ID NO: 15). These exons specify a full-length transcript of about 2 kb, with exons 2-9 encoding the 371 amino acid mESX protein. Comparison of the mouse and human ΕSX sequences revealed the following structural and/or functional domains within a 42 kDa ΕSX protein conserved between mouse and human: an exon 3 encoded POINTED/ A-region, found in a small subset of all ETS genes; an amphiphathic helix and serine-rich box encoded by exons 5 and 6; a nucleoplamin-type nuclear targeting sequence encoded by exon 7, and a helix- turn-helix ETS DNA binding domain encoded by exons 8 and 9.

The proximal promoter region of mESX (350 bp upstream of the transcriptional start site, see Figure 6) is 83% homologous to the hESX promoter. Conserved putative response elements within this region include ΕTS, AP-2, SP1, USF, Oct, and NF-6B binding sites which are believed to regulate ΕSX induction. A conserved CCAAT box lies about 80 bp upstream of the pyrimidine rich Inr element which specifies ESX transcript initiation. Unlike hESX, mESX lacks a TATA box. O Isolation of cDNA and/or probes.

The nucleic acids (e.g., ESXcDNA, or subsequences (probes)) of the present invention are cloned, or amplified by in vitro methods, such as the polymerase chain reaction (PCR), the ligase chain reaction (LCR), the transcription-based amplification system (TAS), the self-sustained sequence replication system (SSR). A wide variety of cloning and in vitro amplification methodologies are well-known to persons of skill. Examples of these techniques and instructions sufficient to direct persons of skill through many cloning exercises are found in Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 152 Academic Press, Inc., San Diego, CA (Berger); Sambrook et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, (Sambrook et al); Current Protocols in Molecular Biology, F.M. Ausubel et al, eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1994 Supplement) (Ausubel); Cashion et al., U.S. patent number 5,017,478; and Carr, European Patent No. 0,246,864. Examples of techniques sufficient to direct persons of skill through in vitro amplification methods are found in Berger, Sambrook, and Ausubel, as well as Mullis et al, (1987) U.S. Patent No. 4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis); Arnheim & Levinson (October 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991) 3: 81-94; (Kwoh et al (1989) Proc. Natl Acad. Sci. USA 86: 1173; Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87, 1874; Lomell et al. (1989) J. Clin. Chem., 35: 1826; Landegren et al, (1988) Science, 241: 1077-1080; Van Brunt (1990) Biotechnology, 8: 291-294; Wu and Wallace, (1989) Gene, 4: 560; and Barringer et al. (1990) Gene, 89: 117. In one preferred embodiment, the human ESX cDNA can be isolated by routine cloning methods. The cDNA sequence provided in SΕQ ID NO: 1 can be used to provide probes that specifically hybridize to the ESX gene, in a genomic DNA sample, or to the EiS mRNA, in a total RNA sample (e.g., in a Southern blot). Once the target ESX nucleic acid is identified (e.g., in a Southern blot), it can be isolated according to standard methods known to those of skill in the art (see, e.g., Sambrook et al. (1989) Molecular

Cloning: A Laboratory Manual, 2nd Ed., Vols. 1-3, Cold Spring Harbor Laboratory; Berger and Kimmel (1987) Methods in Enzymology, Vol. 152: Guide to Molecular Cloning Techniques, San Diego: Academic Press, Inc.; or Ausubel et al. (1987) Current Protocols in Molecular Biology, Greene Publishing and Wiley-Interscience, New York). Methods of screening human cDNA libraries for the ESX gene are provided in Example 1.

In another preferred embodiment, the human ESXcDNA can be isolated by amplification methods such as polymerase chain reaction (PCR). In a preferred embodiment, the ESX sequence is amplified from a cDNA sample (e.g., double stranded placental cDNA (Clontech)) using the primers 5ΕSX-DBD, 5'-CCGGGACATCCTCA TCCACCC-3' (SEQ ID NO: 13)) and 3' ESX-DBD (5^*-GTACCTCATGGCCCGGCTCAG-3' (SEQ ID NO: 14)). Preferred amplification conditions include lOx PCR buffer (500 mM KC1, 100 mM Tris, pH 8.3 at room temperature, 15 mM MgCl₂, 0.1% gelatin) with the amplification run for about 34 cycles at 94°C for 30 sec, 58°C for 30 sec and 72°C for 60 sec.

Similarly, using the nucleic acid sequence provided herein (e.g., SEQ ID NO: 15), one of ordinary skill can routinely isolate the mouse ESX gene, mRNA or cDNA. However, in a preferred embodiment, the mouse ESX sequence is amplified from a nucleic acid sample (e.g., gDNA or cDNA) using that primers readily derived from the sequence listings provided herein. Suitable primers include, but are not limited to primers (e.g., 20 mers) corresponding to the 5' and 3' termini of the murine ESXcDNA as described above. D) Labeling of nucleic acid probes.

Where the ΕSX cDNA or its subsequences (e.g., exon 4 or 7) are to be used as nucleic acid probes, it is often desirable to label the nucleic acids with detectable labels. The labels may be incorporated by any of a number of means well known to those of skill in the art. However, in a preferred embodiment, the label is simultaneously incorporated during the amplification step in the preparation of the sample nucleic acids. Thus, for example, polymerase chain reaction (PCR) with labeled primers or labeled nucleotides will provide a labeled amplification product. In another preferred embodiment, transcription amplification using a labeled nucleotide (e.g. fluorescein-labeled UTP and/or CTP) incorporates a label into the transcribed nucleic acids.

Alternatively, a label may be added directly to an original nucleic acid sample (e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the amplification is completed. Means of attaching labels to nucleic acids are well known to those of skill in the art and include, for example nick translation or end-labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore).

Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., ³H, ^l251, ³⁵S, ¹⁴C, or ³²P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ΕLISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241.

It will be recognized that fluorescent labels are not to be limited to single species organic molecules, but include inorganic molecules, multi-molecular mixtures of organic and/or inorganic molecules, crystals, heteropolymers, and the like. Thus, for example, CdSe-CdS core_shell nanocrystals enclosed in a silica shell can be easily derivatized for coupling to a biological molecule (Bruchez et al. (1998) Science, 281: 2013- 2016). Similarly, highly fluorescent quantum dots (zinc sulfide-capped cadmium selenide) have been covalently coupled to biomolecules for use in ultrasensitive biological detection (Warren and Nie (1998) Science, 281 : 2016-2018).

Means of detecting such labels are well known to those of skill in the art. Thus, for example, radiolabels may be detected using photographic film or scintillation counters, fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label.

in. Antibodies to ESX polypeptide(s).

Antibodies are raised to the ESX polypeptides of the present invention (e.g. exon 4, exon 7, specific anti-acetylated ESX, specific anti-unacetylated ESX, etc.), including individual, allelic, strain, or species variants, and fragments thereof, both in their naturally occurring (full-length) forms and in recombinant forms. Additionally, antibodies are raised to these polypeptides in either their native configurations or in non-native configurations. Anti-idiotypic antibodies can also be generated. Many methods of making antibodies are known to persons of skill. The following discussion is presented as a general overview of the techniques available; however, one of skill will recognize that many variations upon the following methods are known.

A) Antibody Production.

A number of immunogens are used to produce antibodies specifically reactive with ESX polypeptides. Recombinant or synthetic polypeptides of 10 amino acids in length, or greater, selected from amino acid sub-sequences of (ESX) (e.g., SΕQ ΕD NO:l, exon 4, exon 7) are the preferred polypeptide immunogen (antigen) for the production of monoclonal or polyclonal antibodies. In one class of preferred embodiments, an immunogenic peptide conjugate is also included as an immunogen. Naturally occurring polypeptides are also used either in pure or impure form.

Recombinant polypeptides are expressed in eukaryotic or prokaryotic cells (as described below) and purified using standard techniques. The polypeptide, or a synthetic version thereof, is then injected into an animal capable of producing antibodies. Either monoclonal or polyclonal antibodies can be generated for subsequent use in immunoassays to measure the presence and quantity of the polypeptide.

Methods of producing polyclonal antibodies are known to those of skill in the art. In brief, an immunogen (antigen), preferably a purified polypeptide, a polypeptide coupled to an appropriate carrier (e.g., GST, keyhole limpet hemocyanin, etc.), or a polypeptide incorporated into an immunization vector such as a recombinant vaccinia virus (see, U.S. Patent No. 4,722,848) is mixed with an adjuvant and animals are immunized with the mixture. The animal's immune response to the immunogen preparation is monitored by taking test bleeds and determining the titer of reactivity to the polypeptide of interest. When appropriately high titers of antibody to the immunogen are obtained, blood is collected from the animal and antisera are prepared. Further fractionation of the antisera to enrich for antibodies reactive to the polypeptide is performed where desired (see, e.g., Coligan (1991) Current Protocols in Immunology Wiley/Greene, NY; and Harlow and Lane (1989) Antibodies: A Laboratory Manual, Cold Spring Harbor Press, NY). Antibodies, including binding fragments and single chain recombinant versions thereof, against predetermined fragments of ESX polypeptides are raised by immunizing animals, e.g., with conjugates of the fragments with carrier proteins as described above. Typically, the immunogen of interest is a peptide of at least about 5 amino acids, more typically the peptide is 10 amino acids in length, preferably, the fragment is 15 amino acids in length and more preferably the fragment is 20 amino acids in length or greater. The peptides are typically coupled to a carrier protein (e.g., as a fusion protein), or are recombinantly expressed in an immunization vector. Antigenic determinants on peptides to which antibodies bind are typically 3 to 10 amino acids in length.

One particularly preferred immunogen is illustrated in the Example 1. In this example, a peptide fragment consisting of the sixteen carboxy-terminal amino acids of ESX was used as an ESX antigen in rabbits. An amino-terminal cysteine was introduced to allow coupling of the peptide to a carrier protein (KLH). Anti-ESX antibodies were obtained by affinity purification of total IgG from immunized rabbits using an affinity column to which the ESX carboxyl terminal peptide fragment was bound. Monoclonal antibodies are prepared from cells secreting the desired antibody. These antibodies are screened for binding to normal or modified polypeptides, or screened for agonistic or antagonistic activity, e.g., activity mediated through an ESX protein. Specific monoclonal and polyclonal antibodies will usually bind with a K_D of at least about .1 mM, more usually at least about 50 mM, and most preferably at least about 1 mM or better.

In some instances, it is desirable to prepare monoclonal antibodies from various mammalian hosts, such as mice, rodents, primates, humans, etc. Description of techniques for preparing such monoclonal antibodies are found in, e.g., Stites et al. (eds.) Basic and Clinical Immunology (4th ed.) Lange Medical PubUcations, Los Altos, CA, and references cited therein; Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press, New York, NY; and Kohler and Milstein (1975) Nature 256: 495-497. Summarized briefly, this method proceeds by injecting an animal with an immunogen. The animal is then sacrificed and cells taken from its spleen, which are fused with myeloma cells. The result is a hybrid cell or "hybridoma" that is capable of reproducing in vitro. The population of hybridomas is then screened to isolate individual clones, each of which secrete a single antibody species to the immunogen. In this manner, the individual antibody species obtained are the products of immortalized and cloned single B cells from the immune animal generated in response to a specific site recognized on the immunogenic substance.

Alternative methods of immortalization include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other methods known in the art. Colonies arising from single immortalized cells are screened for production of antibodies of the desired specificity and affinity for the antigen, and yield of the monoclonal antibodies produced by such cells is enhanced by various techniques, including injection into the peritoneal cavity of a vertebrate (preferably mammalian) host. The polypeptides and antibodies of the present invention are used with or without modification, and include chimeric antibodies such as humanized murine antibodies.

Other suitable techniques involve selection of libraries of recombinant antibodies in phage or similar vectors (see, e.g., Huse et al. (1989) Science 246: 1275-1281; and Ward, et al (1989) Nature 341: 544-546; and Vaughan et al. (1996) Nature Biotechnology, 14: 309-314).

Frequently, the polypeptides and antibodies will be labeled by joining, either covalently or non-covalently, a substance which provides for a detectable signal. A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionucleotides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, chemiluminescent moieties, magnetic particles, and the like. Patents teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241. Also, recombinant immunoglobulins may be produced (see, e.g., Cabilly, U.S. Patent No. 4,816,567; and Queen et al. (1989) Proc. Nat'lAcad. Sci. USA 86: 10029-10033.

The antibodies of this invention are also used for affinity chromatography in isolating ESX polypeptides. Columns are prepared, e.g., with the antibodies linked to a solid support, e.g., particles, such as agarose, Sephadex, or the like, where a cell lysate is passed through the column, washed, and treated with increasing concentrations of a mild denaturant, whereby purified ESX polypeptides are released.

The antibodies can be used to screen expression libraries for particular expression products such as normal or abnormal human ESX protein. Usually the antibodies in such a procedure are labeled with a moiety allowing easy detection of presence of antigen by antibody binding.

Antibodies raised against ESX polypeptides can also be used to raise anti- idiotypic antibodies. These are useful for detecting or diagnosing various pathological conditions related to the presence of the respective antigens. B) Human or humanized (chimeric) antibody production.

The anti-ESX antibodies of this invention can also be administered to an organism (e.g., a human patient) for therapeutic purposes (e.g., to block the action an ESX polypeptide or as targeting molecules when conjugated or fused to effector molecules such as labels, cytotoxins, enzymes, growth factors, drugs, etc.). Antibodies administered to an organism other than the species in which they are raised are often immunogenic. Thus, for example, murine antibodies administered to a human often induce an immunologic response against the antibody (e.g., the human anti-mouse antibody (HAMA) response) on multiple administrations. The immunogenic properties of the antibody are reduced by altering portions, or all, of the antibody into characteristically human sequences thereby producing chimeric or human antibodies, respectively. ϊ) Humanized (chimeric^") antibodies.

Humanized (chimeric) antibodies are immunoglobulin molecules comprising a human and non-human portion. More specifically, the antigen combining region (or variable region) of a humanized chimeric antibody is derived from a non-human source (e.g., murine) and the constant region of the chimeric antibody (which confers biological effector function to the immunoglobulin) is derived from a human source. The humanized chimeric antibody should have the antigen binding (e.g., anti-ESX polypeptide) specificity of the non- human antibody molecule and the effector function conferred by the human antibody molecule. A large number of methods of generating chimeric antibodies are well known to those of skill in the art (see, e.g., U.S. Patent Nos: 5,502,167, 5,500,362, 5,491,088, 5,482,856, 5,472,693, 5,354,847, 5,292,867, 5,231,026, 5,204,244, 5,202,238, 5,169,939, 5,081,235, 5,075,431, and 4,975,369).

In general, the procedures used to produce these chimeric antibodies consist of the following steps (the order of some steps may be interchanged): (a) identifying and cloning the correct gene segment encoding the antigen binding portion of the antibody molecule; this gene segment (known as the VDJ, variable, diversity and joining regions for heavy chains or VJ, variable, joining regions for light chains (or simply as the V or Variable region) may be in either the cDNA or genomic form; (b) cloning the gene segments encoding the constant region or desired part thereof; (c) ligating the variable region with the constant region so that the complete chimeric antibody is encoded in a transcribable and translatable form; (d) ligating this construct into a vector containing a selectable marker and gene control regions such as promoters, enhancers and poly(A) addition signals; (e) amplifying this construct in a host cell (e.g., bacteria); (f) introducing the DNA into eukaryotic cells (transfection) most often mammalian lymphocytes;

Antibodies of several distinct antigen binding specificities have been manipulated by these protocols to produce chimeric proteins (e.g., anti-TNP: Boulianne et al. (1984) Nature, 312: 643; and anti-tumor antigens: Sahagan et al. (1986) J. Immunol, 137: 1066). Likewise several different effector functions have been achieved by linking new sequences to those encoding the antigen binding region. Some of these include enzymes (Neuberger et al. (1984) Nature 312: 604), immunoglobulin constant regions from another species and constant regions of another immunoglobulin chain (Sharon et al (1984) Nature 309: 364; Tan et al., (1985) J. Immunol 135: 3565-3567).

In one preferred embodiment, recombinant DNA vector is used to transfect a cell line that produces an anti-ESX antibody. The novel recombinant DNA vector contains a "replacement gene" to replace all or a portion of the gene encoding the immunoglobulin constant region in the cell line (e.g., a replacement gene may encode all or a portion of a constant region of a human immunoglobulin, a specific immunoglobulin class, or an enzyme, a toxin, a biologically active peptide, a growth factor, inhibitor, or a linker peptide to facilitate conjugation to a drug, toxin, or other molecule, etc.), and a "target sequence" which allows for targeted homologous recombination with immunoglobulin sequences within the antibody producing cell.

In another embodiment, a recombinant DNA vector is used to transfect a cell line that produces an antibody having a desired effector function, (e.g., a constant region of a human immunoglobulin) in which case, the replacement gene contained in the recombinant vector may encode all or a portion of a region of an anti-ESX antibody and the target sequence contained in the recombinant vector allows for homologous recombination and targeted gene modification within the antibody producing cell. In either embodiment, when only a portion of the variable or constant region is replaced, the resulting chimeric antibody may define the same antigen and/or have the same effector function yet be altered or improved so that the chimeric antibody may demonstrate a greater antigen specificity, greater affinity binding constant, increased effector function, or increased secretion and production by the transfected antibody producing cell line, etc. Regardless of the embodiment practiced, the processes of selection for integrated DNA (via a selectable marker), screening for chimeric antibody production, and cell cloning, can be used to obtain a clone of cells producing the chimeric antibody. Thus, a piece of DNA which encodes a modification for a monoclonal antibody can be targeted directly to the site of the expressed immunoglobulin gene within a B-cell or hybridoma cell line. DNA constructs for any particular modification may be used to alter the protein product of any monoclonal cell line or hybridoma. Such a procedure circumvents the costly and time consuming task of cloning both heavy and light chain variable region genes from each B-cell clone expressing a useful antigen specificity. In addition to circumventing the process of cloning variable region genes, the level of expression of chimeric antibody should be higher when the gene is at its natural chromosomal location rather than at a random position.

Detailed methods for preparation of chimeric (humanized) antibodies can be found in U.S. Patent 5,482,856. if) Human antibodies.

In another embodiment, this invention provides for fully human anti-ESX antibodies. Human antibodies consist entirely of characteristically human polypeptide sequences. The human anti-ESX antibodies of this invention can be produced in using a wide variety of methods (see, e.g., Larrick et al, U.S. Pat. No. 5,001,065, for review). In one preferred embodiment, the human anti-ESX antibodies of the present invention are usually produced initially in trioma cells. Genes encoding the antibodies are then cloned and expressed in other cells, particularly, nonhuman mammalian cells.

The general approach for producing human antibodies by trioma technology has been described by Ostberg et al. (1983), Hybridoma 2: 361-367, Ostberg, U.S. Pat. No. 4,634,664, and Engelman et al, U.S. Pat. No. 4,634,666. The antibody-producing cell lines obtained by this method are called triomas because they are descended from three cells; two human and one mouse. Triomas have been found to produce antibody more stably than ordinary hybridomas made from human cells. Preparation of trioma cells requires an initial fusion of a mouse myeloma cell line with unimmunized human peripheral B lymphocytes. This fusion generates a xenogenic hybrid cell containing both human and mouse chromosomes (see, Engelman, supra.). Xenogenic cells that have lost the capacity to secrete antibodies are selected. Preferably, a xenogenic cell is selected that is resistant to 8-azaguanine. Cells possessing resistance to 8- azaguanine are unable to propagate on hypoxanthine-aminopterin-thymidine (HAT) or azaserine-hypoxanthine (AH) media.

The capacity to secrete antibodies is conferred by a further fusion between the xenogenic cell and B-lymphocytes immunized against an ESX polypeptide or an epitope thereof. The B-lymphocytes are obtained from the spleen, blood or lymph nodes of human donor. If antibodies against a specific antigen or epitope are desired, it is preferable to use that antigen or epitope thereof as the immunogen rather than ESX polypeptide. Alternatively, B-lymphocytes are obtained from an unimmunized individual and stimulated with an ESX polypeptide, or a epitope thereof, in vitro. In a further variation, B-lymphocytes are obtained from an infected, or otherwise immunized individual, and then hyperimmunized by exposure to an ESX polypeptide for about seven to fourteen days, in vitro.

The immunized B-lymphocytes prepared by one of the above procedures are fused with a xenogenic hybrid cell by well known methods. For example, the cells are treated with 40-50% polyethylene glycol of MW 1000-4000, at about 37°C for about 5-10 min. Cells are separated from the fusion mixture and propagated in media selective for the desired hybrids. When the xenogenic hybrid cell is resistant to 8-azaguanine, immortalized trioma cells are conveniently selected by successive passage of cells on HAT or AH medium. Other selective procedures are, of course, possible depending on the nature of the cells used in fusion. Clones secreting antibodies having the required binding specificity are identified by assaying the trioma culture medium for the ability to bind to an ESX polypeptide or an epitope thereof. Triomas producing human antibodies having the desired specificity are subcloned by the limiting dilution technique and grown in vitro in culture medium, or are injected into selected host animals and grown in vivo.

The trioma cell lines obtained are then tested for the ability to bind an ESX polypeptide or an epitope thereof. Antibodies are separated from the resulting culture medium or body fluids by conventional antibody- fractionatidh procedures, such as ammonium sulfate precipitation, DEAE cellulose chromatography and affinity chromatography.

Although triomas are genetically stable they do not produce antibodies at very high levels. Expression levels can be increased by cloning antibody genes from the trioma into one or more expression vectors, and transforming the vector into a cell line such as the cell lines typically used for expression of recombinant or humanized immunoglobulins. As well as increasing yield of antibody, this strategy offers the additional advantage that immunoglobulins are obtained from a cell line that does not have a human component, and does not therefore need to be subjected to the especially extensive viral screening required for human cell lines.

The genes encoding the heavy and light chains of immunoglobulins secreted by trioma cell lines are cloned according to methods, including the polymerase chain reaction, known in the art (see, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor, N.Y., 1989; Berger & Kimmel, Methods in

Enzymology, Vol. 152: Guide to Molecular Cloning Techniques, Academic Press, Inc., San Diego, Calif, 1987; Co et al. (1992) J. Immunol, 148: 1149). For example, genes encoding heavy and light chains are cloned from a trioma's genomic DNA or cDNA produced by reverse transcription of the trioma's RNA. Cloning is accomplished by conventional techniques including the use of PCR primers that hybridize to the sequences flanking or overlapping the genes, or segments of genes, to be cloned.

Typically, recombinant constructs comprise DNA segments encoding a complete human immunoglobulin heavy chain and/or a complete human immunoglobulin light chain of an immunoglobulin expressed by a trioma cell line. Alternatively, DNA segments encoding only a portion of the primary antibody genes are produced, which portions possess binding and/or effector activities. Other recombinant constructs contain segments of trioma cell line immunoglobulin genes fused to segments of other immunoglobulin genes, particularly segments of other human constant region sequences (heavy and/or light chain). Human constant region sequences can be selected from various reference sources, including but not limited to those listed in Kabat et al. (1987), Sequences of Proteins of Immunological Interest, U.S. Department of Health and Human Services.

In addition to the DNA segments encoding anti-ESX immunoglobulins or fragments thereof, other substantially homologous modified immunoglobulins can be readily designed and manufactured utilizing various recombinant DNA techniques known to those skilled in the art such as site-directed mutagenesis (see, e.g., Gillman & Smith (1979) Gene, 8: 81-97; Roberts et al. (1987) Nature, 328: 731-734). Such modified segments will usually retain antigen binding capacity and/or effector function. Moreover, the modified segments are usually not so far changed from the original trioma genomic sequences to prevent hybridization to these sequences under stringent conditions. Because, like many genes, immunoglobulin genes contain separate functional regions, each having one or more distinct biological activities, the genes may be fused to functional regions from other genes to produce fusion proteins (e.g., immunotoxins) having novel properties or novel combinations of properties. The recombinant polynucleotide constructs will typically include an expression control sequence operably linked to the coding sequences, including naturally- associated or heterologous promoter regions. Preferably, the expression control sequences will be eukaryotic promoter systems in vectors capable of transforming or transfecting eukaryotic host cells. Once the vector has been incorporated into the appropriate host, the host is maintained under conditions suitable for high level expression of the nucleotide sequences, and the collection and purification of the human anti-ESX immunoglobulins.

These expression vectors are typically replicable in the host organisms either as episomes or as an integral part of the host chromosomal DNA. Commonly, expression vectors will contain selection markers, e.g., ampicillin-resistance or hygromycin-resistance, to permit detection of those cells transformed with the desired DNA sequences.

In general, prokaryotes can be used for cloning the DNA sequences encoding a human anti-ESX immunoglobulin chain. E. coli is one prokaryotic host particularly useful for cloning the DNA sequences of the present invention. Microbes, such as yeast are also useful for expression. Saccharomyces is a preferred yeast host, with suitable vectors having expression control sequences, an origin of replication, termination sequences and the like as desired. Typical promoters include 3-phosphoglycerate kinase and other glycolytic enzymes. Inducible yeast promoters include, among others, promoters from alcohol dehydrogenase 2, isocytochrome C, and enzymes responsible for maltose and galactose utilization. Mammalian cells are a particularly preferred host for expressing nucleotide segments encoding immunoglobulins or fragments thereof (see, e.g., Winnacker (1987) From Genes to Clones, VCH Publishers, N.Y.). A number of suitable host cell lines capable of secreting intact heterologous proteins have been developed in the art, and include CHO cell lines, various COS cell lines, HeLa cells, L cells and myeloma cell lines. Preferably, the cells are nonhuman. Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter, an enhancer (Queen et al. (1986) Immunol. Rev. 89: 49), and necessary processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences. Preferred expression control sequences are promoters derived from endogenous genes, cytomegalovirus, SV40, adenovirus, bovine papillomavirus, and the like (see, e.g., Co et al. (1992) J. Immunol, 148: 1149).

The vectors containing the DNA segments of interest can be transfeπed into the host cell by well-known methods, depending on the type of cellular host. For example, calcium chloride transfection is commonly utilized for prokaryotic cells, whereas calcium phosphate treatment, electroporation, lipofection, biolistics or viral-based transfection may be used for other cellular hosts. Other methods used to transform mammalian cells include the use of polybrene, protoplast fusion, liposomes, electroporation, and microinjection (see, generally, Sambrook et al, supra). Once expressed, human anti-ESX immunoglobulins of the invention can be purified according to standard procedures of the art, including HPLC purification, fraction column chromatography, gel electrophoresis and the like (see, generally, Scopes (1982) Protein Purification, Springer- Verlag, NY). Detailed protocols for the production of human antibodies can be found in U.S. Patent 5,506,132. Other approaches in vitro immunization of human blood. In this approach, human blood lymphocytes capable of producing human antibodies are produced. Human peripheral blood is collected from the patient and is treated to recover mononuclear cells. The suppressor T-cells then are removed and remaining cells are suspended in a tissue culture medium to which is added the antigen and autologous serum and, preferably, a nonspecific lymphocyte activator. The cells then are incubated for a period of time so that they produce the specific antibody desired. The cells then can be fused to human myeloma cells to immortalize the cell line, thereby to permit continuous production of antibody (see U.S. Patent 4,716,111). In another approach, mouse-human hybridomas which produces human anti- ESX are prepared (see, e.g., 5,506,132). Other approaches include immunization of mice transformed to express human immunoglobulin genes, and phage display screening (Vaughan et al. supra.).

IV. Production of ESX polypeptides.

A) De novo chemical synthesis.

The ESX proteins or subsequences thereof (e.g. polypeptides corresponding to exons 4 or 7) may be synthesized using standard chemical peptide synthesis techniques. Where the desired subsequences are relatively short (e.g., when a particular antigenic determinant is desired) the molecule may be synthesized as a single contiguous polypeptide. Where larger molecules are desired, subsequences can be synthesized separately (in one or more units) and then fused by condensation of the amino terminus of one molecule with the carboxyl terminus of the other molecule thereby forming a peptide bond.

Solid phase synthesis in which the C-terminal amino acid of the sequence is attached to an insoluble support followed by sequential addition of the remaining amino acids in the sequence is the preferred method for the chemical synthesis of the polypeptides of this invention. Techniques for solid phase synthesis are described by Barany and Merrifield, Solid-Phase Peptide Synthesis; pp. 3-284 in The Peptides: Analysis, Synthesis, Biology. Vol. 2: Special Methods in Peptide Synthesis, Part A., Merrifield, et al. (1963) J. Am. Chem. Soc, 85: 2149-2156, and Stewart et al. (1984) Solid Phase Peptide Synthesis, 2nd ed. Pierce Chem. Co., Rockford, 111.

B) Recombinant expression.

In a preferred embodiment, the ESX proteins or subsequences thereof (e.g. polypeptides of exons 4 or 7), are synthesized using recombinant DNA methodology. Generally this involves creating a DNA sequence that encodes the fusion protein, placing the DNA in an expression cassette under the control of a particular promoter, expressing the protein in a host, isolating the expressed protein and, if required, renaturing the protein.

DNA encoding the ESX proteins or subsequences of this invention can be prepared by any suitable method as described above, including, for example, cloning and restriction of appropriate sequences or direct chemical synthesis by methods such as the phosphotriester method of Narang et al (1979) Meth. Enzymol 68: 90-99; the phosphodiester method of Brown et al.(\ 979) Meth. Enzymol. 68: 109-151; the diethylphosphoramidite method of Beaucage et al. (1981) Tetra. Lett., 22: 1859-1862; and the solid support method of U.S. Patent No. 4,458,066. Chemical synthesis produces a single stranded oligonucleotide. This may be converted into double stranded DNA by hybridization with a complementary sequence, or by polymerization with a DNA polymerase using the single strand as a template. One of skill would recognize that while chemical synthesis of DNA is limited to sequences of about 100 bases, longer sequences may be obtained by the ligation of shorter sequences.

Alternatively, subsequences may be cloned and the appropriate subsequences cleaved using appropriate restriction enzymes. The fragments may then be ligated to produce the desired DNA sequence.

In one embodiment, ESX proteins of this invention can be cloned using DNA amphfication methods such as polymerase chain reaction (PCR). Thus, for example, the nucleic acid sequence or subsequence is PCR amplified, using a sense primer containing one restriction site (e.g., Ndel) and an antisense primer containing another restriction site (e.g., Hindm). This will produce a nucleic acid encoding the desired ESX sequence or subsequence and having terminal restriction sites. This nucleic acid can then be easily ligated into a vector containing a nucleic acid encoding the second molecule and having the appropriate corresponding restriction sites. Suitable PCR primers can be determined by one of skill in the art using the sequence information provided in SEQ ID NOs: 1 and 3. Appropriate restriction sites can also be added to the nucleic acid encoding the ESX protein or protein subsequence by site-directed mutagenesis. The plasmid containing the ESX sequence or subsequence is cleaved with the appropriate restriction endonuclease and then ligated into the vector encoding the second molecule according to standard methods.

The nucleic acid sequences encoding ESX proteins or protein subsequences may be expressed in a variety of host cells, including E. coli, other bacterial hosts, yeast, and various higher eukaryotic cells such as the COS, CHO and HeLa cells lines and myeloma cell lines. As the ESX proteins are typically found in eukaryotes, a eukaryote host is preferred. The recombinant protein gene will be operably linked to appropriate expression control sequences for each host. For E. coli this includes a promoter such as the T7, trp, or lambda promoters, a ribosome binding site and preferably a transcription termination signal. For eukaryotic cells, the control sequences will include a promoter and preferably an enhancer derived from immunoglobulin genes, SV40, cytomegalovirus, etc., and a polyadenylation sequence, and may include splice donor and acceptor sequences.

The plasmids of the invention can be transferred into the chosen host cell by well-known methods such as calcium chloride transformation for E. coli and calcium phosphate treatment or electroporation for mammalian cells. Cells transformed by the plasmids can be selected by resistance to antibiotics conferred by genes contained on the plasmids, such as the amp, gpt, neo and hyg genes.

Once expressed, the recombinant ESX proteins can be purified according to standard procedures of the art, including ammonium sulfate precipitation, affinity columns, column chromatography, gel electrophoresis and the like (see, generally, R. Scopes, (1982) Protein Purification, Springer- Verlag, N.Y.; Deutscher (1990) Methods in Enzymology Vol. 182: Guide to Protein Purification., Academic Press, Inc. N.Y.). Substantially pure compositions of at least about 90 to 95% homogeneity are preferred, and 98 to 99% or more homogeneity are most preferred. Once purified, partially or to homogeneity as desired, the polypeptides may then be used (e.g., as immunogens for antibody production).

One of skill in the art would recognize that after chemical synthesis, biological expression, or purification, the ESX protein(s) may possess a conformation substantially different than the native conformations of the constituent polypeptides. In this case, it may be necessary to denature and reduce the polypeptide and then to cause the polypeptide to re-fold into the preferred conformation. Methods of reducing and denaturing proteins and inducing re-folding are well known to those of skill in the art (See, Debinski et al. (1993) J. Biol. Chem., 26 : 14065-14070; Kreitman and Pastan (1993) Bioconjug. Chem., 4: 581-585; and Buchner, et al, (1992) Anal. Biochem., 205: 263-270). Debinski et al, for example, describes the denaturation and reduction of inclusion body proteins in guanidine- DTE. The protein is then refolded in a redox buffer contaimng oxidized glutathione and L- arginine.

One of skill would recognize that modifications can be made to the ESX proteins without diminishing their biological activity. Some modifications may be made to facilitate the cloning, expression, or incorporation of the targeting molecule into a fusion protein. Such modifications are well known to those of skill in the art and include, for example, a methionine added at the amino terminus to provide an initiation site, or additional amino acids (e.g., poly His) placed on either terminus to create conveniently located restriction sites or termination codons or purification sequences.

V. Detection of ESX acetylation. As indicated above, abnormal (e.g., altered or deficient) expression of the human ESX gene is believed to be a causal factor in the development of various cancers (e.g., head, neck, breast, ovary, bladder, colon, etc.). In particular, the data provided herein establish the importance of the ESX gene in the etiology of carcinomas, including epithelial cancers such as breast cancer. ΕSX becomes overexpressed at an early stage of breast cancer known as ductal carcinoma in situ, making abnormal expression of ESX a marker for early detection of cancers. Of course, early detection can be critical to treatment efficacy. It is believed that abnormal expression of the ESX gene influences transcription of genes that are regulated by the ΕSX transcription factor. Thus, it is desirable to screen for and identify abnormal ESX activity.

ESX, in particular exon 7, is subject to acetylation. Moreover, without being bound to a particular theory, it is believed that the acetylation of exon 7 (polypeptide) provides a measure of activation of ESX and abnormal acetylation indicates abnormal ESX activation. Thus, in one embodiment, it is desired to assay for ΕSX acetylation. A) Sample collection and processing.

Acetylation of the ΕSX gene product (e.g. exon 7) is preferably detected and/or quantified in a biological sample. As used herein, a biological sample is a sample of biological tissue or fluid that, in a healthy and/or pathological state, contains an ΕSX nucleic acid or polypeptide. Such samples include, but are not limited to, sputum, amniotic fluid, blood, blood cells (e.g. , white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections of tissues such as frozen sections taken for histological purposes. Often, a sample will be obtained from a cancerous or precancerous tissue. Although the sample is typically taken from a human patient, the assays can be used to detect ΕSX genes or gene products in samples from any mammal, such as dogs, cats, sheep, cattle, and pigs.

The sample may be pretreated as necessary by dilution in an appropriate buffer solution or concentrated, if desired. Any of a number of standard aqueous buffer solutions, employing one of a variety of buffers, such as phosphate, Tris, or the like, at physiological pH can be used. B Control for physiological state.

As explained herein, expression levels of the ESX gene vary with the developmental and reproductive state of the organism. Thus, for example, in mice, ΕSX expression is induced early in fetal development (e.g., greater than about 7 days), is substantially diminished or lost during lactation, and dramatically increases post-weaning. In light of this variation, it will be appreciated that abnormal levels of ΕSX expression, e.g. as characterized by acetylation state, will be determined relative to a control reflecting the sex, developmental state of the animal or human, preferably the reproductive state, and/or preferably the particular tissue or cell type as well. Thus, in a preferred embodiment, controls will be matched for one or more of these factors according to standard methods known to those of skill in the art.

C ESX polypeptide assays.

The expression of the human ESX gene can also be detected and/or quantified by detecting or quantifying ESX acetylation of the expressed ΕSX polypeptideor a subunit thereof (e.g. exon 7). The acetylated ΕSX polypeptides (e.g. ratio of acetylated to non- acetylated ΕSX) can be detected and quantified by any of a number of means well known to those of skill in the art. These may include analytic biochemical methods such as electrophoresis, capillary electrophoresis, high performance liquid chromatography (HPLC), thin layer chromatography (TLC), hyperdiffusion chromatography, and the like, or various immunological methods such as fluid or gel precipitin reactions, immunodifϊusion (single or double), immunoelectrophoresis, radioimmunoassay (RIA), enzyme-linked immunosorbent assays (ΕLISAs), immunofluorescent assays, western blotting, and the like.

In a particularly prefeπed embodiment, the acetylated and/or non-acetylated ΕSX polypeptides are detected in an electrophoretic protein separation, more preferably in a two-dimensional electrophoresis, while in a most preferred embodiment, the acetylated and/or non-acetylated ΕSX polypeptides are detected using an immunoassay.

As used herein, an immunoassay is an assay that utilizes an antibody to specifically bind to the analyte (acetylated or non-acetylated ΕSX polypeptide). The immunoassay is thus characterized by detection of specific binding of an acetylated or nonacetylated ΕSX polypeptide to an anti-ΕSX antibody as opposed to the use of other physical or chemical properties to isolate, target, and quantify the analyte.

0 Electrophoretic assays.

As indicated above, the acetylated and/or non-acetylated ESX polypeptides in a biological sample can be determined using electrophoretic methods. Means of detecting proteins using electrophoretic techniques are well known to those of skill in the art (see generally, Scopes (1982) Protein Purification, Springer- Verlag, N.Y.; Deutscher, (1990) Methods in Enzymology Vol. 182: Guide to Protein Purification, Academic Press, Inc., N.Y.). ii^") Immunological binding assays.

In a prefeπed embodiment, the acetylated and/or non-acetylated ESX polypeptides are detected and/or quantified using any of a number of well recognized immunological binding assays (see, e.g., U.S. Patents 4,366,241; 4,376,110; 4,517,288; and 4,837,168). For a review of the general immunoassays, see also Asai (1993) Methods in Cell Biology Volume 37: Antibodies in Cell Biology, Academic Press, Inc. New York; Stites and Ten (1991) Basic and Clinical Immunology 7th Edition. Immunological binding assays (or immunoassays) typically utilize a "capture agent" to specifically bind to and often immobilize the analyte (in this case ESX polypeptide or subsequence). The capture agent is a moiety that specifically binds to the analyte. In a preferred embodiment, the capture agent is an antibody that specifically binds acetylated or non-acetylated ESX polypeptide(s). The antibody (anti-ESX) may be produced by any of a number of means well known to those of skill in the art as described above.

Immunoassays also often utilize a labeling agent to specifically bind to and label the binding complex formed by the capture agent and the analyte. The labeling agent may itself be one of the moieties comprising the antibody/analyte complex. Thus, the labeling agent may be a labeled ESX polypeptide or a labeled anti-ESX antibody. Alternatively, the labeling agent may be a third moiety, such as another antibody, that specifically binds to the antibody/ESX complex. In a prefeπed embodiment, the labeling agent is a second human ESX antibody bearing a label. Alternatively, the second ESX antibody may lack a label, but it may, in turn, be bound by a labeled third antibody specific to antibodies of the species from which the second antibody is derived. The second can be modified with a detectable moiety, such as biotin, to which a third labeled molecule can specifically bind, such as enzyme- labeled streptavidin.

Other proteins capable of specifically binding immunoglobulin constant regions, such as protein A or protein G may also be used as the label agent. These proteins are normal constituents of the cell walls of streptococcal bacteria. They exhibit a strong non-immunogenic reactivity with immunoglobulin constant regions from a variety of species (see, generally Kronval, et al. (1973) J. Immunol, 111: 1401-1406, and Akerstrom, et al. (1985) J. Immunol, 135: 2589-2542).

Throughout the assays, incubation and/or washing steps may be required after each combination of reagents. Incubation steps can vary from about 5 seconds to several hours, preferably from about 5 minutes to about 24 hours. However, the incubation time will depend upon the assay format, analyte, volume of solution, concentrations, and the like. Usually, the assays will be carried out at ambient temperature, although they can be conducted over a range of temperatures, such as 10°C to 40°C. a) Non-competitive assay formats.

Immunoassays for detecting acetylated and/or non-acetylated ESX polypeptide may be either competitive or noncompetitive. Noncompetitive immunoassays are assays in which the amount of captured analyte (in this case ESX) is directly measured. In one prefeπed "sandwich" assay, for example, the capture agent (anti-ESX antibodies) can be bound directly to a solid substrate where they are immobilized. These immobilized antibodies then capture ESX present in the test sample. The ESX thus immobilized is then bound by a labeling agent, such as a second human ESX antibody bearing a label. Alternatively, the second ESX antibody may lack a label, but it may, in turn, be bound by a labeled third antibody specific to antibodies of the species from which the second antibody is derived. The second can be modified with a detectable moiety, such as biotin, to which a third labeled molecule can specifically bind, such as enzyme-labeled streptavidin. b Competitive assay formats.

In competitive assays, the amount of analyte (acetylated or non-acetylated ESX) present in the sample is measured indirectly by measuring the amount of an added (exogenous) analyte (acetylated or non-acetylated ESX) displaced (or competed away) from a capture agent (anti ESX antibody) by the analyte present in the sample. In one competitive assay, a known amount of, in this case, non-acetylated ESX is added to the sample and the sample is then contacted with a capture agent, in this case an antibody that specifically binds ESX. The amount of non-acetylated ESX bound to the antibody is inversely proportional to the concentration of non-acetylated ESX present in the sample.

In a particularly prefeπed embodiment, the antibody is immobilized on a solid substrate. The amount of ESX bound to the antibody may be determined either by measuring the amount of acetylated or non-acetylated ESX present in an ESX/antibody complex, or alternatively by measuring the amount of remaining uncomplexed ESX. The amount of ESX may be detected by providing a labeled ESX molecule.

A hapten inhibition assay is another prefeπed competitive assay. In this assay a known analyte, in this case acetylated or non-acetylated ESX is immobilized on a solid substrate. A known amount of anti-( acetylated or non-acetylated) ESX antibody is added to the sample, and the sample is then contacted with the immobilized ESX. In this case, the amount of anti-ESX antibody bound to the immobilized ESX is inversely proportional to the amount of ESX present in the sample. Again the amount of immobilized antibody may be detected by detecting either the immobilized fraction of antibody or the fraction of the antibody that remains in solution. Detection may be direct where the antibody is labeled or indirect by the subsequent addition of a labeled moiety that specifically binds to the antibody as described above. c) Other assay formats.

In a particularly prefeπed embodiment, Western blot (immunoblot) analysis is used to detect and quantify the presence or ratio of acetylated and/or non-acetylated ESX in the sample. The technique generally comprises separating sample proteins by gel electrophoresis on the basis of molecular weight, transferring the separated proteins to a suitable solid support, (such as a nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample with the antibodies that specifically bind ESX. The anti- ESX antibodies specifically bind to ESX on the solid support. These antibodies may be directly labeled or alternatively may be subsequently detected using labeled antibodies (e.g., labeled sheep anti-mouse antibodies) that specifically bind to the anti-ESX.

Other assay formats include liposome immunoassays (LIA), which use liposomes designed to bind specific molecules (e.g., antibodies) and release encapsulated reagents or markers. The released chemicals are then detected according to standard techniques (see, Monroe et al. (1986) Amer. Clin. Prod. Rev. 5:34-41). d) Scoring of the assay.

The assays of this invention as scored (as positive or negative for acetylated or non-acetylated ESX polypeptide) according to standard methods well known to those of skill in the art. The particular method of scoring will depend on the assay format and choice of label. For example, a Western Blot assay can be scored by visualizing the colored product produced by the enzymatic label. A clearly visible colored band or spot at the coπect molecular weight is scored as a positive result, while the absence of a clearly visible spot or band is scored as a negative. In a prefeπed embodiment, a positive test will show a signal intensity (e.g., acetylated ESX polypeptide quantity) at least twice that of the background and/or control and more preferably at least 3 times or even at least 5 times greater than the background and/or negative control. e) Reduction of non-specific binding.

One of skill in the art will appreciate that it is often desirable to reduce non- specific binding in immunoassays. Particularly, where the assay involves an antigen or antibody immobilized on a solid substrate it is desirable to minimize the amount of nonspecific binding to the substrate. Means of reducing such non-specific binding are well known to those of skill in the art. Typically, this involves coating the substrate with a proteinaceous composition. In particular, protein compositions such as bovine serum albumin (BSA), nonfat powdered milk, and gelatin are widely used with powdered milk being most prefeπed.

D) Labels.

The particular label or detectable group used in the assay is not a critical aspect of the invention, so long as it does not significantly interfere with the specific binding of the antibody used in the assay. The detectable group can be any material having a detectable physical or chemical property. Such detectable labels have been well-developed in the field of immunoassays and, in general, most any label useful in such methods can be applied to the present invention. Thus, a label is any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include magnetic beads (e.g. Dynabeads™), fluorescent dyes (e.g., fluorescein isothiocyanate, texas red, rhodamine, and the like), radiolabels (e.g., ³H, ¹²⁵1, ³⁵S, ¹⁴C, or ³²P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g. polystyrene, polypropylene, latex, etc.) beads. The label may be coupled directly or indirectly to the desired component of the assay according to methods well known in the art. As indicated above, a wide variety of labels may be used, with the choice of label depending on sensitivity required, ease of conjugation with the compound, stability requirements, available instrumentation, and disposal provisions.

Non-radioactive labels are often attached by indirect means. Generally, a ligand molecule (e.g., biotin) is covalently bound to the molecule. The ligand then binds to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or covalently bound to a signal system, such as a detectable enzyme, a fluorescent compound, or a chemiluminescent compound. A number ofhgands and anti-ligands can be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and cortisol, it can be used in conjunction with the labeled, naturally occurring anti-ligands. Alternatively, any haptenic or antigenic compound can be used in combination with an antibody.

The molecules can also be conjugated directly to signal generating compounds, e.g., by conjugation with an enzyme or fluorophore. Enzymes of interest as labels will primarily be hydrolases, particularly phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. Fluorescent compounds include fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol. For a review of various labeling or signal producing systems which may be used, see, U.S. Patent No. 4,391,904.

Means of detecting labels are well known to those of skill in the art. Thus, for example, where the label is a radioactive label, means for detection include a scintillation counter or photographic film as in autoradiography. Where the label is a fluorescent label, it may be detected by exciting the fluorochrome with the appropriate wavelength of light and detecting the resulting fluorescence. The fluorescence may be detected visually, by means of photographic film, by the use of electronic detectors such as charge coupled devices (CCDs) or photomultipliers and the like. Similarly, enzymatic labels may be detected by providing the appropriate substrates for the enzyme and detecting the resulting reaction product. Finally simple colorimetric labels may be detected simply by observing the color associated with the label. Thus, in various dipstick assays, conjugated gold often appears pink, while various conjugated beads appear the color of the bead.

Some assay formats do not require the use of labeled components. For instance, agglutination assays can be used to detect the presence of the target antibodies. In this case, antigen-coated particles are agglutinated by samples comprising the target antibodies. In this format, none of the components need be labeled and the presence of the target antibody is detected by simple visual inspection. E) Substrates. As mentioned above, depending upon the assay, various components, including the antigen, target antibody, or anti-human antibody, may be bound to a solid surface. Many methods for immobilizing biomolecules to a variety of solid surfaces are known in the art. For instance, the solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish (e.g., PVC, polypropylene, or polystyrene), a test tube (glass or plastic), a dipstick (e.g. glass, PVC, polypropylene, polystyrene, latex, and the like), a microcentrifuge tube, or a glass or plastic bead. The desired component may be covalently bound or noncovalently attached through nonspecific bonding.

A wide variety of organic and inorganic polymers, both natural and synthetic may be employed as the material for the solid surface. Illustrative polymers include polyethylene, polypropylene, poly(4-methylbutene), polystyrene, polymethacrylate, poly(ethylene terephthalate), rayon, nylon, poly(vinyl butyrate), polyvinylidene difluoride (PVDF), silicones, polyformaldehyde, cellulose, cellulose acetate, nitrocellulose, and the like. Other materials which may be employed, include paper, glasses, ceramics, metals, metalloids, semiconductive materials, cements or the like. In addition, are included substances that form gels, such as proteins (e.g., gelatins), lipopolysaccharides, silicates, agarose and polyacrylamides can be used. Polymers which form several aqueous phases, such as dextrans, polyalkylene glycols or surfactants, such as phospholipids, long chain (12- 24 carbon atoms) alkyl ammonium salts and the like are also suitable. Where the solid surface is porous, various pore sizes may be employed depending upon the nature of the system.

In preparing the surface, a plurality of different materials may be employed, particularly as laminates, to obtain various properties. For example, protein coatings, such as gelatin can be used to avoid non-specific binding, simplify covalent conjugation, enhance signal detection or the like.

If covalent bonding between a compound and the surface is desired, the surface will usually be polyfunctional or be capable of being polyfiinctionalized. Functional groups which may be present on the surface and used for linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic groups, hydroxyl groups, mercapto groups and the like. The manner of linking a wide variety of compounds to various surfaces is well known and is amply illustrated in the literature (see, e.g., Chibata (1978) Immobilized Enzymes, Halsted Press, New York, and Cuatrecasas (1970) J. Biol Chem. 245: 3059).

In addition to covalent bonding, various methods for noncovalently binding an assay component can be used. Noncovalent binding is typically nonspecific absorption of a compound to the surface. Typically, the surface is blocked with a second compound to prevent nonspecific binding of labeled assay components. Alternatively, the surface is designed such that it nonspecifically binds one component but does not significantly bind another. For example, a surface bearing a lectin such as Concanavalin A will bind a carbohydrate containing compound but not a labeled protein that lacks glycosylation. Various solid surfaces for use in noncovalent attachment of assay components are reviewed in U.S. Patent Nos. 4,447,576 and 4,254,082.

F) Evaluation of ESX acetylation levels and/or abnormal expression.

One of skill will appreciate that abnormal expression levels or abnormal expression products (e.g., mutated transcripts, truncated or non-sense polypeptides) are identified by comparison to normal expression levels and normal expression products. Normal levels of expression or normal expression products can be determined for any particular population, subpopulation, or group of organisms according to standard methods well known to those of skill in the art. Typically this involves identifying healthy organisms and/or tissues (i.e. organisms and/or tissues without ESX expression dysregulation or neoplastic growth) and measuring expression levels of the ESX gene (as described herein) or sequencing the gene, mRNA, or reverse transcribed cDNA, to obtain typical (normal) sequence variations. Application of standard statistical methods used in molecular genetics permits determination of baseline levels of expression, and normal gene products as well as significant deviations from such baseline levels.

Preferably, normal levels of expression are determined using a control organism or tissue that is in a physiological milieu that is similar to that of the test sample. For example, ESX expression can be influenced by age of the organism, pregnancy, menopause, and day of menstrual cycle, among other factors. Therefore, it is prefeπed to choose as a control tissue one that is at a similar stage as the tissue being tested for abnormal ESX expression. For example, a tissue known to be healthy can be obtained from the same organism from which the test tissue is obtained.

VI. Detection kits.

The present invention also provides for kits for the diagnosis of organisms (e.g., patients) with a predisposition (at risk) for carcinomas, including epithelial cancers. The kits preferably include one or more reagents for determining the presence or absence or degree of acetylation of ESX, for quantifying expression of the ESX gene, or for detecting an abnormal ESX gene (amplified or reaπanged), or expression products of an abnormal ESX gene. Prefeπed reagents include nucleic acid probes that specifically bind to the normal ESX gene, cDNA, or subsequence thereof, probes that specifically bind to abnormal ESX gene (e.g., ESX genes containing premature truncations, insertions, or deletions), antibodies that specifically bind to normal ESX polypeptides (e.g. acetylated or non- acetylated) or subsequences thereof, or antibodies that specifically bind to abnormal ESX polypeptides or subsequences thereof. The antibody or hybridization probe may be free or immobilized on a sohd support such as a test tube, a microtiter plate, a dipstick and the like. The kit may also contain instructional materials teaching the use of the antibody or hybridization probe in an assay for the detection of a predisposition for ESX.

The kits may include alternatively, or in combination with any of the other components described herein, an anti-ESX antibody. The antibody can be monoclonal or polyclonal. The antibody can be conjugated to another moiety such as a label and/or it can be immobilized on a solid support (substrate).

The kit(s) may also contain a second antibody for detection of ESX polypeptide/antibody complexes or for detection of hybridized nucleic acid probes. The kit may contain appropriate reagents for detection of labels, positive and negative controls, washing solutions, dilution buffers and the like.

VII. Gene and/or cDNA activation or squelching using ESX

A) Gene and/or cDNA upregulation (transactivation^'). In another embodiment, this invention provides methods of activation

(upregulating) or inactivating (downregulating gene expression). It is demonstrated herein (e.g. Example 7) that ESX, in particular exon 4, is a potent transactivator. When attached to a nucleic acid (e.g. DNA) binding domain. Thus, constructs comprising ESX, ESX exon 4, or variants thereof, attached (e.g. chemically conjugated or recombinantly expressed) can be used to target and upregulate selected genes. The target genes (or cDNAs) can be endogenous or heterologous genes. They can be integrated into the host genome or can be non-integrated (e.g. extra-chromosomal).

In prefeπed embodiments, the target genes/cDNAs are under the control of a promoter comprising an The method of claim 8, wherein said gene or cDNA is under the control of promoter having an ETS response element. The promoter can be an a naturally occuring promoter having such a response element (e.g. an epithelial gene promoter) or it can be a promoter engineered to contain such a response element.

Epithelial gene promoters are known to those of skill in the art. Indeed, ESX contains one such promoter and it is unique among transcription factors generally, and Ets factors specifically, for its restricted expression in terminally differentiated epidermal cells (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Choi et al. (1998) J. Biol. Chem., 273: 110-117). In stratified epithelium, ESX is thought to transactivate such genes as the fransforming growth factor-β type II receptor (TGF-βRH), Endo-A/keratin-8, and several markers of epidermal cell differentiation including transglutaminase 3, SPRR2A, and profilaggrin (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Choi et al. (1998) J. Biol. Chem., 273: 110-117; Oettgen et al ( 991) Mol. Cell. Biol, 17: 4419-4433; Tymms et al (1997) Oncogene, 15, 2449-2462).

Nucleic acid binding proteins (domains) include, but are not limited to DNA binding proteins such as Fis, Lad, lambda cl, lambda cro, LexA, TrpR, ArgR, AraC, CRP, FNR, OxyR, IHF, GalR, MalT, LRP, SoxR, SoxS, sigma factors, chi, T4 MotA, PI RepA, p53, NF-kappa-B, GAL4, and the like. A large number of nucleic acid binding proteins are described in the TransFac database (see also (1997) Nucleic Acids Res. 25(1) 265-268).

Methods of coupling the nucleic acid binding domain to the ESX polypeptide are well known to those of skill in the art. Chemical conjugation is preferably by way of a linker. A "linker" as used herein, is a molecule that is used to join the targeting molecule to the effector molecule. The linker is capable of forming covalent bonds to both the DNA binding domain and to the ESX polypeptide. Suitable linkers are well known to those of skill in the art and include, but are not limited to, straight or branched-chain carbon linkers, heterocyclic carbon linkers, or peptide connectors. The linkers may be joined to the constituent amino acids through their side groups (e.g., through a disulfide linkage to cysteine). However, in a prefeπed embodiment, the connectors will be joined to the alpha carbon amino and carboxyl groups of the terminal amino acids.

Many procedures and linkers molecules for attachment of various polypeptides are known (see, e.g., European Patent Application No. 188,256; U.S. Patent Nos. 4,545,985 and 4,894,443, 4,671,958, 4,659,839, 4,414,148, 4,699,784; 4,680,338; 4,569,789; and 4,589,071; and Borlinghaus et al. (1987) Cancer Res. 47: 4071-4075; Waldmann (1991) Science, 252: 1657).

In a prefeπed embodiment, the nucleic binding domain and the ESX polypeptide are expressed as a fusion protein. Methods of preparing fusion proteins are well known to those of skill in the art and are illustrated in Example 7.

B) Gene and/or cDNA down regulation (squelching).

In another embodiment, the ESX polyeptide (e.g. exon 4 polypeptide) can be used to down regulate gene expression. It is demonstrated herein that exon 4 polypeptides are capable of squelching (downregulating) activity of a target gene or cDNA. Squelching occurs when a potent transactivator reduces the expression of a gene or cDNA (e.g. a co- transfected reporter plasmid in a transient transfection assay), with the resulting decline in activity believed to be due to sequestration of GTFs and reduction in their effective concentration (Natesan et al. (1997) Nature, 390: 349-350). Without being bound to a particular theory, it is believed that squelching is mediaged by high-affinity binding of ESX exon 4 polypeptide to a limiting component of the basic transcriptional machinery, TATA-binding protein (TBP). When TBP is recruited by is sequestered by excess ESX (e.g. exon 4) unbound to DNA, squelching of TBP-dependent gene expression can occur. Thus provision of a cell with excess ESX exon 4 (e.g. transfecting the cell with an ESX exon 4-expressing vector) can down-regulate a gene or cDNA. As with transactivation, prefeπed target genes/cDNAs have a native or engineered Ets response element. Vπi. Probes.

In another embodiment, the ESX polypeptides of ESX exon 4 and/or ESX exon 7 can be used as probes to identify in vivo, in vitro, or in situ naturally occuring molecules or test agents that interact with these ESX domains. Thus, for example, the polypeptide expressed by ESX exon 4 interacts with other intracellular proteins (e.g., TATA- binding protein (TBP)). Similarly, ESX exon 7 can interact with proteins (e.g. histone acetyl transferase (HAT) domain from the nuclear co-regulatory factor, pCAF, (pCAF/HAT)) and nucleic acids (it is a sequence-nonspecific DNA-binding motif that recognizes and stabilizes architecturally iπegular DNA structures like the H-DNA form thought to be present within the GGA minor-repeat and nuclear matrix binding region of the erέB2 proximal promoter). Thus, ESX exon 4 and/or ESX exon 7 can be used to probe organisms, tissues and cells for binding proteins and/or nucleic acids. In a prefeπed embodiment, this is accomplished simply by labeling the ESX exon 4 or exon 7 polypeptide with a detectable label, treating the sample animal, tissue, or cell and detecting or isolating the ESX probe to localize and/or isolate molecules interacting with the probe.

IX. Affinity matrix for isolating exon 4 and/or exon 7 binding molecules.

Alternatively, the ESX exon 4 and or exon 7 polypeptides can be used e.g. in an affinity matrix (e.g. affinity column) to isolate targets (e.g. proteins or nucleic acids that interact with these ESX domains). Briefly, in one embodiment, affinity chromatography involves immobilizing (e.g. on a solid support) polypeptides comprising the ESX exon 4 and/or exon 7 polypeptides. Cells, cellular lysate, or cellular homogenate, or other samples are then contacted with the immobilized polypeptide which then binds to its components of the sample that interact with that polypeptide. The remaining material is then washed away and the bound molecule(s) can then be released from the ESX polypeptide(s) for further use. Methods of performing affinity chromatography are well known to those of skill in the art (see, e.g., U.S. Patent Nos: 5,710,254, 5,491,096, 5,278,061, 5,110,907, 4,985,144, 4,385,991, 3,983,001, etc.).

Suitable matrix materials include, but are not limited to paper, glasses, ceramics, gels, aerogels, metals, metalloids, polacryloylmorpholide, various plastics and plastic copolymers such as NylonTM, TeflonTM, polyethylene, polypropylene, poly(4- methylbutene), polystyrene, polystyrene, polystyrene/latex, polymethacrylate, poly(ethylene terephthalate), rayon, nylon, poly(vinyl butyrate), polyvinylidene difluoride (PVDF), silicones, polyformaldehyde, cellulose, cellulose acetate, nitrocellulose, and thelike, and other materials generally known to be suitable for use in affinity columns (e.g. HPLC columns).

X. ESX modulation/therapeutics.

The ESX polypeptide appears to be an extremely strong gene transactivator, as revealed by GAL4 fusion studies showing that the ESX amino acid sequences encoded by ElS exon 4 are as powerful as the transactivating sequences of VP16, one of the strongest transactivators known and most often used as a positive control in GAL4 fusion studies. These studies indicate that ΕSX is most likely "turning on" rather than "turning off all the genes under its control (e.g., growth factor receptors such as erbB2, and extracellular matrix proteases such as MMPs, and UP A). Up-regulation of ΕSX will therefore turn on (e.g., transactivate) genes under ΕSX control, while down-regulation of ΕSX will turn off genes under ΕSX control.

A) Screening for ΕSX modulation.

As indicated earlier, ΕSX controls a number of functions including, but not limited to in remodeling ductal epithelium and in regulating gene programs involved with this process (e.g. extracellular matrix degradation, apoptosis, etc.). In particular extracellular matrix degradation control or apoptosis appear to be essential for enhanced tumor cell invasion and metastasis. Modulation of such functions is useful in both a research and a therapeutic context. Thus, in one embodiment, this invention provides methods of screening for agents that modulate (e.g., up-regulate (turn on or increase) or down-regulate (turn off or decrease) ΕSX expression or ΕSX polypeptide activity.

In a prefeπed embodiment, such such methods involve contacting a cell or an isolated system (e.g. a solution) containing an endogenous or heterologous ΕSX exon 4 or exon 7 DNA or polypeptide with the agent that is to be screened for ΕSX modulatory activity and detecting binding of that agent to ΕSX exon 4 and/or exon 7 nucleic acid or polypeptide. Alternatively a change in ΕSX acetylation (e.g. exon 7 acetylation) can be detected.

Methods of assaying for binding interactions are well known to those of skill in the art. Such methods include, for example, DNA bending assays (see, e.g., Wechsler and Dang (1992) Proc. Natl. Acad. Sci. USA, 89: 7635-7639 with modifications to prevent anomalous results described by McCormick et al. (1996) Proc. Natl Acad. Sci. USA, 93: 14434-14439), and more traditional binding assays such as transcription factor binding assays (see, e.g., U.S. Patents 5,350,835 and 5,563,036). It was a discovery of this invention that the minimal ΕSX domain necessary for ΕSX-mediated transactivation is encoded by exon 4 (aa 129-159), an acidic domain containing a central lysine residue (K-145). Subsequent mutations of this domain have established that the central K-145 is essential and provides nearly 1000-fold transactivation potency (relative to a neutral residue placed there). A database search revealed that the exon 4-encoded domain is homologous to the essential core domain of all known Topoisomerase I molecules (cf. Stewart et al, (1996) J. Biol.

Chem. 271: 7602-7608; Pommier (1996) Sem. Oncology 23: 3-10). Since human Topo-I is a critical intracellular target for the newest and most exciting family of camptothecin-like anticancer agents (like Topotecan, CPT-11, 9AC, etc.; see reviews).

This information not only provides important data regarding the molecular transactivation mechanism of ESX, but it suggests that this particular ESX domain may be used to search for or screen (from libraries, e.g., combinatorial libraries of synthetic chemicals and/or natural products) for even newer and more effective and selective anticancer agents. Existing Topo-I agents target a very different, C-terminal conserved domain in the Topo-I enzyme. Prior to this invention there was no specific function attributed to the highly conserved Topo-I Core domain which is homologous to the ESX transactivation domain.

These data also shed light on the functioning of Topo-I (and new ways to inhibit it) as they do on the functioning of ESX. In this regard, this invention provides, in one embodiment, methods of screening for a therapeutic lead compound. The methods involve providing a nucleic acid encoding a polypeptide of ESX exon 4 or a polypeptide sequence of ESX exon 4; (ii) contacting the compound to the nucleic acid or polypeptide sequence; and (iii) detecting binding of the compound to the nucleic acid or polypeptide sequence. Compounds that specifically bind to the exon 4 nucleic acid and/or polypeptide are expected to provide lead compounds for therapeutic evaluation and/or development. Suitable binding assays are described herein and are also well known to those of skill in the art.

B) ESX Modulators for screening.

Virtually any compound can be screened for ESX modulatory activity. However, it will be appreciated that some compounds are expected to show ESX modulatory activity and these compounds may be preferentially screened. Such compounds include, but are not limited to compounds that specifically target and bind to ESX exon 4 and/or exon 7 nucleic acids or polypeptides (e.g., ESX muteins, or ESX antisense molecules). i) ESX muteins.

It was a discovery of this invention that full-length ESX bends DNA by as much as 80 degrees upon DNA-binding. In contrast, when only the DNA-binding portion of ESX (see, Fig. 5), or any other ETS protein is assessed, only 6-20 degrees of DNA bending is observed (as reported by NMR and X-ray crystallography studies on other truncated ETS proteins). This indicates that a mutated version of a full DNA bending ESX construct can act as a "dominant-negative" transcription factor or fused to a known repression module to produce an agent that will silence ESX regulated genes and turn off potential gene programs necessary for tumor cell invasion and metastasis. Using the sequence information provided herein (e.g., Fig. 5) ESX polypeptide variants can be routinely produced.

For example, it is demonstrated herein that the central K¹⁴⁵ of exon 4 (aa 129- 159) of is essential for ESX transactivation activity and provides nearly 1000-fold transactivation potency (relative to a neutral residue placed there. The mutation of K¹⁴⁵ to a neutral residue will provide an inactivating (competitive) mutein. Methods of making other such polypeptide variants (muteins) are well known to those of skill (see, e.g., U.S. Patents 5,486,463, 5,422,260, 5,116,943, 4,752,585, 4,518,504). Screening of such polypeptides (e.g., in DNA binding assays or for competitive inhibition of full-length normal ESX polypeptides) can be accomplished with only routine experimentation. Using high-throughput methods, as described herein, literally thousands of agents can be screened in only a day or two. ii) Antisense molecules.

ESX gene regulation can be downregulated or entirely inhibited by the use of antisense molecules. An "antisense sequence or antisense nucleic acid" is a nucleic acid is complementary to the coding ESX mRNA nucleic acid sequence or a subsequence thereof. Binding of the antisense molecule to the ESX mRNA interferes with normal translation of the ESX polypeptide.

Thus, in accordance with prefeπed embodiments of this invention, prefeπed antisense molecules include ohgonucleotides and oligonucleotide analogs that are hybridizable with ESX messenger RNA (preferably with exon 4 or exon 7). This relationship is commonly denominated as "antisense." The ohgonucleotides and oUgonucleotide analogs are able to inhibit the function of the RNA, either its translation into protein, its translocation into the cytoplasm, or any other activity necessary to its overall biological function. The failure of the messenger RNA to perform all or part of its function results in a reduction or complete inhibition of expression of ESX polypeptides. In the context of this invention, the term "oligonucleotide" refers to a polynucleotide formed from naturally-occurring bases and/or cyclofuranosyl groups joined by native phosphodiester bonds. This term effectively refers to naturally-occurring species or synthetic species formed from naturally-occurring subunits or their close homologs. The term "ohgonucleotide" may also refer to moieties which function similarly to ohgonucleotides, but which have non naturally-occurring portions. Thus, ohgonucleotides may have altered sugar moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing species which are known for use in the art. In accordance with some prefeπed embodiments, at least one of the phosphodiester bonds of the oligonucleotide has been substituted with a structure which functions to enhance the ability of the compositions to penetrate into the region of cells where the RNA whose activity is to be modulated is located. It is prefeπed that such substitutions comprise phosphorothioate bonds, methyl phosphonate bonds, or short chain alkyl or cycloalkyl structures. In accordance with other prefeπed embodiments, the phosphodiester bonds are substituted with structures which are, at once, substantially non-ionic and non-chiral, or with structures which are chiral and enantiomerically specific. Persons of ordinary skill in the art will be able to select other linkages for use in the practice of the invention.

Ohgonucleotides may also include species which include at least some modified base forms. Thus, purines and pyrimidines other than those normally found in nature may be so employed. Similarly, modifications on the furanosyl portions of the nucleotide subunits may also be effected, as long as the essential tenets of this invention are adhered to. Examples of such modifications are 2'-O-alkyl- and 2'-halogen-substituted nucleotides. Some specific examples of modifications at the 2' position of sugar moieties which are useful in the present invention are OH, SH, SCH₃, F, OCH₃, OCN, O(CH₂)[n]NH₂ or O(CH₂)[n]CH₃, where n is from 1 to about 10, and other substituents having similar properties.

Such ohgonucleotides are best described as being functionally interchangeable with natural ohgonucleotides or synthesized ohgonucleotides along natural lines, but which have one or more differences from natural structure. All such analogs are comprehended by this invention so long as they function effectively to hybridize with messenger RNA of ESX to inhibit the function of that RNA.

The ohgonucleotides in accordance with this invention preferably comprise from about 3 to about 50 subunits. It is more prefeπed that such ohgonucleotides and analogs comprise from about 8 to about 25 subunits and still more prefeπed to have from about 12 to about 20 subunits. As will be appreciated, a subunit is a base and sugar combination suitably bound to adjacent subunits through phosphodiester or other bonds. The ohgonucleotides used in accordance with this invention may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors, including Applied Biosystems. Any other means for such synthesis may also be employed, however, the actual synthesis of the ohgonucleotides is well within the talents of the routineer. It is also will known to prepare other oligonucleotide such as phosphorothioates and alkylated derivatives. iii) Combinatorial libraries (e.g.. small organic molecules). Conventionally, new chemical entities with useful properties are generated by identifying a chemical compound (called a "lead compound") with some desirable property or activity, creating variants of the lead compound, and evaluating the property and activity of those variant compounds. However, the cuπent trend is to shorten the time scale for all aspects of drug discovery. Because of the ability to test large numbers quickly and efficiently, high throughput screening (HTS) methods are replacing conventional lead compound identification methods.

In one prefeπed embodiment, high throughput screening methods involve providing a library containing a large number of potential therapeutic compounds (candidate compounds). Such "combinatorial chemical libraries" are then screened in one or more assays, as described below to identify those library members (particular chemical species or subclasses) that display a desired characteristic activity. The compounds thus identified can serve as conventional "lead compounds" or can themselves be used as potential or actual therapeutics.

A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical "building blocks" such as reagents. For example, a linear combinatorial chemical Ubrary such as a polypeptide (e.g., mutein) library is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length (i.e., the number of a ino acids in a polypeptide compound). Millions of chemical compounds can be synthesized through such combinatorial mixing of chemical building blocks. For example, one commentator has observed that the systematic, combinatorial mixing of 100 interchangeable chemical building blocks results in the theoretical synthesis of 100 million tetrameric compounds or 10 billion pentameric compounds (Gallop et al. (1994) 37(9): 1233-1250). Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide libraries (see, e.g., U.S. Patent 5,010,175, Furka (1991) Int. J. Pept. Prot. Res., 37: 487-493, Houghton et al. (1991) Nature, 354: 84-88). Peptide synthesis is by no means the only approach envisioned and intended for use with the present invention. Other chemistries for generating chemical diversity libraries can also be used. Such chemistries include, but are not limited to: peptoids (PCT PubUcation No WO 91/19735, 26 Dec. 1991), encoded peptides (PCT PubUcation WO 93/20242, 14 Oct. 1993), random bio-oligomers (PCT Publication WO 92/00091, 9 Jan. 1992), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et al, (1993) Proc. Nat. Acad. Sci. USA 90: 6909-6913), vinylogous polypeptides (Hagihara et al (1992) J. Amer. Chem. Soc. 114: 6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann et al, (1992) J. Amer. Chem. Soc. 114: 9217-9218), analogous organic syntheses of small compound libraries (Chen et al. (1994) J. Amer. Chem. Soc. 116: 2661), ohgocarbamates (Cho, et al., (1993) Science 261:1303), and or peptidyl phosphonates (Campbell et al, (1994) J. Org. Chem. 59: 658). See, generally, Gordon et al, (1994) J. Med. Chem. 37:1385, nucleic acid libraries (see, e.g., Strategene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083) antibody libraries (see, e.g., Vaughn et al. ( 996) Nature Biotechnology, 14(3): 309-314), and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al. (1996) Science, 274: 1520-1522, and U.S. Patent 5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33, isoprenoids U.S. Patent 5,569,588, thiazolidinones and metathiazanones U.S. Patent 5,549,974, pyπolidines U.S. Patents 5,525,735 and 5,519,134, morphoUno compounds U.S. Patent 5,506,337, benzodiazepines 5,288,514, and the like). Devices for the preparation of combinatorial libraries are commercially available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, MA).

A number of well known robotic systems have also been developed for solution phase chemistries. These systems include automated workstations tike the automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Cahf.) which mimic the manual synthetic operations performed by a chemist. Any of the above devices are suitable for use with the present invention. The nature and implementation of modifications to these devices (if any) so that they can operate as discussed herein will be apparent to persons skilled in the relevant art. In addition, numerous combinatorial libraries are themselves commercially available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, Columbia, MD, etc.).

O High-throughput screening.

Any of the assays for compounds modulating ESX gene expression and or ESX protein activity (e.g., binding activity) described herein are amenable to high throughput screening. Prefeπed assays thus detect enhancement or inhibition of ESX gene transcription, inhibition or enhancement of ESX polypeptide expression, inhibition or enhancement of DNA binding by ESX polypeptide, or inhibition or enhancement of expression of native genes (or reporter genes) under control of the ESX polypeptide.

High throughput assays for the presence, absence, or quantification of particular nucleic acids or protein products are well known to those of skill in the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, for example, U.S. Patent 5,559,410 discloses high throughput screening methods for proteins, U.S. Patent 5,585,639 discloses high throughput screening methods for nucleic acid binding (i.e., in aπays), while U.S. Patents 5,576,220 and 5,541,061 disclose high throughput methods of screening for ligand/antibody binding.

In addition, high throughput screening systems are commerciaUy available (see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries,. Mentor, OH; Beckman Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems typically automate entire procedures including aU sample and reagent pipetting, tiquid dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for the assay. These configurable systems provide high throughput and rapid start up as well as a high degree of flexibility and customization. The manufacturers of such systems provide detailed protocols the various high throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening systems for detecting the modulation of gene transcription, ligand binding, and the like.

XI. In vivo administration of ESX modulators.

The ESX polypeptides, ESX polypeptide subsequences (e.g. exon 4 and/or exόn 7), anti-ESX antibodies, anti-ESX antibody-effector (e.g., enzyme, toxin, hormone, growth factor, drug, etc.) conjugates or fusion proteins, or other ESX modulators of this invention are useful for parenteral, topical, oral, or local aciministration, such as by aerosol or transdermally, for prophylactic and/or therapeutic treatment. The pharmaceutical compositions can be administered in a variety of unit dosage forms depending upon the method of administration. For example, unit dosage forms suitable for oral administration include powder, tablets, pills, capsules and lozenges. It is recognized that the ESX polypeptides and related compounds described of, when administered orally, must be protected from digestion. This is typically accomplished either by complexing the protein with a composition to render it resistant to acidic and enzymatic hydrolysis or by packaging the protein in an. appropriately resistant carrier such as a liposome. Means of protecting proteins from digestion are well known in the art.

The pharmaceutical compositions of this invention are particularly useful for topical administration to cancers, in particular epithelial cancers, and their precursors (such as ductal carcinoma in situ, DCIS). In another embodiment, the compositions are useful for parenteral administration, such as intravenous administration or admimstration into a body cavity or lumen of an organ. The compositions for administration will commonly comprise a solution of the ESX polypeptide, antibody or antibody chimera fusion dissolved in a pharmaceutically acceptable carrier, preferably an aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions are sterile and generally free of undesirable matter. These compositions may be sterilized by conventional, well known sterilization techniques. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, for example, sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate and the like. The concentration of chimeric molecule in these formulations can vary widely, and will be selected primarily based on fluid volumes, viscosities, body weight and the like in accordance with the particular mode of administration selected and the patient's needs.

Thus, a typical pharmaceutical composition for intravenous administration would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per day may be used, particularly when the drug is administered to a secluded site and not into the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher dosages are possible in topical administration. Actual methods for preparing parenterally administrable compositions will be known or apparent to those skilled in the art and are described in more detail in such publications as Remington's Pharmaceutical Science, 15th ed., Mack Publishing Company, Easton, Pennsylvania (1980). The compositions containing the present ESX polypeptides, antibodies or antibody chimera/fusions, or a cocktail thereof (i.e., with other proteins), can be administered for therapeutic treatments. To treat an epithelial cancer characterized by overexpression of ESX, one can administer an anti-ESX antibody or an abnormal ESX protein that is not biologically active. Such inactive ESX polypeptides can, for example, interfere with binding of native ESX polypeptide to its DNA binding site, or to RNA polymerase or other protein through which the ESX transcription factor activity is mediated.

In therapeutic applications, compositions are administered to a patient suffering from a disease (e.g., an epithelial cancer) in an amount sufficient to cure or at least partially aπest the disease and its complications. An amount adequate to accompUsh this is defined as a "therapeutically effective dose." Amounts effective for this use will depend upon the severity of the disease and the general state of the patient's health. Single or multiple administrations of the compositions may be administered depending on the dosage and frequency as required and tolerated by the patient. In any event, the composition should provide a sufficient quantity of the proteins of this invention to effectively treat the patient.

Among various uses of the ESX polypeptides, polypeptide subsequences, anti-ESX antibodies and anti-ESX-effector chimeras/fusions of the present invention are treatment a variety of disease conditions, including cancers such as cancers of the breast, head, neck, ovary, bladder, colon, and the like.

XII. Cellular transformation and gene therapy.

The present invention provides packageable human ESX nucleic acids (e.g. nucleic acids encoding exon 4, and/or exon 4 attached to a nucleic acid binding domain)) for the transformation of cells in vitro and in vivo. These packageable nucleic acids can be inserted into any of a number of well known vectors for the transfection and transformation of target cells and organisms as described below. The nucleic acids are transfected into cells, ex vivo or in vivo, through the interaction of the vector and the target cell. The nucleic acid, under the control of a promoter, then expresses the ESX protein construct thereby upregulating or downregulating the target gene. For treatment of conditions characterized by excessive ESX expression, the ESX construct can be one that downregulates the target gene (e.g. a polypeptide comprising exon 4 without a nucleic acid binding domain). Conversely, in conditions characterized by under expression of a target gene, the ceU(s) can be transfected with a construct comprising a DNA binding domain specific to a domain adjacent to or in proximity of the target gene attached to an ESX exon 4 transactivator. This will be most useful where the target gene promoter contains an Ets response element. Such gene therapy procedures have been used to coπect acquired and inherited genetic defects, cancer, and viral infection in a number of contexts. The ability to express artificial genes in humans facilitates the prevention and/or cure of many important human diseases, including many diseases which are not amenable to treatment by other therapies. As an example, in vivo expression of cholesterol-regulating genes, genes which selectively block the replication of HIN, and tumor-suppressing genes in human patients dramatically improves the treatment of heart disease, AIDS, and cancer, respectively. For a review of gene therapy procedures, see Anderson (1992) Science 256: 808-813; Νabel and Feigner (1993) TIBTECH 11: 211-217; Mitani and Caskey (1993) ΗBTECH 11 : 162-166; Mulligan (1993) Science 926-932; Dillon (1993) ΗBTECH 11 : 167-175; Miller (1992) Nature 357: 455-460; Nan Brunt (1988) Biotechnology 6(10): 1149-1154; Vigne (1995) Restorative Neurology and Neuroscience 8: 35-36; Kremer and Perricaudet (1995) British Medical Bulletin 51(1) 31-44; Haddada et al. (1995) in Current Topics in Microbiology and Immunology, Doerfler and Bδhm (eds) Springer- Verlag, Heidelberg Germany; and Yu et al, (1994) Gene Therapy 1:13-26.

Delivery of the gene or genetic material into the cell is the first critical step in gene therapy treatment of disease. A large number of delivery methods are well known to those of skill in the art. Such methods include, for example Uposome-based gene delivery (Debs and Zhu (1993) WO 93/24640; Mannino and Gould-Fogerite (1988) BioTechniques 6(7): 682-691; Rose U.S. Pat No. 5,279,833; Brigham (1991) WO 91/06309; and Feigner et al. (1987) Proc. Natl. Acad. Sci. USA 84: 7413-7414), and replication-defective rerroviral vectors harboring a therapeutic polynucleotide sequence as part of the rerroviral genome (see, e.g., Miller et al. (1990) Mol. Cell. Biol. 10:4239 (1990); Kolberg (1992) J. NIH Res. 4:43, and Cornetta et al. (1991) Hum. Gene Ther. 2: 215). Widely used rerroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus

(GaLV), simian immunodeficiency virus (SIN), human immunodeficiency virus (HIN), and combinations thereof (see, e.g., Buchscher et al. (1992) J. Virol. 66(5) 2731-2739; Johann et al. (1992) J. Virol. 66 (5):1635-1640; Sommerfelt et al, (1990) Virol. 176: 58-59; Wilson et al. (1989) J. Virol. 63:2374-2378; Miller et al. (1991) J. Virol. 65:2220-2224; Wong-Staal et al, PCT/US94/05700, Rosenburg and Fauci (1993) in Fundamental Immunology, Third Edition Paul (ed) Raven Press, Ltd., New York and the references therein, and Yu et al, (1994) Gene Therapy supra).

AAV-based vectors are also used to transduce cells with target nucleic acids, e.g., in the in vitro production of nucleic acids and peptides, and in in " vivo and ex vivo gene therapy procedures (see, West et al. (1987) Virology 160: 38-47; Carter et al. (1989) U.S. Patent No. 4,797,368; Carter et al. WO 93/24641 (1993); Kotin (1994) Human Gene Therapy 5: 793-801; Muzyczka (1994) J. Clin. Invst. 94: 1351 and Samulski (supra) for an overview of AAV vectors. Construction of recombinant AAV vectors are described in a number of publications, including Lebkowski, U.S. Pat. No. 5,173,414; Tratschin et al (1985) Mol. Cell. Biol. 5(11): 3251-3260; Tratschin, et al. (1984) Mol Cell. Biol, 4:2072- 2081; Hermonat and Muzyczka (1984) Proc. Natl. Acad. Sci. USA, 81:6466-6470; McLaughlin et al. (1988) and Samulski et al. (1989) J. Virol, 63: 3822-3828. Cell lines that can be transformed by rAAV include those described in Lebkowski et al. (1988) Mol. Cell Biol, 8:3988-3996.

A) Ex vivo transformation of cells.

Ex vivo cell transformation for diagnostics, research, or for gene therapy (e.g., via re-infusion of the transformed cells into the host organism) is well known to those of skill in the art. In a prefeπed embodiment, cells are isolated from the subject organism, transfected with the construct(s) of this invention, and re-infused back into the subject organism (e.g., patient). Various cell types suitable for ex vivo transformation are well known to those of skill in the art. Particular prefeπed cells are progenitor or stem cells (see, e.g., Freshney et al, (1994) Culture of Animal Cells, a Manual of Basic Technique, third edition Wiley-Liss, New York) and the references cited therein for a discussion of how to isolate and culture cells from patients).

For some embodiments, stem cells are used in ex-vivo procedures for cell transformation and gene therapy. One advantage for some applications to using stem cells is that they can be differentiated into other cell types in vitro, or can be introduced into a mammal (such as the donor of the cells) where they will engraft in the bone maπow. Methods for differentiating CD34+ cells in vitro into clinically important immune cell types using cytokines such a GM-CSF, IFN-g and TNF-a are known (see, Inaba et al. (1992) J. Exp. Med. 176, 1693-1702).

Stem cells are isolated for transduction and differentiation using known methods. For example, in mice, bone maπow cells are isolated by sacrificing the mouse and cutting the leg bones with a pair of scissors. Stem cells are isolated from bone maπow cells by panning the bone maπow cells with antibodies which bind unwanted cells, such as CD4 and CD8⁺ (T cells), CD45⁺ (panB cells), GR-1 (granulocytes), and lad (differentiated antigen presenting cells). For an example of this protocol see, Inaba et al. (1992) J. Exp. Med. 176, 1693-1702. In humans, bone maπow aspirations from iliac crests are performed e.g., under general anesthesia in the operating room. The bone maπow aspirations is approximately 1,000 ml in quantity and is collected from the posterior iliac bones and crests. If the total number of cells collected is less than about 2 x 108/kg, a second aspiration using the sternum and anterior iliac crests in addition to posterior crests is performed. During the operation, two units of iπadiated packed red cells are administered to replace the volume of maπow taken by the aspiration. Human hematopoietic progenitor and stem cells are characterized by the presence of a CD34 surface membrane antigen. This antigen is used for purification, e.g., on affinity columns which bind CD34. After the bone maπow is harvested, the mononuclear cells are separated from the other components by means of ficol gradient centrifugation. This is performed by a semi-automated method using a cell separator (e.g., a Baxter Fenwal CS3000+ or Terumo machine). The light density cells, composed mostly of mononuclear cells are collected and the cells are incubated in plastic flasks at 37°C for 1.5 hours. The adherent cells (monocytes, macrophages and B-Cells) are discarded. The non-adherent cells are then collected and incubated with a monoclonal anti- CD34 antibody (e.g., the murine antibody 9C5) at 4°C for 30 minutes with gentle rotation. The final concentration for the anti-CD34 antibody is 10 μg/ml. After two washes, paramagnetic microspheres (Dyna Beads, supplied by Baxter Immunotherapy Group, Santa Ana, California) coated with sheep antimouse IgG (Fc) antibody are added to the cell suspension at a ratio of 2 cells/bead. After a further incubation period of 30 minutes at 4°C, the rosetted cells with magnetic beads are collected with a magnet. Chymopapain (supplied by Baxter Immunotherapy Group, Santa Ana, California) at a final concentration of 200 U/ml is added to release the beads from the CD34⁺ cells. Alternatively, and preferably, an affinity column isolation procedure can be used which binds to CD34, or to antibodies bound to CD34 (see, the examples below). See, Ho et al. (1995) Stem Cells 13 (suppl. 3): 100-105. See also, Brenner (1993) Journal ofHematotherapy 2: 7-17. In another embodiment, hematopoetic stem cells are isolated from fetal cord blood. Yu et al. (1995) Proc. Natl. Acad. Sci. USA, 92: 699-703 describe a prefeπed method of transducing CD34⁺ cells from human fetal cord blood using rerroviral vectors. For some purposes, non-stem cells are prefeπed for ex vivo treatments using

ESX nucleic acids. For example, where it is desirable to have the ESX product expressed transiently, mortal cells that do not differentiate are prefeπed carriers of ESX nucleic acids. B) In vivo transformation.

Vectors (e.g., retroviruses, adenoviruses, liposomes, etc.) contaimng therapeutic nucleic acids can be administered directly to the organism for transduction of cells in vivo. Administration is by any of the routes normally used for introducing a molecule into ultimate contact with blood or tissue cells. The packaged nucleic acids are administered in any suitable manner, preferably with pharmaceutically acceptable carriers. Suitable methods of administering such packaged nucleic acids are available and well known to those of skill in the art, and, although more than one route can be used to administer a particular composition, a particular route can often provide a more immediate and more effective reaction than another route.

Pharmaceutically acceptable carriers are determined in part by the particular composition being administered, as well as by the particular method used to administer the composition. Accordingly, there is a wide variety of suitable formulations of pharmaceutical compositions of the present invention. Formulations suitable for oral administration can consist of (a) liquid solutions, such as an effective amount of the packaged nucleic acid suspended in diluents, such as water, saline or PEG 400; (b) capsules, sachets or tablets, each containing a predetermined amount of the active ingredient, as liquids, soUds, granules or gelatin; (c) suspensions in an appropriate liquid; and (d) suitable emulsions. Tablet forms can include one or more of lactose, sucrose, mannitol, sorbitol, calcium phosphates, corn starch, potato starch, tragacanth, microcrystalline cellulose, acacia, gelatin, colloidal silicon dioxide, croscarmellose sodium, talc, magnesium stearate, stearic acid, and other excipients, colorants, fillers, binders, diluents, buffering agents, moistening agents, preservatives, flavoring agents, dyes, disintegrating agents, and pharmaceutically compatible carriers. Lozenge forms can comprise the active ingredient in a flavor, usually sucrose and acacia or tragacanth, as well as pastilles comprising the active ingredient in an inert base, such as gelatin and glycerin or sucrose and acacia emulsions, gels, and the like containing, in addition to the active ingredient, carriers known in the art.

The packaged nucleic acids, alone or in combination with other suitable components, can be made into aerosol formulations (i.e., they can be "nebuhzed") to be administered via inhalation. Aerosol formulations can be placed into pressurized acceptable propellants, such as dichlorodifluoromethane, propane, nitrogen, and the like.

Suitable formulations for rectal administration include, for example, suppositories, which consist of the packaged nucleic acid with a suppository base. Suitable suppository bases include natural or synthetic triglycerides or paraffin hydrocarbons. In addition, it is also possible to use gelatin rectal capsules which consist of a combination of the packaged nucleic acid with a base, including, for example, liquid triglycerides, polyethylene glycols, and paraffin hydrocarbons. Formulations suitable for parenteral administration, such as, for example, by intraarticular (in the joints), intravenous, intramuscular, intradermal, intraperitoneal, and subcutaneous routes, include aqueous and non-aqueous, isotonic sterile injection solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation isotonic with the blood of the intended recipient, and aqueous and non-aqueous sterile suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, and preservatives. In the practice of this invention, compositions can be administered, for example, by intravenous infusion, orally, topically, intraperitoneally, intravesically or intrathecally. Parenteral administration and intravenous admimstration are the prefeπed methods of administration. The formulations of packaged nucleic acid can be presented in unit-dose or multi-dose sealed containers, such as ampoules and vials.

Injection solutions and suspensions can be prepared from sterile powders, granules, and tablets of the kind previously described. Cells transduced by the packaged nucleic acid as described above in the context of ex vivo therapy can also be administered intravenously or parenterally as described above. The dose administered to a patient, in the context of the present invention should be sufficient to effect a beneficial therapeutic response in the patient over time. The dose will be determined by the efficacy of the particular vector employed and the condition of the patient, as well as the body weight or surface area of the patient to be treated. The size of the dose also will be determined by the existence, nature, and extent of any adverse side-effects that accompany the admimstration of a particular vector, or transduced cell type in a particular patient.

In determining the effective amount of the vector to be administered in the treatment or prophylaxis, the physician evaluates circulating plasma levels of the vector, vector toxicities, progression of the disease, and the production of anti- vector antibodies. In general, the dose equivalent of a naked nucleic acid from a vector is from about 1 mg to 100 mg for a typical 70 kilogram patient, and doses of vectors which include a retroviral particle are calculated to yield an equivalent amount of therapeutic nucleic acid.

For administration, inhibitors and transduced cells of the present invention can be administered at a rate determined by the LD-50 of the inhibitor, vector, or transduced cell type, and the side-effects of the inhibitor, vector or cell type at various concentrations, as applied to the mass and overall health of the patient. Administration can be accomplished via single or divided doses.

In a prefeπed embodiment, prior to infusion, blood samples are obtained and saved for analysis. Between 1 x 10⁸ and 1 x 10¹² transduced cells are infused intravenously over 60-200 minutes. Vital signs and oxygen saturation by pulse oximetry are closely monitored. Blood samples are obtained 5 minutes and 1 hour following infusion and saved for subsequent analysis. Leukopheresis, transduction and reinfiision can be repeated are repeated every 2 to 3 months. After the first treatment, infusions can be performed on a outpatient basis at the discretion of the clinician. If the reinfiision is given as an outpatient, the participant is monitored for at least 4, and preferably 8 hours following the therapy.

Transduced cells are prepared for reinfiision according to established methods. See, Abrahamsen et al. (1991) J. Clin. Apheresis, 6: 48-53; Carter et al. (1988) J. Clin. Arpheresis, 4:113-117; Aebersold et al. (1988) J. Immunol. Meth., 112: 1-7; Muul et al. (1987) J. Immunol Methods,lOl: 171-181 and Carter et al. (1987) Transfusion 27: 362- 365. After a period of about 2-4 weeks in culture, the cells should number between 1 X 10 and 1 x 10¹². In this regard, the growth characteristics of cells vary from patient to patient and from cell type to cell type. About 72 hours prior to reinfiision of the transduced cells, an aliquot is taken for analysis of phenotype, and percentage of cells expressing the therapeutic agent.

EXAMPLES

The following examples are offered to illustrate, but not to limit the present invention.

Example 1 : Cloning and Expression of a Human ES Gene. This example describes the isolation of a complete human ΕSX cDNA sequence that encodes a putative protein of 371 amino acids. Briefly, a highly conserved eight amino acid motif within the carboxy (C)-terminal region of the ΕTS domain was identified and this motif was used to search a database of human epitheUum expressed sequence tags (ΕSTs). The database (dbΕST) contained >250,000 largely anonymous ΕSTs (Lennon et al. (1996) Genomics 33: 151-152. This search identified a partial cDNA sequence from fetal liver-spleen (GenBank locus T78501). Within this same database, were found two other unidentified but nearly identical partial sequences from normal mammary epithelium (GenBank locus R73021) and adult pancreas (GenBank locus T27397). Human placental polyA+ mRNA was used to generate a full-length cDNA sequence.

Experimental procedures.

Cloning of EST cDNA. The Basic Local Alignment Search Tool (BLAST) was used to search a database of expressed sequence tags (EST) using nucleotides derived from human Ets-2 that encode a highly conserved eight amino acid motif within the carboxy terminal region of the ETS domain (MNYEKLSR). The BLAST algorithm is described in Altschul et al. (1990) J. Mol. Biol 215: 403. This search identified a partial cDNA sequence from fetal liver-spleen (GenBank locus T78501) as a putative new member of the Ets family that was named ESX. Made available by I.M.A.G.E. Consortium and commercially obtained (Research Genetics, Inc.), this 1.1 kb partial cDNA sequence derived from fetal liver-spleen contains a polyA tail, approximately 0.7 kb of 3' untranslated sequence and a 5' region encoding the C- terminal 126 amino acids of ESX. Re-sequencing of T78501 revealed several eπors in its original GenBank sequence that would have disrupted the reading frame. A 5' RACE procedure (Frohman (1990) RACE: Rapid amplification ofcDNA ends, p 28 in PCR Protocols: A guide to methods and applications, Innis, et al., Eds. Academic Press, San Diego, CA) was performed using the Marathon cDNA amplification kit (Clontech Laboratories, Inc.) using placental polyA mRNA to clone the remaining 5' portion of ESX cDNA, which was estimated to be approximately 0.8 kb. Automated DNA sequencing of three independent clones of the expected length yielded identical results and 5' cDNA termination sites within 30 bases of one another. Melding these sequences with the amended T78501 sequence produced the open reading frame as shown in SEQ ED NO:l. To identify ESX domain homologies, performed BLAST searches of the SWISS-PROT and PER protein databases were performed.

ESX polypeptide production. DNA binding assay, and DNA footprinting assay.

Using primers incorporating the initiating methionine or the termination codon of ESX and designed with Nhel and HindlTI sites, respectively, PCR ampUfication was performed on double stranded placental cDNA (Clontech) to produce a fiiU-length ESX cDNA product which was subsequently cloned into the Nhel and HindEQ sites of a pRSETA His-tag expression plasmid (Invitrogen). Following sequence verification, an ESX expression clone in BL21(DE3)pLysS cells was used to produce ESX protein following 8M urea bacterial extraction, purification on ProBond resin (Invitrogen), and dialysis against PBS containing 10% glycerol. SDS polyacrylamide gel analysis indicated a 42 kDa protein with >90% purity.

Electrophoretic mobility shift assay (EMSA) was performed as previously described (Scott et al. (1994) J. Biol. Chem. 269: 19848-19858), using approximately one ng of ESX protein per condition and 0.3 pmol of end-labeled TA5 probe (+cold competitor). TA5 is a duplexed 31 -mer oligonucleotide from the HER2/neu promoter, extending from -50 bp to -20 bp relative to the major transcriptional start site, that includes an Ets response element. DNase I footprinting was performed on a 125 bp BssHII/Smal fragment from the HER2/neu promoter, labeled on the antisense strand at the Smal site. Reactions contained -10 ng of ESX protein with 1 unit of DNase-I acting for 1 min at room temperature. Reaction products containing ESX were electrophoresed on a 6% denaturing gel alongside a control reaction lane (minus ESX, lane C). Trans-activation of Ets-responsive gene expression by ESX.

Cultured COS cells were transiently cotransfected by calcium phosphate precipitation as previously described (Scott et al. (1994) J. Biol. Chem. 269: 19848-19858) using pcDNAl/Amp (Invitrogen) to express full-length ESX protein and either the thymidine kinase minimal promoter-CAT vector (pBLCAT5, from American Type Culture Collection) enhanced with 3 tandem (head-to-tail) upstream copies of TA5 (p3TA5-

BLCAT5) or a 700 bp AfUI/NcoI fragment from the HER2/neu promoter (containing two other putative Ets response elements upstream of the TA5 sequence) inserted into pCAT- Basic (Promega) to give pHER2-CAT. Mutant reporter plasmids, p3TA5P-BLCAT5 and pHER2m-CAT, were similarly constructed with the former possessing a GGAA to GAGA mutation within each of the tandem repeats and the latter retaining the two upstream promoter response elements intact but possessing a GGAA to TTAA Ets response element mutation within the TA5 sequence. Transfections, using 0.5 mg of reporter and 5 mg of expression plasmid, were repeated at least three times with the mean values (+SD) of CAT reporter activity (arbitrary units) as shown. Chromosomal localization.

Metaphase chromosomal localization and interphase copy number of ESX were determined by FISH analysis with a genomic ESX PI clone, using a previously described technique (Stokke et al. (1995) Genomics 26: 134-137). Northern hybridization.

Total cellular RNA was prepared by guanidinium isothiocyanate extraction (pH 5.5) as described previously (Scott et al, supra.) and blotted onto nylon membranes following electrophoresis through 1% formaldehyde agarose gels (~ 20 mg per lane). All blots were probed with a randomly primed 400 bp cDNA fragment from the C-terminal ESX coding region, and given final washes at 65°C in 0.2x SSC. Short exposure of the autoradiograph was used to demonstrate HRG induction of ESX in the overexpressing SK- BR-3 cells.

Detection of ESX expression by in situ hybridization. ESX sense and antisense riboprobes for in situ hybridization were generated by ³⁵S-labeling and run-off transcription using T7 or T3 RNA polymerase, respectively, from pT7T3 (Pharmacia) containing a 700 bp fragment of 3' untranslated ESX cDNA. Using previously described techniques (Wilkinson (1992) In situ hybridization: a practical approach, E L Press, Oxford), tissue hybridization and autoradiography were performed on thin sections of paraffin-embedded samples of normal mammary epithelium (n=3) and DCIS breast tumors (n=10). Samples were chosen according to their previously determined HER2/neu overexpression and amplification status (Liu et al. (1992) Oncogene 7: 1027- 1032) and for their RNA integrity and comparable levels of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) expression, as determined by preliminary in situ hybridization with an antisense probe for GAPDH. The hybridizations show only the antisense riboprobe signals resulting from ESX transcripts in the underlying hematoxylin-counterstained epithelial cells. ESX sense riboprobe was used to control for non-specific hybridization and autoradiography background signal using adjacent sections from each sample. The density oftftis background signal (from sense riboprobe) was nearly identical for the representative samples shown in this figure, representing less than one-tenth the antisense riboprobe signal density over the epithelial cells and comparable to that over the acellular stromal component of each sample.

Preparation of anti-ESX antiserum.

A peptide fragment consisting of the sixteen carboxy-terminal amino acids of ESX was synthesized for use as an ESX antigen in rabbits. An ammo-terminal cysteine was introduced to allow coupling of the peptide to a carrier protein (KLH). To obtain anti-ESX antibodies, total IgG from immunized rabbits was affinity purified on a column to which the ESX carboxy-terminal peptide fragment was bound. Results and Discussion.

Cloning of a human ESX cDNA.

The nucleotide and deduced amino acid sequences of a human ESX cDNA are shown in Figure 1. The cDNA includes an open reading frame that encodes a 371 amino acid ESX protein as shown in Figure 2 A. The C-terminal ETS DNA binding domain of ESX (aa 274-354) contains 27 of the 38 most highly conserved (consensus) residues found in the DNA-binding domain of all Ets family members (Figure 2D). This domain in ESX has its greatest homology with the Drosophila E74/human Elf-1 subfamily (nearly 50% identity, 70% similarity), although ESX has no homology with E74/Elf-1 outside the Ets DNA binding domain. The most obvious structural differences distinguishing ESX from other Ets family members are the five non-conservative changes in its DNA-binding domain consensus residues, including three within the first helix (al) that enhance basicity in a region likely to make critical contact with the minor groove phosphate backbone of bound DNA (Werner et al. (1995) Cell 83: 761-771; Kodandapani et al. (1996) Nature 380: 456- 460). Therefore, ESX may be assigned to the E74/Elf-1 subfamily on the basis of its sequence homology within the ETS domain (Lautenberger et al. (1992) Oncogene 7: 1713- 1719; Laudet et α .(1993) Biochem. Biophys. Res. Commun. 190: 8-14; Degnan et al. (1993) Nucl Acids Res. 21: 3479-3484; Wasylyk et al. (1993) Ewr. J. Biochem. 211: 7-18; Janknecht and Nordheim (1993) Biochem. Biophys. Acta. 1155: 346-356). In contrast to its two other subfamily members, however, ΕSX possesses an amino (N)-terminal A-region or Pointed domain, a helix-loop-helix structure that has been conserved from Drosophila to humans and retained within subfamilies remote to Ε74/Εlf-1 (Lautenberger et al, supra.; Wasylyk et al, supra.; Klambt (1993) Development 117: 163-176). The A-region in ESX (aa 64-103) is most similar to that found in Ets-1 (aa 69-106) with 65% similarity and 40% identity, including 7 of 9 consensus A-region residues (Figure 2B).

Additional features within ESX highUght the known plasticity of Ets proteins in regions outside of their ETS domain, reflecting >500 million years of evolutionary recombination and exon shuffling (Lautenberger et al, supra.; Laudet et al, supra.; Degnan et al, supra.; Wasylyk et al, supra.). ESX has one of the shortest C-terminal tails (16 aa) of all Ets family members. While this terminal sequence has no significant homology to any known eukaryotic gene product, it is over 50% identical and 85% similar to a highly conserved element within the Ross River (aa 194-207) and Semliki Forest (aa 197-210) virus-encoded nsPl protein, which is required for membrane-bound initiation of RNA synthesis, replication and the subsequent pathogenicity of these New World RNA alphaviπises (Strauss and Strauss (1994) Microbiological Rev. 58: 491-562). Contained within the N-terminal flanking region of the ESX DNA-binding domain is a serine-rich track of 51 residues (aa 188-238) that is 35% identical to the conserved polyserine transactivating domain of the lymphocyte-restricted HMG-box protein, SOX4 (aa 370-420) (VandeWetering et al. (1993) EMBO J. 12: 3847-3854). Polyserine domains are known to act as strong transactivators, presumably, as in the case of p65NF-kB (aa 530-560), by forming amphipathic helical structures in which the serines are clustered opposite a hydrophobic face (Seipel et al. (1992) EMBO J. 11: 4961-4968; Schmitz and Baeuerle (1991) EMBO J. 10: 3805-3817), as shown in a helical wheel model of the serine box in ESX (Figure 2C).

ESX binding to and transactivation of HER2/neu Ets response element.

Earlier studies have demonstrated that the HER2/neu oncogene, which is activated by overexpression in >40% of DCIS early breast tumors (Liu et al, supra.), contains a highly conserved Ets responsive element in its proximal promoter (Scott et al., supra.). Therefore, an oligonucleotide (TA5) containing the Ets response element from HER2/neu was used to assess DNA-binding and transactivation by ESX. BacteriaUy expressed full-length ESX demonstrates high-affinity, sequence-specific binding to TA5 by electrophoretic mobility shift assay (EMS A), as shown in Figure 8A. Unlike EMS A results for other Ets proteins known to contain flanking regions that restrict DNA-binding (Jonsen et al. (1996) Mol. Cell. Biol. 16: 2065-2073), full-length ESX binds DNA with comparable affinity to that of truncated ESX (aa 271-371), consisting primarily of the ESX DNA- binding domain. As seen with other Ets factors, DNA probes with mutations in the GGAA Ets core of TA5 fail to compete against TA5 for ESX binding, while those with mutations flanking the GGAA core are relatively effective at competing for ESX binding. To confirm that ESX binds DNA in an Ets-like manner, ESX footprinting was performed on a larger HER2/neu promoter fragment overlapping the TA5 sequence and its GGAA core response element. Characteristic of DNA-bound Ets proteins, ESX produces a DNase-I hypersensitive site embedded within a footprint on the antisense strand of the core response element (Figure 8B). The transactivating potential of ESX was then determined by cotransfecting

COS cells with an ESX expression plasmid and either of two different Ets-responsive reporter genes: a minimal promoter construct enhanced by 3 tandem head-to-tail copies of TA5 from the HER2/neu promoter, or -0.7 kb of the wild-type HER2/neu promoter driving the chloramphenicol acetyl transferase (CAT) reporter gene. Exogenously introduced ESX significantly increases CAT expression from both constructs, but only when the core Ets response element is intact and not mutated, confirming the Ets-specific transactivating potential of ESX (Figure 8C).

Chromosomal localization. To obtain further insight into the evolutionary mechanisms of Ets dispersion during the metazoan radiation of this multigene family, we mapped the chromosomal location of the human ESX gene and found that the gene is located next an unrelated subfamily member. About 10 of the known human Ets genes have been chromosomally mapped and half of these occur as a tandem linkage of dissimilar subfamily members at two general loci (21q22 for Ets2, Erg, and GABPa; 1 lq23 for Etsl and Flil), supporting a proposed model in which duphcation of an ancestral Ets was followed by duphcation and transposition of the Ets pair to another chromosome (Lautenberger et al, supra.; Laudet et al, supra.; Degnan et al, supra.; Wasylyk et al, supra.).

An ESX clone isolated from an aπayed PI library was used to map ESX to chromosome lq32 by fluorescence in situ hybridization (FISH) (Figure 8D). Since SAPl (also known as ELK4, a member of the SAP/Elk/Net subfamily) was recently mapped to lq32 (Shipley et al. (1994) Genomics 23: 710-711; Giovane et al. (1995) Genomics 29: 769- 772), ESX and SAPl now represent the third known set of tandemly linked human Ets genes. While the chromosomal location of Elf-1 (subfamily homolog of ESX) is not presently known, it is tempting to speculate that it will be linked to another SAP/Elk/Net subfamily member, in accordance with the evolutionary model for the generation of the Ets- 1/Fli-l and Ets-2/Erg loci.

Southern blotting suggested the presence of excess ESX gene copies in several breast cancer cell lines known for their amplification of HER2/neu (e.g. SK-BR-3, BT-474). Therefore, FISH analysis was also performed on these cells. As shown in

FigureδD, ESX amplification in these cell lines results predominantly from an increase in chromosome lq copy number (aneusomy). While gene amplification is not thought to be a common mechanism by which Ets proto-oncogenes become activated (Wasylyk et al, supra.; Janknecht and Norheim, supra.), multiple copies of DNA sequences mapping across the lq32 locus can be observed in about 50% of early breast tumors (Isola et al. (1995) Am. J. Pathol 147: 905-911). Apart from two other more centromeric proto-oncogenes on this chromosome arm, SKI at lq22-24 and TRK at lq23-24 (Chaganti et al. (1986) Cytogenet. Cell Genet. 43: 181-186; Morris et al. (1991) Oncogene 6: 1093-1095, ESX and SAPl represent likely oncogene candidates accounting for this lq amplification in human breast tumors.

Expression of ESX.

Many human Ets exhibit a tissue-restricted pattern of gene expression, with some family members showing greater tissue specificity than others (Wasylyk et al, supra.; Janknecht and Norheim, supra.). Northern blots of normal human tissue demonstrate that ESX mRNA expression is restricted to tissues of epithelial origin, with little if any expression detectable in testes, ovary, brain, skeletal muscle, or lympho-hematopoietic tissues (spleen, thymus, white blood cells). PEA3, by comparison, the only other epithelium-restricted Ets, is expressed in a subset (5 of 9) of the ESX-positive tissues (data not shown); expression of both in normal heart leaves open to question the endo-, myo-, or peri-cardial component of this tissue that is the source of ESX and PEA3 transcripts.

When a panel of human breast cancer cell lines was compared for ESX expression with normal human mammary epithelial cells (HMEC), ESX mRNA was increased in the HER2/neu-positive tumor lines and not increased in the HER2/neu-negative lines. Two immortalized but non-transformed mammary cell lines (HBL100, MCF10A) expressed ESX mRNA at levels similar to or below that of HMEC. To explore the possible relationship between ESX overexpression and HER2/neu activation, ESX mRNA was measured in cultured SK-BR-3 cells after treatment with the ligand heregulin-bl 1-244 (HRG), known to initiate mitogenic signaling in these cells by activation of HER2/neu receptor tyrosine kinase in association with ErbB3 (Holmes et al. (1992) Science 256: 1205- 1210; Li et al (1996) Oncogene 12: 2473-2477). ESX mRNA increased within 15 min of HRG treatment, achieving peak levels between 60 and 120 min. These results indicate that ESX induction is an immediate early gene response to HER2/neu activation, supporting a signaling link between ESX and HER2/neu gene function.

Since HER2/neu activation occurs early during human breast tumorigenesis and with development of DCIS, evidence of early ESX overexpression was screened for by in situ hybridization in DCIS tumor samples previously characterized as HER2/neu-positive with regard to ampUfication and overexpression relative to that of normal breast epithelium (Liu et al, supra.). ESX expression was restricted to normal and mahgnant mammary ductal epithelium with no ESX expression detectable in breast stroma, including its reticulo- endotheUal cell and inflammatory/lymphocytic cell components. Consistent with ESX overexpression observed in HER2/neu amplified breast cancer Unes, ESX transcript levels in HER2/neu-positive DCIS were markedly increased relative to that of normal breast epithelium. These tissue hybridization studies indicate that overexpression of ESX, as with HER2/neu, may occur early during development of human breast tumors.

Since ESX can transactivate the HER2/neu promoter, one potential mechanistic link may be explored by interfering with transcriptional regulation at the Ets response element on this promoter (Noonberg et al. (1994) Gene 149: 123-126). Also, preliminary studies suggest that activated HER2/neu increases Ets-mediated gene expression via Ras signaling and that this can lead to feedback upregulation of Ets transcription (Galang et al (1996) J. Biol. Chem. 271: 7992-7998; OΗagan et al. (1996) Amer. Assoc. Cancer Res. 37: 3575. Thus, there is compelling rationale to establish the prevalence and mechanistic role of ESX overexpression in breast tumors as well as other human malignancies of epithelial origin.

Anti-ESX antibodies.

In a Western blot analysis, anti-ESX polyclonal antibodies prepared as described above specifically recognized purified recombinant ESX protein (-42 kD), as well as a similar sized protein in whole cell extracts. The intensity of the ESX band in samples prepared from whole cell extracts was coπelated with cellular ESX mRNA levels.

The anti-ESX antibodies also function to immunoprecipitate a single -42 kD ESX protein band from 35S metabolically labeled cells.

Example 2: Cloning and analysis of murine ESX. A 8FIX2 genomic library from strain 129 mouse DNA was screened using a

5' cDNA probe from hESX to isolate a clone from which a 7,751 bp fragment was subcloned into Bluescript and sequenced. A fully encoding mESX cDNA clone was derived from total RNA of 129 mouse ES cells by reverse transcription PCR (RT-PCR) using specific primers extending 5' and 3' from the putative ATG-start and TAA-stop codons, respectively of the genomic sequence. A Bluescript subclone containing this 1,116 bp mESX cDNA was similarly sequenced. All sequencing was performed on an ABI Prism Automated DNA sequencer (model 377) using 3 '-dye labeled ddNTP terminators. The full length mouse ESX genomic sequence is provided in SΕQ ΕD NO: 11.

Alignment of genomic and cDNA mΕSX sequences as well as comparison of mΕSX vs hΕSX homologous sequences were used to determine exon and intron boundaries (see, Fig. 7). Conserved murine and human promoter elements as well as putative amino acid domain homologies were identified from PΕR-protein, SWISS-PROT, and PROSITΕ databases by GCG computer search (Genetics Computer Group, Wisconsin Package 3.0, Madison, WI). A 7.8 kb mESX genomic clone was isolated that contains -2.9 kb of promoter upstream of -4.9 kb of DNA incorporating at least 9 exons. These specify a full-length transcript of -2 kb, with exons 2-9 encoding the 371 amino acid mESX protein (see Fig. 3).

The following putative structural and/or functional domains within the 42 kDA ESX protein were conserved between mouse and human (Fig. 4):

An Exon 3 encoded POINTED/ A-region found in a small subset of all Ets;

An amphilic helix and serine rich box encoded by exons 5 and 6;

A nucleoplasmin-type nuclear targeting sequence encoded by exon 7; and

A helix-turn-helix Ets DNA-binding domain encoded by exons 8 and 9. A comparison of the human ESX and mouse ESX genomic DNA structure is shown in Figure 6.

The proximal promoter region of mESX (350 bp upstream of transcriptional start site) was 83% homologous to the hESX promoter (Fig. 5). Conserved putative response elements within this region include Ets, AP-2, SP1, USF, Oct, and NF-6B binding sites. A conserved CCAAT box lies -80 bp upstream of the pyrimidine-rich Inr element which specifies ESX transcript initiation. Unlike hESX, mESX lacks a TATA box.

The comparison of mESX and hESX genomic and cDNA sequences supports a modular model of ESX primary structure in which putative protein domains, first suggested by homology with other proteins, are now shown to be highly conserved and derived from individual exons or exon pairs.

Example 3: Embryo and mammary epithelia cell expression of ESX.

Whole mount analysis of mammary gland morphology was performed as described by Smith (1996) Breast Cancer Res. Treat., 39: 21-31. Endogenous mESX transcripts were detected by Northern blotting using a 5' specific mESX cDNA probe. Mouse embryos exhibited progressive induction of mESX transcription after

7 days of age, with 17 day levels approximately 10-fold higher than those of 11 day old embryos. ESX mRNA, undetectable in virgin mouse mammary glands, was induced during pregnancy in association with progressing ductal morpohogenesis, branching and lobuloalveolar differentiation. ESX then declined to undetectable levels during lactation, but increased dramatically with 3 days of weaning when milk secretion stops, alveolar epithehum involutes by apoptosis, and glandular remodeling occurs leaving a more mature ductal epithelium system ready for subsequent pregnancy. These data suggest that ESX has a primary role in directing ductal epithelial proliferation and migration in preparation for lobuloalveolar differentiation. Example 4: Transgenic hESX model.

MMTV-hESX transgenic mice were produced by implanting foster mothers with fertilized eggs microinjected with a full-length hESX expression construct, driven by the MMTV LTR and containing the polyA signaling and splicing sequence from SV40). (The MMTV promoter is well described (Huang et al. (1981) Cell, 27: 245-255). In addition, the use of MMTV-LTR for targeted expression of transgenes to the mammary gland of mice and other animals is described in detail in Webster and Muller, (1994) Sem. Cancer Biol, 5: 69-76). hESX transgene expression was detected using a probe specific for the SV40 polyA sequence and confirmed by nested RT-PCR analysis using 5' primers specific for hESX and 3 ' primers specific for the S V40 polyA sequence.

Founder (F₀) lines created as described in Example 3, were tested for transgene presence. Fourteen of fortyone animals carried the transgene. The Founder animals were then mated and 155 day pregnant F] females were then tested for mammary gland expression of hESX mRNA. Total RNA was extracted from the mammary glands of 15 day pregnant MMTV-hESX transgenic Ε_\ mice. A northern blot of 10 μg of the RNA was probed for sequences specific to the SV40 polyA-containing hESX transcript.

Mammary gland morphology in an MMTV-hESX expressing transgenic mouse appeared abnormal, showing retardation of lobuloalveolar development during pregnancy (15 day, first pregnancy). This morphologic abnormality suggests that failure to turn of ESX in progenitor epithelial cells and alveolar buds leads to continued ductal growth with interrupted mammary gland maturation.

Example 5: ESX is a transcriptional activator .

To prove that ESX upregulates genes (vs. transcriptionally repressing them), many different hESX-Gal4 fusion constructs were produced in which the DNA-binding domain (DBD) of the yeast Gal4 was chimerically expressed with various portions of human ESX (see, Fig. 9) (for a general description of the method see, e.g., White and Parker (1993) Analysis of cloned Factors, In Transcription Factors: a practical approach; D.S. Latchman, ed.; L Press at Oxford Univ. Press, Oxford). These fusion constructs were then co- transfected into human breast cancer cells along with a Gal4 binding luciferase reporting expression construct in order to find either an ESX transactivating or repressing domain. A similar Gal4-VP16 construct was used as positive control since the VP16 transactivating domain from Herpes Simplex virus is acknowledged to be one of the strongest of all known transactivators. ESX transactivated as strongly as VP16 ( I I i I I ) (see, Fig. 9) and the minimal ESX domain necessary for this activity is encoded by exon 4 (aa 129-159), an acidic domain containing a central lysine residue (K-145). Subsequent mutations of this domain established that the central K-145 is essential and provides nearly 1000-fold transactivation potency (relative to a neutral residue placed there).

A database revealed that the exon 4-encoded domain is homologous to an essential core domain of all known Topoisomerase I molecules (Stewart et al. (1996) J. Biol. Chem. 271:7602-7608; Pommier (1996) Sem. Oncology 23: 3-10). Since human Topo-I is a critical intracellular target for the newest and most exciting family of camptothecin-like anticancer agents (like Topotecan, CPT-11, 9 AC, etc.; see reviews), this information not only provides important clues as to the molecular transactivation mechanism of ESX, but it indicates that this particular ESX domain may be used to search for or screen (from libraries of chemicals or natural products) for even newer and more effective and selective anticancer agents. Existing Topo-I agents target a very different, C-terminal conserved domain in the Topo-I enzyme; as yet, there is no specific function assigned to the highly conserved Topo-I Core domain which is homologous to the ESX transactivation domain.

Example 6: The epithelium-specific Ets transcription factor ESX is associated with mammary gland development and involution. In this example, in order to study mammary gland expression of the epitheUum-restricted Ets factor, ESX, mouse cDNA and genomic sequences were cloned and a -350 bp proximal promoter region with >80% mouse-human homology was identified that mediates ESX induction by serum, heregulin (HRG) or epidermal growth factor (EGF). ESX mRNA expression progressively increases during embryonic mouse development from day 7, is detectable in virgin mammary glands and shows little if any change during pregnancy, then declines to barely detectable levels following 3 days of lactation. Similarly, cultured HC11 cells from midpregnant mouse mammary epithehum show an increase in ESX expression upon reaching lactogenic competency (in the presence of EGF or HRG), with a decline to barely detectable levels upon exposure to lactogenic hormones which induce milk protein (β-casein) expression. In contrast, involuting mouse and rat mammary glands show maximal ESX expression. High ESX levels are also seen in the involuting ventral prostate gland of rats. These findings, including the persistence of upregulated ESX in fully regressed mammary glands, suggest that ESX expression can be induced by soluble growth factors and is maximally upregulated in those partially committed epithelial cells that are destined to survive both the apoptotic and remodelling phases of glandular involution. Introduction.

Ets transcription factors regulate stage- and tissue-specific gene programs in fetal development and are overexpressed or reaπanged in a variety of vertebrate and human malignancies (reviewed in Wasylyk and Nordheim (1997) Ets transcription factors: partners in the integration of signal responses. In Transcription Factors in Eukaryotes (Papavassiliou AG, ed.) pp. 253-286, Springer- Verlag GmbH & Co. KG., Heidelberg, Germany; Hromas Klemsz (1994) Int. J. Hematol 59: 257-265; and Macleod et al. (1992) Trends Biochem. Sci. 17: 251-256.). We recently cloned and characterized a novel 42 kDa Ets factor, ESX (Epithelial-restricted with Serine boX), which is transcriptionally upregulated in a subset of early breast tumors and breast cancer cell lines and is thought to transactivate the Ets responsive mammary gland oncogene, erόB2 (Scott et al. (1994) J. Biol. Chem. 269: 19848-19858; Chang et al. (1997) Oncogene 14: 1617-1622). Subsequent to this report, four groups have published on the potential biological and developmental importance of this epithelium-specific Ets factor (variably named ESE-1, Elf-3, Jen, or ERT; now identified ESX in HUGO/GDB:6837498) in non-mammary epithelial systems, where ESX is thought to transactivate such genes as the transforming growth factor-β type II receptor (TGF-βRH), Endo-A/keratin-8, and several markers of epidermal cell differentiation including transglutaminase 3, SPRR2A, and profilaggrin (Oettgen et al. (1991) Mol. Cell. Biol. 17: 4419-4433; Tymms et al. (1997) Oncogene 15: 2449-2462; Andreoli et al (1997) Nucleic Acid Res. 25: 4287-4295; Choi S-G et al. (1998) J. Biol Chem. 273: 110-117.).

While its expression profile suggests that ESX is associated with development of both simple and stratified epithelium (Davies and Gaπod (1997) BioEssays 19: 699-704), detailed studies have been performed only in the latter and these have shown that ESX is unique among transcription factors generally, and Ets factors specifically, for its restricted expression in the most terminally differentiated of epidermal cells (Oettgen et al. (1997) Mol. Cell. Biol 17: 4419-4433; Andreoli et al (1997) Nucleic Acid Res. 25: 4287- 4295). A limited in situ analysis of normal human mammary tissue demonstrated low but heterogeneous levels of ESX transcript expression restricted to the polarized simple epithehum of ductules and terminal ductal-lobular units (5). To evaluate ESX expression during all differentiation stages of mammary epithelium, mouse ESX cDNA and genomic sequences were first cloned and compared to their human counterparts, and then used to study postnatal rodent models of mammary gland development. The inductive influences controlling ESX expression were explored by transient transfection of an ESX promoter- reporter construct into a breast cancer cell line (SKBr3) responsive to heregulin-β (HRG) and epidermal growth factor (EGF). RNA samples probed for ESX expression were derived from different stages of cultured HC 11 mouse mammary epithelial cells, first made competent for lactogenesis and then hormonally induced to synthesize β-casein (Ball et al. (19SS) EMBO J. 7: 2089-2095; Marte et al. (1995) Mol En docrinol 9: 14-23). These results were compared to ESX Northern blots of virgin, pregnant, lactating, and involuting mouse mammary glands. Lastly, mouse and rat mammary glands collected during involution were also compared to the involuting ventral prostate gland of rats to demonstrate that maximal induction of ESX occurs during this stage of glandular regression, suggesting an association with epithelial apoptosis.

Methods.

Comparison of murine and human ESX genomic and cDNA sequences. A λFEXII 129SN mouse genomic library (Stratagene) was screened using a 5' cDΝA probe from hESX (5; Genbank accession number U66894) to isolate a clone from which a 7.75 kb BamHI fragment was subcloned into pBluescript SK (Stratagene). Upon full sequencing this genomic clone was found to contain 3.6 kb of sequence upstream from the ATG-start codon (beginning exon 2), about 2.9 kb upstream of the transcriptional start site. The deduced organization of 9 exons (8 coding) and 8 introns spanning 4.9 kb of genomic sequence was subsequently found to be similar to that reported by Tymms et al. (1997) Oncogene 15: 2449-2462. This mESX genomic sequence was compared to a previously isolated and fully sequenced 1.8 kb BglTI-Bglll human genomic clone containing 1.5 kb of hESX promoter sequence upstream of exon 1 and the 5' half of intron 1. A 1.1 kb Bluescript subclone encoding the entire mESX cDΝA was derived from 129SN mouse ES cell total RΝA by RT-PCR using specific primers extending 5' and 3' from the respective ATG-start and TAA-stop codons in the genomic sequence, and the entire cDΝA subclone was sequenced. All sequencing was performed on an ABI Prism Automated DΝA Sequencer (model 377) using 3 '-dye labelled ddΝTP terminators. Computer alignments of genomic and cDΝA mESX and hESX sequences were performed, and comparison of genomic and cDΝA mESX sequences were used to determine exon and intron boundaries. Conserved murine and human promoter elements as well as putative amino acid domain homologies were identified from PIR-protein, SWTSS-PROT and PROSITE databases by GCG computer search (Genetics Computer Group, Wisconsin Package 3.0, Madison, WI). Growth factors and tissue culture conditions.

Recombinant human EGF was commercially obtained (Sigma). Recombinant human HRG isoforms were kindly provided (Amgen; βl isoform 177-228) or commercially obtained (NeoMarkers; full-length βl isoform), with no significant difference in activity detected between the truncated and full-length β 1 isoforms. SKBr3, MCF-7 and MDA-435 breast cancer cell lines (Chang et al. (1997) Oncogene 14: 1617-1622, Daly et al. (1997) Cancer Res. 57: 3804-3811), and NEH3T3 mouse fibroblasts, were all maintained in DMEM medium (Life Technologies, Inc.) supplemented with 10% fetal calf serum (FCS). HCl 1 cells, derived from midpregnant B ALB/c mouse mammary gland tissue, were maintained in culture using a growth medium consisting of RPMI-1640, 10% heat-inactivated fetal calf serum, 5 μg/ml bovine insulin, and either 2 nM HRG or 2 nM EGF (Ball et al. (1988) EMBOJ. 7: 2089-2095.12. Marte et al. (1995) Mol Endocrinol 9: 14-23). HCl 1 cells were induced into lactogenic competency by culturing them in growth media and then maintaining them at confluency for 3 d (Marte et al. (1995) Mol. Endocrinol. 9: 14-23). These competent cultures were then induced to terminally differentiate and produce β-casein by incubation for 1-6 d in DIP induction medium (RPMI-1640, 5 μg/ml ovine prolactin, 5 μg/ml insulin, and 1 μM dexamethasone).

Northern blot analysis of cell and tissue RNA samples.

RNA samples included commercial blots of polyA-RNA from 7d to 17d mouse embryos (Clontech) and total RNA extracted from HCl 1 cell cultures, excised mouse (BALB/c) and rat (Sprague Dawley) inguinal mammary glands (virgin, pregnant, lactating, and involuting), and excised rat ventral prostate glands (pre- or post-castration). Extractions of total RNA were performed on snap frozen (liquid nitrogen) cell pellets or excised glands (Bielke et al. (1997) Cell Death and Differentiation 4: 114-124), using either the guanidinium isothiocyanate or Trizol (Gibco BRL) methods. When indicated, polyA- enriched RNA from mammary or prostate tissue was prepared using oligo dT-cellulose (Boehringer Mannheim). Either 10 μg of total RNA or 5 μg poly(A)-enriched RNA/sample were electrophoresed into 1% agarose gels and transfeπed onto either nylon (Zeta-probe, BioRad) or nitrocellulose filters which were then UN crosslinked using a Stratalinker 1800 (Stratagene). After ethidium bromide or acridine orange staining to quantitate transfer of 18S and 28S RΝA, filters were hybridized with a randomly primed and ³²αP-dATP labelled 300 bp cDΝA fragment from the Ν-terminal mESX coding region, and given final washes at 65° C in 0.2x SSC prior to autoradiography. ESX promoter activation in transient transfection assay.

Luciferase (luc) reporter constructs (in pGL2-Basic Vector; Promega) containing either 0.4 kb (-349 bp to +61 bp) of mESX proximal promoter having >80% homology to hESX (mESX-luc) or 1.1 kb (-349 bp to +704 bp) of proximal promoter with additional 5' untranslated sequence up to the ATG initiation codon (ΠIESX -IUC), were constructed by PCR amplification from the murine genomic clone. Transient transfection of mESX-luc reporter (1 μg DNA) in 6 μl of Lipofectamine (Gibco BRL) in serum-free medium (SFM) was performed into replicate tissue culture wells containing 60% confluent (~lxl0⁵) cells. After 5 h, the Lipofectamine containing media was replaced with SFM and 12 h later cell cultures were induced with 10% serum-containing media + growth factor

(HRG or EGF) at the indicated concentrations. The transiently transfected cell cultures were then harvested at 0- 24 h following serum + growth factor induction, extracts prepared and luciferase activity measured as recommended by the vendor (Promega). Results. We isolated a 7.8 kb mESX genomic clone containing 4.9 kb of sequence specifying 9 exons (exons 2-9, coding) and 8 introns, consistent with the recently described genomic structure of mESX (Tymms et al. (1997) Oncogene 15: 2449-2462). Alignment of this genomic sequence with that determined from the 1.1 kb cDNA clone allowed us to compare the primary structures of murine and human ESX as shown in Figure 10. This comparison of the primary structures reveals 87% amino acid identity, and also maps the 7 exon boundaries within the encoded 371 amino acids of ESX.

Comparing the 2.9 kb of mESX promoter-containing sequence with that of a formerly cloned 1.5 kb hESX promoter-containing genomic fragment (Chang et al. (1997) Oncogene 14: 1617-1622), and aligning both with reference to exon 1 and a previously determined hESX 5'UTR sequence (Oettgen et al. (1997) Mol. Cell. Biol. 17: 4419-4433), showed <50% bp homology between the most upstream genomic sequences (-1500 bp to - 350 bp). In contrast, the proximal ESX promoter regions (-350 bp to +50 bp) showed 83% homology at the nucleotide level between mouse and human genes, demonstrated in Figure 11. The notable features in this proximal promoter region include conservation of 6 different consensus response elements (Ets, AP-2, SPl/GC box, USF, Oct, NF-κB), a CCAAT box at -75 bp, and a putative pyrimidine-rich initiator element (Inr) capable of specifying transcript initiation from the TATA-less murine promoter.

To verify that this homologous region of the proximal promoter can confer growth factor induced transcriptional upregulation of mESX, as is known to occur in hESX overexpressing breast cancer cells (Chang et al. (1997) Oncogene 14: 1617-1622), the activities of two different mESX promoter-reporter constructs (0.4 kb mESX-luc and 1.1 kb ΠIESX_L-I C) were assessed by transient transfection into cultured cells expressing negligible (NIH3T3), low (MCF-7, MDA-435) or high (SKBr3) levels of endogenous ESX (Id.). Since no significant differences in promoter activity were observed between the 1.1 kb ΓΠESX_L-IUC and the 0.4 kb mESX-luc constructs, the smaller mESX-luc construct was used for all consequent experiments. The negligible and low ESX expressing cell lines consistently showed minimal reporter activity unresponsive to culture stimulation by serum + growth factors (NTH3T3, MDA-453) or estradiol (MCF-7). In contrast, mouse NIH3T3 cells engineered to overexpress human ErbB receptor pairs (ErbB2 + ErbBl = NE2/1 cells; ErbB2 + ErbB3 = NE2/3 cells; ErbB2 + ErbB4 = NE3/4 cells) and with intracellular signaling upon appropriate ErbB ligand stimulation, showed ligand inducible increases in mESX promoter activity in the presence of serum. NE2/1 cells produced mESX-luc reporter upregulation in response to EGF while NE2/3 and NE2/4 cells responded similarly to HRG (data not shown), demonstrating the functionality of this ectopic mESX promoter within mouse cells activated by human ErbB receptors.

SKBr3 cells, which overexpress ErbB2 and also moderately express ErbBl and ErbB3 receptors, were used to study mESX promoter induction since they were known to produce an immediate increase in endogenous ESX transcripts following culture exposure to HRG (5). Treatment of SKBr3 for various intervals (0-24 h) produced serum and growth factor (HRG, EGF) inducible increases in mESX-luc reporter activity. Serum supplementation alone produced a 3- to 4-fold maximal induction of promoter activity which peaked within 8 h of treatment and then declined to near serum-free basal promoter activity by 24 h. When these SKBr3 cells were treated with 1 nM HRG in addition to serum- supplementation, a 7-fold peak induction over basal promoter activity was observed at 8 h, and promoter activity was still elevated nearly 4-fold over basal levels 24 h after treatment. HRG concentrations from 0.1 nM to 2 nM produced comparable enhancements in mESX promoter activity after 8 h, ranging from 2- to 3 -fold over the peak activity produced by serum alone. At this same time point, concentrations of EGF up to 4 nM (in serum- supplemented media) also enhanced mESX promoter activity to a similar (although slightly lessor) degree as HRG (Figure 3B). In the absence of serum, neither HRG nor EGF produced any significant mESX promoter induction. Insulin (5 μg/ml + serum- supplementation) had no significant impact on mESX promoter activity in SKBr3 cells. Epithelial-specific ESX mRNA expression has been shown for various mouse tissues after fetal day 17, but not during earlier embryonic development or during adult mouse mammary gland differentiation (Tymms et al. (1997) Oncogene 15: 2449-2462). As shown on the Figure 4 Northern blot, mouse embryos exhibit progressive induction of a 2.2 kb ESX transcript after fetal day 7 with 17 day transcript levels about 10-fold higher than those of 11 day old embryos, which is consistent with the earliest onset of epithelial differentiation and progressive fetal growth of epithelial organs and tissues. Of interest, prior, to day 17 embryos show no detectable evidence of the alternatively spliced larger ESX transcript (3.8- 4.1 kb) noted in later stage fetal and adult organs and malignant tissues (Chang et al. (1997) Oncogene 14: 1617-1622; Oettgen et al. (1997) Mol. Cell. Biol. 17: 4419-4433; Tymms et al (1997) Oncogene 15: 2449-2462; Andreoli et al. (1997) Nucleic Acid Res. 25: 42%! -4295).

Postembryonic mammary gland expression of ESX was evaluated in 3 separate experiments where RNA was isolated from mouse glands taken at various stages of differentiation including that of virgin, pregnant, lactating, and involuting mammary glands. In general, a basal level of ESX expression was seen in virgin and first-pregnancy glands, which declined to undetectable levels after 2-3 days of lactation, and then increased to maximal levels following weaning and involution. In a representative stage-specific Northern blot profile of ESX expression RNA samples from 8-12 day involuting glands revealed persistently high ESX expression. These later time points are beyond the active phases of mammary gland involution and after most of the molecular and histologic coπelates of apoptosis and tissue remodelling have already peaked (Bielke et al. (1997) Cell Death and Differentiation 4: 114-124; Strange et al. (1992) Development 115: 49-58; Wilde et al. (1997) Mammary apoptosis: physiological regulation and molecular determinants. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 103-114, Hannah Research Institute, Ayr, Scotland). A fully regressed mouse mammary gland resected 8 weeks after weaning also showed maximal ESX expression comparable to peak transcript levels observed within the first 12 days of involution.

A panel of rat mammary gland RNA samples was also probed and confirmed this stage-specific profile of ESX expression (Figure 6). Rat mammary ESX expression was basal during pregnancy, undetectable during lactation, and showed re-induction to maximal levels within 3 days of weaning and involution. Given the fact that normal prostate expresses ESX (5) and that regressing prostatic tissue shows morphological and biochemical features similar to involuting mammary tissue (14), we looked for changes in ESX expression during castration-induced involution of the adult rat ventral prostate. As with the rodent mammary tissue, rat prostate expression of ESX appears highest during glandular involution.

Detailed studies in stratified epithelium have shown that ESX expression is restricted to the most terminally differentiated epidermal keratinocytes (Oettgen et al (1991) Mol. Cell. Biol. 17: 4419-4433; Andreoli et al. (1991) Nucleic Acid Res. 25: 4287- 4295). Since ESX transcripts decline to undetectable levels during lactation, when the mammary gland is composed of fully differentiated secretory epithelium, we tried to simulate this in vivo observation using cultured HCl 1 cells which can be induced into lactogenic competency on exposure to growth factors, and then hormonally stimulated to differentiate and produce the milk protein, β-casein (Ball et al (1988) EMBO J. 1: 2089- 2095; Marte et al. (1995) Mol Endocrinol. 9: 14-23). Proliferating HCl 1 cells express basal levels of ESX until they reach confluence and a state of lactogenic competence (2-3 d after culture confluence), at which point ESX expression increases dramatically. Upon growth factor (HRG or EGF) withdrawal and administration of lactogenic hormones (DEP induction medium), these competent and terminally differentiating cells express increasing amounts of β-casein while ESX transcript levels fall concuπently to basal levels.

Discussion.

The >30 known metazoan members of the Ets family of transcription factors are recognized for their roles in embryonic development and tissue maturation where they direct stage-specific and tissue-restricted programs of gene expression, targeted by a highly conserved -85 amino acid Ets DNA binding domain (Wasylyk and Nordheim (1997) Ets transcription factors: partners in the integration of signal responses. In Transcription Factors in Eukaryotes (Papavassiliou AG, ed.) pp. 253-286, Springer-Verlag GmbH & Co. KG., Heidelberg, Germany; Hromas Klemsz (1994) Int. J. Hematol. 59: 257-265; Macleod et al. (1992) Trends Biochem. Sci. 17: 251-256). As a new member of this family, the 371 amino acid ESX transactivator possesses a typical Ets DNA binding domain located in its C- terminal region (Chang et al. (1997) Oncogene 14: 1617-1622; Oettgen et al (1991) Mol. Cell. Biol 17: 4419-4433; Tymms et al. (1997) Oncogene 15: 2449-2462; AndreoU et al (1997) Nucleic Acid Res. 25: 4287-4295). Outside of this conserved DNA binding domain, ESX contains several other structural motifs not found in other Ets proteins, a situation thought to result from >500 million years of evolutionary recombination and exon shuffling (Id.). The present findings support a domain-based modular structure for ESX as shown by the overall high degree of amino acid sequence homology (87% identity) between mouse and human ESX, and the fact that all putative domains within ESX are encoded by only one or two exons (Figure 1). In addition to the exon 8- and 9-encoded Ets DNA binding domain, these other structural modules include the exon 3 -encoded Pointed (B-region) domain, the exon 5- and 6-encoded amphipathic helix and serine-rich box, and the exon 7-encoded bipartite nuclear targeting sequence (Chang et al. (1997) Oncogene 14: 1617-1622).

Similar patterns of epithelial-specific ESX mRNA expression have been noted in human and mouse tissues (Tymms et al. (1997) Oncogene 15: 2449-2462), suggesting common mechanisms of transcriptional control and promoter regulation in both the murine and human genes. Results from this study indicate that mouse and human ESX promoters share a high degree of homology (83 % nucleotide identity) over a relatively short region extending -0.4 kb upstream from the putative transcriptional start site (+1) and just beyond a conserved pair of Ets binding sites adjacent to an AP-2 consensus response element (Figure 2). The mESX promoter lacks the TATA box sequence present in hESX (- 41 bp). However, both promoters have a typical CCAAT box located -75 bp upstream of a conserved pyrimidine-rich type Inr, making it likely that both mESX and hESX function as TATA-less promoters. No significant differences in promoter activity were observed between the 1.1 kb ΠIESX_L-IUC and the 0.4 kb mESX-luc constructs, suggesting that the 0.7 kb of 5' untranslated region (UTR) between the Inr and ATG initiation codon (beginning exon 2) does not contain strong promoter regulatory elements. In addition to conserved Ets and AP-2 response elements, both the murine and human ESX proximal promoters share consensus elements for SPl/GC, USF, Oct, and NF-DB. Any combination of these response elements could account for the development- and tissue-specific profile of ESX expression common to both mouse and human tissues (Tymms et al. (1997) Oncogene 15: 2449-2462). These same response elements likely contribute to the differential upregulation of ESX promoter activity observed between high (SKBr3) and low (MCF-7, MDA-435) ESX expressing cell lines, and in SKBr3 and ErbB receptor overexpressing NDH3T3 cells (NE2/1, NE2/3, NE2/4) upon exposure to serum and growth factors (HRG, EGF). HRG, in particular, appears to synergistically enhance ESX proximal promoter activity 2- to 3-fold over the primary 3-fold stimulatory effect of serum- supplementation alone, consistent with our previous report of HRG induced upregulation of ESX mRNA in cultured SKBr3 cells (Chang et al. (1997) Oncogene 14: 1617-1622). Additional studies are underway to determine the mechanisms and response elements mediating serum, HRG, and EGF induction of ESX promoter activity in these cells. The dramatic changes in ESX mRNA levels observed during normal mammary epitheUal differentiation in vitro and in vivo may also be mediated by these same growth factor responsive promoter elements.

Mammary epithelium not only requires membrane activated ErbB receptor family members for normal ductal development (Xie et al. (1997) Mol. Endocrinology 11: 1766-1781) but also the ErbB receptor ligands, EGF and HRG, that are potent in vivo stimulators of mammary epithelial proliferation and differentiation (Nonderhaar (1987) J. Cell. Physiol. 132: 581-584; Coleman et al. (1988) Dev. Biol. 127: 304-315; Jones et al. (1996) Cell Growth Diff. 7: 1031-1038). The in vivo situation can be simulated in vitro using HCl 1 cell cultures, in which both HRG and EGF are mitogenic and either can be used to promote HCl 1 lactogenic competency, a state of commitment essential for subsequent hormonal induction of terminal differentiation and milk expression (Ball et al. (1988) EMBOJ. 7: 2089-2095; Marte et al. (1995) Mol. Endocrinol. 9: 14-23). The mechanisms associated with lactogenic competency are incompletely understood but are partially mediated by responses to increased cell-cell interactions and to a reorganized extracellular matrix (Chammas et al (1994) J. Cell Science 107: 1031-1040, Lochter and BisseU (1997) Mammary gland biology and the wisdom of extracellular matrix. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 77-92, Hannah Research Institute, Ayr, Scotland.). Our present study demonstrates that in HCl 1 cells, growth factor promoted lactogenic competency is associated with a dramatic upregulation in ESX expression.

The changes in ESX expression associated with in vitro induction of HCl 1 terminal differentiation mimicked some but not all the features of ESX transcript profiles observed during in vivo mammary gland development. Pregnancy represents a developmental stage in which epithelial cell proliferation and increasing commitment to terminal differentiation occur. Unlike the ESX upregulation observed when prohferatmg HCl 1 cells become lactogenically competent, glands from sexually mature virgin and lst- pregnancy mice showed no significant variation in their level of ESX expression. However, with in vivo terminal differentiation of mouse and rat mammary epithelium into milk producing lobuloalveolar units, there was a marked decline in ESX expression (Figures 5 and 6) consistent with the fall in ESX transcript levels observed with hormonal mduction of β-casein expression in competent HCl 1 cells. This dramatic decline in ESX expression upon terminal differentiation of mammary epithelial cells in vitro and in vivo is in unique contrast to stratified epithelial systems where ESX expression is upregulated and restricted to the most terminally differentiated forms of epidermal keratinocytes (Oettgen et al. (1997) Mol. Cell. Biol. 17: 4419-4433; AndreoU et al. ( 991) Nucleic Acid Res. 25: 4287-4295).

ESX may now be added to a small but growing list of epithelial genes known to be repressed during lactogenesis and then dramatically upregulated with weaning and initiation of mammary gland involution (Bielke et al. (1997) Cell Death and Differentiation 4: 114-124; Strange et al. (1992) Development 115: 49-58; Wilde et l (1997) Mammary apoptosis: physiological regulation and molecular determinants. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 103-114, Hannah Research Institute, Ayr, Scotland, Marti et al. (1995) Cell Death and Differentiation 2: 277- 283; Lund et al. (1996) Development 122: 181-193; Lochter and BisseU (1997) Mammary gland biology and the wisdom of extracellular matrix. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 77-92, Hannah Research Institute, Ayr, Scotland). Increasing ESX transcript levels are evident in the involuting mammary glands of both mouse and rat beginning as early as 1-2 days after weaning. In the rat gland this induction reaches peak levels within 4 days, while in the mouse gland expression is maximal by 8 days and remains high for at least 8 weeks, a point when apoptosis and remodelling are completed and the gland is fully regressed. The persistence of high ESX transcript levels in fully regressed mammary glands suggests that the involutional induction of ESX is occurring in newly committed epithelial cells that are destined to survive both apoptotic and remodelling phases of involution. Future in situ analysis will address the possibility that ESX upregulation occurs in a subpopulation of partially committed and pluripotential ductal epithelium poised to regenerate a fully differentiated milk producing gland with the next cycle of pregnancy and lactation. Molecular markers that potentially distinguish virgin mammary epithelium from partially or terminally differentiated ductal-lobular elements are of both biological and medical interest, as they might ultimately serve to identify women whose breast tissue is more or less vulnerable to malignant transformation (Chepko and Smith (1997) Tissue & Cell 29: 239- 253; Russo and Russo (1997) Endocrine-Related Cancer 4: 7-21).

Like the mammary gland, prostatic tissue is subject to involutional changes and its epithelium regresses in a reversible manner following surgical castration or pharmacologically induced androgen ablation. While the regressing ventral prostate shows morphological and biochemical features of epithelial apoptosis analogous to those of involuting mammary gland, unlike the latter it shows little evidence of tissue remodelling with slight induction of ECM proteinases of the matrix metalloproteinase (MMP) and serine protease families (Bielke et al. (1997) Cell Death and Differentiation 4: 114-124). In contrast, the transition from lactating to involuting mammary gland is well characterized by two distinct phases of apoptosis, an early proteinase-independent phase and a prominant proteinase-dependent later stage (Lund et al. (1996) Development 122: 181-193). In the initial phase (days 1-3 after weaning), the gland's alveoli and supporting mesenchyme remain largely intact, but chromatin cleavage and DNA laddering become detectable along with induction of the same apoptosis-associated genes upregulated during prostatic involution (Bielke et al. (1997) Cell Death and Differentiation 4: 114-124.; Strange et al. (1992) Development 115: 49-58; Wilde et al. (1997) Mammary apoptosis: physiological regulation and molecular determinants. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 103-114, Hannah Research Institute, Ayr, Scotland, Marti et al. (1995) Cell Death and Differentiation 2: 277-283; Lund et al. (1996) Development 122: 181-193; Lochter and Bissell (1997) Mammary gland biology and the wisdom of extracellular matrix. In Biological Signalling and the Mammary Gland, Wilde CJ, Peaker M and Taylor E, eds.), pp 77-92, Hannah Research Institute, Ayr, Scotland, Nishi et al (1996) Prostate 28: 139-152). During the second stage (days 3-10 after weaning), massive apoptotic cell loss (~50%o of the gland's cellularity and >95% of all alveolar epithehum) results in collapse and dissolution of all milk producing glands, necessitating a much more extensive protease-mediated ECM remodelling process than that required by the involuting prostate. Despite these differences between involuting breast and prostate glands, ESX transcript levels increased in both in a similar manner. Maximal upregulation of prostatic ESX occurred within 2-4 days of hormonal ablation, concuπent with increases in other apoptosis-associated prostatic transcripts (e.g. sulphated glycoprotein-2, tissue transglutaminase, p53, DDC-4, TGF-βl, TGF-βRH) previously demonstrated in these same RNA samples (Bielke et al. ( 1997) Cell Death and Differentiation 4: 114- 124) or by other groups (Nishi etal. (1996) Prostate 28: 139-152).

While a number of ECM proteases are known to be transcriptionally regulated by Ets factors (Higashino et al (1995) Oncogene 10: 1461-1463; D'Orazio et al. (1997) Gene 201: 179-187), the early and comparable extent of ESX upregulation observed during involution of prostate and mammary glands and the persistence of upregulated ESX in fully regressed mammary glands suggest that this Ets family transactivator may be regulating other genes in addition to proteases in cells destined to survive both the apoptotic and remodelling phases of glandular involution. In this regard, transglutaminase 3 and TGF- βRII are two of the few genes identified to date as being transcriptionally upregulated by ESX (Oettgen et al. (1997) Mol. Cell. Biol. 17: 4419-4433; Andreoli et al. (1997) Nucleic Acid Res. 25: 4287-4295; Choi S-G et al. (1998) J. Biol. Chem. 273: 110-117). The former is closely related to tissue transglutaminase which, along with TGF-βRH, are upregulated in concert with ESX during involution. Thus, our findings should not only stimulate the search for ESX regulated genes associated with involution and apoptosis, but also provide greater incentive to identify ECM and growth factor sensitive response elements within the ESX promoter that account for its transcriptional upregulation during prostate and mammary gland involution.

Example 7: Exon 4-encoded acidic domain in the epithelium-restricted Ets Factor.

ESX. confers potent transactivating capacity and binds to TATA-binding Protein (TBP)

The Ets. gene family has a complex evolutionary history with many family members known to regulate genetic programs essential for differentiation and development and some known for their involvement in human tumorigenesis. To understand the biological properties associated with a recently described epithelium-restricted Ets factor ESX, an 11 kb fragment from the lq32.2 genomically localized human gene was cloned and analyzed. Upstream of the ESX promoter region in this genomic fragment lies the terminal exon of a newly identified gene that encodes a ubiquitin-conjugating enzyme variant, UEV- 1. Tissues expressing ESX produce a primary 2.2 kb transcript along with a 4.1 kb secondary transcript arising by alternate poly(A) site selection and uniquely recognized by a genomic probe from the 3' terminal region of the 11 kb clone. Endogenous expression of ESX results in a 42 kDa nuclear protein having 5 fold greater affinity for the chromatin- nuclear matrix compartment as compared to other endogenous transcription factors like AP- 2 and the homologous Ets factor, ELF-1. Exon mapping of the modular structure infeπed from ESX cDΝA and construction of GAL4(DBD)-ESX expression constructs were used to identify a transactivating domain encoded by exon 4 having comparable potency to the acidic transactivation domain of the viral transcription factor, VP16. This exon 4-encoded 31 amino acid domain in ESX was shown by mutation and deletion analysis to possess a 13 residue acidic transactivation core which, based on modeling and circular dichroism analysis, is predicted to form an amphipathic D -helical secondary structure. Using recombinant GST-ESX (exon 4) fusion proteins in an in vitro pull-down assay, this ESX transactivation domain was shown to bind specifically to one component of the general transcription machinery, TATA-binding protein (TBP). Transient transfection experiments confirmed the ability of this TBP-binding transactivation domain in ESX to squelch heterologous promoters independent of any promoter binding as efficiently as the transactivation domain from VP16.

Materials and Methods.

Cells lines, recombinant ESX protein, and anti-ESX polyclonal antibody.

Human breast cancer cell lines SKBr3, MDA-468, MDA-231, ZR-75-1 and the African green monkey kidney line, COS-7, were obtained from the American Type Culture Collection (Manassas, VA) and maintained as recommended. Recombinant ESX protein was prepared and purified as previously described (Chang et al. (1997) Oncogene, 14: 1617-1622). To produce an affinity purified anti-ESX polyclonal, a synthetic 17 amino acid ESX peptide (N-terminal cysteine plus 16 C-terminal amino acids of ESX) coupled to KLH was injected into a New Zealand White SPF female rabbit (Animal Pharm Services, Ine). Total IgG was obtained from 8 week post-immunization rabbit serum using a commercial Protein A purification kit (Pierce). Antibody purification from the total IgG was performed using the antigenic 17 amino acid ESX peptide coupled to an Affi-Gel 10 matrix according to the manufacturer's recommendations (BioRad).

Northern and Western analyses.

For Northern blotting, total cell RNA (10 μg/sample lane) prepared from freshly harvested cell the by guanidinium isothiocyanate method was electrophoresed into 1% agarose gels and transfeπed onto membranes that were then hybridized with ³²P-probes labeled, washed and autoradiographed as previously described (Chang et al. (1997) Oncogene, 14: 1617-1622, Example 6). The 5' HindHI probe was prepared from a 431 bp Hindm-Pstl fragment, the 3' Hind H probe prepared from a 349 bp Scal-HindUI fragment and the ESX probe prepared from ESX cDNA. For Western blotting, whole-cell or nuclear extracts were boiled in sample loading buffer (1% SDS, 20% glycerol, 100 mM DTT, 50 mM Tris pH 6.8) and then electrophoresed into 9% sodium dodecyl sulphate polyacrylamide gels (SDS-PAGE) and transfeπed onto membranes (Immobilon-P, MilUpore) by electroblotting at room temperature (Hoeffer, 250 mA, 1.5 h). Protein-bound membranes were blocked in PBS containing 5% dried milk and 0.1% Tween 20, incubated in the same buffer with anti-ESX polyclonal (above) or commercially obtained antibodies that recognize ELF-1, p65NF-DB, SP1, or AP-2 (Santa Cruz Biotechnology), and then incubated with a secondary IgG antibody conjugated to horseradish peroxidase (Sigma). Protein bands were visuahzed using the SuperSignal chemiluminescent substrate (Pierce). Isolation and sequencing of the 11 kb ESX genomic fragment

The ESX containing 11 kb HindE-I fragment identified in Southern blots was isolated from Hindffl digestion fragments of the PI clone originally used to chromosomally map ESX (Chang et al. (1997) Oncogene, 14: 1617-1622) and was subcloned into the Hindm site of pcDNAI/Amp (Invitrogen). AU DNA manipulations were carried out by using standard methods (Ausubel et al. (1989) Current Protocols in Molecular Biology. John Wiley & Sons: New York; Sambrook, et al. (1989) Molecular Cloning -A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York). The 11 kb HinάTfl clone was further analysed by subcloning genomic BamHI, Bglll, and Stul fragments into pBluescript (Stratagene), and the resulting plasmid clones were sequenced with primers derived from the human ESX cDNA (Chang et al. (1997) supra.). The 5 '-upstream region was sequenced by primer walking. AU genomic DNA fragments were sequenced from both directions by automated DNA sequencing with an ABI sequencer.

GAL4 constructs encoding deleted and mutated ESX fusion proteins. AU GAL4(DBD)-ESX chimeras were constructed by cloning a set of deletion

ESX PCR products in frame with the DNA-binding domain (aa 1-147) of GAL4 into a pM vector (Clontech) for the expression of ESX fusion proteins in transiently transfected SKBr3, MCF-7 and COS-7 cells. Since the GAL4(DBD)-ESX chimeras produced relatively similar reporter gene results in each of the 3 cell lines, only the SKBr3 data is shown for the mapping of ESX' s transactivation. The plasmid pM contains the DNA-binding domain (DBD) of the yeast GAL4 protein driven by the SV40 early promoter. The GAL4(DBD)- ESX full-length fusion constructs, and all the N- and C-terminal deletion constructs of ESX were constructed by PCR amplification of the appropriate fragment of hESX derived from the plasmid, pcDNAI/Amp-ESX (Chang et al. (1997) supra.), using Pfu polymerase (Stratagene). PCR primers containing an EcoRl site (5' primer) and HindEEt site (3' primer) were used to create in frame PCR fragments for cloning between the EcoRl and HindDI sites of pM to produce plasmids encoding GAL4(DBD)-ESX(l-30), GAL4(DBD)-ESX(l-63), GAL4(DBD)-ESX(1-103), GAL4(DBD)-ESX(1-128), and GAL4(DBD)-ESX(1-156), GAL4(DBD)-ESX(l-268), GAL4(DBD)-ESX(55-103), GAL4(DBD)-ESX(55-128), GAL4(DBD)-ESX(55-156), GAL4(DBD)-ESX(55-268), GAL4(DBD)-ESX(104-156), GAL4(DBD)-ESX(104-159), GAL4(DBD)-ESX(104-199), GAL4(DBD)-ESX(104-229), GAL4(DBD)-ESX(129-156), GAL4(DBD)-ESX(129-159), GAL4(DBD)-ESX(129-199), GAL4(DBD)-ESX(129-229) and GAL4(DBD)-ESX(157-268). All PCR product inserts were verfied by restriction mapping and sequencing. The luciferase reporter gene construct pG5ElbLUC (Chang and Gralla (1993) Mol. Cell. Biol, 13: 7469-7475) was used to determine the transactivation activities of GAL4(DBD)-ESX fusion constructs. pG5ElbLUC is a reporter plasmid with five GAL4 response elements, an inverted CCAAT box, and a TATA element controlling the luciferase (LUC) reporter gene. The internal control plasmid pCHl 10 (Pharmacia) was used to monitor transfection efficiency.

AU GAL4(DBD)-ESX(129-159) fusion protein point mutations were introduced by a two-step PCR protocol using two complementary mutagenic primers, two flanking primers and Pfu polymerase as previously described (Ausubel et al. (1989) supra.). Briefly, a first PCR was performed with a set of one mutagenic primer plus one flanking primer. The two products from this first PCR reaction were gel-purified and used as template in a second PCR reaction employing only the flanking primers. Sequences for these mutagenic primers are available upon request. The integrity of all fusion constructs was confirmed by DNA sequencing.

Transient transfection assays. AU plasmids used for transfection assays were prepared using plasmid Maxi

Kits from Qiagen. Transient transfection experiments were carried out with Lipofectamine reagent (Gibco BRL) according to the manufacturer's recommendations. Briefly, subconfluent cells in 60 mm diameter plates were transfected with a total of 4 μg DNA consisting of 0.5 μg of reporter plasmid, 2.5 μg of expression plasmid and 1 μg of the internal control plasmid. The composition of transfected plasmids are described in each figure legend. Following transfection (48 h), cells were washed twice with PBS and harvested. Cell extracts were prepared and luciferase assays were carried out using a luciferase assay kit (Promega) and quantitated using a Turner Designs luminometer (model TD-20e). Transfection efficiency was monitored by measuring the β-galactosidase activity from a co-transfected pCHl 10 plasmid (Pharmacia). For each experiment, at least three independent transfections were performed and results are displayed as the mean plus SEM luciferase acitivity in relative tight units. Additionally, the expression level for all GAL4(DBD)-ESX fusion constructs was verified by Western blotting of lysates from transfected cells probed with a monoclonal antibody against the GAL4 DNA binding domain (Santa Cruz Biotechnology).

GST pull-down assays.

Plasmid pGEX-6P-l-ESX(129-159), which encodes GST-exon 4 (amino acids 129-159) and plasmid pGEX-6P-l-ESX(129-159)L143P which encodes GST-exon 4 with the leucine to pro line substitution at position 143 were constructed by cloning PCR- amplified EcoRI-Xhol fragment containing residues 129 to 159 from pM-hΕSX(129-159) and pM-hESX(129-159)L143P respectively into pGEX-6P-l vector (Pharmacia) cleaved at the EcoRl and Tiøl sites. Glutathione S-transfease (GST), GST-exon 4 and GST-exon 4(L143P) proteins were expressed in E. coli BL21 (Novagen) and purified by using the Bulk GST Purification Module in accordance with the manufacturer's protocol (Pharmacia). The quality and quantity of the proteins were verified by SDS-PAGΕ followed by Commassie staining. Following rebinding of purified GST, GST-exon 4 and GST-exon 4(L143P) to glutathione-Sepharose beads, binding reactions with commerically available TBP (Promega) and TFIΕB (Promega) were performed in binding buffer (130 mM NaCl, 50 mM Tris pH 7.5, 1 mM DTT, 0.1% NP-40) at 4°C. The glutathione GST, GST-exon 4 and GST-exon

4(L143P) beads were washed extensively in binding buffer followed by elution in binding buffer adjusted tol M NaCl for TFIΕB or successive elutions at 0.5 M, 0.8 M, 1.3 M and 2.0 M NaCl for TBP. AUquotes of the NaCl eluates were subjected to SDS-PAGΕ, electiophoretically transfeπed to nitrocellulose membranes, probed with an antibody to either TBP (Santa Cruze Biotechnology) or TFIΕB (Promega) and analyzed for immunoreactivity with an enhanced-chemiluminescence detection system (Pierce).

Circular dichroism spectroscopy.

A 25 amino acid peptide from exon 4 of ΕSX, 131- SSSDΕLSWΕΕLLΕKDGMAFQΕALD-155 (SΕQ ID NO: ), was synthesized on an automated synthesizer and purified by high-performance liquid chromatography. Secondary structural predictions were carried out using the Chou-Fasman algorithm (Chou and Fasman, 1978). Solutions of the peptide were prepared for circular dichroism (CD) measurements in 10 mM Tris HCl, pH 7.4 mixed with 0-50% methanol in increments of 10%; the final peptide concentration was 0.1 mg/ml. Samples were studied at room temperature in a circular quartz cell of 1 mm pathlength. Data were recorded over the range 192-260 nm in a Jasco model 720 spectropolarimeter (Jasco Inc., Εaston, MD), continuously flushed with dry nitrogen. CD readings in millidegrees were converted to mean residue elUpticity [Q]r in degrees cm² dmol^'1, using a mean residue mass of 113 Da. The α-helix content was estimated by comparison with a reference set of 33 proteins using the variable selection method (VARSLC) (Johnson (1990) Proteins, 1, 205-214). Results.

Organization of the ESX gene locus: exon-intron structure and putative UEV gene upstream of the ESX promoter.

Cloning of the ESX gene locus was undertaken to further our goal of understanding the functional domains of the ESX protein. Southern blot examination of normal as well as several human breast cancer cell lines with 5' and 3' ESX cDNA probes identified a single 11 kb HindUI band. This 11 kb fragment was isolated, subcloned and sequenced from the Hindffl digestion fragments of the PI clone initially used to localize the ESX gene to lq32 (Chang et al (1997) supra.). A computer search of this genomic sequence (submitted to GenBank) for known ESX cDNA sequences locaUzed the primary 2.2 kb ESX transcript (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295, Chang et al. (1997) supra., Choi et al. (1998) J. Biol. Chem., 273: 110-117; Oettgen et al. (1997) Mol. Cell. Biol, 17: 4419-4433) to 9 exons spanning -5 kb of the genomic DNA, with exons 8 and 9 coding for the C-terminally located Ets DNA binding domain (Figure 12A). The exonic sequences are identical to the published human cDNA sequence (GenBank Accession No. U66894). The exon-intron splice sites are in good agreement with splice site consensus sequences (Table 1), and agree precisely in their positioning to those obtained for the mouse homolog of ESX (Example 6, Tymms et al. (1997) Oncogene, 15: 2449-2462). Figure 12B compares the predicted ESX amino acid sequence and location of exon-intron junctions within the Ets DNA-binding domain to that of other Ets factors whose genomic organization are published. Both amino acid identities and exon-intron junction alignments confirm the well established evolutionary conservation of ETS- 1 subfamily members (ETS-1, ETS-2, PNT) across species ranging from Drosophila to human (Kla bt C. (1993) Development, 111: 163-176; Watson et al. (1988) Virology, 164: 99-105). In contrast to this situation in the ETS-1 subfamily, ESX and ERF demonstrate only 43% amino acid identity within the Ets domain but share a single and identically placed exon-intron boundary that terminates the α3 helix of this domain (Change et al. (1997) supra.; Figure 12B).

As described in Example 6, the proximal 350 bp of the ESX promoter is >80% identical to that of the coπesponding mouse promoter region, while the mouse and human promoter sequence between -1500 bp and -350 bp share <50% identity. Two longer Alu elements are located upstream of the region of diverging human-mouse promoter homology while two shorter Alu elements are located within introns 2 and 8. Incidentally noted were two small CpG islands of -100 bp size near exons 1 and 7 (Figure 12A). Table 1. Sizes of exons and introns and sequences at exon/intron junctions of the human ESX gene.

Exon Sequence at exon/intron junctions Intron no. size 5' splice donor 3' splice acceptor no. size aa interrupted (bp) (bp)

1 112 ACTCCG/gtagga . . . ccacag/GTAGCC 1 422

2 171 GTACAG/gtgggt . . . ttgcag/AGAAGG 2 657 Glu-55

3 222 ACCTCA/gtgagt . . . tgtcag/CTTCCA 3 165 Thr-129

4 93 CCTTTG/gtgaga . . . tcccag/ACCAGG 4 203 Asp- 160

5 120 CCGCAG/gtgaga . . . ccccag/GGACTG 5 187 Gly-200

6 90 CCAGCG/gtgagt . . . ccacag/ATGGTT 6 145 Asp-230

7 117 AGCACG/gtgagc . . . ttgcag/CGCCCA 7 530 Ala-269

8 196 CATGAG/gtgagc . . . accccag/GTACTA 8 869 Arg-334

9 797 9

Surprisingly, a BLAST search of ESTs compiled by TIGR identified a 1.3 kb tentative human contig (THC213038) with complete identity in its 3' terminal sequence to a 589 bp region -4 kb upstream of the ESX transcription initiation site (Figure 12A), suggesting the presence of a terminal exon from another gene in this upstream region. As well, a 5'-TG-3' dinucleotide in the genomic sequence immediately upstream of the homology break with THC213038 suggested the presence of a splice acceptor site at this location. A BLAST search of GenBank revealed that the THC213038 sequence was identical to the terminal 3' untranslated region of a recently described ubiquitin-conjugating E2 enzyme variant, UEN-1 (Sancho et al (1998) Mol. Cell. Biol, 18: 576-589). Northern analysis using a genomic probe from this 5' Hindm region overlapping with THC213038 demonstrated a widespread pattern of expression for this putative UEV gene in contrast to the known epithelium-restricted pattern of ESX expression. While this Northern blot result is consistent with the published pattern of UEV-1 expression (Sancho et al. (1998) supra.), the reported chromosome location of UEV-1 on 20ql3.2 is not consistent with the well documented mapping of ESX to lq32.2 (Chang et al. (1997) supra.; Oettgen et al. (1997) Genomics, 45: 456-457; Tymms et al. (1997) Oncogene, 15, 2449-2462), although one reported UEV-1 cDNA clone (MAC4) did identify a lq locus when mapped by fluorescence in situ hybridization (Sancho et al. (1998) supra.). Polv(A) site selection produces either 2.2 kb or 4.1 kb ESX transcripts.

A BLAST search for ESTs in the 900 bp region between the termination of the primary 2.2 kb ESX transcript and the downstream 3' HindlEI site identified two contigs, THC203540 and THC209689 (Figure 12A). These overlap by 20 bp which is insufficient to meet TIGR's 40 bp overlap criteria for contiguousness but are otherwise identical to the genomic sequence from this region. Northern blots probed with a 349 bp fragment from the 3' Hindiπ region identified a 4.1 kb transcript identical in position and expression pattern as that obtained with cDNA probes from the 2.2 kb ESX transcript. THC203540 extends 300 bp beyond the 3' HindlTI site and terminates with the variant and weak polyadenylation signals ATT AAA and GAT AAA, respectively. Together, therefore, THC203540 and THC209689 contain 1.2 kb of sequence that undoubtedly comprise the 3' untranslated portion of the larger 4.1 kb ESX transcript previously reported (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Chang et al. (1997) supra.; Example 6; Oettgen et al. (1991) Mol. Cell. Biol; 17: 4419-4433; Tymms et al. (1997) Oncogene, 15, 2449-2462) and now thought to arise by alternate poly(A) site selection. Additionally, all the clones contributing to these two THCs were derived from epithelial sources, consistent with the reported epithelial expression pattern of ESX. It is interesting to note that a run of 12 consecutive A's in the 3' untranslated region of the 4.1 kb ESX transcript is apparently responsible for the oligo dT priming that generated 3 of the 4 clones contained in THC209687 and possibly explains the 2.7 kb ESX cDNA described in an earlier report (Choi et al. (1998) J. Biol. Chem., 273: 110-117).

Chromatin-nuclear matrix association by endogenous ESX protein.

Using an affinity purified rabbit polyclonal antibody directed against the 17 C-terminal amino acids of human ESX, Western analyses on whole cell extracts from human breast cancer cell lines expressing high (SKBr3, MDA-468), medium (ZR-75-1) or low

(MDA-231) levels of the 2.2 kb ESX transcript showed that these cells express proportional amounts of a -42 kDa ESX protein, although this polyclonal also recognizes several other proteins whose epitopes were competed by the C-terminal ESX peptide. The possibility was explored that ESX might reside within a nuclear compartment since some transcription factors (e.g. Spl, p65NF-κB, steroid hormone receptors) are known to be preferentially retained in a high-salt resistant chromatin-pellet fraction of the nucleus containing the nuclear matrix (Raziuddin et al. (1997) J. Biol. Chem., 272: 15715-15720; Van Wijnen et al. (1993) Biochemistry, 32: 8397-8402). SKBr3 nuclei were used to produce NE (nuclear extract) and NP (nuclear pellet) fractions which were compared by Western analyses for their relative abundance of ESX, Spl, p65NF-κB, AP-2 and the ESX-homologous Ets factor, ELF-1. Cell nuclei were extracted in a standard Dignam buffer (.42 M NaCl) with the extract partitioned from the chromatin-nuclear matrix residue by high-speed centrifugation (NE fraction). Following removal of the Dignam extract, the chromatin-nuclear matrix pellet was then completely solubilized by heating (80°C) in RIPA buffer (NP fraction). Multiple Western blots were prepared by loading gels with identical NE and NP sample volumes, and then blotted with specific antibodies. As known components of the nuclear matrix, p65NF-κB and Spl served as positive controls. Upon scanning densitometry measurement of their Western bands, p65NF-κB and Spl were found to have NP/NE ratios of 1:1.7 and 1:3, respectively, consistent with previous studies confirming their general propensity for retention by the chromatin-nuclear matrix (Raziuddin et al. (1997) supra.; Van Wijnen et α/.(1993) supra.). In contrast, AP-2 and ELF-1 were found to have NP/NE ratios of 1:12 and 1:15, demonstrating their lack of chromatin-nuclear matrix association. As shown Western Blots and indicated by an NP/NE ratio of 1:2.5, ESX appeared to be preferentially retained in the chromatin-nuclear matrix fraction. Given the considerable homology (-50% identity and 70% similarity) in amino acid sequences between ESX and ELF-1 in their Ets DNA-binding domains (Chang et al. (1997) supra.), the 5 -fold greater chromatin-nuclear matrix affinity of ESX relative to ELF-1 probably reflects the strong localizing function of another domain positioned outside of the ESX DNA-binding domain. ESX transactivating domain encoded by exon 4.

To look for ESX transactivating domains, a series of C-terminally deleted ESX sequences were fused 3' to the DNA binding domain (DBD) of GAL4 and these GAL4(DBD)-ESX fusion constructs were tested for their capacity to transactivative a GAL4(DBD) responsive reporter in transiently transfected SKBr3 cells. As shown in Figure 13, C-terminal deletions of ESX which preserved exon 4 (aa 129-159) demonstrated a transactivating capacity comparable to the GAL4(DBD)-VP16 positive control, whereas deletions lacking exon 4 produced near background reporter activity. We had initially suggested that the serine-rich box of ESX (aa 189-229), encoded by exons 5 and 6 and homologous to the polyserine transactivating domain of the SOX4 gene, might serve as the ESX transactivating domain (Chang et al. (1997) supra.). However, the results shown in Figure 13 indicate that the serine-rich box of ESX has no detectable transactivating activity, at least in the context of these GAL4(DBD) assays. The suppression from optimal reporter activity observed with those GAL4(DBD)-ESX fusion constructs containing the highly basic exon 7 (aa 230-268) domain might be explained by its putative HMGl-like A/T hook (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Oettgen et al. (1997) Mol. Cell. Biol, 17: 4419-4433). This exon 7-encoded DNA-binding capacity could result in sequestration of the GAL4(DBD)-ESX fusion proteins into a chromatin-nuclear matrix compartment that would diminish its exposure to the co-transfected reporter gene. To define more precisely the ESX transactivating domain, GAL4(DBD) fusion constructs were examined that contained only exon 3 (aa 55-128), exon 4 (aa 129-159), exons 3-4 (aa 55- 159), exons 4-5 (aa 129-199), exons 4-6 (aa 129-229) and exons 4-7 (aa 129-268). The results shown in Figure 13 estabhsh that ESX exon 4 autonomously encodes a potent transactivating domain of comparable strength to VP16. Exon 3, encoding the A-region or Pointed domain, exhibited no transactivating potential in agreement with previous studies among ETS-1 subfamily members possessing this conserved domain (Schneikert et al, 1992). Addition of exon 3, exon 5, or exons 5-6 to the exon 4 transactivation domain produced nearly equivalent activity as exon 4 alone, although addition of exon 7 attenuated this activity. Finally, to' demonstrate the essential requirement of exon 4 for transactivation two fusion constructs lacking exon 4, GAL4(DBB)-exons 5-9 and GAL4(DBD)-exons 5-7, were tested and found to possess no significant transactivating capacity (Figure 13).

Acidic core and candidate transactivating motifs in exon 4.

Activation domains are typically characterized by the nature of their amino acid composition; thus, with 7 D or E residues out of 31 (22%), the exon 4 domain may be classified as an acidic activation domain. The relevance of these acidic residues to transactivation was determined by alanine substitutions of these residues into the GAL4(DBD)-exon 4 construct as either single or double mutations. Figure 14A demonstrates that while mutation of the two C-terminally located acidic residues (E152A/D155A) had no impact upon transactivation, mutations in the other acidic pairs D134A/E135A and E144A D146A severly diminished transactivating capacity (to <5% control activity) while the single mutation E141 A reduced reporter activity by -60%. Interestingly, the single D134A mutation eliminated exon 4 transactivating capacity while mutation of the adjacent acidic residue (E135A) left the transactivating capacity of exon 4 un ^minished. Mutation of K145, the only basic amino acid within exon 4, to Q had no significant effect on exon 4 transactivation while the double mutation K145Q/D146A reduced transactivating capacity to 20% of control (Figure 14A).

Interestingly, an N-terminal 6 amino acid element of exon 4 (129-TSSSED- 134) harbors two overlapping casein kinase II consensus sequences (S/TXXD/E). Since transactivation by Myb and PU.l are known to be modulated by casein kinase II phosphorylation (Oelgeschlager et al. (1995) Mol. Cell. Biol, 15: 5966-5974; Pongubala et al. (1993) Science, 259: 1622-1625), the mutations S131R and S132A were introduced into the GAL4(DBD)-ESX (1-156) fusion construct to address the possibility that ESX activation is dependent upon serine phosphorylation within exon 4. As shown in Figure 13, replacement of these exon 4 serines (S 131R/S 132 A) had no significant impact on transactivation relative to the unmutated GAL4(DBD)-ESX (1-156) fusion control, suggesting that serine phosphorylation is not involved in transactivation by exon 4. Also of interest, the motif 150-FXXφφ-154 (X = any amino acid, φ = any hydrophobic amino acid) present in the C-terminal region of exon 4 is a known recognition element for TAFII31 that is also critical for VP16 transactivation, suggesting that it generally functions as a recognition motif within acidic transactivators (Uesugi et al. (1997) Science, 277, 1310- 1313). Therefore, mutational analysis of this motif was performed to estabhsh its relevance to exon 4 transactivating capacity. When incorporated into the GAL4-(DBD)-exon 4 (129- 159) fusion construct, neither the single F150A mutation or the double F150A L154A mutations resulted in significant ( >20%) reduction in ESX transactivating capacity (Figure 5B). As a positive control, inactivation of this motif by introduction of F479A/L483A double mutations into the GAL4(DBD)-VP16 fusion construct resulted in complete loss of VP16 transactivating capacity, as previously described (Id.). Thus, despite the presence of a similar FXXφφ motif in the acidic activation domains of VP16 and ESX exon 4, only the VP16 domain appears to depend on this recognition element.

Since mutations in the two C-terminal acidic residues of ESX exon 4 as well as within the C-terminally located FXXφφ motif produced no significant loss of transactivation capacity, progressive C-terminal deletions in exon 4 were undertaken to delineate a minimal ESX transactivation domain. Stop codons were introduced at amino acid positions 158, 148 and 145 to generate GAL4 (DBD)-exon 4 fusion constructs with exon 4 peptide lengths of 27, 19 and 16 amino acids, respectively. The two longer N- terminal exon 4 constructs had reduced though still quite potent transactivating capacities (88%o and 52% of control activity), while the 16 amino acid exon 4 fusion construct (129- 144) with deleted D146 residue exhibited only 13% of control activity (Figure 14A). Lastly, a 13 amino acid GAL4(DBD)-exon 4 fusion construct (aa 134-147) with deleted N-terminal serines and possessing only the most essential of the exon 4 acidic residues showed 45% of control exon 4 activity, suggesting that much of the exon 4 transactivating capacity results from this acidic central core of 13 amino acids. α-helical structure of acidic core in ESX transactivating domain.

While the ability of acidic domains to support transactivation has been well established, elucidation of the structural basis for this transactivating function remains largely unclear (Uesugi et al. (1997) supra.). Previous structural studies of the acidic activation domains of GCN4 and p65NF-κB by circular dichroism (CD) and nuclear magnetic resonance (NMR) spectroscopy revealed that these domains, while unstructured in aqueous solution at neutral pH, assumed respectively β sheet under acidic conditions and α- helical conformation in less polar solvents (Schmitz et al. (1994) J. Biol Chem., 269: 25613-2562011 Van et al. (1993) Cell, 72: 587-594). Based on protein modeling algorithms, the 13 amino acid acidic core of the ESX exon 4 transactivation domain, 134-

DELSWϋELLEKD-146 (SEQ ID NO: __), is predicted to form an α-helical structure (Chou and Fasman. (1978) Ann. Rev. Biochem., 47: 251-276). A hetical wheel projection of these 13 residues suggests its amphipathic character, with hydrophihc residues D134, S137, E141, E144 and K145 distributed on one face and hydrophobic residues L136, 1139, 1140 and L143, on the other (Figure 15A). To assess the impact of a helix destabilizing mutation in the middle of exon 4 on its transactivating capacity, LI 43 was mutated to the helix-breaking amino acid proline. As shown in Figure 145A, the L143P mutation was incapable of transactivation, suggesting the importance of this exon 4 hetical structure on its transactivating function. CD spectroscopy, which allows sensitive detection of secondary structure elements in proteins and peptides, was used to probe the structure of the 25 amino acid exon 4 transactivation domain (aa 131 to 155). The spectral profile obtained for the 25-residue peptide in the presence of 50% methanol is shown in Figure 15B. In the absence of methanol the spectral minimum close to 200 nm shows the main secondary structural feature to be a random coil, although the signal approaching 222 nm indicates a small degree of α- helix. With increasing methanol concentration, however, there was a strengthening of the α- helical signal at 222 nm and a shift of the lower wavelength minimum towards 208 nm, clearly indicating greater α-hehcal structure in the more hydrophobic environment. This observation that the α-helical content of the exon 4 domain depends on hydrophobic conditions suggests that hydrophobic interactions in the full-length ESX protein serve to stabilize the exon 4 helical structure and its transactivating function.

Binding of the ESX transactivating domain to TBP.

Transactivators like VP16 are known to enhance transcription by recruiting one or more proteins into a preinitiation complex with RNA pol π, this complex contaimng general transcription factors (GTFs) including the TATA-binding protein (TBP) and assorted TBP-associated factors (TAFs) (Chang and Gralla (1993) Mol. Cell. Biol, 13: 7469-7475; Ptashne and Gann (1997) Nature, 386: 569-577; Pugh BF. (1997) Transcription Factors in Eukaryotes. Papavassiliou AG (ed), Springer- Verlag: Heidelberg, Germany, pp 37-50; Uesugi et al. (1997) Science, 277, 1310-1313). As well, at least one Ets factor (ERM) is known to recruit a TAF via its α-helical acidic activation domain (Defossez et al (1997) Nucleic Acids Res., 25: 4455-4463). Therefore, to assess the ability of the α-helical acidic activation domain of ESX to bind and recruit such factors, we constructed a GST-exon 4 fusion protein for in vitro pull-down assays. Unmodified GST and a GST-exon 4 fusion protein containing the hehx-destabilizing L143P mutation were used as negative controls. Recombinant TBP or TFIEB were applied in binding buffer to glutathione-Sepharose beads pre-bound with equivalent amounts of either GST-exon 4, GST-exon 4 (L143P), or GST. Extensive washing was followed by elution with either 1 M NaCl for TFUB or. successive elutions with 0.8, 1.3 and 2.0 M NaCl for TBP. Equal portions of the eluates were loaded onto SDS acrylamide gels, electrophoresed, blotted onto membranes and probed with antibodies to detect TBP or TFUB in the eluates. TFUB did not bind to unmodified GST or to either, of the GST-exon 4 fusion proteins. In contrast, TBP, which showed no binding to unmodified GST, bound quantitatively to the GST-exon 4 fusion protein. Interestingly, the mutation-bearing GST-exon 4(L143P) fusion protein retained some capacity to bind TBP although with substantially reduced affinity relative to the GST-exon 4 fusion protein

(Figure 7). In this regard, the GST-exon 4(L143P) protein with its protine substitution and otherwise unaltered exon 4 charge distribution, migrated 10- 15 kDa slower on SDS-PAGE relative to the GST-exon 4 fusion protein indicating the disruptive influence of protine on the helical secondary structure of exon 4. Therefore, these GST pull-down results support the likelihood that TBP recruitment by exon 4 in vivo accounts for much of the transactivating capacity observed with GAL4(DBD)-exon 4.

Squelching mediated by the ESX transactivation domain.

Squelching occurs when a potent transactivator reduces the expression of a co-transfected reporter plasmid in a transient transfection assay, with the resulting decline in reporter activity thought to be due to sequestration of GTFs and reduction in their effective concentration (Natesan et al. (1997) Nature, 390: 349-350). As ESX exon 4 exhibited a potent in vivo transactivating capacity in the GAL4(DBD) assay and bound TBP in the context of an in vitro GST-exon 4 pull-down assay, we tested the ability of the GAL4(DBD)-exon 4 expression construct to influence two different promoters in vivo, the SV40 early promoter and a synthetic promoter containing three tandem copies of the HER2 promoter's Ets response element positioned just upstream of a minimal thymidine kinase (tk) promoter (Chang et al. (1997) supra.). As shown in Figure 16A, GAL4(DBD)-exon 4 equally surpressed the activity of these reporters lacking any GAL4 DNA-binding sites, following co-transfection into SKBr3 (or COS-7) cells. The squelching of these different reporters by GA 4(DBD)-exon 4 amounted to a 4- 5 fold suppression of reporter activity as compared to co-transfection with the control GAL4(DBD) construct. In parallel experiments, we observed a similar degree of reporter squelching by GAL4(DBD)-VP16 (Figure 16A). Interest in the squelching aspects of exon 4 arose following initial co- transfection studies using full-length ESX and a HER2 promoter-driven reporter (Chang et al. (1997) supra.). In subsequent experiments (Figure 16B), this Ets element-containing reporter was up-regulated -4 fold by co-transfection of ESX into COS-7 cells but was down- regulated -4 fold upon co-transfection into the breast cancer cell lines SKBr3 or ZR-75-1. As both SKBr3 and ZR-75-1 cells express relatively high levels of endogenous ESX transcripts and protein, we suspected that the exogenously introduced ESX served to titrate down TBP levels available for reporter transcription. To examine the potential role of ESX exon 4 in mediating reporter suppression in the breast cancer cell lines, the exon 4 double mutation (E144A/D146A) previously shown to eliminate GAL4(DBD)-exon 4 transactivation was introduced into full-length ESX and compared to an exon 4 deleted construct. Following co-transfection into SKBr3 cells, reporter activity was 4 fold higher with both the exon 4 mutated (E144A D146A) and the exon 4 deleted ESX constructs as compared to the wild type ESX expression construct (Figure 16B). Conversely, in COS-7 cells where ESX expression had produced 4 fold upregulation of the Ets-driven reporter, both the exon 4 mutated and deleted constructs produced neither activation nor suppression of reporter activity relative to transfection of the empty pcDNAl vector (Figure 16B). These results support the explanation that squelching observed with ESX under certain experimental conditions is mediated by the same exon 4 elements that mediate ESX transactivation and binding to TBP.

Discussion. A number of groups have identified ESX as a structurally unique epitheUum- restricted member of the Ets family of transcription factors (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Chang et al. (1997) Oncogene, 14: 1617-1622; Choi etal. (1998) J. Biol. Chem., 273: 110-117; Oettgen et al (1997a) Mol. Cell. Biol, 17: 4419-4433; Tymms et al (1997) Oncogene, 15, 2449-2462). To study transcriptional expression and the functional domain organization of the predicted 42 kDa ESX protein, we first isolated and sequenced an 11 kb genomic fragment encompassing the human ESX gene locus. The 9 exons encoding the previously defined human ESX cDNA sequence span -5 kb of this genomic locus, and all exon-intron boundaries are conserved between mouse and human cDNA sequences (Example 6). The C-terminal DNA-binding Ets domain is encoded by exons 8 and 9 and the site of insertion of the intron separating these exons is also precisely conserved with that of another Ets factor, ERF (Figure 12B). This conservation of exon- intron boundary between ESX and ERF is somewhat surprising in that these two Ets factors share only 43% amino acid identity within their Ets domains; in contrast, ERF and ETS-1 which share 65% sequence identity exhibit different exonic structure in this domain. As genomic structures become available for other family members having more similar sequence homologies with ESX in the Ets domain, such as EHF (84% identity) and E74/ELF-1 (49%) identity), this apparent discrepancy in family lineage based on conservation of Ets domain sequence vs. exonic structure will no doubt become better understood.

One surprising finding that emerged from an analysis of the most upstream ESX genomic sequence was the identification of a terminal exon from a new gene encoding a putative ubiquitin-conjugating enzyme variant, UEV-1. Located -5 kb upstream of the ESX transcription initiation site, this 589 bp region mapped identically to the terminal 3' sequences of both the human contig, THC213038, and a recently identified human cDNA sequence encoding a cell cycle altering UEV-1 gene product (Sancho et al (1998) supra.). Northern analysis using a genomic fragment overlapping this 589 bp region detected transcripts migrating at 3.5 kb and 1.9 kb, consistent with the prior observation of a multiply •spticed UEV-1 transcript (Id.). The more ubiquitously expressed UEV-1 gene is separated from the epithelium-specific ESX gene by two intervening Alu elements, which may serve to delineate the 5' boundary of the functionally restricted ESX promoter. Indeed, our recent ESX promoter studies indicate that conserved elements responsive to both serum and epithehal growth factors (neuregulin and epidermal growth factor) reside within 350 bp of the ESX transcription initiation site (Example 6). This proximal promoter region maintains a >80% sequence identity to the coπespondmg region in the mouse proximal promoter but the promoter sequence homology is rapidly lost (<50% identity) from -350 bp upstream to - 1500 bp, suggesting a very compact ESX promoter (Example 6).

Northern blots of poly(A)-RNA from whole mouse embryos (7- 17 days old) have demonstrated a single 2.2 kb ESX transcript (Example 6); however, alternate ESX transcripts of larger (3.8- 4.1 kb) and smaller (<2 kb) size have been noted in some adult organs (rodent and human) and malignant cell lines (Chang et al. (1997) supra.; Pongubala et al. (1993) Science, 259: 1622-1625; Tymms et al. (1997) Oncogene, 15, 2449-2462). In this regard, a BLAST EST search of the 0.9 kb genomic sequence lying between the 3' termination site of the 2.2 kb ESX transcript and the 3' HindHI site identified two overlapping contigs (THC209689 and THC203540), one of which extends 300 bp beyond the HindUJ site and terminates in two adjacent poly(A) signals (Figure 12B). Northern analysis using a genomic probe from the 3' HindEfl region identified only the 4.1 kb ESX transcript, thus establishing that this transcript likely arises from alternate poly(A) site selection.

Of the several ESX structural domains initially postulated on the basis of sequence homologies (Andreoli et al. (1997) supra.; Chang et al. (1997) supra.; Oettgen et al. (1997) supra.), the A/T hook homologous region adjacent to the Ets domain is provocative as a possible supplementary DNA recognition element that could facilitate chromatin association by minor groove binding within A/T-rich DNA stretches. One report has described weak ESX binding in vitro to a consensus A/T response element (Andreoli et al. (1997) supra.). Moreover, the identification of an A T hook motif from an HMG1 architectural factor being chromosomally disrupted and chimerically fused to acidic transactivation domains in human lipomas underscores the potential biological relevance of this A/T hook homology in ESX (Ashar et al. (1995) Cell, 82, 57-65). To look for enhanced chromatin association by ESX an antibody to the C-terminal portion of ESX was used to Western blot nuclear fractions from the ESX overexpressing cell Une, SKBr3 (Chang et al (1997) supra.). Nuclei were fractionated into a standard Dignam-type extract (NE) and a fully solubilized chromatin-nuclear matrix fraction (NP). The potential influence of the Ets DNA-binding domain in directing preferential nuclear association of ESX was assessed by comparing NP/NE partitioning of ESX with another Ets factor, ELF-1, given its sequence homology in the Ets domain. Additionally, other SKBr3 expressed transcription factors AP- 2, Spl, and NF-κB were also assessed as controls, since the latter two are known to be components of the nuclear matrix Raziuddin et al. (1997) supra.; Van Wijnen et al. (1993) supra.). AP-2 was found most highly associated with the NE fraction. In contrast, ESX was found to have a -5 fold greater affinity than ELF-1 for the NP fraction and showed an NP/NE ratio comparable to that of the known nuclear matrix components, S l and NF-κB. While not directly implicating the A/T hook region in ESX for this chromatin-nuclear matrix association, the much greater affinity of ESX over ELF-1 for this association suggests that it may have been mediated by a functional element in ESX outside of the Ets domain. Another candidate element in ESX under study in this regard is the putative bipartite nuclear localization sequence (NLS) that partially overlaps the A T hook region in exon 7 (Robbins et al. (1991) Cell, 64: 615-623). Since all initial reports documented the capacity of ESX to transactivate an epithelial promoter harboring an appropriate Ets response element (Andreoli et al. (1997) Nucleic Acids Res., 25, 4287-4295; Chang et al. (1997) Oncogene, 14: 1617-1622; Chen et al/ (1998) Gene, 207: 209-218; Choi et al. (1998) J. Biol. Chem., 273: 110-117; Tymms et al. (1997) Oncogene, 15, 2449-2462; Oettgen et al. (1997) Genomics, 45: 456-457), a series of exon-based C-terminal deletions of ESX fused 3' to GAL4(DBD) were constructed and assayed for their transactivating capacity on a GAL4(DBD) responsive reporter. Transactivation to the level established by a GAL4(DBD)-VP16 positive control was achieved with all deletion constructs that included exon 4, which is located in the N-terminal region of ESX and spans amino acids 129-159. AU GAL4(DBD)-ESX fusion constructs missing exon 4 exhibited virtually no transactivating capacity. One recent study exploring the transactivation capacity of ESX had shown that large N-terminal deletions abolished this activity;. however, without full knowledge of ESX genomic organization this study was limited in its ability to map the ESX transactivation domain (Choi et al. (1998) supra.). To establish that the ESX domain encoded by exon 4 could function autonomously as a transactivator, various contiguous exonic groupings were fused to

GAL4(DBD) and examined for their transactivating capacity. In all cases transactivation was absolutely dependent upon exon 4, and maximal reporter activity was achievable with exon 4 alone or any combination of exons with exon 4, except those containing the A T hook encoding exon 7 (aa 236-268). GAL4(DBD)-ESX fusion constructs containing both exon 4 and exon 7 upregulated the GAL4(DBD) responsive reporter but with only -10% of the maximal transactivating capacity of all other exon 4 containing constructs, consistent with an earlier observation (Choi et al. (1998) supra.) and suggesting that exon 7 provides an attenuating influence on ESX transactivation. While the mechanism by which exon 7 attenuates the transactivating capacity of exon 4 in these transient transfection assays remains unclear, it is not affected by the presence of the Ets domain encoded by exons 8 and 9 but may relate to the chromatin-nuclear matrix sequestering property of ESX and/or the presence of an A T hook and candidate bipartite nuclear localization sequence in exon 7, effectively reducing the distribution of soluble transactivator seen by the co-transfected reporter gene. With 22%» of its 31 amino acids encoding either aspartic acid (D) or glutamic acid (E), the exon 4 domain of ESX may be classified as an acidic activation domain although it lacks sequence homology with other well studied acidic transactivation domains such as those found in the yeast transcription factor, GAL4, and the Herpes Simplex transcription factor, VP16 (Chang and Gralla (1993) supra.' Ptashne and Gann (1997) supra.; Uesugi et al. (1997) supra.). To define those amino acids essential for ESX transactivation, an aπay of single and double alanine (A) substitutions within exon 4 were introduced into the GAL4(DBD)-ESX(129-159) fusion construct co-transfected into SKBr3 cells. Results from these experiments were typical of similar mutagenesis mapping studies performed on VP 16-like acidic tiansactivators in so much as they demonstrated the presence of both dispensible and indispensible acidic residues without revealing the underlying mechanism of transactivation by such acidic domains (Uesugi et al. (1997) supra.). Some mutations (E135A and E152A/D155A) had virtually no impact on transactivating capacity relative to the unmodified exon 4 construct, one (El 41 A) reduced activity by nearly 60%, and others (D 134 A, D 134A/E 135 A, and E 144A D 146 A) completely crippled the activating function of this domain.

Two provocative motifs are contained within exon 4 and their roles in the transactivating mechanism used by this domain were also assessed by mutagenesis. Two overlapping casein kinase (CK) II elements (129-TSSSED-134) in the N-terminus of exon 4 are potential targets for phosphorylation that could affect ESX transactivation, as has been shown for the CK Ef element present in another Ets family member, PU.l (Pongubala et al. (1993) supra.). To test the relevance of these overlapping CK Ef sites, the double mutation S131R/S132A was introduced into exon 4 and compared to the unmodified GAL4(DBD)- ESX(129-159) control construct. Upon transfection into SKBr3 cells the mutated construct activated the GAL4(DBD) reporter as well as the unmodified control construct, demonstrating that CK II phosphorylation plays little if any role in exon 4 transactivation. Another motif appearing in the C-terminal portion of exon 4 (150-FXXφφ-154) has not only been shown to be critical for VP16 transactivation and its interaction with TAFII31, but is also thought to function as a general recognition element for acidic activators (Uesugi et al. (1997) supra.). Alanine substitutions in this motif, including a double mutation that in the context of VP16 (479- FXXφφ-483) completely abrogated its transactivating capacity, had little effect on the transactivating capacity of the GAL4(DBD)-ESX(129-159) construct. To confirm that elements residing in either terminus of this domain are not required for exon 4 transactivation, a 13 residue acidic core domain (DELSWHELLKDG) expressed as a GAL4(DBD)-ESX(134-147) fusion construct was shown to retain nearly 50% of the full exon 4 transactivating capacity.

While structural themes mediating the function of acidic tiansactivators are not fully understood, the acidic transactivating domains of GCN4 and p65NF-DB are known to assume, respectively, a β sheet or α-helical structure that depends on pH or local hydrophobic environment (Schmitz et al. (1994) J. Biol. Chem., 269: 25613-25620; Van Hoy et al. (1993) Cell, 72: 587-594). The 13 amino acid (aa 134-147) core domain from exon 4 was predicted to form a amphipathic α-helical structure. In agreement with this prediction, a helix-destabilizing protine mutation introduced into the middle of exon 4 (L143P) completely abolished its transactivating capacity. To confirm the predicted secondary structure of this domain, CD analysis of a synthetic 25 residue ESX exon 4 peptide demonstrated α-helical structure that was completely dependent on its hydrophobic (50%) methanol) environment.

Acidic tiansactivators are thought to recruit components of the basal transcription complex into a pol II preinitiation complex, as exemplified by the specific binding of VP16 to TBP, TFUB and TAFU31 (Chang and Gralla (1993) supra.; ptashne and Gann (1997) supra.; Pugh (1997) supra.; Uesugi (1997) supra.). Since the activation potency of the acidic exon 4 domain from ESX appeared comparable to that of the activation domain from VP16, a GST pull-down assay was employed to assess the potential affinity of exon 4 for the VP 16 binding factors, TBP and TFUB. As negative binding controls in this assay for column bound protein, the leucine to protine exon 4 mutation, GST-exon 4(L143P), as well as GST alone were used. In particular, the L143P mutation was anticipated to be a particularity stringent control since it exhibited no transactivation capacity but leaves the charge distribution unaltered in the critical acidic core region of exon 4. By SDS-PAGE, the GST-exon 4(L143P) fusion protein consistently migrated -10-15 kDa slower than the unmodified GST-exon 4 protein, indicative of the helix-distorting influence of the proline substitution. TFUB did not bind to either the negative controls or to the unmodified GST-exon 4 fusion protein. In contrast, the column bound GST-exon 4 fusion protein quantitatively bound all of the added recombinant TBP. Interestingly, while the GST negative control did not bind TBP, the column bound GST-exon 4(L143P) fusion protein exhibited some capacity to bind TPB but with substantially reduced affinity as compared to the unmodified GST-exon 4 fusion protein. This was shown using successive elutions at increasing salt concentrations, with 1.3 M NaCl sufficient to remove almost all TBP from the proline mutated exon 4 while unmodified exon 4 still retained most of its bound TBP following this 1.3 M elution condition, requiring 2.0 M NaCl for significant elution. Given the ability of a 13 residue acidic core domain from exon 4 to induce nearly 50%) of the transactivating capacity observed with full-length exon 4, it is tempting to speculate that its ability to bind and recruit TBP into a pol II preinitiation complex may be sufficient to explain the potent transactivating capacity of this ESX domain.

A review of published studies exploring the gene activating potential of full- length ESX on transiently co-transfected Ets responsive promoters suggests that this promoter upregulation is best detected in cell lines expressing ininimal levels of endogenous ESX (Andreoli et al, supra.; Chang et al. supra.; Choi et al, supra.; Tymms et al. supra.). Indeed, we have repeatedly shown that ESX can induce >4 fold upregulation of an Ets responsive reporter in transiently co-transfected COS-7 cells which express little if any endogenous ESX (Chang et al. supra.). As illustrated here, however, a comparative assessment of Ets promoter activation in transiently transfected SKBr3 cells produced the opposite result, suppression of an Ets responsive promoter following excess ectopic production of ESX. For these SKBr3 cells, as well as for similar cell lines expressing high levels of endogenous ESX (e.g. MDA-453, ZR-75-1; data not shown), introduction of a reporter .construct bearing a tandemly repeated Ets responsive element from the er&B2/HER2 promoter that was co-transfected with a vector control resulted in >5 fold higher reporter activity than that achieved when the same reporter was co-transfected with an ESX expressing vector. The complete lack of suppression in this reporter activity (relative to vector control) with co-transfection of an ESX expression construct whose transactivation domain had been inactivated by either a double mutation (D134A E135A) or complete deletion of exon 4 sequences (aa 129-159), indicated that this promoter squelching was dependent on the ESX transactivation domain and potentially due to TBP binding and sequestration. To demonstrate that this full squelching effect required only the acidic activation domain of ESX and was not dependent on its promoter binding, a GAL4(DBD)- ESX(129-159) fusion protein was co-transfected into SKBr3 cells with two different GAI 1(DBD) promoter-independent reporters. The strong squelching effect observed with this exon 4-encoding fusion protein on both heterologous reporters was comparable to that produced by the GAL4(DBD)-VP16(413-490) positive control.

The results from these transient transfection studies are consistent with the explanation that squelching and transactivation are both mediated by ESX exon 4 and depend on high-affinity binding by this domain to a limiting component of the basic transcriptional machinery, TBP. When TBP is recruited by promoter-bound ESX, transactivation with enhanced formation of a pol II preinitiation complex occurs. When TBP is sequestered by excess ESX protein unbound to DNA, squelching of TBP-dependent gene expression can occur. Clearly, assessment of the ability of ESX to transactivate individual Ets responsive promoters must take into account the cell system being employed (including its endogenous level of ESX expression) as well as the format of the transiently transfected Ets responsive reporter construct.

It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in tight thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this apphcation and scope of the appended claims. AU publications, patents, and patent applications cited herein are hereby incδrporated by reference for all purposes.

Claims

CLAIMS WHAT IS CLAIMED IS: 1. A method of screening for a modulator of ESX activity, said method comprising the steps of: (i) providing a target selected from the group consisting of a nucleic acid encoding a polypeptide of ESX exon 7, a polypeptide comprising ESX exon 7; (ii) contacting said nucleic acid or said polypeptide with a test agent; and (iii) detecting binding of said test agent to said nucleic acid or said polypeptide wherein a test agent that binds to said nucleic acid or said polypeptide modulates ESX activity.

2. The method of claim 1, wherein target is a nucleic acid encoding a polypeptide of ESX exon 7.

3. The method of claim 1, wherein target is a polypeptide of ESX exon 7.

4. The method of claim 1, wherein said nucleic acid or said polypeptide is labeled with a detectable label.

5. The method of claim 4, wherein said detectable label is selected from the group consisting of radioisotopes, enzymes, fluorescent molecules, chemiluminescent molecules, bioluminescent molecules, and colloidal metals.

6. The method of claim 5, wherein said detectable label is a fluorescent label.

7. The method of claim 1, wherein said detecting is by an immunoassay utilizing an antibody that specifically binds to a polypeptide of ESX exon 7.

8. A method of activating transcription of a gene or a cDNA, said method comprising contacting said gene or cDNA with a construct comprising a DNA binding domain attached to a polypeptide comprising exon 4 of ESX.

9. The method of claim 8, wherein said polypeptide comprising exon 4 of ESX contains one or more mutations of amino acids selected from the group consisting of amino acid 152, amino acid 155, amino acid 145, amino acid 135, amino acid 131, amino acid 132, amino acid 150, and amino acid 154.

10. The method of claim 8, wherein said polypeptide comprising exon 4 of ESX contains a carboxyl terminal deletion of exon 4 of up to 27 amino acids.

11. The method of claim 8, wherein said polypeptide comprising exon 4 of ESX contains an amino terminal deletion of exon 4 of up to a 5 amino acids.

12. The method of claim 8, wherein said polypeptide comprising exon 4 of ESX contains both an amino terminal and a carboxyl terminal deletion of exon 4 leaving the exon 4 amino acids 134 through 147.

13. The method of claim 8, wherein said gene or cDNA is under the control of an epithelial gene promoter having an ETS response element.

14. The method of claim 13, wherein said gene or cDNA under the control of an epithelial gene promoter having and ETS response element is a transiently transfected vector.

15. The method of claim 8, wherein said a DNA binding domain attached to a polypeptide comprising exon 4 of ESX is a fusion protein.

16. The method of claim 8, wherein said a DNA binding domain is a GAL4 DNA binding domain.

17. A construct comprising a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX.

18. The construct of claim 17 comprising a DNA binding domain attached to a polypeptide comprising exon 4 of ESX.

19. The construct of claim 18, wherein said polypeptide comprising exon 4 of ESX contains one or more mutations of amino acids selected from the group consisting of amino acid 152, amino acid 155, amino acid 145, amino acid 135, amino acid 131, amino acid 132, amino acid 150, and amino acid 154.

20. The construct of claim 18, wherein said polypeptide comprising exon 4 of ESX contains a carboxyl terminal deletion of exon 4 of up to 27 amino acids.

21. The construct of claim 18, wherein said polypeptide comprismg exon 4 of ESX contains an amino terminal deletion of exon 4 of up to 5 amino acids.

22. The construct of claim 18, wherein said polypeptide comprising exon 4 of ESX contains both an amino terminal and a carboxyl terminal deletion of exon 4 leaving the exon 4 amino acids 134 through 147.

23. The construct of claim 18, wherein said a DNA binding domain attached to a polypeptide comprising exon 4 of ESX is a fusion protein.

24. The construct of claim 18, wherein said a DNA binding domain is a GAL4 DNA binding domain.

25. A nucleic acid encoding a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX.

26. The nucleic acid of claim 25, wherein said nucleic acid encodes a DNA binding domain attached to a polypeptide comprising exon 4 of ESX.

27. The nucleic acid of claim 25, wherein said polypeptide comprising exon 4 of ESX contains one or more mutations of amino acids selected from the group consisting of amino acid 152, amino acid 155, amino acid 145, amino acid 135, amino acid 131, amino acid 132, amino acid 150, and amino acid 154.

28. The nucleic acid of claim 25, wherein said polypeptide comprising exon 4 of ESX contains a carboxyl terminal deletion of exon 4 of up to 27 amino acids.

29. The nucleic acid of claim 25, wherein said polypeptide comprising exon 4 of ESX contains an amino terminal deletion of exon 4 of up to 5 amino acids.

30. The nucleic acid of claim 25, wherein said polypeptide comprising exon 4 of ESX contains both an amino terminal and a carboxyl terminal deletion of exon 4 leaving the exon 4 amino acids 134 through 147.

31. The nucleic acid of claim 25, wherein said DNA binding domain is a GAL DNA binding domain.

32. The nucleic acid of claim 25, wherein said nucleic acid is present in a vector.

33. An affinity matrix comprising a solid support attached to a polypeptide comprising exon 4 or exon 7 of ESX.

34. The affinity matrix of claim 33, wherein said solid support is selected from the group consisting of glass, plastic, metal, ceramic, and aerogel.

35. A kit comprising a container containing one or more of the constructs selected from the group consisting of: a polypeptide that is not a full- length ESX and that comprises ESX exon 4 or ESX exon 7, a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX, a nucleic acid encoding a polypeptide that is not a full-length ESX and that comprises ESX exon 4 or ESX exon 7, and a nucleic acid encoding a DNA binding domain attached to a polypeptide comprising exon 4 or exon 7 of ESX.

36. A method of detecting dysregulation of an ESX gene in an organism, said method comprising: (i) detecting the degree of acetylation of ESX in a biological sample of said organism; and (ii) comparing said degree of acetylation of ESX in said biological sample with the degree of acetylation in a control sample from a normal healthy tissue, wherein a difference in the degree of acetylation of ESX in said biological sample with the degree of acetylation in said control sample indicates dysregulation of an ESX gene.

-Ill-

.

37. The method of claim 36, wherein said difference is a statistically significant difference.

38. The method of claim 36, wherein said detecting utitizes an antibody that specifically binds to an acetylated ESX and not to an unacetylated ESX.

39. The method of claim 36, wherein said detecting utilizes an antibody that specifically binds to an unacetylated ESX and not to an acetylated ESX.

40. The method of claim 37, wherein said statistically significant difference is indicative of an epithelial cancer.

41. The method of claim 40, wherein said epithehal cancer is human breast cancer.

42. The method of claim 41 , wherein said healthy tissue comp╧Çses normal human mammary epithehal cells.

43. The method of claim 36, wherein said statistically significant difference is indicative of an unfavorable prognosis.

44. The method of claim 43, wherein said method further comprises selecting an appropriate treatment regime.

45. A method of inhibiting growth or proliferation of neoplastic cells, said method comprising administering to said cells an effective amount of an agent that inhibits activity of exon 4 or exon 7

46. The method of claim 45, wherein said neoplastic cells comprise a cancer in an organism.

47. The method of claim 45, wherein said method comprises transfecting cells of said mammal with a vector expressing an antisense ESX nucleic acid that specifically binds to a nucleic acid of exon 4 or exon 7.

48. The method of claim 45, wherein said agent is an exon 4 mutein or an exon 7 mutein.

49. A method of depressing transcription of a gene or a cDNA, said method comprising contacting said gene or cDNA with a recombinantly expressed polypeptide comprising an exon 4 of ESX wherein said polypeptide is not a full-length ESX and lacks a DNA binding domain.

50. A probe for detection or localization of proteins that bind to ESX, said probe comprising a detectable label attached to a polypeptide comprising ESX exon 4 wherein said polypeptide is not a full length ESX polypeptide.

51. The probe of claim 50, wherein said polypeptide is recombinantly expressed.

52. The probe of claim 50, wherein said detectable label is selected from the group consisting of radioisotopes, enzymes, fluorescent molecules, chemiluminescent molecules, quantum dots, bioluminescent molecules, and colloidal metals.

53. A probe for detection or localization of proteins or nucleic acids that bind to ESX, said probe comprising a detectable label attached to a polypeptide comprising ESX exon 7 wherein said polypeptide is not a full length ESX polypeptide.

54. The probe of claim 53, wherein said polypeptide is recombinantly expressed.

55. The probe of claim 53, wherein said detectable label is selected from the group consisting of radioisotopes, enzymes, fluorescent molecules, chemiluminescent molecules, quantum dots, bioluminescent molecules, and colloidal metals.