WO1994005776A1

WO1994005776A1 - Myocyte-specific transcription enhancing factor 2

Info

Publication number: WO1994005776A1
Application number: PCT/US1993/008386
Authority: WO
Inventors: Bernardo Nadal-Ginard
Original assignee: The Children's Medical Center Corporation
Priority date: 1992-09-04
Filing date: 1993-09-07
Publication date: 1994-03-17
Also published as: AU4849593A

Abstract

The invention generally features members of the Myocyte-specific Enhancer Factor 2 (MEF2) protein family which has myocyte transcription enhancing activity, the MEF2 nucleic acids or proteins being used to increase muscle cell mass or activity in transgenic animals, or in victims of muscle cell atrophy.

Description

MYOCYTE-SPECIFIC TRANSCRIPTION ENHANCING FACTOR 2 Background of the Invention Funding for the work described herein was provided by the federal government through the National Institutes of Health, which has certain rights in the invention.

The invention relates to the use of muscle cell transcriptional regulators. Upon growth factor removal, skeletal myoblasts form terminally differentiated myotubes with the concomitant induction of a battery of muscle specific genes (reviewed in Emerson et al. 1986. Molecular biology of muscle development. Alan R. Liss, Inc., New York). The control regions of many of these genes interact with a complex set of cell specific and ubiquitous factors that combinatorially produce muscle specific transcription (Walsh et al. 1987. J. Bio . Chem . 262;9429- 9432; Muscat et al. 1988. Mol . Cell .Biol . 8:4120-4133; Gossett et al. 1989. Mol . Cell .Bio . 2:5022-5033; Buskin et al. 1989. Mol . Cell . Bio . 9:2627-2640; Yu et al. 1989. Mol . Cell . Bio . 9:1839-1849; Braun et al. 1989. Mol . Cell . Bio . 9:2513-2525; Mar et al. 1990. Mol . Cell .Bio . ICI:4271-4283; Thompson et al. 1991. J. Bio . Chem . 2ϋ6:22678-22688; reviewed in Rosenthal, 1989.

Curr. Opinion Cell Biol . 1:1094-1101). To date, the best characterized muscle specific regulatory factors are the myogenic basic-helix-loop-helix (bHLH) proteins of the MyoD family (reviewed in Rosenthal, εupra ; Emerson, 1990. Curr . Opinion Cell Biol . 2 : 1065-1075; Olson, 1990. Geneε Dev . 4_.:1454-1461; Weintraub et al. 1991. Science 251:761- 766) . Muscle g«?ne induction by these proteins depends on sequence speci ic DNA binding at the E-box present in many muscle enhancers and promoters. However, not all muscle genes contain E-boxes, and even when present, they are not uniformly required for efficient muscle specific expression (Bouvagnet et al. 1987. Mol . Cell .Bio . 2:4377- 4389; Walsh et al. εupra ; Mar et al. εupra ; Miller, 1990. J . Cell Biol . 111:1149-1159; Peterson et al. 1990. Cell 62.:493-502; Thompson et al. εupra) . In addition, many genes induced by MyoD in skeletal muscle are also expressed in cardiac and, in some cases, smooth muscle where myogenic bHLH proteins have not been found and unrelated lineage-determining genes may operate (Davis et al. 1987. Cell 5,:987-1000; Hopwood et al. 1989. EMBO J . 8:3409-3417; Sassoon et al. 1989. Nature 341:303-307; Wright et al. 1989. Cell 5.6:607-617; R.E.B., unpublished observations) .

Summary of the Invention Applicants have identified and isolated a family of muscle-specific transcription factors, the Myocyte- specific Enhancer Factor 2 (MEF2) protein family, and cDNA's encoding them. These transcription factors are characterized by their ability to enhance the transcription of genes in muscle cells that play important roles in muscle cell proliferation and differentiation. The transcription factors of the invention are useful for increasing muscle mass in agricultural or domestic animals, or in humans that suffer from muscle cell atrophy.

Accordingly, the invention generally features a transgenic non-human mammal, hereinafter referred to as a transgenic mammal of the invention, that includes a first transgene encoding a member of the Myocyte-specific Enhancer Factor 2 (MEF2) protein family having myocyte transcription enhancing activity. The term transgene is used to cover mammals comprising a transgene introduced at an embryonic stage into the mammal or into an ancestor of the mammal. "A member of the Myocyte-specific Enhancer Factor 2 protein family", referred to herein as a MEF2 polypeptide, refers to a polypeptide that enhances the transcriptional activity of a set of structural genes that include a MEF2 consensus recognition site, 5'- CTAAAAATAA-3' (SEQ ID NO: 18) or 5'-CTA(AT)₄ TAG-3' (SEQ ID NO: 19) , as part of their 5' regulatory sequences. A MEF2 polypeptide will include a sequence substantially homologous to the MADS enhancer sequence (Fig. IB) (SEQ ID NO: 2) , and a sequence substantially homologous to the MEF2 region (Fig. IA

SEQ ID NO: 1) and IC) . The MEF2 family can include any active form of MEF2, including forms whose activity is potentiated by other substances. Myocyte transcription activity means activity in the assay described below or an equivalent assay.

In preferred embodiments, the nucleotide sequence of the first transgene can include at least one of the following elements: a) a nucleotide sequence encoding at least eleven consecutive glutamine residues, or b) a nucleotide sequence encoding the amino acid sequence SEEEELEL (SEQ ID NO: 20) . The mammal can be any agricultural or domestic mammal, or any mammal used for laboratory, research, or diagnostic purposes. The MEF2 protein encoded by the transgene can include at least a 54 amino acid portion of the amino acid sequence of Fig. IA (SEQ ID NO: 1) . Where the wild-type protein includes an inactivation domain, the MEF2 protein can be a mutant of the wild-type protein, such that the first transgene is deleted for sequences encoding the inactivation domain.

Specific MEF2 family members according to the inventions are isoforms of the MEF2 sequence shown in Fig. 1. e.g., aMEF2 SEQ ID NO: 7, or XMEF2 SEQ ID NO: 3. The transgenic mammal of the invention can further include a second transgene introduced into the mammal, or an ancestor of the mammal, at an embryonic stage, the second transgene including a promoter positioned to effect expression of a structural gene, the promoter being characterized in that the expression is enhanced by the MEF2 protein family member. Alternatively, the transgenic mammal of the invention can further include a second transgene introduced into the mammal, or an ancestor of the mammal, at an embryonic stage, the second transgene enhancing the activity of the MEF2 protein family member. The enhancing activity can be any enhancing activity that increases MEF2 activity, i.e., by increasing the amount of MEF2 transcribed, e.g., by increasing the expression of MEF2; or by increasing the activity of an at least partially inactive form of the MEF2 protein family member. The activity of a partially inactive MEF2 protein can be increased, for example, by including a transgene that phosphorylates MEF2, or by including a transgene that codes for a protease that deletes inactivating sequences from the primary sequence of the MEF2 polypeptide, or by including a transgene that codes for an activator molecule, e.g., a hormone. Examples of proteins that can enhance MEF2 activity include, but are not limited to, a MyoD polypeptide, a myogenin polypept:.de, or a homeobox protein. A "myoD polypeptide", as used herein, can include any member of the myogenic basic-helix-loop-helix (bHLH) polypeptide family. A transgene of the invention, e.g., a first transgene, or a second transgene, can be expressed by a tissue-specific promoter, e.g., a muscle cell specific promoter.

In another aspect, the invention includes an essentially pure nucleic acid encoding a member of the Myocyte-specific Enhancer Factor (MEF2) protein family which has myocyte transcription enhancing activity.

In preferred embodiments, a MEF2 nucleic acid can include at least one of the following elements: a) a nucleotide sequence encoding at least eleven consecutive glutamine residues, or b) a nucleotide sequence encoding the amino acid sequence SEEEELEL. The MEF2 nucleic acid can also encode a 54 amino acid portion of the amino acid sequence of Fig. IA (SEQ ID NO: 1), e.g., a sequence including the conserved MADS domain, or a sequence including the MEF2 DNA binding domain. The MEF2 nucleic acid can be an isoform of the MEF2 sequence shown in Fig. 1, e.g., aMEF2 (SEQ ID NO: 7), or XMEF2 (SEQ ID NO: 3). The MEF2 nucleic acid can be part of a nucleic acid vector, wherein the vector can also, but does not of necessity, include a transcriptional regulatory sequence positioned and oriented to regulate expression of the nucleic acid encoding the MEF2 family member. A cell that contains such a vector is also included in the invention.

An additional preferred embodiment is a substantially pure MEF2 polypeptide encoded by any of the MEF2 nucleic acids defined above. The polypeptide can include at least a 54 amino acid portion of the amino acid sequence of Fig. IA (SEQ ID NO: 1), e.g., a sequence including the conserved MADS domain, or a sequence including the MEF2 DNA binding domain. A MEF2 polypeptide can be included in a composition that additionally includes a pharmaceutically acceptible carrier.

In a third aspect, the invention includes a method of inducing the expression of muscle-specific genes of a mammal, e.g., a human, or a domestic animal. The method involves administering to the mammal a nucleic acid vector that encodes a member of the Myocyte-specific Enhancer Factor 2 (MEF2) protein family that has transcription enhancing activity.

A preferred nucleic acid vector used in the above' method of inducing the expression of muscle specific genes includes at least one of the following elements: a) a nucleotide sequence encoding at least eleven consecutive glutamine residues, or b) a nucleotide sequence encoding the amino acid sequence SEEEELEL (SEQ ID NO: 20) . The method of inducing the expression of muscle-specific genes can further include a second nucleic acid administered to the mammal, the second nucleic acid enhancing the activity of the MEF2 protein family member. The enhancing activity can be any enhancing activity that increases MEF2 activity, i.e., by increasing the amount of MEF2 transcribed, e.g., by increasing the expression of MEF2; or by increasing the activity of an at least partially inactive form of the MEF2 protein family member. The activity of a partially inactive MEF2 protein can be increased, for example, by administering a second nucleic acid that phosphorylates MEF2, or by including a second nucleic acid that codes for a protease that deletes inactivating sequences from the primary sequence of the MEF2 polypeptide, or by administering a second nucleic acid that codes for an activator molecule, e.g., a hormone. Examples of proteins that can enhance MEF2 activity include, but are not limited to, a MyoD polypeptide, a myogenin polypeptide, a retinoblastoma polypeptide, or a homeobox protein. The invention also includes a method of inducing the expression of muscle-specific genes in a mammal, the method including administering a polypeptide expressed from any of the MEF2 nucleic acid sequences described above.

A method of alleviating symptoms of muscular dystrophy in a mammal features administering a MEF2 nucleic acid, or a member of the MEF2 protein family, to a mammal, preferably to a human diagnosed with any of the disease forms of Muscular Dystrophy, in a vector that includes means for expressing the MEF2-family member- encoding nucleic acid. Where the method of alleviating symptoms of muscular dystrophy features administering a nucleic acid, the method can further include administering a second nucleic acid, e.g., a nucleic acid encoding a dystrophin protein, to the mammal, the level of transcriptional expression of the second nucleic acid being enhanced by a member of the Myocyte-specific Enhancer Factor 2 (MEF2) protein family.

The invention also includes a method of preventing or reducing muscle atrophy in a mammal, involving administering a vector that includes a MEF2 nucleic acid of the invention, or a MEF2 polypeptide, to the mammal.

The invention also includes a method of enhancing muscle mass in a mammal, involving administering the MEF2 nucleic acid of the invention, or a MEF2 polypeptide, to the mammal. The administration can be by direct intramuscular injection.

The invention also includes a method of identifying a molecule that enhances the activity of a member of the Myocyte-specific Enhancer factor 2 (MEF2) family. The method includes providing a candidate molecule; providing a MEF2 family member of the invention in a solution; providing a MEF2 consensus nucleic acid binding sequence; and determining whether the candidate molecule enhances binding of the MEF2 family member to the MEF2 consensus binding sequence.

The invention also includes a method of identifying a molecule that enhances the activity of a member of the Myocyte-specific Enhancer factor 2 (MEF2) family. The method involves providing a candidate molecule; providing MEF2 nucleic acid of the invention. transformed into a cell, the cell comprising a structrual gene which includes a regulatory region that includes a MEF2 consensus binding sequence and a promoter responsive^ to the consensus binding sequence; and determining whether introduction of the candidate molecule into the cell enhances expression of the structural gene.

The invention also includes a method of identifying a molecule that enhances the activity of a member of the Myocyte-specific Enhancer factor 2 (MEF2) family. The method involves providing a candidate molecule; providing a MEF2 nucleic acid of the invention, transformed into a cell, the cell including a structrual gene which includes a regulatory region that includes a MEF2 consensus binding sequence and a promoter responsive to the consensus binding sequence; and determining whether introduction of the candidate molecule into the cell enhances expression of the structural gene.

"Essentially pure nucleic acid", as used herein, is nucleic acid that is not immediately contiguous with both of the flanking sequences with which it is immediately contiguous (i.e., one at the 5' end and one at the 3' end) in the naturally-occurring genome of the organism from which the nucleic acid of the invention is derived. The term therefore includes, for example, a recombinant DNA which is incorporated into a vector; into an autonomously replicating plasmid or virus; or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by the polymerase chain reaction or by restriction endonuclease treatment) independent of other nucleic acid sequences. It also includes a recombinant DNA which is part of a hybrid gene encoding additional polypeptide sequence.

"HomoloQcus" refers to the sequence similarity between two p lypeptide molecules or between two nucleic acid molecules. When a position in both of the two compared sequences is occupied by the same base or amino acid monomeric subunit, e.g., if a position in each of two DNA molecules is occupied by adenine, then the molecules are homologous at that position. The homology between two sequences is a function of the number of matching or homologous positions shared by the two sequences. For example, if 6 of 10 of the positions in two sequences are matched or homologous then the two sequences are 60% homologous. By way of example, the DNA sequences ATTGCC and TATGGC share 50% homology.

"A substantially pure MEF2 polypeptide" is a preparation which is substantially free of the proteins with which it naturally occurs in a cell. The transcription factors of the invention bind and induce the expression of a number of muscle specific enhancers and promoters with the consensus sequence (C/T)T(A/T) (A/T)AAATA(A/G) (SEQ ID NO : 21) . These factors regulate muscle-tissue specific gene expression in skeletal, cardiac, and smooth muscle cellε. In particular, applicants have isolated and characterized multiple isoforms of the MEF2 protein family. Four preferred genes encoding this family of transcription factors (aMEF2, xMEF2, dMEF2, and cMEF2) are described below. By alternative splicing, four different isoforms of dMEF2 and CM-MEF2; are produced.

MEF2 transcription factors according to the invention can be used to produce transgenic animals with increased muscle cell mass, to prevent or counteract muscle atrophy in humans or animals suffering a pathological muscular condition, or to develop pharmacological agents that regulate the expression of muscle-tissue genes. Other features and advantages of the invention will be apparent from the following description and from the claims.

Brief Description of the Drawings Fig. IA is a representation of the nucleotide sequence and corresponding amino acid sequence of MEF2. Fig IB compares amino acid sequences of a region of MEF2 with other proteins. Fig. IC shows alternatively spliced isoforms (SEQ ID NOS: 1, 2, 6 and 7). Fig. 2 is a representation of the nucleotide sequence and corresponding amino acid sequence of the XMEF2 isofor , a product of a related gene (SEQ ID NO:

3).

Figs. 3A-3G is an illustration of how ubiquitously expressed MEF2-related RNAs accumulate preferentially in skeletal muscle, heart, and brain.

Fig. 4 is an autoradiograph showing that xMEF2 RNAs are highly restricted to skeletal muscle, heart, and brain. Fig. 5A through 5D are autoradiographs showing that endogenous myotube MEF2 and cloned MEF2 have identical DNA binding specificities.

Fig. 6A and 6B are electrophoretic demonstations that skeletal, cardiac, and smooth muscle specific DNA binding activity is due to MEF2/aMEF2.

Figs. 7A and 7B are diagrammatic representations that cloned M£F2 reproduces site-dependent transcriptional activation present in skeletal, cardiac, and smooth muscle. Fig. 8 is a diagrammatic representation that MyoD induces trans-activation in nonmuscle cells.

Fig. 9 is an illustration of the relation between the amount of injected DNA and CAT-activity.

Fig. 10 is an illustration of the time course of expression of injected gene constructs. Fig. 11 is an illustration of the regional expression pattern of injected gene constructs throughout the left ventricular wall.

Fig. 12 is an illustration of the expression of promiscuous (MSV) or muscle-specific (-667r/3-MHC) promoter constructs in the right ventricle and in skeletal muscle.

Figs. 13A and 13B are illustrations of the correlation of CAT-to Luciferase-activity in co-injection experiments.

Fig. 14 is an illustration of the mapping of the 5' flanking region of the /3-MHC gene in vivo .

Fig. 15 is a representation of the nucleotide (1-2161) and predicted amino acid (1-465) sequences of the dMEF2 cDNA. The double underlined region indicates the putative MADS domain. The region downstream of the MADS domain which is necessary for sequence specificity of the MEF2 related factors is underlined. The alternatively spliced (96nt) region at the 3' end of the cDNAs is overlined with a dashed line. (SEQ ID NO: 4)

Fig. 16 is a diagram of the various alternatively spliced dMEF2 gene products:white, untranslated sequence; checkered, MADS domain; spotted, MEF2 conserved region; diagonal stripes, dMEF2 alternative coding exons. Fig. 17 is a sequence analysis comparing the predicted amino acid sequence of dMEF2 (SEQ ID NO: 5) and MEF2. A vertical line indicates an identical amino acid;: indicates a highly conservative substitution; and ^• indicates a conservative substitution. Fig. 18 is a comparison of the MADS/MEF2 domain amino acid sequences from dMEF2 (present study) (SEQ ID NO: 5), MEF2 (Yu et al 1992) SEQ ID NO: 1), XMEF2 (Yu et al. 1992) (SEQ ID NO: 3, SRF (Norman et al. 1988) (SEQ ID NO: 11), MCMl (Passmore et al 1988) (SEQ ID NO:12),AGL6 (Ma et al. 1991) (SEQ ID NO: 13), AG (Yanofsky et al. 1990) (SEQ ID NO: 17), TM6 (Pnueli et al. 1991) (SEQ ID NO: 14), DEF (Sommer et al. 1990) (SEQ ID NO:15),* and AP3 (Jack et al. 1992) (SEQ ID NO: 16). The MADS domain is the checkered sequence and the MEF2 specific extension of the binding site corresponds to the spotted sequence. The overall identity between these factors is indicated at the right of each sequence. The absolutely conserved amino acids are indicated in capitals in the consensus and conservative substitutions are indicated in lower case letters. For the MADS domain, the consensus is calculated for all of the factors. For the MEF2 specific domain the consensus is calculated just for the 3 MEF2 related factors. The two schematics show a cross section of the two amino - terminal regions which contain predicted amphipathic alpha helices (aa 20-33, and aa60-69 respectively) . The amino-terminal regions which contain predicted amphipathic alpha helices (aa 20-33) , and aa 60-69 respectively) . The amino-terminal end of helixl begins at Thr-20 in the upper region of the diagram and rotates clockwise 100° per residue to Tyr-33. Helix 2 begins at Thr-60 in the upper region of the diagram and ends at tyrosine-69 in the lower region. The hydrophobic residues, which are in bold print, are clustered on one side of each alpha-helix.

Fig. 19 iε; a comparison by alignment of the amino acid sequences of aMEF2 (SEQ ID NO: 7) , yMEF2 (SEQ ID NO: 8), CM-MEF2 (SEQ ID NO:9), CMEF2 (SEQ ID NO: 10), and XMEF2 (SEQ ID NO: 3). Amino acids are expressed in one letter standard code. Description of the Preferred Embodiment(s)

Methods Library Screening

The initial MEF2 cDNA clone was obtained by screening a λGTll expression library generated from primary human skeletal myocytes cultured from vastus lateralis with a probe containing four concatenated copies of the MEF2 site, sequences -1081 to -1059 of the mouse MCK enhancer (-1081/-1059) (Sternberg et al., Mol . Cell . Bio . ,8: 2896-2909, 1988)

(5'-GATCCTCGCTCTAAAAATAACCCTGTC-3') (SEQ ID NO: 22) at a specific activity of 7.8 x 10⁸ cpm/μg. The screening procedures of Singh et al. {Cell , 5^:415-423, 1988.) were followed with several modifications. The buffer used to blot the filters contained 5% nonfat milk powder

(Carnation) in 1 x binding buffer (1 x BB: 20 mM HEPES, pH7.9, 50 mM KC1, 0.2 mM EDTA, ImM DTT) . After washing the filters twice with 0.25% milk in 1 x BB, the filters were incubated in the same buffer containing lOμg/ml poly(dl-dC) /poly(dl/dC) , 10μ.g/ml denatured salmon sperm DNA, and the ³²P-MEF2 probe

(1.7 x 10⁶ cpm/ml) at room temperature for 1 hr. The filters were then kept at 4 °C overnight with gentle agitation, follov/ed by washing four times with 0.25% milk in l x BB at 4 °C for a total of 25 minutes, and subjected to autoradiography. One positive clone was purified, and the DNA insert (2.97 kb) was subcloned and completely sequenced.

For DNA hybridization screening, human adult male cardiac ventricle λZAPII (Stratagene, LaJolla, CA) and dog cardiac ventricle λgtlO (Scott et al., J . Biol . Chem . , 263: 8958-8964, 1988) cDNA libraries, were screened according to standard methods (Sambrook et al., Molecular Cloning, A Laboratory Manual, Second Edition, 1989) . Filters were hybridized at 37 or 42 °C in 5X SSC, 50 mM Na phoεphated, pH 6.5, 1.2X Denhardt's, 0.1% SDS, 100 μg/ml calf thy us DNA, 10% dextran sulfate, 25% or 50% formamide, and 2 x 10⁶ cpm/ml probe. The probe was the 387 bp Nsil/Ndel MEF2 cDNA fragment (nt 342-728) labeled to specific activity (10⁸-10⁹ cpm/μg) with ³²P. Filters were then washed in 2X SSC/0.2% SDS at 25-37°C and exposed to film. The positive clones were purified, and the cDNA inserts were subcloned and sequenced. DNA Seguence Analysis Cloned cDNAs were sequenced with automated instrumentation (Applied Biosystems, Foster City, CA) using a modified dideoxy chain termination method (reviewed in Connell et al.. Bio Techniqueε 5_:342-348, 1987) . Sequences were verified by multiple runs for both strands. Computer analysis of nucleic acid and protein sequences was performed using the University of Wisconsin Genetics Computer Group Sequence Analysis Software Package (Devereux et al., Nucl . Acidε . Reε . , 12:387-395, 1984) and the BLAST Network Service of the National Center for Biotechnology Information (Altschul et al., J. Mol . Biol . , 215:403-410. 1990) . RNA Blot Analysis

Poly-A⁺ RNAs from cultured cells and mouse tissues were electrophoresed (5 μg per lane) and transferred to membranes according to Sambrook et al. (Molecular

Cloning, A Laboratory Manual . 2nd ed . Cold Spring Harbor Laboratory Preεε , 1989) . The human tissue mRNA blot (2 μg per lane) was obtained from Clontech (Palo Alto, CA) . Blots were hybridized in 1 M NaCl, 50 mM Tris-HCl, pH 7.5, 1% SDS, calf thymus DNA 100 μg/ml, 10% dextran sulfate, and 50% formamide with 1 x 10⁶ cpm/ml of probe at 42° The probes were those described in Figures 3 and 4. The blots were then washed at progressively increasing stringencies up to 0.1 X SSC/0.1% SDS at 60°C. Between probes, blots were completely stripped. Synthetic Oligonucleotides Used in Electrophoretic Mobility Shift Assay (EKSA)

All probeu and competitor DNAs were double- stranded (d.s.) synthetic oligonucleotides. For each DNA, the nucleotide sequence (one strand, linker sequence shown in parenthesis) , and coordinates in the respective enhancer or promoter is as follows: MEF2, 5 -(GATC)CTCGCTCTAAAAATAACCCTGT(A)-3 (SEQ ID NO: 22) (mouse MCK enhancer -1081/-1060, Sternberg et al., Mol . Cell . Bio . , 8.:2896-2909, 1988); MEF2mt, MEF2mt4, and MEF2mt6 were MEF2 mutants with point mutations shown in Table 1 (also in Cserjesi, et al., Mol . Cell . Bio . 11:4854-4862, 1991); A/Temb, 5'- (AGCTT)CGGACCCTGCTCATTTCTATATATA(G)-3' (SEQ ID NO:23) (rat embryonic myosin heavy chain promoter -176/-151, Bouvagnet et al., Mol . Cell . Bio . , 2:4377-4389, 1987); CArG, 5'-(AGCTT)GGGGACCAAATAAGGCAAGGT(G) -3' (SEQ ID NO: 24) (human cardiac ά-actin promoter -114/-93, Miwa and Kedes, Mol . Cell . Biol . 1 , 2803-2813, 1987); 0TF-2, 5'-(GATCC)TTCCCAATGATTTGCATGCTCTCAC-3'

(SEQ ID NO: 25) (immmunoglobulin K light chain promoter - 75/-51, Scheidere.it et al., Cell , 5_.2:783-793, 1987); MLC2-HF-1, 5'-(GATC)TCCCTGGGGTTAAAAATAACCCCATGAC-3' (SEQ ID NO: 26) (rat cardiac myosin light-chain-2 promoter -35/-62, Zhu et al., Mol . Cell . Bio . 11, 2273- 2281, 1991) ; MCK A/T,

5'-(GATC)GATCGATGCCTGGTTATAATTAACCCAGACAT-3' (SEQ ID NO: 27) (mouse MCK enhancer -1200/-1173, Sternberg et al. , Mol . Cell . Bio . , 8:2896-2909, 1988) ; cTNT A/T, 5'-(GATC)TCCGACGGGTTTAAAATAGCAAAACTCT-3' (SEQ ID NO: 28) (chicken cardiac troponin T gene -226/-119, Iannello et al., J . Biol . Chem . 266, 3309-3316, 1991); αMHC A/T-, 5'- (GATC)CCTTTCAGATTAAAAATAACTAAGGTAA-3' (SEQ ID NO: 29) and αMHC A/T-2, 5'-(GATC)GCCCAAGGACTAAAAAAAGGCCCTGGA-3' (SEQ ID NO: 30) (rat α myosin heavy chain gene -340/-313 and - Probe/ Sequence MEF2 Binding SEQUENCE ID Competitor DNA

HEF2 5'- C G C T C T A A A A A T A A C C C T -3' +++

MEF2mt C G C T C T A A G G C T A A C C C T

MEF2mt4

MEF2mt6

Λ/Temb

CArG

OTF-2

M C2 HF-1 MCK A/T cTNT A/T αMHC A/T-l αMHC A/T-2

Consensus C T A A A A A T A A

T t T G

Table 1. Nucleotide sequences of probes and competitor DNAs used in MEF2 binding assays. Only the core sequences of the d.s. oligonucleotides are shown. (+) and (-) represent positive and negative binding of the probes, respectively (see Figure 5) . Nucleotides in bold print confora to the consensus sequence of the MEF2 site as reported by Cserjesi and Olson, Hoi Cell Bio, 11:4854-4862, 1991.

Preparation of nuclear extracts and EMSA

The nuclear extracts from C2C12 myoblasts, myotubes, HeLa cells, and rat primary neonatal cardiocytes were prepared as described previously (Yu et al., Mol . Cell . Bio . , j):1839-1849, 1989; Thompson et al., J . Bio . Chem . , 266:22678-22688, 1991). Nuclear extracts from NIH3T3 cells, 10T1/2 cells, and smooth muscle cells were prepared according to the procedures of Schreiber et al. (Nucl . Acidε . Reε . , 18_.:5496-5503, 1990). Smooth muscle cells were from a cell line derived from adult rat pulmonary arteries. The EMSA assays were carried out as described previously (Yu, et al., 1989, εupra) with a few modifications. When the nuclear extracts were examined for the binding activities, the incubation mixture contained 4-7 μg extract, 0.2 ng probe, 3-3.5 μg polydl- dC/polydl-dC, and 100 ng single stranded (s.s.) synthetic oligonucleotide as nonspecific DNA competitors in the binding buffer. When the in vitro translated protein was used in the EMSA assays, the incubation mixture contained 1.5μl translated reticulocyte lysate, 0.2 ng probe, 0.45 μg polydl-dC, and 100-150 ng s.s. oligonucleotide. The bound fraction and the free probe were separated in a 5% polyacrylamide gel (acrylamide:bis = 29.1) at 4°C. Generation of anti-MEF2 antibodies and supershift σel retardation assays

Synthetic peptides corresponding to the partial alternative exons in MEF2 (SEQ ID NO: 1) and aMEF2 (SEQ ID NO: 7) (Fig. IA) , TPHTEEKYKKINEEF(C) (SEQ ID NO: 31) and (C)DYFEHSPLSEDR (SEQ ID NO: 32), respectively, were used to raise antibodies against MEF2 and aMEF2 (Harlow and Lane, Antibodies, A Laboratory Manual, 1988) . The specificities of antisera were demonstrated by the EMSA using the in vitro translated MEF2 and aMEF2, as well as by the specific immunoprecipitation of MEF2 and aMEF2 obtained from in vitro translation and in vivo expression. The anti-MEF2 antiserum recognized both MEF2 and aMEF2, whereas the anti-aMEF2 recognized aMEF2 only. For supershift EMSA, the procedures of Brennan and Olson * (Genes & Dev. 4_.:582-595, 1990) were followed, using 1 μl of serum.

Construction of Plasmid DNAs

For in vitro and in vivo expression of cloned MEF2 isoforms, the cDNA inserts were subcloned into pGEM vectors (Promega Corp. , Madison, WI) and pMT2 vector (Kaufman et al., Mol . Cell . Bio . 2:946-958, 1989), respectively. To generate the MHCemb-CAT reporter constructs, two copies of various oligonucleotides were inserted at -102 of the MHCe b promoter in plasmid PE102CAT (Fig. 7A) (Yu, et al., Mol . Cell . Bio . , 1989 εupra) . Two copies of oligonucleotides were also cloned at the Hindlll site of pδTCKAT (Thompson, J . Bio . Chem . , 166:22678-26688, 1991) located at -109 of the HSV thy idine kinase gene promoter to generate the TK-CAT reporter constructs. Tissue culture and transient expression assays

The tissue culture and transient expression assays were performed as described previously (Yu, et al., 1989 εupra ; Thompson et al., J . Bio . Chem . , 266:22678-22688. 1991) . Transfections were carried out using 10 μg of the individual CAT reporter plasmid, 5 μg of either pMT2-MEF2 or vector pMT2, and 3 μg of the internal control pSV- βgal. The preparation of cell extracts and the assays on the activities of CAT and j8-galactosidase were reported previously (Yu, et al., 1989 εupra ; Thompson et al., 1991 εupra) . When noted, 5 μg of pMSV- yoD (Davis et al., Cell , 51:987-1000_; 1987) was used. Pulmonary arterial smooth muscle cells were maintained in DME/20%FCS. For transient expression assays, these cells were allowed to grow to about 60% confluency, and transfected with various DNAs by calcium phosphate coprecipitation as described above. Cells were glycerol shocked 18 hrs later, and re-fed with DME/20%FCS. After 24 hours, the media was changed to low serum media (DME/5% heat inactivated horse serum) , and cells were harvested 48 hours later.

Results MEF2 and Related Isoforms Are Members of the MADS Gene Family

Using oligonucleotides containing four concatenated copies of the MCK MEF2 binding site sequence, a total of 1.5 x 10⁶ recombinants were screened from a λgtll cDNA expression library generated from primary human skeletal myocytes cultured from vastus lateralis. A single positive clone was obtained, producing a protein which specifically bound the probe. The results are shown in Fig. 1. In Fig.lA (SEQ ID NO: 1) the nucleotide (1-2968) and predicted amino acid (1-507) sequences of the MEF2 cDNA are shown in upper case letters (SEQ ID NO: 1) . The aMEF2 cDNA differs from MEF2 in the alternatively spliced exon beginning at nt 673 (aa 87) , which is 2 codons shorter and is indicated above the MEF2 sequence (SEQ ID NO: 7) . The underlined region is highly conserved between these isoforms and the product of another gene, xMEF2 (Fig. 2) (SEQ ID NO: 3) , including the MADS domain underlined in bold. The sequence of the clone containing the alternatively spliced 5' untranslated region is indicated in lower case letters (unnumbered) (SEQ ID NO: 6) , with the dotted line overlying the excluded Alu repeat. In Fig. IB the MEF2 and xMEF2 MADS domain amino acid sequences (#3-57 OF SEQ ID NO: 1, and #3-57 OF SEQ ID NO: 3, respectively) are compared to those of the plant homeotic genes agamouε (AG, Yanofsky et al., Nature, 3_4_6:35-39, 1990 (SEQ ID NO: 17) and deficienε (DEFA, Sommer et al., EMBO J . , 9:605-613, 1990) (#1-55 of SEQ ID NO: 15) , the human serum response factor (SRF, Norman et al., Cell , 5_5:989-1003, 1988) (#1-55 of SEQ ID NO: 11), and the yeast transcription factors MCMl (Ammerer, Genes • Dev. 4, 299-312, 1990) (#1-55 of SEQ ID NO: 12) and ARG80 (Dubois et al., Mol . Gen . Genet . , 207:142-148. 1987) (SEQ ID NO: 2) . The first position of each is numbered. Residues conserved in MEF2, xMEF2, and at least two of the other proteins are marked (^*) . In Fig. IC The various alternatively spliced isoforms of the MEF2 gene are diagrammed: white, untranslated sequence; checkered, MADS domain; black, constant regions; horizontal and vertical stripes, MEF2 and aMEF2 alternative coding exons. An additional isoform that introduces a premature stop codon is also depicted (diagonal stripes) . The alternative sequence encoding the peptide SEEEELEL

(SEQ ID NO: 20), absent in RSRFC4/RSRFC9 (Pollack and Treisman, Genes Dev. 5, 2327-2341, 1991), is indicated. The 2.97 kb insert has a long open reading frame encoding a predicted polypeptide of 507 amino acids, provisionally named MEF2, with a calculated molecular weight of 54.8 kD and isoelectric point of 7.99 (Figure IA) . The methionine initiation codon is preceded by a translation stop three codons upstream. The 3' end of the cDNA has a tract of eleven adenosines, but there is no canonical polyadenylation signal. The sequence AACAAA appears beginning 29 nt upstream, but this has been shown to be a poorly functional mutation of the consensus (reviewed in Birnstiel et al., Cell , 4_.1:349-359, 1985). Thus, this tract of adenosines may be internally encoded in a longer 3' untranslated sequence.

The N-terninal region of the encoded MEF2 protein (amino acids 3-57) (SEQ ID NO: 1) is closely homologous to the conserved DNA binding and dimerization domains of the recently identified MADS gene family, comprising a series of yeast and human transcription factors and plant homeotic loci (Fig. IB; reviewed in Schwarz-Sommer et al., Science , 250:931-936. 1990; Ceon, et al., Nature , 353 : 31-37. 1991) . A region rich in basic residues (amino acids 3-31) overlaps a relatively long predicted -helix from amino acids 23-48. Beyond the MADS domain, there is a distinctive sequence of 27 consecutive glutamines and prolines (amino acids 420-446 (SEQ ID NO: 1) and another region rich in serine and threonine (amino acids 141-186, 43% S+T) . Domains such as these are important for the transcription activation function of other factors

(Courey, et al., Cell , 5_5:887-898, 1988; Courey, et al., Cell , 5^:827-836, 1989; Mer od et al., Cell , 58:741-753, 1989) . The MEF2 sequence contains numerous potential phosphorylation sites, i.e. nine for casein kinase II ( [S,T]XX[D,E] ) and eight for protein kinase C

([S,T]X[R,K]), that could be important for post- translational regulation (Sorger, et al., Cell , 54:855- 864, 1988; Yamamoto et al., Nature 334, 494-498, 1988; Manak et al., Genes Dev. 4, 955-967, 1990; Boyel et al., Cell 64, 573-584, 1991) .

Using a MEF2 cDNA subfragment (nt 342-728) encompassing the MADS domain as a probe, we also screened 1.25 x 10⁶ recombinants from an adult human cardiac ventricle λZAPII cDNA library at a range of hybridization stringencies. Sequencing of the 16 clones isolated revealed several isoforms in addition to the original MEF2 from the skeletal muscle library, that apparently arise from alternatively spliced transcripts of the same MEF2 gene (Fig. IC) . One partial cDNA isoform (lower case in Fig. IA) has an alternatively processed 5' untranslated sequence that excludes the segment from nt 56-262 (SEQ ID NO: 6) . This deleted domain is an Alu repetitive element (Jelinek, et al., Ann . Rev . Biochem . 5_l:813-844, 1982). This isoform also has an additional 80 nt of untranslated sequence at itε 5 ' end. A second alternative splicing event results in the substitution of translated sequences: amino acids 87-132 (nt 673-810) in the original MEF2 isoform are replaced by a different peptide, shorter by two codons, in the alternative isoform named aMΞF2. These alternative peptide sequences share limited homology, with 15 identical residues and 12 conservative substitutions out of 44 positions. Another cDNA clone was identified that differs entirely from MEF2 downstream from nt 672, i.e. at precisely the point of MEF2/aMEF2 divergence (see Figure IC) . In the divergent sequence of this clone, however, the translational reading frame terminates after just 12 nt and, as it begins with a possible 5' splice site (AG-GTAACA) , it may be a retained intron (data not shown) . While this cDNA could arise as an artifact from reverse transcription of incompletely spliced nuclear RNA, retained introns do occur in regulated alternative splicing in some systems (Breitbart et al. , Ann . Rev. Biochem ., 5>:467-495, 1987). MEF2 and aMEF2 are apparently isoforms of the same gene that also encodes the human SRF-related clones RSRFC4 and RSRFC9, respectively (Pollack and Treisman, Genes Dev. 5, 2327-2341, 1991) . RSRFC4 and RSRFC9 correspond to the isolate without the 5' untranslated Alu sequence. However, nt 1279-1302 in Figure 1, encoding the amino acids SEEEELEL (SEQ ID NO: 20) (residues 289- 296 in MEF2) , are absent from RSRFC4/RSRFC9, presumably as a result of alternative RNA splicing. Also absent are two of the eleven gluta ine codons (CAG) at nt 1672-1704, possibly due to alternative splicing also, most likely at adjacent splice acceptor sites (CAGCAGCAG) . The RSRFC4/RSRFC9 sequence lacks a single A nucleotide among the three at nt 1892-1894, possibly a sequencing discrepancy, that produces a shifted reading frame with a different C-terminus eleven amino acids shorter than MEF2. Furthermore, the RSRFC4/RSRFC9 sequence does not possess the transcription enhancing activities of the MEF2 factors. Other minor differences in RSRFC4/RSRFC9, * either allelic or sequencing discrepancies, include the absence of two GT repeats at nt 2084-2093, and two G→T transversions at nt 1767 and nt 2655, none of which affects the protein sequence.

We also isolated clones corresponding to the MEF2 alternative isoforms from a canine heart cDNA library. These include the form with the apparent retained intron, lending credence to the hypothesis that this represents a bona fide splicing event. It is striking that the dog and human MEF2 nucleotide sequences, which are 93% conserved in translated regions, are also better than 90% conserved over the entire 5' (excluding the Alu repeat) and most of the 3' untranslated sequences. There are no long open reading frames in the untranslated regions in either species to suggest that they might actually be translated in unidentified alternatively processed isoforms. However, these highly conserved sequences may be important in regulating mRNA turnover or translation.

The cDNAs shown in Figure 1 all derive from the same MEF2 gene, as is clear from the absolute sequence identity outside the alternative regions and from the genomic structure. We also isolated the product of a second related gene, XMEF2, by low-stringency screening of the human carαiac library (Figure 2) .

The nucleotide (1-1500) and predicted amino acid (1-365) sequences of the XMEF2 cDNA are shown in Fig. 2. The underlined region is highly conserved between xMEF2 and MEF2/aMEF2 (Fig. IA) , including the MADS domain underlined in bold. The remainder of the sequence is entirely divergent. The canonical polyadenylation signal is overlined. (Note that nt 1 is actually from the linker used in cloning.) This 1.5 kb cDNA, xMEF2, has a 365 amino acid open reading frame following the methionine codon at nt 250. The predicted protein has a calculated molecular weight - of 38.6 kD and an isoelectric point of 10.24. Residues 3-57 constitute a MADS domain identical to MEF2 at 50 of 55 positions (Figure IB) . The XMEF2 and MEF2 peptide sequences remain similar immediately downstream of this domain over another 29 residues, with just four conservative substitutions. The corresponding nucleotide sequences are 76% homologous over these regions. Beyond residue 86, MEF2 and XMEF2 have no substantial similarity. This point of divergence aligns precisely with the beginning of the MEF2/aMEF2 alternative peptides (see Figure IA) , consistent with it being an exon boundary. The remainder of xMEF2 is peculiarly proline- rich (22%) overall; however, it lacks a long glutamine/proline domain like that found in MEF2. There are three potential casein kinase II and seven potential protein kinase C phosphorylation sites. It should be noted that the methionine at position 1 in xMEF2 is actually the first methionine codon within an uninterrupted long open reading frame that extends to the 5' end of this cDNA, i.e., it is unknown whether a stop codon or, alternatively, the true initiation codon, might lie further upstream. Nevertheless, the XMEF2 peptide as depicted in Figure 2 aligns exactly with MEF2 and DEF A, both of which also have N-terminal MADS domains. In addition, the sequence around codon 1 in XMEF2 has a 6 of 7 match to the initiator consensus sequence, suggesting that this is a functional translation start site (Kozak, Cell 44, 283-292, 1986). The 3' end of XMΞF2, in contrast to MEF2, terminates with a canonical polyadenylation signal and poly-A tail. XMEF2 is an alternatively spliced isoform of the gene that also encodes the SRF-related clone RSRFR2 (Pollack and Treisman, Genes Dev. 5, 2327-2341, 1991): at nt 33/34, it lacks 178 nt of 5' untranslated sequences containing the sole upstream in frame stop codon present in RSRFR2. The protein coding sequences are identical. MEF2-Related Transcripts Accumulate Preferentially in Muscle and Brain Tissues

To determine the tissue distribution of the cloned sequences, we probed blots of poly-A⁺ RNAs from a series of cell lines and human tissues with an MEF2 cDNA fragment (nt 342-728) containing the MADS sequences (Figure 3A and 3B) .

Fig. 3 shows northern blots of poly-A⁺ RNAs from a variety of muscle and non-muscle cell lines (Fig. 3, panels A, C, E; Mb, myoblasts; Mt, myotubes; 28S and 18S ribosomal RNA positions shown) and adult human tissues (Fig. 3, panels B, D, F; RNA size markers indicated in kilobases, kb) were sequentially hybridized, stripped, and rehybridized at high stringency to a series of radiolabeled probes derived from the MEF2 cDNA, including; MADS Domain (Fig. 3, panels A,B; nt 342-728), 3'UT Sequence (panels C,D; nt 2158-2969), and Exon- Specific (panels E,F; nt 673-810) . Another blot of adult mouse tissue poly-A⁺ RNAs was also probed with the Exon- Specific probe (panel G) . The corresponding aMEF2 exon- specific probe gave identical results (not shown) . Each blot has equivalent amounts of RNA per lane (see Materials and methods) .

MEF2 transcripts were found in all cells and tissues examined, but were more abundant in myotubes, skeletal muscle, heart, and brain. In all samples, the predominant species is «6.5 kb, with a minor band at «3.5kb. The abundance of the longer transcript is increased relative to the shorter one in differentiated myotubes, as compared with myoblasts and non-muscle cellε. Smaller bands were alεo detected in non-muscle cells. Because of the posεibility that the conserved MADS sequence was cross-hybridizing with transcripts from related genes (see Figure 4) , we probed the same blots with a second fragment (nt 2158-2968) (SEQ ID NO: 1) comprising only the MEF2 3' untranslated sequence (Figure 3C and D) . This probe showed the same distribution of 6.5 and 3.5 kb transcripts (but not the smaller bands), confirming that these species are, in fact, products of this MEF2 gene. The hybridization of this human untranslated probe to rodent RNAs at high stringency again reflects the unusual interspecies conservation of these sequences as noted above for the dog clones.

In order to investigate the possible tissue- restricted splicing of these transcripts, we generated exon-specific probes corresponding to the two alternative coding exons for MEF2 and aMEF2 (see Figure IA) and hybridized them individually to the same mRNA blots, and to another blot with mouse tissue poly-A⁺ RNAs (Figure 3E, 3F, and 3G) . Both exon-specific probes show that, while transcripts containing these exons are expressed ubiquitously at low levels, they are noticeably more abundant in myotubes, skeletal muscle, heart and brain. This enrichment is even more pronounced than is seen using either MADS or 3' untranslated probes, indicating that there is tisεue-specific regulation of MEF2 splicing, or perhaps mRNA stability. That both exons give the same result (data not shown) indicates that they are regulated in parallel, and that other transcripts of this gene detected only by the common region probes must lack these exons.

Similar northern blot analysis for xMEF2, using a probe from this cDNA (nt 1-502) (SEQ ID NO: 3) that contains itε MADS sequence, demonεtrated that expression of the xMEF2 gene is clearly tissue-specific (Figure 4) . The same cell (panel A) and human tissue (panel C) RNA blots shown in Fig. 3, and another of rat heart poly- A⁺ RNA (panel B) were probed at high stringency with a radiolabeled fragment of the xMEF2 cDNA (nt 1-502) . Again, transcripts are abundant in myotubes, skeletal muscle, heart, and brain. The major species in myotubes form a doublet at approximately 7.5 and 6.5 kb, with a less abundant transcript at about 3.5 kb. In the tissues, only the 7.5 kb and 3.5 kb bands are seen. These xMEF2 transcripts are present at a lower level in myoblasts (which generally include a small subpopulation of differentiated myocytes in culture) and are barely detectable in non-muscle, non-neural cells and tissues. Smaller species in HeLa and CV-1 are distinct from those seen with the corresponding MEF2 probe. It is noted that none of the cDNAs isolated, either for MEF2 or XMEF2, is as long as the transcripts for these genes in RNA blots.

Cloned and Endogenous MEF2 Have Identical DNA Binding Specificities The presence of MEF2-related transcripts in multiple tissues contrasts sharply with the muscle specific MEF2 activity described previously (Gossett et al., Mol . Cell . Bio . , 2:5022-5033, 1989; Cserjesi and Olson, Mol. Cell. Bio. 11, 4854-4862, 1991) . We undertook experiments to compare the DNA binding specificity of cloned proteins with that of the endogenous muscle activity.

Electrophoretic mobility shift assay (EMSA) confirmed specific binding of the MCK MEF2 site in C2C12 myotube nuclear extract (Figure 5A; probe and competitor oligonucleotide sequences are shown in Table 1) , as demonstrated by others (Gossett et al., Mol . Cell . Bio . 9:5022-5033, 1989). In Fig. 5A, C2C12 myotube nuclear extract was assayed for binding to the radiolabeled MEF2, CArG, and MEF2 mutant probes (specified at bottom) in the absence - (-) or presence (+) of a 100- or 250-fold molar excess of unlabeled competing oligonucleotide (specified at top) , with sequences shown in Table 1. Bound probe (B) was separated from free probe (F) by EMSA and detected by autoradiography. Lanes 1 and 12 show probe without extract. In Fig. 5B, in vitro translated MEF2 protein from the cloned MEF2 cDNA was similarly aεεayed for DNA binding. Controlε εhowing probe alone (P) , bound in myotube nuclear extract(C2) , and not bound in unprogrammed rabbit reticulocyte lysate (RL) are included for comparison (lanes 1-3) . In Fig. 5C, in vitro translated proteins from the three corresponding cDNAs (indicated at top) were each asεayed for binding to a εerieε of known or potential MEF2 sites from muscle gene regulatory regions shown in Table 1. MCK MEF2 is the MEF2 site, and RRL is unprogrammed rabbit reticulocyte lysate. The EMSA autoradiogramε are cropped to show only the bound probes (arrowheads) . In Fig. 5D the DNA binding domain of MEF2 was identified using EMSA in which full length in vitro translated MEF2 and a serieε of C- terminal deletions (dl-d4) were tested for binding to the MEF2 probe. Truncated cDNA templates are diagrammed at bottom: boxeε repreεent coding and lineε untranslated (UT) sequences; restriction enzyme cleavage sites are marked for Hindlll (H) , seal (S) , ndel (N) , and nhel (Nhe) , producing the N-terminal peptide lengthε indicated. The autoradiogram shows free probe (F) separated from that bound by MEF2 (B) , dl (BI) , d2 (B2) , and d3 (B3) , while d4, cleaved immediately downstream from the MADS seqαenceε, failε to bind. Unbound probe (P) and unprogrammed lyεate (RL) controlε are included. The MEF2 site probe was bound (B) by an activity in this extract (lane 2) . This interaction was competed by exceεε unlabeled probe (lane 3) but not by the mutated MEF2 εite (laneε 4 and 5) , confirming that the interaction iε εpecific. The A/Temb εite, a ciε element in the embryonic yoεin heavy chain (MHCemb) promoter important for itε muscle specific activity (Bouvagnet et al., Mol . Cell . Bio . , 7:4377-4389, 1987; Y.-T.Y. and B.N.-G., in preparation), was a lesε effective competitor (lanes 6 and 7) . Unrelated A/T-rich sequenceε including CArG, which iε a target for another MADS protein SRF (Boxer et al., Mol . Cell . Bio . £: 515-522, 1989) , and the OTF-2 site (Scheidereit et al., Cell , .51:783-793, 1987), did not compete at all for MEF2 binding (lanes 8-11) ; nor did the MEF2 oligonucleotide compete for CArG binding in complementary experiments (lanes 13-15) , consistent with previous reports (Gossett et al., Mol . Cell . Bio . , 2:5022-5033, 1989) . Further, the extract bound MEF2 mutant site mt4, but not mt6, distinguishing between ubiquitous and muscle specific binding (lanes 16-18) , as shown previouεly (Cεerjesi and Olson, Mol. Cell. Bio. 11, 4854-4862, 1991). These data confirm that the MEF2 site is specifically bound by a myotube nuclear factor distinct from known ubiquitouε binding activitieε. Cloned MEF2 exhibited the εame DNA binding εpecificity as the endogenous myotube activity in similar EMSAs uεing cDNA-encoded in vitro tranεlated MEF2 (Fig. 5B) . The mobility of the complex formed by thiε protein with the MEF2 probe waε identical to that in the myotube extract (compare laneε 4 and 2) . Competition for thiε binding by the εame εerieε of oligonucleotideε uεed in Fig. 5A showed that the relative affinity of the cloned MEF2 for these sites exactly recapitulates that of the endogenous activity (lanes 5-10) . In vitro tranεlated MEF2 bound mt4 , but not mt6, again identical to the endogenouε muεcle εpecific binding activity (laneε Il¬ ls) . The εame binding specificity was alεo reproduced by in vitro translated aMEF2, the alternative isoform (data not shown) . Thus, these cloned factors have a DNA binding specificity indistinguiεhable from that of endogenouε muεcle MEF2.

MEF2 and aMEF2. but Not XMEF2. Bind Multiple Cardiac and Skeletal Muscle Gene Promoter Elements in vitro

The promoters or enhancers of many muεcle-εpecific genes contain essential A/T rich elements that conform fully or partially to the MEF2 site consenεuε (Cserjesi, et al., Mol Cell Bio , 11:4854-4862, 1991). We uεed oligonucleotide probes corresponding to these sequences (see Table 1) in EMSAs to determine the relative affinities of the in vitro translated MEF2-related isoforms (Fig. 5C) . Both MEF2 and aMEF2 bound all of the known or potential MEF2 sites tested, including, in decreasing order of affinity: the cardiac myosin light chain 2 promoter HF-1 element; the original MCK enhancer MEF2 εite; a εecond A/T rich element in the MCK enhancer; and A/T rich sequences from the promoters for cardiac troponin T, cardiac cn-myosin heavy chain (two distinct sites) , and MHCemb. Remarkably, the myosin light chain 2 HF-1 and -myosin heavy chain A/T-l siteε have identical core εequences (TTAAAAATAA) (SEQ ID NO: 33) ; however, the former was bound avidly while the latter was bound poorly, implicating the flanking sequences in site εpecification.

In every inεtance, aMEF2 bound εeveral fold more effectively than MEF2; thus, the alternative peptideε, which lie well outεide the εhared MADS domain, muεt modulate the binding propertieε of these proteins. XMEF2, with a nearly identical MADS domain, bound none of these sequences in vitro , due either to the few amino acid substitutions in the N-terminal region or to the completely divergent C-terminuε. Itε capacity to activate tranεcription via theεe MEF2 sites, however, indicates that it may well bind in vivo (see below) . The MADS Homology Region Alone Is Not Sufficient for DNA Binding

Earlier εtudies with SRF and MCMl have shown that the DNA binding function of each factor resideε in a domain that includes the MADS homology (Norman, C. , et al., Cell , .55:989-1003, 1988; Christ, C. , et al., Geneε Dev . , 5_:751-763, 1991). To ascertain whether the same might be true for MEF2, we constructed a set of progreεsive C-terminal deletions of cloned MEF2 and assayed DNA binding by gel shift (Fig. 5D) . Truncated in vitro translation products containing 322, 201, and 104 N-terminal residueε retained the capacity to bind DNA. Further deletion to 58 amino acidε, at the boundary of the conεerved MADS domain, eliminated binding. Therefore, as in other proteins in this family, the DNA binding function of MEF2 includes the MADS homology, but as many as 46 additional residueε C-terminal to it are alεo required. Indeed, aε noted above, differenceε in thiε region are reεponεible for the different DNA binding affinitieε of MEF2 and aMEF2 (εee Fig. 5C) . Skeletal aε Well as Cardiac and Smooth Muscle Specific DNA Binding Activity Is Due to MEF2/aMEF2

The DNA binding specificity of cloned MEF2 and aMEF2, which faithfully reproduces that of endogenous muεcle MEF2 activity, εtandε in contraεt to the ubiquitouε diεtribution of MEF2-related tranεcriptε. In order to inveεtigate the tiεsue specificity of the MEF2 protein isoforms, we compared nuclear extracts from a variety of cell typeε in EMSAε with the MCK MEF2 probe (Fig. 6A) .

In Fig. 6A nuclear extracts from C2C12 (C2) and Solδ myoblasts (mb) and myotubes (mt) , rat primary cardiocytes (Card) , rat pulmonary artery smooth muscle cells, C3H10T1/2 fibroblastε (10T1/2) , HeLa cells, and NIH3T3 cells untransfected (3T3) or tranεiently transfected with MyoD (3T3+MyoD) were used in EMSA assays in which free MEF2 probe (F) was separated from εpecifically bound probe (B) , or from the nonmuscle complex (H) which migrated more εlowly (lower band in HeLa is a nonspecific artifact) . In Fig. 6B Antisera raised against cloned MEF2 iεoformε demonstrated that these proteins are reεponsible for the muscle specific MEF2 binding activity shown by EMSA. Immune sera included Anti-MEF2, specific for MEF2 and aMEF2, and Anti-aMEF2, specific for aMEF2. Controls included the corresponding preimmune sera (Pre-MEF2, Pre-aMEF2) or unrelated antisera (Rabbit S, Anti-lOOkd) . Extractε εpecified in Fig. 6A were alεo uεed here, in addition to those of COS cells and rat liver tiεεue.

The differentiation of skeletal myoblasts to myotubes in both C2C12 (Lanes 2 and 3) and Sol8 (lanes 5 and 6) cellε was accompanied by a marked increase in binding (B) . Similarly, NIH3T3 fibroblastε normally devoid of thiε activity developed MEF2 binding upon transient transfection with MyoD (lanes 8 and 9) . These data are consistent with previous work documenting the induction of thiε activity during myogeneεiε and in response to myogenin (Gossett, et al., Mol Cell Bio, 2:5022-5033. 1989; Cserjeεi, et al., Mol Cell Bio , 11:4854-4862, 1991). It iε striking that smooth muscle cells and primary cardiocytes which lack known myogenic bHLH factors, also contained specific MEF2 binding activity (lanes 4 and 7) . In contrast, cells outside these muscle lineageε showed only a slower-migrating complex (H) distinct from the muscle specific complex (C3H10T1/2 fibroblastε and HeLa, lanes 10 and 11; see also below) . Tiεεue εpecific expression of MEF2 isoformε waε further demonstrated using antisera to define the proteinε in MEF2 DNA binding complexes (Fig. 6B) . Anti- MEF2 recognizes both MEF2 and aMEF2, while anti-aMEF2 iε specific for aMEF2 (see Methods) . Both antibodies produced a "supershift" of bound probe, confirming the preεence of theεe factorε in C2C12 myotube (lanes 2-8) , cardiocyte (lanes 16-18) , and smooth muscle cell (not shown) extracts, while preimmune and unrelated controlε had no effect. In contraεt, the εlower-migrating H complex lackε theεe MEF2 proteins and was not supershifted in HeLa (lanes 9-13), COS (lanes 14 and 15), and liver (lanes 19-21) extracts, nor in C3H10T1/2 or CACO (colon carcinoma) cells (data not shown) , confirming that ubiquitous binding of the probe iε not due to the cloned factorε. A fraction of the H complex from liver extract seems to be supershifted; whether a small amount of MEF2 iε expreεεed in liver tiεεue or possibly arises from vascular εmooth muεcle in the organ remains to be determined.

Thus, MEF2 DNA binding activity Is found in skeletal, cardiac, and smooth muscle lineageε. Note that vascular smooth muscle could account for MEF2-related transcriptε in non-muεcle tissues, but not in cultured cells. The presence of MEF2 RNAs in cells and tissues outside theεe lineageε indicates that post- transcriptional mechanisms are required to produce absolute tisεue specificity of MEF2 DNA binding. Some of this regulation may come from preferential splicing of the MEF2- and aMEF2-specific alternative exons (see Fig. 3) , but translational or post-translational mechanismε are likely to operate aε well. The antibody εupershifts demonεtrate unambiguouεly that tiεεue εpecific MEF2 DNA binding activity iε directly attributable to the cloned MEF2 gene products. It is particularly interesting here that anti-aMEF2, which is specific for only one (aMEF2) of the alternative isoforms, supershifted virtually all of the bound probe in these asεayε. Either theεe complexes comprise aMEF2 alone, or MEF2:aMEF2 heterodimers that are shifted intact by this antibody. The Cloned Factors Are MEF2 Site-Dependent Transcriptional Activators

A considerable body of data shows that the MEF2 site is critical for tissue specific transcription conferred by muscle gene promoters and enhancerε

(Gossett, et al., Mol Cell Bio , 2:5022-5033, 1991; Zhu et al., Mol Cell Bio , 11:2273-2281, 1991; Wentworth, et al., PNAS , 88:1242-1246, 1991) . Our resultε to this point correlate tisεue specific binding at the MEF2 εite with the cloned gene products. To determine whether these proteins are functional transcription factors, we examined their capacity to trans-activate promoters containing MEF2 εites.

MEF2 cDNAs, subcloned into the pMT2 eukaryotic expresεion vector, were cotranεfected with variouε reporter plaεmids in nonmuscle cells. As diagrammed in Fig. 7A, the reporter constructs comprise the bacterial chloramphenicol acetyl transferaεe (CAT) gene linked to the baεal MHCemb promoter (pE102-CAT; Bouvagnet, et al., Mol Cell Bio , 7:4377-4389; Yu, et al., Mol Cell Bio,

2:1839-1849, 1989), or the HSV thymidine kinase promoter (p8TK-CAT; McKnight, et al.. Science , 117:316-324, 1982). Each promoter was tested with or without two copieε of intact or mutated MEF2 binding εites (M) . Parallel experiments in HeLa (Fig. 7A) , CV-l, NIH3T3, and

C3H10T1/2 (not εhown) cellε all gave similar resultε. The MEF2 expreεsion vector pMT2-MEF2 produced marked tranεcriptional activation of reporters containing the MCK MEF2 binding site (p8TKCAT-MEF2x2,pE102CAT-MEF2x2) or the related A/Temb site from the MHCemb promoter pE102CAT-ATembx2) . Control experiments with reporter constructs containing MEF2 εite mutantε (pSTKCAT- MEF2mtx2, pE102CAT-MEF2mtx2) or no MEF2 binding εiteε (pδTKCAT, pE102CAT) εhowed that tranε-activation by MEF2 dependε abεolutely on the preεence of intact binding sites. The enhanced responsivenesε of the MHCemb promoter over the thymidine kinaεe promoter (26 fold versus 5-6 fold) suggests that MEF2 may interact synergiεtically with other tranεcription factorε that bind the MHCemb promoter. The pE175CAT reporter containing the native MHCemb promoter, including the single endogenous A/Temb site (-162 to -150) was also activated by MEF2, albeit at a lower level.

Fig. 7A: The various chloramphenicol acetyl- transferase (CAT) reporter genes, with and without duplicated wild type or mutated MEF2 binding sites (M) , are diagrammed here described in detail in the text. The coordinates of the MHCemb (pE102-CAT) and thymidine kinase (p8TK-CAT) promoters are indicated. The pE175CAT reporter, not diagrammed, is described in the text. HeLa cells were cotransfected individually with theεe constructε and either the MEF2 cDNA expression plasmid (pMT2-MEF2) or vector control (pMT2) , and the resultε displayed graphically. Fig. 7B: The same cotransfection experimentε were conducted in C2C12 myoblaεtε and myotubeε, rat primary cardiocyteε, and rat pulmonary ε ooth muεcle cellε.

Thiε is consistent with previous resultε εhowing that duplicated MEF2 binding sites are more effective than a single εite (Goεεet, et al., Mol Cell Bio 2:5022- 5033, 1989). Together, these results document that the cloned MEF2 proteins by themselveε are sufficient to produce both specific DNA binding and trans-activation in nonmuscle cellε. Therefore, the cloned sequences encode the endogenous factors responsible for MEF2 activity in vivo.

When we performed analogous experiments with aMEF2 and xMEF2 expression constructs, transcription activation by aMEF2 was consistently as good or better than that conferred by MEF2, correlating with the relative binding affinities of the two isoformε (εee above, Fig. 5C) . XMEF2, which gave no detectable DNA binding in vitro, alεo conferred lower but reproducible tranε-activation in theεe cotranεfection experimentε. We infer either that XMEF2 binds DNA in vivo as a heteromeric complex with other unidentified MEF2-related isoformε or unrelated factorε, or, leεs likely, that it potentiates other tranεcription factorε without contacting the DNA itεelf. Alternatively, the discrepancy between xMEF2 in vitro binding and in vivo trans-activation may be due to the difference between the single copy MEF2 site in the binding probe and the duplicated copies in the reporter genes. Skeletal. Cardiac, and Smooth Muεcle Cellε contain

Saturating Levels of Endogenous MEF2 Trans-Activating Factors

The presence of trans-activating MEF2 activity in skeletal myotubes iε well eεtabliεhed (Goεεet, et al., Mol Cell Bio 2:5022-5033, 1989) . Here we have found that εpecific MEF2 DNA binding activity iε present not only in skeletal muscle, but also cardiac and smooth muscle cells, raising the question as to whether all three muscle lineages expresε endogenous MEF2 transcriptional activity. To investigate this, we performed a series of contransfection experiments in all three muscle cell types (Fig. 7B) , Again, the pE102CAT reporter without binding sites was inactive. Undifferentiated C2C12 myoblastε behaved much as nonmuscle cells (above, Fig. 7A) in that transcription of pE102CAT -MEF2x2 was increaεed significantly when cotransfected with pMT2- MEF2. In fused myotubes, however, this reporter construct waε already fully active without cotranεfection, and there was no appreciable further stimulation when pMT2-MEF2 was added. As shown, the results in primary cardiocytes and pulmonary arterial smooth muscle cells were the same as thoεe in εkeletal myotubes, i.e., these cell types, in contrast to nonmuscle cells, already contain saturating levels of MEF2 activity, presumably from endogenouε MEF2 itεelf and/or from itε related iεoformε.

There is, therefore, an exact correlation between the tissue specific MEF2 DNA binding activity demonstrated in skeletal, cardiac, and smooth muscle (see Fig. 6) , and functional trans-activation in theεe εame cell types. Particularly striking is the presence of MEF2 activity in all three muscle lineages. Despite certain phenotypic similaritieε between εmooth and striated muscle tiεεues, common mechanisms for specific gene regulation in these tiεεue types have not been previously described. MEF2 is Induced by MyoD but. Alone, is Not Myogenic

It is clear from theεe data that the appearance of MEF2 activity iε correlated with muscle differentiation. Given the well-known capacity of MyoD to produce myogenic conversion, it waε of intereεt to investigate the potential interrelationship between MEF2 and MyoD in this procesε. MyoD, aε well as myogenin, induces MEF2 DNA binding activity in transfected fibroblasts (εee Fig. 6A; Cεerjeεi, et al., Mol . Cell . Bio . 11:4854-4862, 1991). We tested whether this coincided with the development of MEF2 trans-activation and, further, whether MEF2 alone is εufficient to generate the muεcle phenotype.

MEF2 trans-activation was induced by MyoD in transiently transfected NIH3T3 cells (Fig. 8) . In Fig. 8, NIH3T3 fibroblasts were transiently cotransfected with a MyoD cDNA expreεεion plaεmid and the pE102CAT reporter, with or without MEF2 binding εites (see Fig. 7) , and assayed for CAT activity following incubation in either low (5% heat-inactivated equine) or high (10% fetal bovine) serum conditions.

Indeed, pE102CAT-MEF2x2 waε transcribed at a high level in these cells. This activity was independent of εerum concentration in these cultures, indicating that the fully differentiated muscle phenotype associated with serum withdrawal iε not required for MEF2 activity in the presence of exogenous MyoD. However, transfected MyoD alone was not εufficient to produce MEF2 activity in HeLa cellε (data not εhown) which are reεiεtant to myogenin conversion (Weintrab, et al., PNAS , 6_.:5434-5438, 1989). These resultε εee to contrast with similar experiments, using myogenin instead of MyoD, where MEF2 DNA binding activity in transfected C3H10T1/2 cells required serum withdrawal, and where thiε εame activity waε induced in tranεfected CV-1 cells which are also reεiεtant to myogenic converεion (Cεerjeεi, et al., Mol . Cell . Bio . 11:4854-4862, 1991). Cell type differenceε may be reεponεible for theεe apparent discrepancies.

Finding that MEF2 activity was induced by MyoD, we sought to determine whether ectopic expresεion of MEF2 alone might induce the muscle program in otherwise nonmyogenic cells. In both transient and stable transfection of C3H10T1/2 fibroblast, however, MEF2 failed to induce the muscle phenotype as characterized by myotube formation and εtriated myoεin heavy chain expreεεion (data ot εhown) .

These results define a hierarchy for myogenesis in which MEF2 lieε downεtream of the muεcle εpecific bHLH factorε. MEF2 is induced by MyoD but iε not, by itεelf, myogenic. It is clear, therefore, that MEF2 is not the εole proximate effector of myogenic converεion by MyoD. Other muscle specific factors must be induced in parallel. Furthermore, the presence of MEF2 activity in - cardiac and εmooth muscle, in which MyoD and its cognates have not been detected, muεt be taken aε evidence for the exiεtence of alternate pathways for MEF2 induction. Isolation and Characterization of Other MEF2 Family Memberε

Genomic εouthern blotting with a probe from the MEF2 DNA binding domain indicated the exiεtence of several genes containing homology to the probe. These observations led us to postulate that a family of transcription factors containing this conserved domain may be present in muscle in an analogous manner to the MyoD family, and that this protein family may be important for muscle gene regulation based on the functional presence of the MEF2 binding site in many muscle specific genes.

We have screened a skeletal muscle cDNA library at low stringency with a conserved DNA binding domain probe from the first MEF2 related gene iεolated in our laboratory, with the purpoεe of identifying additional memberε of the putative MEF2 family of tranεcription factorε. We report the iεolation and characterization of cDNA's encoding a new MEF2 related factor which is homologous to the initial MEF2 gene but is derived from a separate gene. The products of this gene termed dMEF2, activate transcription. The methods uεed to iεolate dMEF2 can similarly be used to isolate other MEF2 family members.

DMEF2 has a similar binding εpecificity to the previouεly iεolated MEF2 related factors. Immunofluorescence studies indicate that dMEF2 is developmentally up-regulated in the myoblast to myotube transition and is also present in a subεet of neuronal cell nuclei. There iε εtrict tissue specific tranεcriptional regulation of this gene, in comparison to the more ubiquitous expression of the other MEF2 related * factors. cDNA library εcreening was performed as deεcribed above. Screening of the λ+10 library waε performed with random primed ³²P labeled cDNA (380 bp Nεil-Ddel fragment) from MEF2 that had a specific activity of lxl0⁹cpm/μg of DNA. Plas ids and transfectionε

For in vitro transcription, translation and sequencing the cloned dMEF2 cDNA's were subcloned into pGEM vectors (Promega Corp., Madison, WI) . For in vivo expreεsion, the cDNA's were εubcloned into pMT2 vector. The MHC emb CAT reporter conεtruct conεisted of 2 copies of the MCK MEF2 siteε inεerted in a concatemeriεed orientation at the -102 poεition of the MHC emb promoter in plaεmid PE102 CAT, aε deεcribed above. The oligonucleotide binding εites were also cloned into the Hindlll site of pδTKCAT (Thompson et al. 1991.

J. Bio . Chem . 266:22678-22688) and at -109 of the HSV TK promoter. Transient tranεfection assayε were carried out aε previously described. Briefly, Hela cells were grown to -60% confluence, and transfected with the various DNA expresεion conεtructs by calcium-phosphate coprecipitation. The cellε were glycerol εhocked 18h later. After 24 hrs., the media was εwitched to low serum media (DME/5% heat inactivated horse serum) , cells were harvested 48hrs. later. Each plate of cellε (~5xl0⁶ cells) was transfected with the following DNA's: 5 μg of the appropriate CAT reporter conεtruct, 5 μg of the pMT2- dMEF2 conεtruct or the pMT2 vector alone, and 3 μg of the pSV 3-gal which served as an internal control for the tranεfection efficiency. For the COS cell transfections 20 μg of the expression construct was used. Cell extracts were prepared and CAT activity was determined by previously published procedures.

In vitro Transcription and Translation

For in vitro translation MEF2-pGEM7 zF(+) constructs were linearized with either BamHl or Pstl for the full length and truncated tranεlation products respectively. The resulting RNA was translated in vitro using a rabbit reticulocyte lysate according to the manufacturer's suggested conditions (Promega) . The in vitro translation productε were analyzed by the incorporation of [³⁵S] methionine and a 3 μl aliquot was electrophoresed on a 12% SDS-polyacrylamide gel. After the proteinε were resolved the gel waε exposed to Enlightning (DuPont) for 30 mins., dried, and autoradiographed. Cloning of dMEF2

A human adult εkeletal muεcle cDNA library, conεtructed in the phage lamda gtlO, waε εcreened by low εtringency hybridizaiton with a DNA probe which contains the MEF2 DNA-binding domain. Three phage were chosen for further analysis from 67 positives isolated from the 1.5xl0⁶ screened, which contained overlapping cDNAs with substantial homology to the DNA binding domain of MEF2. The open reading frame encoded by these cDNA's iε highly conserved in the DNA binding domain (-74% identity at the nucleotide level, 99% at the amino acid level) when compared to the other MEF2 factors, but diverges outside of this conserved domain.

The complete sequence of the longest cDNA insert (1.9kb) , designated as dMEF2, haε one single continuous open reading frame, as shown in Fig. 15. The sequence contains an in frame methionine with upstream stop codons which fits the consenεuε aε a εtrong initiation εite. The dMEF2 cDNAs encode a 465 amino acid polypeptide (isoelectric point - 8.69), with a predicted Mr of 50.3 kd. Amino acid alignment of the predicted amino acid sequences of dMEF2 and MEF2 reveals an overall identity of 66% (Fig. 17) , although the conservation at the N- terminuε iε much greater (83 of 84 reεidueε) . In the region where MEF2 and dMEF2 are strongly conserved there existε a εtriking homology to a number of protein factors that belong to the MDS protein family. dMEF2 contains an 84 amino acid amino (N)-terminus which is highly conserved with the other MEF2 related factors isolated thus far (Fig. 18^ . The amino-terminal part of this εtructural motif (aa3-60) containε the MADS box homology in common with the other MADS factors (Fig. 18) . The carboxy (C) terminal end (aa 60-86) of this domain divergeε from the other MADS factorε but iε highly conεerved in the MEF2 family (Fig. 18) , conferring a binding specificity which is sequence specific but distinct from the other MADS box proteinε.

After reεidue 86, dMEF2 and MEF2 diverge conεiderably (Fig. 17) . Thiε diversity after residue 86 corresponds with the divergence between MEF2 and aMEF2 and the existence of an exon boundary at this point. In addition, dMEF2 lackε the glutamine/proline rich region which exiεtε in the C-terminuε of MEF2, a region which iε a known motif in some transcription factors. Two of the dMEF2 cDNA's are identical except that a 96nt segment (nucleotides 1737-1833) iε absent and represents a bona fide splicing variant (Fig. 16) .

A prediction of the amino acid secondary structure of the dMEF2 molecule reveals that the binding domain contains a short alpha-helical region (amino acids (aa)l- 6) followed by a turn and an extended alpha helix (aa 20- 48) . The N terminal part of this helix (aa 20-33) is highly hydrophilic and has a high surface probability indicating that it may be involved in dimerization and/or binding to DNA. Thiε region iε predicted to be an amphipathic alpha helix in which the hydrophobic residues are clustered on one side of the helix, a molecular arrangement which stabilizes a coiled-coil structure (Fig. 18) . In several other proteins it has been εhown that coiled motifs of this nature are important for dimer formation and transcriptional activation (Johnson et al. 1989. Annu .Rev .Biochem . 5J3:799-839; Raεmuεεen et al. 1991. Proc. Natl . Acad . Aci . USA 88:561-564) . The C-terminal part of the εecond helix (aa 34-48) iε very hydrophobic indicating that it is probably oriented to the interior facing region of the alpha helix. There is an additional alpha helix within the MEF2 specific region of the binding site from aa 60-69 which is alεo predicted to be an amphipathic alpha helix (Fig. 18) . It iε poεεible that this region is responsible for the binding specificity which distinguiεheε the MEF2 related factors from the other MADS proteins. There are potential glycosylation εites at aa 49 and 283. DNA Binding Site Specificity In order to investigate if the dMEF2 protein bindε to the MEF2 DNA binding εite, protein-DNA interactionε were aεεeεεed uεing electrophoretic mobility shift assayε. The binding of in vitro translated dMEF2 to a double-stranded (dε) oligodeoxyribonucleotide, comprised of the previously characterized MCK enhancer MEF2 site (Cserjeεi et al. 1991. Mol . Cell .Bio . 11:4854-4862), waε tested. The probe used in the electrophoretic mobility shift asεay was a 27bp double stranded, single core recognition motif for the MEF2 site labelled by phosphorylation using T4 polynucleotide kinaεe and gam a- ³²P-labeled MEF2 site ds oligonucleotides and the resulting protein-DNA complex was resolved by gel electrophoreεiε followed by autoradiography. The εpecificity of the protein-DNA complex obεerved between dMEF2 and the labelled MCK MEF2 εite waε determined by uεing variouε unlabelled εynthetic oligonucleotideε aε competitorε. The reεultε of theεe experimentε are conεiεtent with the known specificity of the consensuε binding site [CTA(AT)₄ TAG], in that mutant 4 which has a single base change at one of the variant positions in the consensus does bind and effectively compete the specific complex. Conversely, mutantε 1 and 6, which have mutationε in the invariant region of the binding εite, do not effectively compete indicating that they are not bound by dMEF2 with appreciable affinity. Aε expected, the CArG box binding εite, which iε a high affinity binding εite for the MADS protein SRF, doeε not complete the binding.

We teεted if the preεence or abεence of the peptide encoded by the 96 nucleotide alternate region in the cDNA'ε would influence the DNA binding affinity. However, there waε no observable difference between the DNA binding of in vitro translated proteins either with or without thiε region. In addition, a truncated verεion of the protein (Amino terminal 178aa, truncated at the Pεtl εite) retained its DNA binding capacity. The data indicate that both the long and truncated forms bind DNA with εimilar affinity. When the long and truncated formε were co-tranεlated we were not able to obεerve an intermediary complex which would indicate homodimerization. Taken together, theεe obεervations demonstrate a εimilar in vitro DNa binding εpecificity that iε shared by dMEF2 and the other MEF2 related factors so far isolated. Tranεcriptional activation by dMEF2

To determine if dMEF2 could function as an activator of transcription, the dMEF2 cDNA's were subcloned into an eucaryotic expresεion vector (pMT2) . The dMEF2 containing expression constructs were co- transfected with various reporter constructs containing a heterologous promoter site and two concatenated copies of the MEF2 high affinity binding site. All transfectionε were carried out in Hela cells. The reporter constructs used are comprised of the bacterial chloramphenicol acetyl transferase (CAT) gene fused to either: 1) the basal MHC emb promoter (pE102 CAT) ; 2) the HSV thymidine kinase promoter (TK-CAT) ; or 3) the SV40 major late promoter (A10-CAT) . Control tranεfectionε with reporter constructs without the MEF2 binding sites present were not transactivated by the expresεion constructs indicating that transactivation of the reporter constructs was dependent on the presence of the intact MEF2 binding siteε. An intereεting result from these experiments is that the most potent transactivation of the reporter constructs was observed with the muscle specific promoter when compared to the two non-muscle specific promoter elements teεted. Thuε, the cellular context of the promoter element may be important for tranεactivation by dMEF2. Deletion of the Carboxy-Terminal Third of dMEF2

Using the transcriptional activation asεay deεcribed above, we assayed the effect of the presence or absence of the carboxyterminal third of MEF2. We found that the preεence of a C-terminal portion conεtituting about one-third of the molecule substantially reduces or inactivates transcriptional enhancement. Method of Screening for Molecules that Enhance the activity of a MEF2 protein.

To test for molecules that enhance the activity of the MEF2 proteinε we are uεing in vitro and in vivo assays. The in vitro asεay iε a modification of the DNA binding aεεay (retardation gelε) deεcribed above. The different iεoforms of MEF2 produced by expression in bacteria, animal cells, or by in vitro translation are diluted in a progreεsive fashion until the amount of protein present in the assay is insufficient, on its own, to generate a retardation of the DNA probe added to the asεay. Thiε DNA probe contains the MEF2 DNA consensus binding site, as described above. The different molecules, including other proteinε, cell extracts and different types of bacteria, animal cells, or by in vitro translation are diluted in a progressive fashion until the amount of protein present in the aεεay is insufficient, on its own, to generate a retardation of the DNA probe added to the asεay. This DNA probe contains the MEF2 DNA binding site as described above. The different molecules, including other proteins, cell extracts and different typeε of bacterial or fungal brothε are then added to the aεεay and tested for the appearance of a MEF2 retardation complex. This assay has proven εuccessful in identifying a homeobox-containing protein (mHOX) as an enhancer of MEF2 activity.

An in vivo aεεay follows the same principle. Limiting amounts of a mammalian expression plasmid driving MEF2 cDNA εequenceε correεponding to the different isoforms are transfected in limiting amounts into a variety of hoεt cellε that do not endogenously have MEF2 activity. In practice we have used HeLa cells and fibroblast cell lines. A concentration of the plasmid that in itself is insufficient to activate a reporter construct that drives a marker enzyme such as CAT (Chloranfenicol acetyl transfgerase) , β- galactosidase, luciferase or any other marker, whose expreεεion iε dependent on a intact MEF2 DNA binding εite, iε uεed. Thiε plasmid is cotransfected together with the teεt expreεεion plaεmidε. The enhancement in the expreεεion of the reporter plaεmid iε an indication of the enhancing effect of mHOX. The εame aεεay will be used to monitor the effect of cell extract, broths, etc. on the cells that contain the MEF2 expression plasmid together with the MEF2 reporter constructs.

Use

A MEF2 Transcription factor can be used to produce transgenic animalε with increased muscle cell mass, to prevent or counteract muscle atrophy in humans or animals suffering a pathological muεcular condition, or to develop pharmacological agentε that regulate the expreεεion of muscle-specific genes. Biological Activity Asεay for MEF2 Transcription Enhancement Transgenic Animals

The transgenic animals being prepared are those that gain muscular function by overexpresεing the MEF2 isoforms. The transgenic animals are prepared by pronuclei injection using εtandard protocolε as described by Hogan, B. , Constantini, F. , and Lacy, E. (1986) Manipulating the Mouse Embryo: A Lab. Manual (CSHL, CSH, NY) . These protocols with the necessary modifications will be used to produce transgenic animals of commercially and/or scientifically uεeful εpecies.

The transgenic animals are being made using complete coding sequenceε. As the regions important for function modified molecules will be used that produce an enhanced level of activity. The expresεion of the MEF2 sequences can be targeted to different tisεues and stages of development through the used of tiεεue-and developmental-specific promoter. The embryonic heavy chain promoter can target these sequences to the early developmental stages up to the perinatal age and the β myosin heavy chain promoter that can target the expreεεion of the gene to the εlow muεcle fiber and the cardiac tiεsue. These promoters have been iεolated and characterized (Strehler, E.E., et al. 1986. J. Mol . Biol . 190:291-317; Bouvagnet, P.F., et al., 1987. Mol . Cell . Biology. Biology 2:4377-4389) .

The certainty that the transgene will be expressed in the transgenic mammal is illuεtrated by the ability of a MEF2 construct to be expreεεed in a whole animal. This has been demonεtrated by direct injection of the DNA constructε into εkeletal and cardiac muεcle of interact dogs using a modification of the direct DNA injection described by Wolff (1991 Biotechniqueε 11:474-485) . Expreεεion iε alεo illuεtrated by the direct intramuεcular injection experimentε deεcribed above. Uεing thiε methodology we have εhown that it iε possible to produce high level expression in cardiac and skeletal muscle of the injected MEF2. Thiε expression lastε for at least 30 days after injection.

Therapy Administration of a Therapeutic composition by Intramuscular Injection

The regulated expreεεion of MEF2 geneε in vivo waε inveεtigated by injecting the gene into the heart of large mammal in εitu . In εo doing, a methodology εuitable for expreεεing MEF2 genes in large mammals was developed. The method involves injection of plaεmid DNA into canine myocardium. Methods

MSV-CAT waε created by fusing the coding sequence of the chloramphenicol acetyl tranεferaεe (CAT) gene (Gorman et al. 1982. Mol . Cell .Biol . 2:1044-1051) to the long terminal repeat of the mouε εarcoma viruε (MSV) . RSV-Luciferase was described previously (DeWet et al.

1987. Mol . Cell . Biol . 7:725-737). The serieε of deletionε of the 5' flanking region of the /?-MHC included the - 3300r3-MHC-CAT, -667r?-MHC-CAT, -354r/3-MHC-CAT and - 215r0-MHC constructε, which are genomic fragmentε of the rat jø-MHC gene from -3300 baεe pairε (b.p.),-667 b.p., - 354b.p., and -215 b.p. to +38 b.p. relative to the transcriptional start site cloned in front of the CAT gene (Thompson et al. 1992. J . Biol . Chem . 266(33) :22678- 22688) . -607rα-MHC-CAT containε position -607 to +32 of the rat α-myosin heavy chain promoter εequence linked to the CAT gene (Wid^'om et al. 1991. Mol . Cell .Biol . 11:677- 687) .

Adult mongrel dogs of either sex weighing between 20 and 26 kg were used for these experiments. Dogs were pre edicated with xylazine (lOmg/kg i.m.) and general anesthesia was induced with thia ylal (10-20mg/kg i.v.) and maintained with halothane (0.5-1.5 vol.%). Observing sterile technique, the pericardium was opened. Circular plasmid DNA resuspended in 20% sucrose and 1 x phosphate buffered saline was injected through a 30 ga needle inserted perpendicular to the epicardium. CAT-asεays were performed as previously deεcribed (Seed et al. 1988. Gene 67_.:271-277) . Luciferase-assays were also performed as described elsewhere (Brasier et al. 1989. BioTechnigueε 7(10) :1116-1122) . All data are reported mean ± standard error of the mean (SEM) . Results

Reporter constructε utilizing the chloramphenicol acetyl transferase (CAT) gene under the control of muscle-specific (/3-myosin heavy chain gene (3-MHC) ) or promiεcouε (MSV) promoterε were injected into the canine myocardium. Up to 30 εeparate injection εiteε were used per left ventricle with no mortality and only transient tachyarrhythmias.. There was a linear dose-reεponse relationεhip between the level of gene expression and the quantity of plaεmid DNA. injected between lOμg and 200μg. There waε no regional variation in expreεεion of injected reporter geneε throughout the left ventricular wall. Using both the MSV and a muscle-εpecific /3-MHC promoter reporter gene expression was 1 to 2 orders of magnitude greater in the heart than in the skeletal muscle. Expreεεion in the left ventricle was 3-fold higher than in the right ventricle. CAT-activity was detected at 3, 1 , 14 and 21 days post-injection (p.i.) with maximal expresεion at 7 dayε p.i. Statiεtical analysis of co- injection experiments revealed that co-injection of a εecond gene conεtruct (RSV-luciferase) is useful to control for transfection efficiency in vivo . Detection of regulatory sequences by injection of reporter constructs containing serial deletions of the 0-MHC gene 5' flanking region revealed a pattern of expresεion that is in general agreement with results obtained in cell culture studieε.

Fig. 9 εhows a dose-response relation between the amount of injected DNA and CAT-activity. Scatter plot of CAT-activity (in counts per minute/100) per injection site versus total amount of DNA (MSV-CAT) per injection site. Means (±SEM) are shown as solid squareε, n=4 for each doεe. Linear regreεsion function iε εhown (P<0.001) .

Fig. 10 εhows a time course of expression of injected gene constructε. CAT-activity (in countε per minute/1000 versus days post injection for promiscuouε (MSV, solid bars) and muscle specific (-667r3-MHC, hatched bars) promoters driving the CAT reporter gene. Mean ± SEM, n=5 for each time point (*P<0.01 compared with day 7) .

Fig. 11 showε a regional expression pattern of injected gene constructε throughout the left ventricular wall. 24 injectionε of -667r?-MHC-CAT were performed with 4 columnε around the left ventricle each comprising 6 injection sites ranging from base to apex (see cartoon) . Means ± SEM of each column are shown on the left hand panel 9 (n=6) . Means ± SEM of each row are shown on the right hand panel (n=4) . Fig. 12 showε an expression of promiscuouε (MSV) or muscle-specific (-667r?-MHC) promoter constructε in the right ventricle and in skeletal muscle. Valueε (mean ± SEM) are depicted as percent of expresεion of the εame conεtruct in the left ventricle (= 100%, solid bars) . Open bar is right ventricle (n=10 for MSV, n=8 for - 667r/3-MHC) . Hatched bar is skeletal muscle (n=10 for MSV, n=9 for -667r3-MHC.

Fig. 13 εhowε the correlation of CAT-to Luciferaεe-activity in co-injection experiments. Scatter plot of CAT-activity (counts per minute) versus luciferase-activity (light units) . 100 μg of a tisεue- εpecific (-667r3-MHC-CAT, Fig. 5a) or promiεcuouε (MSV, Fig. 5B) reporter gene conεtruct were co-injected with 20 μg of a control gene conεtruct (RSV=Luciferaεe) . The regreεsion functionε are aε indicated.

Fig. 14 εhows the mapping of the 5' flanking region of the fl-MHC gene in vivo . A εerieε of deletionε of the upstream region of the rat 3-MHC gene ranging from -3300 to -215 relative to the transcription start site were cloned in front of the CAT gene and injected into the canine myocardium. For comparison -607rα-MHC-CAT and -256 ApoAi-CAT were also injected. 100 μg of reporter gene construct were co-injected with 20 μg of a control gene construct (RSV-Luciferase) . CAT-activity was corrected for luciferase-activity and is expressed in percent of MSV-CAT. Open barε are 3-MHC-CAT conεtructε (n=6-10) . Hatched bar is the α-MHC-CAT construct (n=10) . Solid bar is the Apo AI-CAT construct (n=10) . See text for statistical analysis.

Other Modes of Administration of a Therapeutic Composition

The MEF2 polypeptides of the invention can be administered to a mammal, particularly a human, by any appropriate method: e.g. , orally, parenterally, tranεdermally, or tranεmucoεally. Adminiεtration can be in a εuεtained releaεe formulation uεing a biodegradable biocompatible polymer, or by on-εite delivery uεing micelleε, gelε or lipoεomeε. Therapeutic doses can be, but are not necessarily, in the range of 0.001 - 100.0 mg/kg body weight, or a range that is clinically determined as appropriate by those skilled in the art. With the availability of the cloned gene, a subεtantially pure MEF2 polypeptide can be produced in quantity uεing εtandard techniques known to one skilled in the art (see, e.g., Scopes, R. Protein Purification: Principles and Practice. 1982 Springer Verlag, NY) .

The nucleic acids of the invention can be administered to a mammal, preferably a human, or a domeεticated animal, by techniqueε of gene therapy. An appropriate recombinant vector, e.g., an attentuated viruε, is administered to a patient in a pharmaceutically-acceptable buffer (e.g., phyεiological εaline) . The therapeutic preparation iε administered in accordance with the condition to be treated. For example, retroviral vectors, can be used as a gene transfer delivery syεtem for a MEF2 polypeptide. Numerouε vectorε uεeful for thiε purpose have been described (Miller, 1990 Human Gene Therapy 1:5-14; Friedman, 1989 Science 244:1275-1281) ; Eglitis et al. 1988 Biotechniqueε 6:608-614; Tolstoshev et al. 1990 Current Opinion in Biotechnology 1:55-61; Sharp, 1991 The Lancet 337:1277-1278; Cornetta et al., 1987 Nucleic Acid Reεearch and Molecular Biology 36:311-322; Anderεon 1984 Science 226:401-409; Moen, 1991 Blood Cellε 12:407-416; and Miller et al. 1989 Biotechniqueε 7:980-990). Retroviral vectorε are particularly well developed and have been uεed in a clinical εetting (Roεenberg et al. 1990 N . Engl . J . Vied . 323:370) . The retroviral constructs, packaging cell lines and delivery systems that may be useful for this purpose include, but are not

In many cases where it is necesεary for the MEF2 polypeptide to enter the nucleus, it may be necessary to employ an attenuated viral vector that naturally replicates and is expresεed in the nucleuε. Alternatively, the nucleic acid vector can include a nuclear localization region, e.g., two conεenεuε regionε consisting of basic amino acids separated approximately 10 "spacer" amino acids. This region is likely to be responsible for directing the transport of this protein from the cytoplasm, where it is produced, to the cellular nucleus (Dingwall, C. and Laskey, R. , 1991. Trendε in Biochemical Scienceε . 16:478-481) .

The retroviral conεtructs, packaging cell lines and delivery syεtemε which may be useful for this purpose include, but are not limited to, one, or a combination of, the following: Moloney murine leukemia viral vector types; self inactivating vectors; double copy vectors; selection marker vectorε; and εuicide mechanism vectors.

Non viral methods for the therapeutic delivery of nucleic acid encoding a MEF2 polypeptide

Nucleic acid encoding MEF2, or a fragment thereof, under the regulation of the a muscle-cell specific promoter, and including the appropriate sequences required for autonomous replication or for insertion into genomic DNA of the patient, may be administered to the patient using the following gene transfer techniques: microinjection (Wolff et al., Science 247:1465 (1990)); calcium phosphate transfer (Graham and Van der Eb, Virology 52:456 (1973); Wigler et al., Cell 14:725 (1978); Feigner et al., Proc. Natl. Acad. Sci. USA JJ4_.:7413 (1987)); lipofection (Feigner et al., Proc. Natl. Acad. Sci. USA J34_.:7413 (1987); Ono et al., Neuroscience Lett 117:259 (1990); Brigham et al. , Am. J. Med. Sci. 298:278 (1989); Staubinger and Papahadjopoulos, Meth. Enz. 101:512 (1983)); asialorosonucoid-polylyεine conjugation (Wu and Wu, J. Biol. Chem. 263:14621 (1988); Wu et al., J. Biol. Chem. 264:16985 (1989)); and electroporation (Neumnn et al., EMBO J. 2:841 (1980)). Theεe publications are hereby incorporated by reference. Muscle Cell Specific Expresεion In any of the modes of administration of MEF2 nucleic acids discussed above, e.g., administration by transgenics or gene therapy, the specific expresεion of MEF2 can be localized to muscle tisεue by including the promoterε of any of the following geneε in the regulatory sequenceε of the conεtruct to be administered: the MyoD family of genes; myogenin; creatine kinase; the myoεin heavy chain gene family; the myosin light chain family; troponins; and tropomyoεins. Regulation of. and by. the MEF2 family proteins The MEF2 genes can be induced in a family of transcription factors called the myogenic determination genes. We have tested two of these for their ability to induce MEF2. Both MyoDl and myogenin are able to induce the MEF2 genes. On the other hand, MEF2 is able to induce the expreεεion of myogenin. These resultε indicate that there is a feedback loop by which the two families of myogenic regulators 9the MyoD family and the MEF2 family) regulate each other. MEF2 upregulates many known muεcle εpecific genes described to date. These include, but are not limited to, creatine kinase, the myoεin heavy chain gene family, the myoεin light chain family, troponins, tropomyoεins, and various ion channels. A MEF2 protein or nucleic acid of the invention can be administered to a mammal to upregulate, or mask a symptomatic defect in, any of these genes, or any other as yet uncharacterized genes that include a MEF2 consensus DNA binding sequence in its 5'regulatory sequenceε.

Other Embodimentε Other embodiments are within the following claims. For examp?.e, the invention includeε any protein that iε substantially homologous to a member of the human MEF2 protein family, and posseεεes the transcriptional enhancer activity of the MEF2 family. Also included are: allelic variations; natural mutants; induced mutants; proteins encoded by DNA that hybridizes under high or low stringency conditions (e.g., washing at 2xSSC at 40 °C with a probe length of at least 40 nucleotides) to a naturally occurring MEF2 family nucleic acid (for other definitions of high and low stringency see Current

Protocols in Molecular Biology, John Wiley & Sons, New York, 1989, 6.3.1 - 6.3.6, hereby incorporated by reference) ; and polypeptides or proteins specifically bound by antisera to a member of the MEF2 protein family, especially by antisera to the active site or binding domain of a member of the MEF2 protein family. The term alεo includeε chimeric polypeptides that include biologically active fragments of the MEF2 protein family.

The invention alεo includeε any biologically active fragment or analog of a member of the MEF2 protein family. By "biologically active" iε meant possessing in vivo or in vitro τranscriptional activity which is characteristic of the MEF2 -amino acid polypeptide shown in Fig. 2. Since a member of the MEF2 protein family exhibits a range of physiological properties and since εuch properties may be attributable to different portions of the MEF2 molecule, a useful MEF2 fragment or MEF2 analog is one that exhibits a biological activity in any biological asεay for MEF2 activity, aε deεcribed above. Moεt preferably a MEF2 protein fragment or analog posseεεeε 10%, preferably 40%, or at least 90% of the activity of a member of the MEF2 protein family, in any * in vivo or in vitro MEF2 activity asεay. Preferred analogε include MEF2 (or biologically active fragments thereof) whose sequences differ from the wild-type sequence only by conservative amino acid εubεtitutions, for example, substitution of one amino acid for another with similar characteriεtics (e.g., valine for glycine, arginine for lysine, etc.) or by one or more non-conservative amino acid substitutions, deletions, or insertions which do not abolish the polypeptide's biological activity.

Other uεeful modifications include those which increaεe peptide εtability. Such analogs may contain, for example, one or more non-peptide bonds (which replace the peptide bondε) or D-amino acids in the peptide sequence.

Analogε can differ from a naturally occurring member of the MEF2 protein family in amino acid sequence or in ways that do not involve sequence, or in both. Analogs of the invention will generally exhibit at least 70%, more preferably 80%, more preferably 90%, and most preferably 95% or even 99%, homology with a segment of 20 amino acid residues, preferably more than 40 amino acid residueε, or more preferably the entire εequence of a naturally occurring MEF2 polypeptide εequence.

Alterations in primary sequence include genetic variantε, both natural and induced. Alεo included are analogε that include residues other than naturally occurring L-amino acids, e.g., D-amino acids or non- naturally occurring or synthetic amino acids, e.g., β or γ amino acids. Alternatively, increased stability may be conferred by cyclizing the peptide molecule, or by exposing the polypeptide to phoεphorylation-altering enzymes, e.g., kinaseε or phosphatases. Other useful modifications also include in vivo or in vitro chemical derivatization of polypeptides, e.g., acetylation, methylation, phosphorylation, carboxylation, or glycosylation; glycosylation can be modified, e.g., by modifying the glycosylation patterns of a polypeptide during itε εyntheεiε and proceεεing or in further processing steps, e.g., by exposing the polypeptide to glycosylation affecting enzymeε derived from cellε that normally provide εuch proceεεing, e.g., mammalian glycoεylation enzymes; phosphorylation can be modified by exposing the polypeptide to phosphorylation-altering enzymes, e.g., kinases or phosphataseε.

In addition to substantially full-length MEF2 polypeptides, the invention also includes biologically active fragments of the MEF2 polypeptides. As used herein, the term "fragment", as applied to a polypeptide, will ordinarily be at least about 20 residueε, more typically at leaεt about 40 reεidueε, or preferably at least about 60 residues in length. Fragmentε of a MEF2 polypeptide can be generated by methodε known to those skilled in the art. The ability of a candidate fragment to exhibit a biological activity of a member of the MEF2 protein family can be assesεed by methodε known to thoεe εkilled in the art as described herein. Also included are MEF2 polypeptides containing residues that are not required for biological activity of the peptide, or that result from alternative mRNA εplicing or alternative protein processing events. What is claimed is: Probe/ Sequence MEF2 Binding SEQUENCE ID Competitor DNA

MEF2mt C G C T C T A A G G C T A A C C C T MEF2mt4 C G C T C T A T A A A T A A C C C T +++ MEF2mt6 C G C T C T A A A C A T A A C C C T

VTemb A T T T C T A T A T A T A C T T T C + λrG G G G G A C C A A A T A A G G C A A OTF-2 C C A A T G A T T T G C A T G C T C LC2 HF-1 G G G G T T A A A A A T A A C C C C +++ CK A/T C T G G T T A T A A T T A A C C C A ++ TNT A/T C G G G T T T A A A A T A G C A A A ++ MHC A/T-l C A G A T T A A A A A T A A C T A A + MHC A/T-2 A G G A C T A A A A A A A G G C C C + onsensus C T A A A A A T A A T t T G

ble l. Nucleotide sequences of probes and competitor DNAs used in MEF2 binding assays. Only the core sequences of the d.s. oligonucleotides are shown. (+) and (-) represent positive and negative binding of the probes, respectively (see Figure 5) . Nucleotides in bold print conform to the consensus sequence of the MEF2 site as reported by Cserjesi and Olson, Mol Cell Bio, 11:4854-4862, 1991.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANT: Bernardo Nadal-Ginard

(ii) TITLE OF INVENTION: MYOCYTE-SPECIFIC TRANSCRIPTION ENHANCING FACTOR 2

(iii) NUMBER OF SEQUENCES: 45 (iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Fish & Richardson

(B) STREET: 225 Franklin Street

(C) CITY: Boεton

(D) STATE: Maεεachuεettε

(E) COUNTRY: U.S.A.

(F) ZIP: 02110-2804

(V) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: 3.5" Diεkette, 1.44 Mb

(B) COMPUTER: IBM PS/2 Model 50Z or

55SX

(C) OPERATING SYSTEM: MS-DOS (Verεion 5.0)

(D) SOFTWARE: WordPerfect (Verεion

5.1)

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: 07/939,898

(B) FILING DATE: 04 SEP 1992

(C) CLASSIFICATION:

(vii) PRIOR APPLICATION DATA:

(A) APPLICATION NUMBER:

(B) FILING DATE:

(viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: John W. Freeman

(B) REGISTRATION NUMBER: 29,066

(C) REFERENCE/DOCKET NUMBER:00108/088001

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (617) 542-5070

(B) TELEFAX: (617) 542-8906 (C) TELEX: 200154

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 1: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2968

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:

GAATTTTCTG CAAGGATCAT ATCTAAGTGC ACTTTTTGCT GATACTTCAT TTCTAGACAT 60

TGAGTCTCAC TCTACCCCCC AGGCTGAAGT GCAGTGGTGT GATCTCGGTT CACTGCAACC 120

TCCGCCTCCA GGTTCAAGTG ATTCTCGTAC CTCAGCCTCC CGAGTAGCTG GGATTACAGG 180

CGCCTGCCAC CATGCCTGGC TGATATTTAT ATTTTTAGTA GAGATGGAGT TTCACCATGT 240

TGGCCAGGCT GGTCTCGAAC TCTGGACCTC AGATCTTGTA GAAAATTTCA GCTGTAGCCC 300

TTGGACTAGA AGCTGAAATA ACAGAAGCTG TGTACGATGC ATTAGGGTAT TGAAGAAAAT 360

TAACTTTTGA ATTAAATATT TGGAATATAA GGAAATAAGG AAAGTTGACT GAAA 414

ATG GGG CGG AAG AAA ATA CAA ATC ACA CGC ATA ATG GAT GAA AGG AAC 462 Met Gly Arg Lyε Lyε lie Gin lie Thr Arg lie Met Aεp Glu Arg Aεn 1 5 10 15

CGA CAG GTC ACT TTT ACA AAG AGA AAG TTT GGA TTA ATG AAG AAA GCC 510 Arg Gin Val Thr Phe Thr Lyε Arg Lyε Phe Gly Leu Met Lys Lys Ala 20 25 30

TAT GAA CTT AGT GTG CTC TGT GAC TGT GAA ATA GCA CTC ATC ATT TTC 558 Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu lie Ala Leu lie lie Phe 35 40 45

AAC AGC TCT AAC AAA CTG TTT CAA TAT GCT AGC ACT GAT ATG GAC AAA 606 Asn Ser Ser Aεn Lyε Leu Phe Gin Tyr Ala Ser Thr Asp Met Aεp Lys 50 55 60

GTT CTT CTC AAG TAT ACA GAA TAT AAT GAA CCT CAT GAA AGC AGA ACC 654 Val Leu Leu Lys Tyr Thr Glu Tyr Asn Glu Pro His Glu Ser Arg Thr 65 70 75 80

AAC TCG GAT ATT GTT GAG GCT CTG AAC AAG AAG GAA CAC AGA GGG TGC 702 Asn Ser Asp lie Val Glu Ala Leu Asn Lys Lys Glu His Arg Gly Cys

85 90 95

GAC AGC CCA GAC CCT GAT ACT TCA TAT GTG CTA ACT CCA CAT ACA GAA 750 Asp Ser Pro Asp Pro Asp Thr Ser Tyr Val Leu Thr Pro His Thr Glu 100 105 110

GAA AAA TAT AAA AAA ATT AAT GAG GAA TTT GAT AAT ATG ATG CGG AAT 798 Glu Lys Tyr Lys Lys lie Asn Glu Glu Phe Asp Asn Met Met Arg Asn 115 120 125

CAT AAA ATC GCA CCT GGT CTG CCA CCT CAG AAC TTT TCA ATG TCT GTC 846 His Lyε lie Ala Pro Gly Leu Pro Pro Gin Aεn Phe Ser Met Ser Val 130 135 140

ACA GTT CCA GTG ACC AGC CCC AAT GCT TTG TCC TAC ACT AAC CCA GGG 894 Thr Val Pro Val Thr Ser Pro Aεn Ala Leu Ser Tyr Thr Asn Pro Gly 145 150 155 160

AGT TCA CTG GTG TCC CCA TCT TTG GCA GCC AGC TCA ACG TTA ACA GAT 942 Ser Ser Leu Val Ser Pro Ser Leu Ala Ala Ser Ser Thr Leu Thr Asp

165 170 175

TCA AGC ATG CTC TCT CCA CCT CAA ACC ACA TTA CAT AGA AAT GTG TCT 990 Ser Ser Met Leu Ser Pro Pro Gin Thr Thr Leu His Arg Asn Val Ser 180 185 190

CCT GGA GCT CCT CAG AGA CCA CCA AGT ACT GGC AAT GCA GGT GGG ATG 1038 Pro Gly Ala Pro Gin Arg Pro Pro Ser Thr Gly Asn Ala Gly Gly Met 195 200 205

TTG AGC ACT ACA GAC CTC ACA GTG CCA AAT GGA GCT GGA AGC AGT CCA 1086 Leu Ser Thr Thr Asp Leu Thr Val Pro Aεn Gly Ala Gly Ser Ser Pro 210 215 220

GTG GGG AAT GGA TTT GTA AAC TCA AGA GCT TCT CCA AAT TTG ATT GGA 1134 Val Gly Asn Gly Phe Val Asn Ser Arg Ala Ser Pro Aεn Leu lie Gly 225 230 235 240

GCT ACT GGT GCA AAT AGC TTA GGC AAA GTC ATG CCT ACA AAG TCT CCC 1182 Ala Thr Gly Ala Asn Ser Leu Gly Lys Val Met Pro Thr Lys Ser Pro

245 250 255

CCT CCA CCA GGT GGT GGT AAT CTT GGA ATG AAC AGT AGG AAA CCA GAT 1230 Pro Pro Pro Gly Gly Gly Asn Leu Gly Met Asn Ser Arg Lyε Pro Aεp 260 265 270

CTT CGA GTT GTC ATC CCC CCT TCA AGC AAG GGC ATG ATG CCT CCA CTA 1278 Leu Arg Val Val lie Pro Pro Ser Ser Lys Gly Met Met Pro Pro Leu 275 280 285

TCG GAG GAA GAG GAA TTG GAG TTG AAC ACC CAA AGG ATC AGT AGT TCT 1326 Ser Glu Glu Glu Glu Leu Glu Leu Aεn Thr Gin Arg lie Ser Ser Ser 290 295 300

CAA GCC ACT CAA CCT CTT GCT ACC CCA GTC GTG TCT GTG ACA ACC CCA 1374 Gin Ala Thr Gin Pro Leu Ala Thr Pro Val Val Ser Val Thr Thr Pro 305 310 315 320

AGC TTG CCT CCG CAA GGA CTT GTG TAC TCA GCA ATG CCG ACT GCC TAC * 1422 Ser Leu Pro Pro Gin Gly Leu Val Tyr Ser Ala Met Pro Thr Ala Tyr

325 330 335

AAC ACT GAT TAT TCA CTG ACC AGC GCT GAC CTG TCA GCC CTT CAA GGC 1470 Aεn Thr Aεp Tyr Ser Leu Thr Ser Ala Aεp Leu Ser Ala Leu Gin Gly 340 345 350

TTC AAC TCG CCA GGA ATG CTG TCG CTG GGA CAG GTG TCG GCC TGG CAG 1518 Phe Aεn Ser Pro Gly Met Leu Ser Leu Gly Gin Val Ser Ala Trp Gin 355 360 365

CAG CAC CAC CTA GGA CAA GCA GCC CTC AGC TCT CTT GTT GCT GGA GGG 1566 Gin His His Leu Gly Gin Ala Ala Leu Ser Ser Leu Val Ala Gly Gly 370 375 380

CAG TTA TCT CAG GGT TCC AAT TTA TCC ATT AAT ACC AAC CAA AAC ATC 1614 Gin Leu Ser Gin Gly Ser Asn Leu Ser lie Aεn Thr Asn Gin Asn lie 385 390 395 400

AGC ATC AAG TCC GAA CCG ATT TCA CCT CCT CGG GAT CGT ATG ACC CCA 1662 Ser lie Lys Ser Glu Pro lie Ser Pro Pro Arg Asp Arg Met Thr Pro

405 410 415

TCG GGC TTC CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG CCG CCG 1710 Ser Gly Phe Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Pro Pro 420 425 430

CCA CCA CCG CAG CCC CAG CCA CAA CCC CCG CAG CCC CAG CCC CGA CAG 1758 Pro Pro Pro Gin Pro Gin Pro Gin Pro Pro Gin Pro Gin Pro Arg Gin 435 440 445

GAA ATG GGG CGC TCC CCT GTG GAC AGT CTG AGC AGC TCT AGT AGC TCC 1806 Glu Met Gly Arg Ser Pro Val Asp Ser Leu Ser Ser Ser Ser Ser Ser 450 455 460

TAT GAT GGC AGT GAT CGG GAG GAT CCA CGG GGC GAC TTC CAT TCT CCA 1854 Tyr Asp Gly Ser Aεp Arg Glu Aεp Pro Arg Gly Aεp Phe His Ser Pro 465 470 475 480

ATT GTG CTT GGC CGA CCC CCA AAC ACT GAG GAC AGA GAA AGC CCT TCT 1902 lie Val Leu Gly Arg Pro Pro Asn Thr Glu Aεp Arg Glu Ser Pro Ser

485 490 495

GTA AAG CGA ATG AGG ATG GAC GCG TGG GTG ACC TAA 1938

Val Lys Arg Met Arg Met Asp Ala Trp Val Thr 500 505

GGCTTCCAAG CTGATGTTTG TACTTTTGTG TTACTGCAGT GACCTGCCCT ACATATCTAA 1998 ATCGGTAAAT AAGGACATGA GTTAAATATA TTTATATGTA CATACATATA TATATCCCTT 2058 TACATATATA TGTATGTGGG TGTGAGTGTG TGTGTATGTG TGGGTGTGTG TTACATACAC 2118 AGAATCAGGC ACTTACCTGC AAACTCCTTG TAGGTCTGCA GATGTGTGTC CCATGGCAGA 2178 CAAAGCACCC TGTAGGCACA GACAAGTCTG GCACTTCCTT GGACTACTTG TTTCGTAAAG 2238 ATAACCAGTT TTTGCAGAGA AACGTGTACC CATATATAAT TCTCCCACAC TAGCTTGCAG 2298 AAACCTAGAG GGCCCCCTAC TTGTTTTATT TAACTGTGCA GTGACTGTAG TTACTTAAGA 2358 GAAAATGCTT TGTAGAACAG AGCAGTAGAA AAGCAGGAAC CAAGAAAGCA ATACTGTACA 2418 TAAAATGTCA TTTATATTTT CCAACCTGGC ATGGGTGTCT GTTGCAAAGG GGTGCATGGG 2478 AAAGGGCTGT TGATATTAAA AACAAACAAA ACAAAAAAGC CCCACACATA ACTGTTTTGC 2538 ACGTGCAAAA ATGTATTGGG TCAAGAAGTG ATCTTTAGCT AATAAAGAAA GAGAATAGAA 2598 AACACGCATG AGATATTCAG AAAATACTAG CCTAGAAATA TAGAGCATTA ACAAAGGAAA 2658 ATTAATATAT TAAGTTATAA TTGGAATATG TCAGAAGTTT CTTTTTACAT TCATATCTTA 2718 AAAATTAAAG AAACTGATTT TAGCTCATGT ATATTTTATA TGAAAGAAAA CACCCTTATG 2778 AATTGATGAC TATATATAAA ATTATATTCA CTACTTTTGA ACACATTCTG CTATGAATTA 2838 TTTATATAAG CCAAAGCTAT ATGTTGTAAC TTTTTTTTAG AGAATAGCTT TATCTTGGTT 2898 TAACTCTTTA GTTTTATTTT AAGAGGGGAA AACAAAAATA TCTTGCAAGC AGAACCTTGA 2958 AAAAAAAAAA 2968

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 2: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 55

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:

Arg Arg Lys Gin Pro lie Arg Tyr lie Glu Asn Lys Thr Arg Arg His

5 10 15

Val Thr Phe Ser Lys Arg Arg His Gly lie Met Lys Lys Ala Tyr Glu 20 25 30 Leu Ser Val Leu Thr Gly Ala Asn lie Leu Leu Leu lie Leu Ala Asn 35 40 45

Ser Gly Leu Val Tyr Thr Phe 50

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1500

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:

CGGAGCCGGA GATGCAGCTC AAGGGGAAGA AAGCGCCGTG AAGAACCTGG TGGACAGCAG 60

CGTCTACTTC CGCAGCGTGG AGGGTCTGCT CAAACAGGCC ATCAGCATCC GGGACCATAT 120

GAATGCCAGT GCCCAGGGCC ACAGCCCGGA GGAACCACCC CCGCCCTCCT CAGCCTGATC 180

CTGGAAGAGA CTCGGGGCCC CCCAGCCTCC GCCAACCCAG ACAAAGATCA TTCCACTCAG 240

CCTGGGACG 249

ATG GGG AGG AAA AAA ATC CAG ATC TCC CGC ATC CTG GAC CAA AGG AAT 297 Met Gly Arg Lyε Lyε lie Gin lie Ser Arg lie Leu Aεp Gin Arg Aεn 1 5 10 15

CGG CAG GTG ACG TTC ACC AAG CGG AAG TTC GGG CTG ATG AAG AAG GCC 345 Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lys Lys Ala 20 25 30

TAT GAG CTG AGC GTG CTC TGT GAC TGT GAG ATA GCC CTC ATC ATC TTC 393 Tyr Glu Leu Ser Val Leu Cyε Aεp Cyε Glu lie Ala Leu lie lie Phe 35 40 45

AAC AGC GCC AAC CGC CTC TTC CAG TAT GCC AGC ACG GAC ATG GAC CGT 441 Aεn Ser Ala Aεn Arg Leu Phe Gin Tyr Ala Ser Thr Aεp Met Aεp Arg 50 55 60

GTG CTG CTG AAG TAC ACA GAG TAC AGC GAG CCC CAC GAG AGC CGC ACC 489 Val Leu Leu Lys Tyr Thr Glu Tyr Ser Glu Pro His Glu Ser Arg Thr 65 70 75 80

AAC ACT GAC ATC CTC GAG ACG CTG AAG CGG AGG GGC ATT GGC CTC GAT 537 Asn Thr Aεp lie Leu Glu Thr Leu Lyε Arg Arg Gly lie Gly Leu Asp

85 90 95 GGG CCA GAG CTG GAG CCG GAT GAA GGG CCT GAG GAG CCA GGA GAG AAG 585

Gly Pro Glu Leu Glu Pro Asp Glu Gly Pro Glu Glu Pro Gly Glu Lys 100 105 110

TTT CGG AGG CTG GCA GGC GAA GGG GGT GAT CCG GCC TTG CCC CGA CCC 633

Phe Arg Arg Leu Ala Gly Glu Gly Gly Asp Pro Ala Leu Pro Arg Pro

115 120 125

CGG CTG TAT CCT GCA GCT CCT GCT ATG CCC AGC CCA GAT GTG GTA TAC 681

Arg Leu Tyr Pro Ala Ala Pro Ala Met Pro Ser Pro Asp Val Val Tyr 130 .135 140

GGG GCC TTA CCG CCA CCA GGC TGT GAC CCC AGT GGG CTT GGG GAA GCA 729

Gly Ala Leu Pro Pro Pro Gly Cys Asp Pro Ser Gly Leu Gly Glu Ala 145 150 155 160

CTG CCC GCC CAG AGC CGC CCA TCT CCC TTC CGA CCA GCA GCC CCC AAA 777

Leu Pro Ala Gin Ser Arg Pro Ser Pro Phe Arg Pro Ala Ala Pro Lys

165 170 175

GCC GGG CCC CCA GGC CTG GTG CAC CCT CTC TTC TCA CCA AGC CAC CTC 825

Ala Gly Pro Pro Gly Leu Val His Pro Leu Phe Ser Pro Ser His Leu 180 185 190

ACC AGC AAG ACA CCA CCC CCA CTG TAC CTG CCG ACG GAA GGG CGG AGG 873

Thr Ser Lys Thr Pro Pro Pro Leu Tyr Leu Pro Thr Glu Gly Arg Arg

195 200 205

TCA GAC CTG CCT GGT GGC .CTG GCT GGG CCC CGA GGG GGA CTA AAC ACC 921 Ser Asp Leu Pro Gly Gly Leu Ala Gly Pro Arg Gly Gly Leu Aεn Thr 210 215 220

TCC AGA AGC CTC TAC AGT GGC CTG CAG AAC CCC TGC TCC ACT GCA ACT 969 Ser Arg Ser Leu Tyr Ser Gly Leu Gin Aεn Pro Cys Ser Thr Ala Thr 225 230 235 240

CCC GGA CCC CCA CTG GGG AGC TTC CCC TTC CTC CCC GGA GGC CCC CCA 1017 Pro Gly Pro Pro Leu Gly Ser Phe Pro Phe Leu Pro Gly Gly Pro Pro

245 250 255

GTG GGG GCC GAA GCC TGG GCG AGG AGG GTC CCC CAA CCC GCG GCG CCT 1065 Val Gly Ala Glu Ala Trp Ala Arg Arg Val Pro Gin Pro Ala Ala Pro 260 265 270

CCC CGC CGA CCC CCC CAG TCA GCA TCA AGT CTG AGC GCC TCT CTC CGG 1113 Pro Arg Arg Pro Pro Gin Ser Ala Ser Ser Leu Ser Ala Ser Leu Arg 275 280 285

CCC CCG GGG GCC CCG GCG ACT TTC CTA AGA CCT TCC CCT ATC CCT TGC 1161 Pro Pro Gly Ala Pro Ala .?hr Phe Leu Arg Pro Ser Pro lie Pro Cyε 290 295 300 TCC TCG CCC GGT CCC TGG CAG AGC CTC TGC GGC CTG GGC CCG CCC TGC 1209 Ser Ser Pro Gly Pro Trp Gin Ser Leu Cys Gly Leu Gly Pro Pro Cys 305 310 315 320

GCC GGC TGC CCT TGG CCG ACG GCT GGC CCC GGT AGG AGA TCA CCC GGT 1257 Ala Gly Cys Pro Trp Pro Thr Ala Gly Pro Gly Arg Arg Ser Pro Gly

325 330 335

GGC ACC AGC CCA GAG CGC TCG CCA GGT ACG GCG AGG GCA CGT GGG GAC 1305 Gly Thr Ser Pro Glu Arg Ser Pro Gly Thr Ala Arg Ala Arg Gly Asp 340 345 350

CCC ACC TCC CTC CAG GCC TCT TCA GAG AAG ACC CAA CAG TGA 1347

Pro Thr Ser Leu Gin Ala Ser Ser Glu Lys Thr Gin Gin 355 360 365

CGCCCCCCTC CGCGGTGGGG GCTTGGAGGT GGGCGGCTGG ACTCAATCCA CCCTGGGGGG 1407

CTCCTTTCCT TCTTCCTATT TGTGTGTATA TCCACAAATA AAACGCGCGT GGCGTCCGTG 1467

GACCAGAAAA AAAAAAAAAA AAAAAAAAAA AAA 1500

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 4: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2161

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:

AAGATGCATG GTGAAATGGT GAGATCAGAA AGGGGCCCTG CATTTGATAA AAATGCAAAA 60

AACAAAATAA AAATAGCATG AAAGAAACTA GTATATACAA TGGATGTCAG TTGACCCAAT 120

AGATTGCTAA TGTATTAAAA ACAATTTAGG GTGTTGCAAT GTGATATGTT CTAACCCCAC 180

AGGTTATCCT TTTGACAGCT GACCTTAAAC TTATAAAATG TAAGCAGAGT AAAAGAAAAC 240

AGANAGAANA TAGTTACTCA AATGTGCAAC TGCACAAATA TACCCCCCTC CCGCTATTAA 300

GATAACAAAA CTTCTGCTAT TACCATAATA TTATATATAT TAGAAAGCTA TACACAAGCA 360

TGTTAATTTC ACAGATTTTT TTAAAAGATT CTTAATATTT TATATAATTA GAAATACACA 420

CATTTCAAAA ACAAACTTCT ACAAAGAGAA AACAGTTATC TTGGTTAGCA AAGCATGGAG 480 TTCTTCATGG CTTAGGGTAG TGCTTTCTAT ACACAAAGTC CTTTTTGGTT TTTTACAGGA 540

CTGTTTAAAA TATTAGCGAC GCTATCAAGG AAAAAATACA TAATTTCAGG GACGAGAGAA 600

AGAAAAGGAA GGAAAAAATA CATAATTTCA GGGACGAGAG AGAGAAGAAA AACGGGGACT 660

ATG GGG AGA AAA AAG ATT CAG ATT ACG AGG ATT ATG GAT GAA CGT AAC 708 Met Gly Arg Lyε Lyε lie Gin lie Thr Arg lie Met Aεp Glu Arg Aεn 1 5 10 15

AGA CAG GTG ACA TTT ACA AAG AGG AAA TTT GGG TTG ATG AAG AAG GCT 756 Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lys Lys Ala 20 25 30

TAT GAG CTG AGC GTG CTG TGT GAC TGT GAG ATT GCG CTG ATC ATC TTC 804 Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu lie Ala Leu lie lie Phe 35 ' 40 45

AAC AGC ACC AAC AAG CTG TTC CAG TAT GCC AGC ACC GAC ATG GAC AAA 852 Aεn Ser Thr Aεn Lyε Leu Phe Gin Tyr Ala Ser Thr Aεp Met Aεp Lyε 50 55 60

GTG CTT CTC AAG TAC ACG GAG TAC AAC GAG CCG CAT GAG AGC CGG ACA 900 Val Leu Leu Lys Tyr Thr Glu Tyr Asn Glu Pro His Glu Ser Arg Thr 65 70 75 80

AAC TCA GAC ATC GTG GAG ACG TTG AGA AAG AAG GGC CTT AAT GGC TGT 948 Asn Ser Asp lie Val Glu Thr Leu Arg Lys Lys Gly Leu Asn Gly Cyε

85 90 95

GAC AGC CCA GAC CCC GAT GCG GAC GAT TCC GTA GGT CAC AGC CCT GAG 996 Aεp Ser Pro Aεp Pro Aεp Ala Asp Asp Ser Val Gly His Ser Pro Glu 100 105 110

TCT GAG GAC AAG TAC AGG AAA ATT AAC GAA GAT ATT GAT CTA ATG ATC 1044 Ser Glu Asp Lys Tyr Arg Lys lie Asn Glu Asp lie Asp Leu Met lie 115 120 125

AGC AGG CAA AGA TTG TGT GCT GTT CCA CCT CCC AAC TTC GAG ATG CCA 1092 Ser Arg Gin Arg Leu Cyε Ala Val Pro Pro Pro Asn Phe Glu Met Pro 130 135 140

GTC TCC ATC CCA GTG TCC AGC CAC AAC AGT TTG GTG TAC AGC AAC CCT 1148 Val Ser lie Pro Val Ser Ser His Asn Ser Leu Val Tyr Ser Asn Pro 145 150 155 160

GTC AGC TCA CTG GGA AAC CCC AAC CTA TTG CCA CTG GCT CAC CCT TCT 1188 Val Ser Ser Leu Gly Aεn Pro Aεn Leu Leu Pro Leu Ala His Pro Ser

165 170 175

CTG CAG AGG AAT AGT ATG TCT CCT GGT GTA ACA CAT CGA CCT CCA AGT 1236 Leu Gin Arg Asn Ser M-*t Ser Pro Gly Val Thr His Arg Pro Pro Ser 180 185 190 GCA GGT AAC ACA GGT GGT CTG ATG GGT GGA GAC CTC ACG TCT GGT GCA 1284 Ala Gly Asn Thr Gly Gly Leu Met Gly Gly Asp Leu Thr Ser Gly Ala 195 200 205

GGC ACC AGT GCA GGG AAC GGG TAT GGC AAT CCC CGA AAC TCA CCA GGT 1332 Gly Thr Ser Ala Gly Asn Gly Tyr Gly Asn Pro Arg Asn Ser Pro Gly 210 215 220

CTG CTG GTC TCA CCT GGT AAC TTG AAC AAG AAT ATG CAA GCA AAA TCT 1380 Leu Leu Val Ser Pro Gly Asn Leu Asn Lys Asn Met Gin Ala Lys Ser 225 230 235 240

CCT CCC CCA ATG AAT TTA GGA ATG AAT AAC CGT AAA CCA GAT CTC CGA 1428 Pro Pro Pro Met Aεn Leu Gly Met Asn Asn Arg Lyε Pro Asp Leu Arg

245 250 255

GTT CTT ATT CCA CCA GGC AGC AAG AAT ACG ATG CCA TCA GTG AAT CAA 1476 Val Leu lie Pro Pro Gly Ser Lys Asn Thr Met Pro Ser Val Asn Gin 260 265 270

AGG ATA AAT AAC TCC CAG TCG GCT CAG TCA TTG GCT ACC CCA GTG GTT 1524 Arg lie Asn Asn Ser Gin Ser Ala Gin Ser Leu Ala Thr Pro Val Val 275 280 285

TCC GTA GCA ACT CCT ACT TTA CCA GGA CAA GGA ATG GGA GGA TAT CCA 1572 Ser Val Ala Thr Pro Thr Leu Pro Gly Gin Gly Met Gly Gly Tyr Pro 290 295 300

TCA GCC ATT TCA ACA ACA TAT GGT ACC GAG TAC TCT CTG AGT AGT GCA 1620 Ser Ala lie Ser Thr Thr Tyr Gly Thr Glu Tyr Ser Leu Ser Ser Ala 305 310 315 320

GAC CTG TCA TCT CTG TCT GGG TTT AAC ACC GCC AGC GCT CTT CAC CTT 1668 Asp Leu Ser Ser Leu Ser Gly Phe Asn Thr Ala Ser Ala Leu His Leu

325 330 335

GGT TCA GTA ACT GGC TGG CAA CAG CAA CAC CTA CAT AAC ATG CCA CCA 1716 Gly Ser Val Thr Gly Trp Gin Gin Gin Hiε Leu Hiε Asn Met Pro Pro 340 345 350

TCT GCC CTC AGT CAG TTG GGA GCT TGC ACT AGC ACT CAT TTA TCT CAG 1764 Ser Ala Leu Ser Gin Leu Gly Ala Cys Thr Ser Thr His Leu Ser Gin 355 360 365

AGT TCA AAT CTC TCC CTG CCT TCT ACT CAA AGC CTC AAC ATC AAG TCA 1812 Ser Ser Asn Leu Ser Leu Pro Ser Thr Gin Ser Leu Aεn lie Lyε Ser 370 375 380

GAA CCT GTT TCT CCT CCT AGA GAC CGT ACC ACC ACC CCT TCG AGA TAC 1860 Glu Pro Val Ser Pro Pro Arg Asp Arg Thr Thr Thr Pro Ser Arg Tyr 385 390 395 400 CCA CAA CAC ACG CGC CAC CAG GCG GGG AGA TCT CCT GTT GAC AGC TTG 1908 Pro Gin Hiε Thr Arg His Glu Ala Gly Arg Ser Pro Val Asp Ser Leu

405 410 415

AGC AGC TGT AGC AGT TCG TAC GAC GGG AGC GAC CGA GAG GAT CAC CGG 1956 Ser Ser Cys Ser Ser Ser Tyr Asp Gly Ser Asp Arg Glu Asp His Arg 420 425 430

AAC GAA TTC CAC TCC CCC ATT GGA CTC ACC AGA CCT TCG CCG GAC GAA 2004 Aεn Glu Phe Hiε Ser Pro lie Gly Leu Thr Arg Pro Ser Pro Aεp Glu 435 440 445

AGG GAA AGT CCC TCA GTC AAG CGC ATG CGA CTT TCT GAA GGA TGG GCA 2052 Arg Glu Ser Pro Ser Val Lys Arg Met Arg Leu Ser Glu Gly Trp Ala 450 455 460

ACA 2055

Thr

465

TGATCAGATT ATTACTTACT AGTTTTTTTT TTTCTCTTGC AGTGTGTGTG TGTTATACCT 2115

TAATGGGGAA GGGGGGTCGA TATGCATTAT ATGTGCCGTG TGTGGA 2161

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 5: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 465

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:

Met Gly Arg Lys Lys lie Gin lie Thr Arg lie Met Asp Glu Arg Asn

5 10 15

Arg Gin Val Thr Phe Thr Lyε Arg Lyε Phe Gly Leu Met Lyε Lys Ala 20 25 30

Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu lie Ala Leu lie lie Phe 35 40 45

Asn Ser Thr Asn Lys Leu Phe Gin Tyr Ala Ser Thr Asp Met Aεp Lyε 50 55 60

Val Leu Leu Lyε Tyr Thr Glu Tyr Aεn Glu Pro His Glu Ser Arg Thr 65 70 75 80 Aεn Ser Asp lie Val Glu Thr Leu Arg Lys Lys Gly Leu Asn Gly Cys

85 90 95

Asp Ser Pro Asp Pro Asp Ala Asp Asp Ser Val Gly His Ser Pro Glu

100 105 110

Ser Glu Asp Lys Tyr Arg Lys lie Asn Glu Asp lie Asp Leu Met lie 115 120 125

Ser Arg Gin Arg Leu Cys Ala Val Pro Pro Pro Aεn Phe Glu Met Pro 130 135 140

Val Ser lie Pro Val Ser Ser Hiε Aεn Ser Leu Val Tyr Ser Aεn Pro

145 150 155

Val Ser Ser Leu Gly Asn Pro Asn Leu Leu Pro Leu Ala His Pro Ser 160 165 170 175

Leu Gin Arg Asn Ser Met Ser Pro Gly Val Thr His Arg Pro Pro Ser

180 185 190

Ala Gly Asn Thr Gly Gly Leu Met Gly Gly Asp Leu Thr Ser Gly Ala 195 200 205

Gly Thr Ser Ala Gly Asn Gly Tyr Gly Asn Pro Arg Asn Ser Pro Gly 210 215 220

Leu Leu Val Ser Pro Gly Asn Leu Asn Lys Asn Met Gin Ala Lys Ser 225 230 235

Pro Pro Pro Met Asn Leu Gly Met Asn Asn Arg Lys Pro Asp Leu Arg 240 245 250 255

Val Leu lie Pro Pro Gly Ser Lys Asn Thr Met Pro Ser Val Asn Gin

260 265 270

Arg lie Asn Asn Ser Gin Ser Ala Gin Ser Leu Ala Thr Pro Val Val 275 280 285

Ser Val Ala Thr Pro Thr Leu Pro Gly Gin Gly Met Gly Gly Tyr Pro 290 295 300

Ser Ala lie Ser Thr Thr Tyr Gly Thr Glu Tyr Ser Leu Ser Ser Ala 305 310 315

Asp Leu Ser Ser Leu Ser Gly Phe Asn Thr Ala Ser Ala Leu His Leu 320 325 330 335

Gly Ser Val Thr Gly Trp Gin Gin Gin His Leu His Asn Met Pro Pro

340 345 350

Ser Ala Leu Ser Gin Leu Gly Ala Cyε Thr Ser Thr Hiε Leu Ser Gin 355 360 365 Ser Ser Asn Leu Ser Leu Pro Ser Thr Gin Ser Leu Asn lie Lys Ser 370 375 380

Glu Pro Val Ser Pro Pro Arg Asp Arg Thr Thr Thr Pro Ser Arg Tyr 385 390 395

Pro Gin Hiε Thr Arg Hiε Glu Ala Gly Arg Ser Pro Val Aεp Ser Leu 400 405 410 415

Ser Ser Cyε Ser Ser Ser Tyr Asp Gly Ser Asp Arg Glu Asp His Arg

420 425 430

Asn Glu Phe His Ser Pro lie Gly Leu Thr Arg Pro Ser Pro Asp Glu

435 440 445

Arg Glu Ser Pro Ser Val Lys Lys Met Arg Leu Ser Glu Gly Trp Ala 450 455 460

Thr 465

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 6: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 371

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:

TGGCGGGCCC GGGCTGCGGC GTGTGCGCGC CCGCCAGCTG CTCCGGAGAT ACGGAATTGC 60

ATTTTGTGAA AAAAGAACAA GAATTTTCTG CAAGGATCAT ATCTAAGTGC ACTTTTTGCT 120

GATACTTCAT TTCTAATCTT GTAGAAAATT TCAGCTGTAG CCCTTGGACT AGAAGCTGAA 180

ATAACAGAAG CTGTGTACGA TGCATTAGGG TATTGAAGAA AATTAACTTT TGAATTAAAT 240

ATTTGGAATA TAAGGAAATA AGGAAAGTTG ACTGAAAATG GGGCGGAAGA AAATACAAAT 300

CACACGCATA ATGGATGAAA GGAACCGACA GGTCACTTTT ACAAAGAGAA AGTTTGGATT 360

AATGAAGAAA G 371

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2950

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:

GAATTTTCTG CAAGGATCAT ATCTAAGTGC ACTTTTTGCT GATACTTCAT TTCTAGACAT 60

TGAGTCTCAC TCTACCCCCC AGGCTGAAGT GCAGTGGTGT GATCTCGGTT CACTGCAACC 120

TCCGCCTCCA GGTTCAAGTG ATTCTCGTAC CTCAGCCTCC CGAGTAGCTG GGATTACAGG 180

CGCCTGCCAC CATGCCTGGC TGATATTTAT ATTTTTAGTA GAGATGGAGT TTCACCATGT 240

TGGCCAGGCT GGTCTCGAAC TCTGGACCTC AGATCTTGTA GAAAATTTCA GCTGTAGCCC 300

TTGGACTAGA AGCTGAAATA ACAGAAGCTG TGTACGATGC ATTAGGGTAT TGAAGAAAAT 360

TAACTTTTGA ATTAAATATT TGGAATATAA GGAAATAAGG AAAGTTGACT GAAA 414

ATG GGG CGG AAG AAA ATA CAA ATC ACA CGC ATA ATG GAT GAA AGG AAC 462 Met Gly Arg Lys Lys lie Gin lie Thr Arg lie Met Asp Glu Arg Asn 1 5 10 15

CGA CAG GTC ACT TTT ACA AAG AGA AAG TTT GGA TTA ATG AAG AAA GCC 510 Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lys Lys Ala 20 25 30

AAC AGC TCT AAC AAA CTG TTT CAA TAT GCT AGC ACT GAT ATG GAC AAA 606 Asn Ser Ser Asn Lys Leu Phe Gin Tyr Ala Ser Thr Asp Met Asp Lys 50 55 60

GTT CTT CTC AAG TAT ACA GAA TAT AAT GAA CCT CAT GAA AGC AGA ACC 654 Val Leu Leu Lyε Tyr Thr Glu Tyr Aεn Glu Pro Hiε Glu Ser Arg Thr 65 70 75 80

AAC TCG ACT TTA AGA AAG AAA GGC CTT AAT GGT TGT GAG AGC CCT GAT 702 Asn Ser Thr Leu Arg Lys Lys Gly Leu Asn Gly Cys Glu Ser Pro Asp

85 90 95

GCT GAC GAT TAC TTT GAG CAC AGT CCA CTC TCG GAG GAC AGA TTC AGC 750 Ala Asp Asp Tyr Phe Glu His Ser Pro Leu Ser Glu Asp Arg Phe Ser 100 105 110

AAA CTA AAT GAA GAT AGT GAT TTT ATT TTC AAA CGA GGC CCT CCT GGT 798 Lys Leu Asn Glu Asp Ser Asp Phe lie Phe Lys Arg Gly Pro Pro Gly 115 120 125 CTG CCA CCT CAG AAC TTT TCA ATG TCT GTC ACA GTT CCA GTG ACC AGC 846 Leu Pro Pro Gin Asn Phe Ser Met Ser Val Thr Val Pro Val Thr Ser 130 135 140

CCC AAT GCT TTG TCC TAC ACT AAC CCA GGG AGT TCA CTG GTG TCC CCA 894 Pro Asn Ala Leu Ser Tyr Thr Asn Pro Gly Ser Ser Leu Val Ser Pro 145 150 155 160

TCT TTG GCA GCC AGC TCA ACG TTA ACA GAT TCA AGC ATG CTC TCT CCA 942 Ser Leu Ala Ala Ser Ser Thr Leu Thr Asp Ser Ser Met Leu Ser Pro

165 170 175

CCT CAA ACC ACA TTA CAT AGA AAT GTG TCT CCT GGA GCT CCT CAG AGA 990 Pro Gin Thr Thr Leu His Arg Asn Val Ser Pro Gly Ala Pro Gin Arg 180 185 190

CCA CCA AGT ACT GGC AAT GCA GGT GGG ATG TTG AGC ACT ACA GAC CTC 1038 Pro Pro Ser Thr Gly Asn Ala Gly Gly Met Leu Ser Thr Thr Asp Leu 195 200 205

ACA GTG CCA AAT GGA GCT GGA AGC AGT CCA GTG GGG AAT GGA TTT GTA 1086 Thr Val Pro Asn Gly Ala Gly Ser Ser Pro Val Gly Asn Gly Phe Val 210 215 220

AAC TCA AGA GCT TCT CCA AAT TTG ATT GGA GCT ACT GGT GCA AAT AGC 1134 Asn Ser Arg Ala Ser Pro Aεn Leu lie Gly Ala Thr Gly Ala Aεn Ser 225 230 235 240

TTA GGC AAA GTC ATG CCT ACA AAG TCT CCC CCT CCA CCA GGT GGT GGT 1182 Leu Gly Lyε Val Met Pro Thr Lys Ser Pro Pro Pro Pro Gly Gly Gly

245 250 255

AAT CTT GGA ATG AAC AGT AGG AAA CCA GAT CTT CGA GTT GTC ATC CCC 1230 Aεn Leu Gly Met Aεn Ser Arg Lys Pro Asp Leu Arg Val Val lie Pro 260 265 270

CCT TCA AGC AAG GGC ATG ATG CCT CCA CTA TCG GAG GAA GAG GAA TTG 1278 Pro Ser Ser Lyε Gly Met Met Pro Pro Leu Ser Glu Glu Glu Glu Leu 275 280 285

GAG TTG AAC ACC CAA AGG ATC AGT AGT TCT CAA GCC ACT CAA CCT CTT 1326 Glu Leu Aεn Thr Gin Arg lie Ser Ser Ser Gin Ala Thr Gin Pro Leu 290 295 300

GCT ACC CCA GTC GTG TCT GTG ACA ACC CCA AGC TTG CCT CCG CAA GGA 1374 Ala Thr Pro Val Val Ser Val Thr Thr Pro Ser Leu Pro Pro Gin Gly 305 310 315 320

CTT GTG TAC TCA GCA ATG CCG ACT GCC TAC AAC ACT GAT TAT TCA CTG 1422 Leu Val Tyr Ser Ala Met Pro Thr Ala Tyr Aεn Thr Aεp Tyr Ser Leu

325 330 335 ACC AGC GCT GAC CTG TCA GCC CTT CAA GGC TTC AAC TCG CCA GGA ATG 1470 Thr Ser Ala Aεp Leu Ser Ala Leu Gin Gly Phe Aεn Ser Pro Gly Met 340 345 350

CTG TCG CTG GGA CAG GTG TCG GCC TGG CAG CAG CAC CAC CTA GGA CAA 1518 Leu Ser Leu Gly Gin Val Ser Ala Trp Gin Gin His His Leu Gly Gin 355 360 365

GCA GCC CTC AGC TCT CTT GTT GCT GGA GGG CAG TTA TCT CAG GGT TCC 1566 Ala Ala Leu Ser Ser Leu Val Ala Gly Gly Gin Leu Ser Gin Gly Ser 370 375 380

AAT TTA TCC ATT AAT ACC AAC CAA AAC ATC AGC ATC AAG TCC GAA CCG 1614 Asn Leu Ser lie Aεn Thr Aεn Gin Aεn lie Ser lie Lyε Ser Glu Pro 385 390 395 400

ATT TCA CCT CCT CGG GAT CGT ATG ACC CCA TCG GGC TTC CAG CAG CAG 1662 lie Ser Pro Pro Arg Aεp Arg Met Thr Pro Ser Gly Phe Gin Gin Gin

405 410 415

CAG CAG CAG CAG CAG CAG CAG CAG CCG CCG CCA CCA CCG CAG CCC CAG 1710 Gin Gin Gin Gin Gin Gin Gin Gin Pro Pro Pro Pro Pro Gin Pro Gin 420 425 430

CCA CAA CCC CCG CAG CCC CAG CCC CGA CAG GAA ATG GGG CGC TCC CCT 1758 Pro Gin Pro Pro Gin Pro Gin Pro Arg Gin Glu Met Gly Arg Ser Pro 435 440 445

GTG GAC AGT CTG AGC AGC TCT AGT AGC TCC TAT GAT GGC AGT GAT CGG 1806 Val Aεp Ser Leu Ser Ser Ser Ser Ser Ser Tyr Asp Gly Ser Aεp Arg 450 455 460

GAG GAT CCA CGG GGC GAC TTC CAT TCT CCA ATT GTG CTT GGC CGA CCC 1854 Glu Aεp Pro Arg Gly Aεp Phe His Ser Pro lie Val Leu Gly Arg Pro 465 470 475 480

CCA AAC ACT GAG GAC AGA GAA AGC CCT TCT GTA AAG CGA ATG AGG ATG 1902 Pro Asn Thr Glu Asp Arg Glu Ser Pro Ser Val Lys Arg Met Arg Met

485 490 495

GAC GCG TGG GTG ACC TAA 1920 Asp Ala Trp Val Thr 500

GGCTTCCAAG CTGATGTTTG TACTTTTGTG TTACTGCAGT GACCTGCCCT ACATATCTAA 1980

ATCGGTAAAT AAGGACATGA GTTAAATATA TTTATATGTA CATACATATA TATATCCCTT 2040

TACATATATA TGTATGTGGG TGTGAGTGTG TGTGTATGTG TGGGTGTGTG TTACATACAC 2100

AGAATCAGGC ACTTACCTGC AAACTCCTTG TAGGTCTGCA GATGTGTGTC CCATGGCAGA 2160 CAAAGCACCC TGTAGGCACA GACAAGTCTG GCACTTCCTT GGACTACTTG TTTCGTAAAG 2220 ATAACCAGTT TTTGCAGAGA AACGTGTACC CATATATAAT TCTCCCACAC TAGCTTGCAG 2280 AAACCTAGAG GGCCCCCTAC TTGTTTTATT TAACTGTGCA GTGACTGTAG TTACTTAAGA 2340 GAAAATGCTT TGTAGAACAG AGCAGTAGAA AAGCAGGAAC CAAGAAAGCA ATACTGTACA 2400 TAAAATGTCA TTTATATTTT CCAACCTGGC ATGGGTGTCT GTTGCAAAGG GGTGCATGGG 2460 AAAGGGCTGT TGATATTAAA AACAAACAAA ACAAAAAAGC CCCACACATA ACTGTTTTGC 2520 ACGTGCAAAA ATGTATTGGG TCAAGAAGTG ATCTTTAGCT AATAAAGAAA GAGAATAGAA 2580 AACACGCATG AGATATTCAG AAAATACTAG CCTAGAAATA TAGAGCATTA ACAAAGGAAA 2640 ATTAATATAT TAAGTTATAA TTGGAATATG TCAGAAGTTT CTTTTTACAT TCATATCTTA 2700 AAAATTAAAG AAACTGATTT TAGCTCATGT ATATTTTATA TGAAAGAAAA CACCCTTATG 2760 AATTGATGAC TATATATAAA ATTATATTCA CTACTTTTGA ACACATTCTG CTATGAATTA 2820 TTTATATAAG CCAAAGCTAT ATGTTGTAAC TTTTTTTTAG AGAATAGCTT TATCTTGGTT 2880 TAACTCTTTA GTTTTATTTT AAGAGGGGAA AACAAAAATA TCTTGCAAGC AGAACCTTGA 2940 AAAAAAAAAA 2950

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 8: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 507

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:

Met Gly Arg Lys Lys lie Gin lie Thr Arg lie Met Asp Glu Arg Asn

5 10 15

Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lyε Lyε Ala 20 25 30

Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu lie Ala Leu lie lie Phe 35 40 45

Asn Ser Ser Asn Lys Leu Phe Gin Tyr Ala Ser Thr Asp Met Asp Lys 50 55 60

Ala Thr Gly Ala Asn Ser Leu Gly Lys Val Met Pro Thr Lys Ser Pro

245 250 255

Pro Pro Pro Gly Gly Gly Asn Leu Gly Met Asn Ser Arg Lys Pro Asp 260 265 270

Leu Arg Val Val He Pro Pro Ser Ser Lyε Gly Met Met Pro Pro He 275 280 285

Ser Glu Glu Glu Glu Leu Glu Leu Asn Thr Gin Arg He Ser Ser Ser 290 295 300

Gin Ala Thr Gin Pro Leu Ala Thr Pro Val Val Ser Val Thr Thr Pro 305 310 315 320

Ser Leu Pro Pro Gin Gly Leu Val Tyr Ser Ala Met Pro Thr Ala Tyr

325 330 335 Asn Thr Asp Tyr Ser Leu Thr Ser Ala Asp Leu Ser Ala Leu Gin Gly 340 345 350

Phe Aεn Ser Pro Gly Met Leu Ser Leu Gly Gin Val Ser Ala Trp Gin 355 360 365

Gin His His Leu Gly Gin Ala Ala Leu Ser Ser Leu Val Ala Gly Gly 370 375 380

Gin Leu Ser Gin Gly Ser Asn Leu Ser He Asn Thr Asn Gin Asn He 385 390 395 400

Ser He Lys Ser Glu Pro He Ser Pro Pro Arg Asp Arg Met Thr Pro

405 410 415

Ser Gly Phe Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Pro Pro 420 425 430

Pro Pro Pro Gin Pro Gin Pro Gin Pro Pro Gin Pro Gin Pro Arg Gin 435 440 445

Glu Met Gly Arg Ser Pro Val Asp Ser Leu Ser Ser Ser Ser Ser Ser 450 455 460

Tyr Asp Gly Ser Asp Arg Glu Asp Pro Arg Gly Asp Phe His Ser Pro 465 470 475 480

He Val Leu Gly Arg Pro Pro Aεn Thr Glu Aεp Arg Glu Ser Pro Ser

485 490 495

Val Lys Arg Met Arg Met Asp Ala Trp Val Thr 500 505

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 9: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 473

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:

Met Gly Arg Lys Lys He Gin He Thr Arg He Met Asp Glu Arg Asn

5 10 15

Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lys Lys Ala 20 25 30 Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu He Ala Leu He He Phe 35 40 45

Asn Ser Thr Asn Lyε Leu Phe Gin Tyr Ala Ser Thr Aεp Met Aεp Lyε 50 55 60

Val Leu Leu Lyε Tyr Thr Glu Tyr Aεn Glu Pro His Glu Ser Arg Thr 65 70 75 80

Asn Ser Asp He Val Glu Thr Leu Arg Lys Lys Gly Leu Asn Gly Cys

85 90 95

Aεp Ser Pro Asp Pro Asp Ala Asp Asp Ser Val Gly His Ser Pro Glu 100 105 110

Ser Glu Asp Lys Tyr Arg Lys He Asn Glu Asp He Asp Leu Met He 115 120 125

Ser Arg Gin Arg Leu Cys Ala Val Pro Pro Pro Asn Phe Glu Met Pro 130 135 140

Val Ser He Pro Val Ser Ser His Aεn Ser Leu Val Tyr Ser Asn Pro 145 150 155 160

Val Ser Ser Leu Gly Asn Pro Asn Leu Leu Pro Leu Ala His Pro Ser

165 170 175

Leu Gin Arg Asn Ser Met Ser Pro Gly Val Thr His Arg Pro Pro Ser 180 185 190

Ala Gly Asn Thr Gly Gly Leu Met Gly Gly Asp Leu Thr Ser Gly Ala 195 200 205

Gly Thr Ser Ala Gly Asn Gly Tyr Gly Aεn Pro Arg Asn Ser Pro Gly 210 2*15 220

Leu Leu Val Ser Pro Asn Leu Asn Lys Asn Met Gin Ala Lys Ser Pro 225 230 235 240

Pro Pro Met Aεn Leu Gly Met Asn Aεn Arg Lys Pro Aεp Leu Arg Val

245 250 255

Leu He Pro Pro Gly Ser Lyε Asn Thr Met Pro Ser Val Ser Glu Aεp 260 265 270

Val Asp Leu Leu Leu Asn Gin Arg He Asn Asn Ser Gin Ser Ala Gin 275 280 285

Ser Leu Ala Thr Pro Val Val Ser Val Ala Thr Pro Thr Leu Pro Gly 290 295 300

Gin Gly Met Gly Gly Tyr Pro Ser Ala He Ser Thr Thr Tyr Gly Thr 305 310 315 320 Glu Tyr Ser Leu Ser Ser Ala Asp Leu Ser Ser Leu Ser Gly Phe Asn

325 330 335

Thr Ala Ser Ala Leu His Leu Gly Ser Val Thr Gly Trp Gin Gin Gin 340 345 350

His Leu His Asn Met Pro Pro Ser Ala Leu Ser Gin Leu Gly Ala Cys 355 360 365

Thr Ser Thr His Leu Ser Gin Ser Ser Asn Leu Ser Leu Pro Ser Thr 370 375 380

Gin Ser Leu Aεn He Leu Lyε Ser Glu Pro Val Ser Pro Pro Arg Aεp 385 390 395 400

Arg Thr Thr Thr Pro Ser Arg Tyr Pro Gin His Thr Arg His Glu Ala

405 410 415

Gly Arg Ser Pro Val Asp Ser Leu Ser Ser Cys Ser Ser Ser Tyr Aεp 420 425 430

Gly Ser Aεp Arg Glu Aεp His Arg Asn Glu Phe Hiε Ser Pro He Gly 435 440 445

Leu Thr Arg Pro Ser Pro Asp Glu Arg Glu Ser Pro Ser Val Lys Arg 450 455 460

Met Arg Leu Ser Glu Gly Trp Ala Thr 465 470 473

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 10: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 521

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:

Met Gly Arg Lys Lys He Gin He Gin Arg He Thr Aεp Glu Arg Asn

5 10 15

Arg Gin Val Thr Phe Thr Lys Arg Lys Phe Gly Leu Met Lys Lys Ala 20 25 30

Tyr Glu Leu Ser Val Leu Cys Asp Cys Glu He Ala Leu He He Phe 35 40 45 Aεn Hiε Ser Aεn Lyε Leu Phe Gin Tyr Ala Ser Thr Aεp Met Asp Lys 50 55 60

Val Leu Leu Lys Tyr Thr Glu Tyr Asn Glu Pro His Glu Ser Arg Thr 65 70 75 80

Asn Ala Asp He He Glu Thr Leu Arg Lys Lys Gly Phe Asn Gly Cys

85 90 95

Asp Ser Pro Glu Pro Asp Gly Glu Asp Ser Leu Glu Gin Ser Pro Leu 100 105 110

Leu Glu Asp Lys Tyr Arg Arg Ala Ser Glu Glu Leu Asp Gly Leu Phe 115 120 125

Arg Arg Tyr Gly Ser Thr Val Pro Ala Pro Asn Phe Ala Met Pro Val 130 135 140

Thr Val Pro Val Ser Aεn Gin Ser Ser Leu Gin Gin Phe Ser Aεn Pro 145 150 155 160

Ser Gly Ser Leu Val Thr Pro Ser Leu Val Thr Ser Ser Leu Thr Asp

165 170 175

Pro Arg Leu Leu Ser Pro Gin Gin Pro Ala Leu Gin Arg Asn Ser Val 180 185 190

Ser Pro Gly Leu Pro Gin Arg Pro Ala Ser Ala Gly Ala Met Leu Gly 195 200 205

Gly Asp Leu Aεn Ser Ala Aεn Gly Ala Cys Pro Ser Pro Val Gly Asn 210 215 220

Gly Tyr Val Ser Ala Arg Ala Ser Pro Gly Leu Leu Pro Val Ala Asn 225 230 235 240

Gly Asn Ser Leu Asn Lys Val He Pro Ala Lys Ser Pro Pro Pro Pro

245 250 255

Thr His Ser Thr Gin Leu Gly Ala Pro Ser Arg Lyε Pro Aεp Leu Arg 260 265 270

Val He Thr Ser Gin Ala Gly Lyε Gly Leu Met Hiε Hiε Leu Thr Glu 275 280 285

Aεp His Leu Asp Leu Asn Asn Ala Gin Arg Leu Gly Val Ser Gin Ser 290 295 300

Thr His Ser Leu Thr Thr Pro Val Val Ser Val Ala Thr Pro Ser Leu 305 310 315 320

Leu Ser Gin Gly Leu Pro Phe Ser Ser Met Pro Thr Ala Tyr Asn Thr

325 330 335 Asp Tyr Gin Leu Thr Ser Ala Glu Leu Ser Ser Leu Pro Ala Phe Ser 340 345 350

Ser Pro Gly Gly Leu Ser Leu Gly Asn Val Thr Ala Trp Gin Gin Pro 355 360 365

Gin Gin Pro Gin Gin Pro Gin Gin Pro Gin Pro Pro Gin Gin Gin Pro 370 375 380

Pro Gin Pro Gin Gin Pro Gin Pro Gin Gin Pro Gin Gin Pro Gin Gin 385 390 395 400

Pro Pro Gin Gin Gin Ser His Leu Val Pro Val Ser Leu Ser Asn Leu

405 410 415

He Pro Gly Ser Pro Leu Pro His Val Gly Ala Ala Leu Thr Val Thr 420 425 430

Thr His Pro Hiε He Ser He Lyε Ser Glu Pro Val Ser Pro Ser Arg 435 440 445

Glu Arg Ser Pro Ala Pro Pro Pro Pro Ala Val Phe Pro Ala Ala Arg 450 455 460

Pro Glu Pro Gly Asp Gly Leu Ser Ser Pro Ala Gly Gly Ser Tyr Glu 465 470 475 480

Thr Gly Asp Arg Asp Asp Gly Arg Gly Asp Phe Gly Pro Thr Leu Gly

485 490 495

Leu Leu Arg Pro Ala Pro Glu Pro Ala Glu Gly Ser Ala Val Lys Arg 500 505 510

Met Arg Leu Asp Thr Trp Thr Leu Lyε 515 520

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 11: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:

Arg Val Lyε He Lyε Met Glu Phe He Asp Asn Lys He Arg Arg Tyr

5 10 15 Thr Thr Phe Ser Lys Arg Lys Thr Gly He Met Lys Lys Ala Tyr Glu 20 25 30

Leu Ser Thr Leu Thr Gly Thr Gin Val Leu Leu Leu Val Ala Ser Glu 35 40 45

Thr Gly His Val Tyr Thr Phe Ala Thr Arg Lys Leu Gin Pro Met He 50 55 60

Thr Ser Glu Thr Gly Lyε Ala Leu He Gin Thr Cys Leu Trp Ser Pro 65 70 75 80

Asp Ser Pro Pro 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 12 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:

Arg Arg Lys He Glu He Lys Phe He Glu Asn Lys Thr Arg Arg His

5 10 15

Val Thr Phe Ser Lys Arg Lys His Gly He Met Lys Lyε Ala Glu Pro 20 25 30

Leu Ser Val Leu Thr Gly Thr Gin Val Leu Leu Leu Val Val Ser Glu 35 40 45

Thr Gly Leu Val Tyr Thr Phe Ser Thr Pro Lys Phe Glu Pro He Val 50 55 60

Thr Gin Gin Glu Gly Arg Asn Leu He Gin Ala Cys Leu Asn Ala Pro 65 70 75 80

Asp Asp Glu Glu 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 13: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid (C) STRANDEDNESS:

(D) TOPOLOGY: linear

( i) SEQUENCE DESCRIPTION: SEQ ID NO: 13:

Arg Gly Arg Val Glu Met Lys Arg He Glu Asn Lyε He Asn Arg Gin

5 10 15

Val Thr Phe Ser Lys Arg Arg Asn Gly Leu Leu Lyε Lyε Ala Tyr Glu 20 25 30

Leu Ser Val Leu Cys Asp Ala Glu Val Ala Leu He He Phe Ser Ser 35 40 45

Arg Gly Lys Leu Tyr Glu Phe Gly Ser Val Gly He Glu Ser Thr He 50 55 60

Glu Arg Tyr Aεn Arg Cys Tyr Asn Cys Ser Leu Ser Asn Asn Lys Pro 65 70 75 80

Glu Glu Thr Thr 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 14: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:

Pro Gly Lys He Glu He Lys Lys He Glu Asn Ser Thr Asn Arg Gin

5 10 15

Val Thr Phe Cys Lys Arg Arg Asn Gly He Phe Lys Lyε Arg Lys Glu 20 25 30

Leu Thr Val Leu Cys Asp Ala Lys He Ser Leu He Met He Ser Ser 35 40 45

Thr Arg Lys Tyr His Glu Tyr Thr Ser Pro Asn Thr Thr Thr Lys Lyε 50 55 60

Met He Asp Gin Tyr Gin Ser Ala Leu Gly Val Asp He Trp Ser He 65 70 75 80 Hiε Tyr Glu Lyε 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 15: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:

Arg Gly Lyε He Gin He Lyε Arg He Glu Aεn Gin Thr Aεn Arg Gin

5 10 15

Val Thr Tyr Ser Lyε Arg Arg Asn Gly Leu Phe Lys Lys Ala His Glu 20 25 30

Leu Ser Val Leu Cyε Asp Ala Lys Val Ser He He Met He Ser Ser 35 40 45

Thr Gin Lys Leu His Glu Tyr He Ser Pro Thr Thr Ala Thr Lys Gin 50 55 60

Leu Phe Aεp Gin Tyr Gin Lys Ala Val Gly Val Asp Leu Trp Ser Ser 65 70 75 80

His Tyr Glu Lys

84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 16: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:

Arg Gly Lys He Gin He Lys Arg He Glu Aεn Gin Thr Aεn Arg Gin

5 10 15

Val Thr Tyr Ser Lyε Arg Arg Asn Gly Leu Phe Lys Lys Ala His Glu 20 25 30 Leu Thr Val Leu Cys Asp Ala Arg Val Ser He He Met Phe Ser Ser 35 40 45

Ser Asn Lys Leu His Glu Tyr He Ser Pro Asn Thr Thr Thr Lys Glu 50 55 60

He Val Asp Leu Tyr Gin Thr He Ser Asp Val Asp Val Trp Ala Thr 65 70 75 80

Gin Tyr Glu Arg 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 17:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 84

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:

Arg Gly Lys He Glu He Lyε Arg He Glu Asn Thr Thr Asn Arg Gin

5 10 15

Val Thr Phe Cyε Lys Arg Arg Asn Gly He Leu Lyε Lys Ala Tyr Glu 20 25 30

Leu Ser Val Leu Cys Aεp Ala Glu Val Leu Ala He Val Phe Ser Ser 35 40 45

Arg Gly Arg Leu Tyr Glu Tyr Ser Aεn Aεn Ser Val Lys Gly Thr He 50 55 60

Glu Arg Thr Lys Lys Ala He Ser Asp Asn Ser Asn Thr Gly Ser Val 65 70 75 80

Ala Glu He Aεn 84

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 18 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:

CTAAAAATAA 10

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 19 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 14

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:

CTAATATATA TTAG 14

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 20 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 9

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:

Ser Glu Glu Glu Glu Glu Leu Glu Leu

5 9

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 21:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21

YTWWAAATAR 10

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 22: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27

(B) TYPE: nucleic acid (C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:

GATCCTCGCT CTAAAAATAA CCCTGTM 27

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 23: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

( i) SEQUENCE DESCRIPTION: SEQ ID NO: 23:

AGCTTCGGAC CCTGCTCATT TCTATATATA G 31

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 24 (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:

AGCTTGGGGA CCAAATAAGG CAAGGTG 27

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 25: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:

GATCCTTCCC AATGATTTGC ATGCTCTCAC 30

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 26 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 32

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:

GATCTCCCTG GGGTTAAAAA TAACCCCATG AC 32

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 27: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:

GATCGATCGA TGCCTGGTTA TAATTAACCC AGACAT 36

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 28: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:

GATCTCCGAC GGGTTTAAAA TAGCAAAACT CT 32

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 29: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: GATCCCTTTC AGATTAAAAA TAACTAAGGT AA 32 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 30: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31

(B) TYPE: nucleic acid

(C) STRANDEDNESS: double

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:

GATCGCCCAA GGACTAAAAA AAGGCCCTGG A 31

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 31: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 16

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:

Thr Pro His Thr Glu Glu Lys Tyr Lys Lys He Asn Glu Glu Phe Cys 1 5 10 15

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 32:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 13

(B) TYPE: amino acid

(C) STRANDEDNESS:

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:

Cys Asp Tyr Phe Glu His Ser Pro Leu Ser Glu Asp Arg

5 10

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 33: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 10

(B) TYPE: nucleic acid (C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:

TTAAAAATAA 10

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 34: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:

CGCTCTAAAA ATAACCCT 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 35: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:

CGCTCTAAGG CTAACCCT 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 36: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36:

CGCTCTATAA ATAACCCT 18 (2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 37: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

( i) SEQUENCE DESCRIPTION: SEQ ID NO: 37:

CGCTCTAAAC ATAACCCT 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 38: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38:

ATTTCTATAT ATACTTTC 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 39: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:

GGGGACCAAA TAAGGCAA 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 40: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: εingle

(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40:

CCAATGATTT GCATGCTC 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 41: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:

GGGGTTAAAA ATAACCCC 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 42: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:

CTGGTTATAA TTAACCCA 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 43: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43:

CGGGTTTAAA ATAGCAAA 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 44: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:

CAGATTAAAA ATAACTAA 18

(2) INFORMATION FOR SEQUENCE IDENTIFICATION NUMBER: 45:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 18

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45:

AGGACTAAAA AAAGGCCC 18

Claims

Claimε

1. An essentially pure nucleic acid encoding a member of the Myocyte-specific Enhancer Factor 2 (MEF2) protein family which has myocyte transcription enhancing activity.

2. The nucleic acid of claim 1, wherein said nucleic acid comprises at least one of the following elements: a) a nucleotide sequence encoding at least eleven consecutive glutamine residueε, or b) a nucleotide sequence encoding the amino acid sequence SEEEELEL

(SEQ ID NO: 1) .

3. The nucleic acid of claim 1, wherein said nucleic acid encodes at least a 54 amino acid portion of the amino acid sequence of Fig. IA.

4. The nucleic acid of claim 1, wherein said MEF2 family member is a mutant of a wild-type protein, said wild-type protein comprising an inactivation domain, said nucleic acid being deleted for sequences encoding said inactivation domain.

5. The nucleic acid of claim 1, wherein said MEF2 iε an iεoform of the MEF2 sequence shown in Fig. 1.

6. The nucleic acid of claim 1, wherein said MEF2 iε εelected from the group conεiεting of aMEF2, XMEF2, dMEF2, and CMEF2.

7. The nucleic acid of any of claimε 1-6 for use in therapy.

8. The nucleic acid of any of claims 1-6 for use in the treatment or prevention of muscular dystrophy or muscle atrophy in a mammal.

9. The nucleic acid of any of claims ^*l-6 for use in enhancing muscle masε in a mammal.

10. The nucleic acid of any of claimε 1-9 in combination with a pharmaceutically acceptable carrier.

11. A nucleic acid vector comprising the nucleic acid of any of claims 1-6.

12. The vector of claim 11, wherein said vector comprises a transcription regulatory εequence positioned and oriented to regulate expression of said nucleic acid encoding said MEF2 family member.

13. A cell comprising the vector of claim 12.

14. A substantially pure MEF2 polypeptide encoded by the nucleic acid of any of claims 1-6.

15. A substantially pure MEF2 polypeptide encoded by the nucleic acid of any of claims 1-6 for use in therapy.

16. A substantially pure MEF2 polypeptide encoded by the nucleic acid of any of claims 1-6 for use in the treatment or prevention of muscular dystrophy or muscle atrophy in a mammal.

17. A subεtantially pure MEF2 polypeptide encoded by the nucleic acid of any of claimε 1-6 for use in enhancing muscle mass in a mammal.

18. The polypeptide of any of claims 14-17 in combination with a pharmaceutically acceptable carrier.

19. A transgenic non-human mammal comprising a firεt tranεgene, εaid firεt tranεgene comprising the nucleic acid of any of claims 1-6.

20. The transgenic mammal of claim 19, further comprising a second tranεgene, said second transgene compriεing a promoter and regulatory DNA poεitioned to effect expreεεion of a εtructural gene, εaid promoter and regulatory DNA being characterized in that said expreεεion iε enhanced by εaid MEF2 protein family member.

21. The transgenic mammal of claim 19, further comprising a second transgene, said εecond tranεgene enhancing the activity of εaid MEF2 protein family member.

22. The tranεgenic mammal of claim 8, wherein εaid enhancing is selected from the group conεiεting of a) increaεing the expreεεion of said MEF2 protein family member, and b) increasing the activity of an at leaεt partially inactive form of said MEF2 protein family member.

23. The transgenic mammal of claim 21, wherein said second tranεgene encodeε a MyoD polypeptide, a myogenin polypeptide, a retinoblaεtoma polypeptide, or a homeobox protein.

24. The tranεgenic mammal of claim 19, wherein εaid first transgene is expresεed by a tissue-specific promoter.

25. The transgenic mammal of any of claimε 19-24, wherein said first transgene iε introduced into εaid mammal, or an anceεtor of said mammal, at an embryonic stage.

26. The tranεgenic mammal of any of claimε 19-24, wherein said transgene is introduced into a somatic cell or into a somatic tissue of said mammal.

27. A method of identifying a molecule that enhances the activity of a member of the Myocyte-specific Enhancer factor 2 (MEF2) family, comprising providing a candidate molecule; combining the candidate molecule with a polypeptide comprising an activity reducing domain of the carboxy terminal one-third of an active MEF2 family member according to claim 22, said polypeptide being characterized in that it lacks MEF2 transcription enhancing activity, and εaid domain being characterized in that deletion of εaid domain from εaid MEF2 family member enhanceε MEF2 activity of εaid family member; determining whether said candidate molecule binds to εaid polypeptide.

28. A method of identifying a molecule that enhanceε the activity of a member of the Myocyte-εpecific Enhancer factor 2 (MEF2) family, comprising providing a candidate molecule; providing a MEF2 family member according to claim 22 in a solution, providing a MEF2 consenεuε nucleic acid binding εequence determining whether said candidate molecule enhances binding of said MEF2 family member to said MEF2 conεenεuε binding εequence.

29. A method of identifying a molecule that enhances the activity of a member of the Myocyte-specific Enhancer factor 2 (MEF2) family, compriεing providing a candidate molecule; providing nucleic acid according to claim 1, tranεformed into a cell εaid cell comprising a structrual gene which comprises a regulatory region that includes a MEF2 conεenεuε binding sequence and a promoter responεive to εaid conεenεus binding sequence; determining whether introduction of εaid candidate molecule into εaid cell enhances expreεεion of said structural gene.