AU2247200A - Programmed cell death genes and proteins - Google Patents

Programmed cell death genes and proteins Download PDF

Info

Publication number
AU2247200A
AU2247200A AU22472/00A AU2247200A AU2247200A AU 2247200 A AU2247200 A AU 2247200A AU 22472/00 A AU22472/00 A AU 22472/00A AU 2247200 A AU2247200 A AU 2247200A AU 2247200 A AU2247200 A AU 2247200A
Authority
AU
Australia
Prior art keywords
leu
ser
lys
ich
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU22472/00A
Inventor
Masayuki Miura
Junying Yuan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
General Hospital Corp
Original Assignee
General Hospital Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Hospital Corp filed Critical General Hospital Corp
Priority to AU22472/00A priority Critical patent/AU2247200A/en
Publication of AU2247200A publication Critical patent/AU2247200A/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Description

P00011 Regulation 3.2 Revised 2/98
AUSTRALIA
Patents Act, 1990
ORIGINAL
COMPLETE
SPECIFICATION
STANDARD
PATENT
*r S 555
S
fr S TO BE COMPLETED BY THE APPLICANT NAME OF APPLICANT: THE GENERAL HOSPITAL
CORPORATION
ACTUAL INVENTORS: ADDRESS FOR SERVICE: INVENTION TITLE: DETAILS OF ASSOCIATED
APPLICATION(S):
JUNYING YUAN and MASAYUKI MIURA Peter Maxwell Associates Level 6 Pitt Street SYDNEY NSW 2000 PROGRAMMED CELL DEATH GENES AND
PROTEINS
Divisional of Australian Patent Application No. 47,472/96 filed on 4 January 1996 The following statement is a full description of this invention including the best method of performing it known to me:la This invention is in the field of molecular biology as related to the control of programmed cell death.
Programmed Cell Death Cell death occurs as a normal aspect of animal development as well as in tissue homeostasis and aging (Glucksmann, A. Biol. Rev. Cambridge Philos.
Soc. 26:59-86 (1950); Ellis et al., Dev. 112:591-603 (1991); Vaux et al., Cell 76:777-779 (1994)). Naturally occuring cell death acts to regulate cell number, 10 to facilitate morphogenesis, to remove harmful or otherwise abnormal y a o *go o ooo* •ego* *•go• *o o y -2cells-and to eliminate cells that have already performed their function. Such regulated cell death is achieved through an endogenous cellular mechanism of suicide, termed programmed cell death or apoptosis (Wyllie, A. in Cell Death in Biology and Pathology, Bowen and Lockshin, eds., Chapman and Hall (1981), pp. 9-34). Programmed cell death or apoptosis occurs when a cell activates this internally encoded suicide program as a result of either internal or external signals. The morphological characteristics of apoptosis include plasma membrane blebbing, condensation of nucleoplasm and cytoplasm and degradation of chromosomal DNA at inter-nucleosomal intervals. (Wyllie, A.
10 in Cell Death in Biology and Pathology, Bowen and Lockshin, eds., Chapman and Hall (1981), pp. 9-34). In many cases, gene expression appears to be required for programmed cell death, since death can be prevented by inhibitors of RNA or protein synthesis (Cohen et al., J. Immunol. 32:38-42 (1984); Stanisic et al., Invest. Urol. 16:19-22 (1978); Martin et al., J. Cell Biol. 106:829-844 (1988)).
The genetic control of programmed cell death has been well-elucidated by the work on programmed cell death in the nematode C. elegans.
Programmed cell death is characteristic and widespread during C. elegans development. Of the 1090 somatic cells formed during the development of the hermaphrodite, 131 undergo programmed cell death. When observed with Nomarski microscopy, the morphological changes of these dying cells follow a characteristic sequence. (Sulston et al., Dev. Biol. 82:110-156 (1977); Sulston et al., Dev. Biol. 100:64-119 (1983)).
Fourteen genes have been identified that function in different steps of the genetic pathway of programmed cell death in C. elegans (Hedgecock et al., Science 220:1277-1280 (1983); Ellis et al., Cell 44:817-829 (1986); Ellis et al., Dev. 112:591-603 (1991); Ellis et al., Genetics 112:591-603 (1991b); Hengartner et al., Nature 356:494-499 (1992); Ellis et al., Dev. 112:591-603 (1991)). Two of these genes, ced-3 and ced-4, play essential roles in either the initiation or execution of the cell death program. Recessive mutations in these genes-prevent almost all of the cell deaths that normally occur during C.
elegans development.
The ced-4 gene encodes a novel protein that is expressed primarily during embryogenesis, the period during which most programmed cell deaths occur (Yuan et al., Dev. 116:309-320 (1992)). The 549 amino acid sequence of ced-4, deduced from cDNA and genomic clones, contain two regions that are similar to the calcium-binding domain known as the EF-hand (Kretsinger, 1987); however, it is still not clear at present whether calcium plays a role in regulating ced-4 or programmed cell death in C. elegans.
A gain-of-function mutation in ced-9 prevents normal programmed cell death, while mutations that inactivate ced-9 are lethal. This suggests that ced-9 acts by suppressing programmed cell death genes in cells that normally do not undergo programmed cell death (Hengartner, et al., Nature 356:494-499 S (1992)). The ced-9 gene encodes a protein product that shares sequence similarity with the mammalian proto-oncogene and cell death suppressor bcl-2 (Hengartner, et al., Cell 76:665-676 (1994)). The lethality of ced-9 lossof-function mutations can be suppressed by mutations in ced-3 and ced-4, .i indicating that ced-9 acts by suppressing the activity of ced-3 and ced-4.
Genetic mosaic analyses indicate that ced-3 and ced-4 likely act in a cellautonomous fashion within dying cells, suggesting that they might be cytotoxic proteins and/or control certain cytotoxic proteins in the process of programmed cell death (Yuan, et al., Dev. Bio. 138:33-41 (1990)).
nedd2 Cell death also occurs in mammals. nedd2 is a mouse gene which is preferentially expressed during early embryonic brain development (Kumar et al., Biochem. Biophys. Res. Commun. 185:1155-1161 (1992)). Since many neurons die during early embryonic brain development, it is possible that nedd- 2 is a cell death gene. Nedd-2 mRNA is down-regulated in the adult brain (Kumar et al., Biochem. Biophys. Res. Comm. 185:1155-1161 (1992); Yuan, et al., Cell 75:641-752 (1993)).
bcl-2 bcl-2 is an oncogene known to inhibit programmed cell death and to be overexpressed in many follicular and B cell lymphomas. Overexpression of bcl-2 as a result of chromosomal translocation occurs in 85% of follicular and 20% of diffuse B cell lymphomas, Fukuhara et al., Cancer Res. 39:3119 (1979); Levine et al., Blood 66:1414 (1985); Yunis et al., N. Engl. J. Med.
316:79-84 (1987).
Overexpression of bcl-2 protects or delays the onset of apoptotic cell death in a variety of vertebrate cell types as well as in C. elegans (Vaux et al., Science 258:1955-1957 (1992); Nunez et al., J. Immun. 144:3602-3610 (1990); Vaux et al., Science 258:1955-1957 (1992); Sentman et al., Cell 67:879-888 (1992); Strasser et al., Cell 67:889-899 (1991)). Expression of bcl-2 in the nematode C. elegans has been shown to partially prevent programmed cell death. Thus, bcl-2 is functionally similar to the C. elegans ced-9 gene (Vaux et al., Science 258:1955-1957 (1992); Hengartner et al., nature 356:494-499 (1992)).
Converting Enzyme Interleukin-lp converting enzyme (ICE) is a substrate-specific cysteine protease that cleaves the inactive 31 KD prointerleukin-lp at Asp" 6 -Ala" 7 releasing a carboxy-terminal 153 amino-acid peptide to produce the mature 17.5 kD interleukin-lp (IL-1P) (Kostura et al., Proc. Natl. Acad. Sci., USA 86:5227- 5231 (1989); Black et al., FEBS Lett. 247:386-390 (1989); Cerretti et al., Science 256:97-100 (1992); Thornberry et al., Nature 356:768-774 (1992)).
IL-1P is a cytokine involved in mediating a wide range of biological responses including inflammation, septic shock, wound healing, hematopoiesis and growth of certain leukemias (Dinarello, Blood 77:1627-1652 (1991); diGiovine et al., Today 11:13 (1990)). A specific inhibitor of ICE, the crmA gene product of cowpox virus, prevents the proteolytic activation of IL-1p (Ray et al., Cell 69:597-604 (1992)) and inhibits host inflammatory response (Ray et al., Cell 69:597-604 (1992)). Cowpox virus carrying a deleted crmA gene is unable to suppress the inflammatory response of chick embryos, resulting in a reduction in the number of virus-infected cells and less damage to the host (Palumbo et al., Virology 171:262-273 (1989)). This observation indicates the importance .1o0 of ICE in bringing about the inflammatory response.
Tumor Necrosis Factor Tumor necrosis factor-a (TNF-a) is a pleiotropic tumoricidal cytokine (Tracey, K.J. et al., Ann. Rev. Cell. Biol. 9:317-343 (1993)). One of the striking functions of TNF-a is to induce apoptosis of transformed cells. In the case of non-transformed cells, TNFa can also induce apoptosis in the presence Sof metabolic inhibitors (Tracey, et al., Ann. Rev. Cell. Biol. 9:317-343 (1993). Apoptosis induced by TNF-a is also suppressed by bcl-2.
One of the most extensively studied functions of TNF-a is its cytotoxicity on a wide variety of tumor cell lines in vitro (Laster, S. M. et al., J. Immunol. 141:2629-2634 (1988)). However, the mechanism of cell death induced by TNF has been largely unknown. HeLa cells express predominantly TNF receptor which is thought to be responsible for cell death signaling (Englemann, H. et al., J. Biol. Chem. 265:14497-14504 (1990); Thoma, B.
et al., J. Exp. Med. 172:1019-1023 (1990)).
HeLa cells are readily killed by TNF-a in the presence of the metabolic inhibitor cycloheximide (CHX). The cell death induced by TNF-a/CHX shows DNA fragmentation and cytolysis, which are typical features of apoptosis (White, E. et al., Mol. Cell. Biol. 12:2570-2580 (1992)). Expression of adenovirus E1B 19K protein, which is functionally similar to bcl-2, inhibits apoptosis induced by TNF in HeLa cells (White, E. et al., Mol. Cell. Biol.
12:2570-2580 (1992)).
Summary of the Invention In the present invention, the ced-3 gene (SEQ ID NO: 1) has been cloned and sequenced and the amino acid sequence (SEQ ID NO: 2) of the protein encoded by this gene is disclosed. Structural analysis of the ced-3 gene revealed that it is similar to the enzyme interleukin-lp converting enzyme and that overexpression of the murine interleukin-lp converting 10 enzyme ("mICE") causes programmed cell death in vertebrate cells. Based S upon these results, a novel method for controlling programmed cell death in vertebrates by regulating the activity of ICE is claimed.
The amino acid sequence of ced-3 was also found to be similar to another murine protein, nedd-2, which is detected during early embryonic brain development, a period when many cells die. The results suggest that ced-3, .i ICE and nedd-2 are members of a gene family which function to cause programmed cell death.
A new cell death gene, mlch-2 (murine ICE-ced-3 homolog), has been discovered which appears to be in the same family as ced-3, ICE, and nedd-2.
mlch-2 is distinguished from other previously identified cell death genes in that it is preferentially expressed in the thymus and placenta of vertebrates. Thus, the invention is also directed to a newly discovered gene, mlch-2, which is preferentially expressed in thymus and placenta and which encodes a protein causing programmed cell death. Thus, the present invention is directed to both the mlch-2 gene sequence (SEQ ID NO: 41) and the protein (SEQ ID NO: 42) encoded by mlch-2. Also encompassed are vectors expressing mlch-2 and host cells transformed with such vectors. The invention also encompasses methods of regulating cell death by expressing mlch-2.
A comparison of the nucleotide sequences of ced-3, mICE, human ICE nedd-2 and mlch-2 indicates that they are members of a gene family that promotes programmed cell death. The identification of this family facilitated the isolation of the newly discovered cell death gene Ice-ced 3 homolog-1 (Ich-1).
Ich-1 is homologous to the other cell death genes described above and particularly to nedd2. Based upon its structure and the presence of DNA encoding the QACRG sequence characteristic of the active center of cell death genes, Ich-1 was identified as a new member of the ced-3/ICE family. Thus, the present invention is directed to both the Ich-1 gene sequence and the protein encoded by Ich-1. Also encompassed are vectors expressing Ich-1 and host cells transformed with such vectors. Alternative splicing results in two distinct Ich-1 mRNA species. Thus, the invention also encompasses these species, proteins produced from them, vectors containing and expressing the genes, and the uses described herein.
The inventors have also identified a further member of the ICE/ced-3 family, Ich-3 (SEQ ID NO: 54). Ich-3 has at least two alternative splicing products. A full length cDNA of one of them from a mouse thymus cDNA library has been identified. It encodes a protein of 418 amino acids that is 38% identical with mICE, 42% identical with mlch-2, 25% with Ich-1, and 24% identical with C. elegans ced-3.
The invention is thus directed to genomic or cDNA nucleic acids having genetic sequences which encode ced-3, mlch-2, Ich-1, and Ich-3. The invention also provides for vectors and expression vectors containing such genetic sequences, the host cells transformed with such vectors and expression vectors, the recombinant nucleic acid or proteins made in such host/vectors systems and the functional derivatives of these recombinant nucleic acids or proteins. The use of the isolated genes or proteins for the purpose of regulating, and especially promoting cell death is also part of the invention.
The invention is also directed to methods for controlling the programmed death of vertebrate cells by regulating the activity of interleukin-1 P converting enzyme. Such regulation may take the form of inhibiting the enzymatic activity, e.g. through the use of specific antiproteases such as crmA, in order to prevent cell death. In this way, it may be possible to develop cell lines which remain viable in culture for an extended period of time or indefinitely. Certain cells can only be maintained in culture if they are grown in the presence of growth factors. By blocking cell death, it may be possible to make such cells growth factor independent. Alternatively, ICE activity may 10 be increased in order to promote cell death. Such increased activity may be used in cancer cells to antagonize the effect of oncogenes such as bcl-2.
The present invention is also based on the discovery that TNF-a activates endogenous ICE activity in HeLa cells and that TNF-a-induced apoptosis is suppressed by crmA, which can specifically inhibit ICE-mediated cell death. Thus, certain embodiments of the invention are based on the activation of the ICE/ced-3-mediated cell death pathway by TNF-a.
Brief Description of the Figures Figure 1 and 1A: Genetic and Physical Maps of the ced-3 Region on Chromosome IV Figure 1 shows the genetic map of C. elegans in the region near ced-3 with the cosmid clones representing this region depicted below the map. nP33, nP34, nP35, nP36, and nP37 are restriction fragment length polymorphisms (RFLP) between Bristol and Bergerac wild type C. elegans strains. C43C9, W07H6 and C48D1 are three cosmid clones tested for rescue of the ced phenotype of ced-3(n717) animals. The ability of each cosmid clone to rescue ced-3 mutants and the fraction of independently obtained transgenic lines that were rescued are indicated on the right of the figure rescue; no rescue; see text for data). The results indicate that ced-3 is contained in the cosmid C48D1.
Figure 1A is a restriction map of C48D1 subclones. C48D1 was digested with BamHI and self-ligated to generate subclone C48D1-28. C48D1- 43, pJ40 and pJ107 were generated by partial digesting C48D1-28 with BgllI.
and pJ7.4 were generated by ExoIII deletion of pJ107. These subclones were assayed for rescue of the ced phenotype of ced-3(n717) animals rescue; no rescue, weak rescue). The numbers in parentheses indicate the fraction of independently obtained transgenic lines that were rescued. The smallest fragment that fully rescued the ced-3 mutant phenotype was the 7.5 kb pJ7.5 subclone.
Figure 2A-2F, 2G-2K, 2L and 2M-2P Genomic Organization, Nucleotide Sequence, and Deduced Amino Acid Sequence of ced-3 0 °°15 Figure 2A-2F shows the genomic sequence of the ced-3 region (SEQ ID NO: as obtained from plasmid pJ107. The deduced amino acid sequence of ced-3 (SEQ ID NO: 2) is based on the DNA sequence of ced-3 cDNA pJ87 and upon other experiments described in the text and in Experimental Procedures.
The 5' end of pJ87 contains 25 bp of poly-A/T sequence (not shown), which is probably a cloning artifact since it is not present in the genomic sequence. The likely start site of translation is marked with an arrowhead. The SL1 splice acceptor site of the ced-3 transcript is boxed. The positions of 12 ced-3 mutations are indicated. Repetitive elements are indicated as arrows above the relevant sequences. Numbers on the left indicate nucleotide positions, beginning with the start of pJ107. Numbers below the amino acid sequence indicate codon positions. Five types of imperfect repeats were found: repeat 1, also found infem-1 (Spence et al., Cell 60:981-990 (1990)) and hlh-1 (Krause et al., Cell 63:907-919 (1990)); repeat 2, novel; repeat 3, also found in lin-12 and fem-1; repeat 4, also found in lin-12; and repeat 5, novel. Numbers on the sides of the figure indicate nucleotide positions, beginning with the start of pJ107. Numbers under the amino acid sequence indicate codon positions.
Figure 2G 2K contain comparisons of the repetitive elements in ced-3 with the repetitive elements in the genes ced-3, fem-1, hlh-1, lin-12, glp-1, and the cosmids B0303 and ZK643 (see text for references). In the case of inverted repeats, each arm of a repeat ("for" or "rev" for "forward" or "reverse", respectively) was compared to both its partner and to individual arms of the other repeats. 2A(i): Repeat 1 (SEQ ID NO: 3-11); 2A(ii): Repeat 2 (SEQ ID 10 NO: 12-14); 2A(iii): Repeat 3 (SEQ ID NO: 15-27); 2A(iv): Repeat 4 (SEQ ID NO: 28-30); and 2A(v): Repeat 5 (SEQ ID NO: 30-33). The different ced-3 sequences which appear in the comparisons are different repeats of the same repetitive element. The numbers "lb" etc. are different repeats of the same class of repetitive element.
Figure 2L shows the locations of the introns (lines) and exons (open boxes) of the ced-3 gene as well as the positions of 12 ced-3 mutations analyzed. The serine-rich region, the trans-spliced leader (SL1), the possible start of translation (ATG) and polyadenlyation (AAA) site are also indicated.
Figure 2M- 2P shows the cDNA sequence (SEQ ID NO: 34) and deduced amino acid sequence (SEQ ID NO:35) of ced-3 as obtained from plasmid pJ87.
Figure 3 and 3A-3B: Comparison of the Structure of the ced-3 Protein and hlCE Protein Figure 3 shows a comparison of structural features of ced-3 with those of the hlCE gene. The predicted proteins corresponding to the hICE proenzyme and ced-3 are represented. The active site in hICE and the predicted active site in ced-3 are indicated by the black rectangles. The four known cleavage sites in hICE flanking the processed hICE subunits (p24, which was detected in low quantities when hICE was purified (Thorberry et al. (1992), p20, and pl0) and two conserved presumptive cleavage sites in ced-3 are indicated with solid lines and linked with dotted lines. Five other potential cleavage sites in ced-3 are indicated with dashed lines. The positions of the aspartate residues at potential cleavage sites are indicated below each diagram.
Figure 3A-3B contains a comparison of the amino acid sequences of ced-3 from C. elegans (SEQ ID NO: 35), C. briggsae (SEQ ID NO: 36) and C. vulgaris (SEQ ID NO: 37) with hICE (SEQ ID NO: 39), mICE (SEQ ID NO: 38), and mouse nedd-2 (SEQ ID NO: 40). Amino acids are numbered to 10 the right of each protein. Dashes indicate gaps in the sequence made to allow optimal alignment. Residues that are conserved among more than half of the proteins are boxed. Missense ced-3 mutations are indicated above the comparison blocks showing the residue in the mutant ced-3 and the allele name.
Asterisks indicate potential aspartate self-cleavage sites in ced-3. Circles indicate known aspartate self-cleavage sites in hlCE. Residues indicated in boldface correspond to the highly conserved pentapeptide containing the active cysteine.
C sFigure 4: Construction of Expression Cassettes of mICE-lacZ and ced-3-lacZ Fusion Genes Figure 4 shows several expression cassettes used in studying the cellular effects of ICE and ced-3 expression. The cassettes are as follows: ppactMI0Z contains intact mICE fused to the E. coli lacZ gene (mlCE-lacZ). ppactM 1Z contains the P20 and P10 subunits of mICE fused to the E. coli lacZ gene (P20/P10-lacZ). ppactM19Z contains the P20 subunit of mICE fused to the E.
coli lacZ gene (P20-lacZ). ppactM12Z contains the P10 subunit of mICE fused to the E. coli lacZ gene (P10-lacZ). ppactced38Z contains the C. elegans ced-3 gene fused to the lacZ gene (ced-3-lacZ). pJ485 and ppactced37Z contain a Gly to Sec mutation at the active domain pentapeptide "QACRG" in mICE and ced-3 respectively. ppactM17Z contains a Cys to Gly mutation at the active domain pentapeptide "QACRG" in mICE. pactpgal' is a control plasmid (Maekawa et al., Oncogene 6:627-632 (1991)). All plasmids use the p-actin promoter.
Figure Genetic Pathways of Programmed Cell Death in the Nematode C. elegans and in Vertebrates In vertebrates, bcl-2 blocks the activity of ICE thereby preventing programmed cell death. Enzymatically active ICE causes vertebrate cell death.
10 In C. elegans, ced-9 blocks the action of ced-3/ced-4. Active ced-3 together with active ced-4 causes cell death.
Figure 6A-6C: mlch-2 cDNA Sequence and Deduced Amino Acid Sequence *o Figure 6A-6C shows the nucleotide sequence of the mlch-2 cDNA sequence (SEQ ID NO: 41) and the amino acid sequence (SEQ ID NO: 42) deduced therefrom.
Figure 7 and 7A: mlch-2 Amino Acid Sequence Figures 7 and 7A contain a comparison of the amino acid sequences of mICE, hICE, mlch-2 and ced-3.
Figure 8: Potential QACRG Coding Region in the Mouse nedd2 cDNA The reading frame proposed by Kumar et al. (Biochem. Biophys. Res.
Comm. 185:1155-1161 (1992)) is b. In reading frame a, a potential QACRG coding region (SEQ ID NO: 44-49) is underlined.
Figure 9 and 9A: Comparison of the Amino Acid Sequences of ced-3, hlCE and Ich-I Figure 9 contains a comparison of the amino acid sequences of ced-3 0* (SEQ ID NO: 35) and Ich-1 (SEQ ID NO: 43). There is a 52% similarity between the sequences and a 28% identity.
Figure 9A contains a comparison of the amino acid sequences of hICE "(SEQ ID NO: 39) and Ich-1 (SEQ ID NO: 43). There is a 52% similarity between the sequences and a 27% identity.
S 6 s Figure 10A-10B: 15 The cDNA Sequence of Ich-1 1 (SEQ ID NO: 50) and the Deduced Amino Acid Sequence (SEQ ID NO: 51) of Ich-IL Protein Product The putative active domain is underlined.
Figure 10C-10E: The cDNA Sequence of Ich-1, (SEQ ID NO: 52) and the Deduced Amino Acid Sequence (SEQ ID NO: 53) of Ich-1s Protein Product The intron sequence is underlined.
-14- Figure llA: The Alternative Splicing of Ich-1 mRNA The exons are shown in bars. The intron is shown in a line.
Nucleotides at the exon-intron borders are indicated.
Figure 11B: The Schematic Diagram of the Two Alternatively Spliced Ich-1 Transcripts and Ich-1L and Ich-Is Proteins The peptide marked with X may not be translated in vivo. The open reading frames and proteins are shown in bars. Untranslated regions and 10 introns are shown in lines. The positions of Ich-ls and Ich-l1 stop codons are indicated on the transcript diagram as "stop". Amino acid sequences in Ich-ls *oo 0e that are different from Ich-1, are shaded.
Figure 12A-12B: A Comparison of the Ich-1 Protein Sequence with the Mouse nedd-2 Protein, the hlCE Protein, the mICE Protein, and C. elegans ced-3 Protein Amino acids are numbered to the right of each sequence. Any residues in nedd-2 (SEQ ID NO: 40), hICE (SEQ ID NO: 39), mICE (SEQ ID NO: 38), and ced-3 (SEQ ID NO: 35) that are identical with Ich-1 (SEQ ID NOs: 51, 53) are highlighted. The two point mutations made by site-directed mutagenesis are marked on the top of the sequence.
Figure 12C: A Schematic Comparison of Structural Features of Ich-1i and hICE The active site of hICE and predicted active site of Ich-l, are indicated.
The four known cleavage sites of hICE and potential cleavage sites of Ich-l, are marked.
Figure 13: Stable Expression of Ich-Is Prevents Cell Death of Rat-1 Cells Induced by Serum Removal Stable transfectants of Rat-1 cells expressing bcl-2, crmA or Ich-1 s were prepared as described in Experimental Procedures. Independent Ich-1s-positive and Ich-1s-negative clones were used. At time O, exponentially growing cells :i10 were washed with serum-free DMEM and dead cells were counted over time by trypan blue staining.
Figure 14-14A: The cDNA Sequence (SEQ ID NO: 54) and Putative Ich-3 Protein Sequence (SEQ ID NO: 15 The putative first Met is marked with a dot.
Figure 15-15A: Comparison of Amino Acid Sequences of Ich-3 (SEQ ID NO: 56) with hlCE (SEQ ID NO: 38), mlch-2 (SEQ ID NO: 42), Ich-1 (SEQ ID NO: 43) and ced-3 (SEQ ID NO: Figure 16: Suppression of TNF-Induced Cytotoxicity by Overexpression of CrmA The results are from three separate experiments with each condition done in duplicate.
Definitions In the description that follows, a number of terms used in recombinant DNA (rDNA) technology or in the research area of programmed cell death are extensively utilized. In order to provide a clear and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided.
Gene. A DNA sequence containing a template for a RNA polymerase.
The RNA transcribed from a gene may or may not code for a protein. RNA that codes for a protein is termed messenger RNA (mRNA). It is understood, 10 however, that a gene also encompasses non-transcribed regulatory sequences .i including, but not limited to, such sequences as enhancers, promoters, and poly- A addition sequences.
A "complementary DNA" or "cDNA" gene includes recombinant genes synthesized by reverse transcription of mRNA and from which intervening sequences (introns) have been removed. Of course, cDNA may also include any complementary part of any gene sequence. The complement could be synthesized, for example, and may not exclude DNA sequences not found in the naturally occurring mRNA.
:Cloning vector. A plasmid or phage DNA or other DNA sequence which is able to replicate autonomously in a host cell, and which is characterized by one or a small number of endonuclease recognition sites at which such DNA sequences may be cut in a determinable fashion without loss of an essential biological function of the vehicle, and into which DNA may be spliced in order to bring about its replication and cloning. The cloning vector may further contain a marker suitable for use in the identification of cells transformed with the cloning vehicle. Markers, for example, are tetracycline resistance or ampicillin resistance. The term "cloning vehicle" is sometimes used for "cloning vector." -17- Expression vector. A vector similar to a cloning vector but which is capable of expressing a gene which has been cloned into it, after transformation into a host. The cloned gene is usually placed under the control of operably linked to) certain control sequences such as promoter sequences.
Control sequences will vary depending on whether the vector is designed to express the operably linked gene in a prokaryotic or eukaryotic host and may additionally contain transcriptional elements such as enhancer elements, termination sequences, tissue-specificity elements, and/or translational initiation and termination sites.
Programmed cell death. The process in which cell death is genetically programmed. Programmed cell death allows organisms to eliminate cells that have served a developmental purpose but which are no longer beneficial.
.i Functional Derivative. A "functional derivative" of mlch-2, Ich-1 (Ich-IL and Ich-Is), or Ich-3 is a protein, or DNA encoding a protein, which possesses a biological activity that is substantially similar to the biological activity of the non-recombinant. A functional derivative of may or may not contain post-translational modifications such as covalently linked carbohydrate, depending on the necessity of such modifications for the performance of a specific function. The term "functional derivative" is intended to include the :20 "fragments," "variants," "analogues," or "chemical derivatives" of a molecule.
The derivative retains at least one of the naturally-occurring functions of the parent gene or protein. The function can be any of the regulatory gene functions or any of the function(s) of the finally processed protein. The degree of activity of the function need not be quantitatively identical as long as the qualitative function is substantially similar.
Fragment. A "fragment" is meant to refer to any subgenetic sequence of the molecule, such as the peptide core, or a variant of the peptide core.
-18- Detailed Description of the Preferred Embodiments Description The present invention relates, inter alia, to isolated DNA encoding ced-3 of C. elegans, mIch-2, Ich-1, and Ich-3. The invention also encompasses nucleic acids having the cDNA sequence of ced-3, nlch-2, Ich-1, and Ich-3.
The invention also encompasses related sequences in other species that can be Sisolated without undue experimentation. It will be appreciated that trivial So•variations in the claimed sequences and fragments derived from the full-length genomic and cDNA genes are encompassed by the invention as well. The .:10 invention also encompasses protein sequences encoded by ced-3, mlch-2, Ich-1, and Ich-3. It should also be understood that Ich-1 encompasses both Ich-1s and Ich-l.
ced-3 The genomic sequence of the claimed gene encoding ced-3 is shown in 15 Figure 2A-2F (SEQ ID NO: The gene is 7,656 base pairs in length and contains seven introns ranging in size from 54 base pairs to 1,195 base pairs.
The four largest introns as well as sequences 5' to the START codon contain repetitive elements, some of which have been previously characterized in the non-coding regions of other C. elegans genes such as fen-1 (Spence et al., Cell 60:981-990 (1990)) and hlh-I (Krause et al., Cell 63:907-919 (1990)). A comparison of the repetitive elements in ced-3 with previously characterized repetitive elements is shown in figures 2G-2K (SEQ ID NOs: 3-33). The START codon of ced-3 is the methionine at position 2232 of the genomic sequence shown in Figure 2B.
The cDNA sequence of ced-3 shown in Figure 2M-2P (SEQ ID NO: 34). The cDNA is 2,482 base pairs in length with an open reading frame encoding 503 amino acids (SEQ ID NO: 35) and 953 base pairs of 3' untranslated sequence. The last 380 base pairs of the 3' sequence are not essential for the expression of ced-3.
In addition to encompassing the genomic and cDNA sequences of ced-3 from C. elegans, the present invention also encompasses related sequences in other nematode species which can be isolated without undue experimentation.
For example, the inventors have shown that ced-3 genes from C. briggsae and C. vulgaris may be isolated using the ced-3 cDNA from C. elegans as a probe (see Example 1).
S'10 The invention also encompasses protein products from the ced-3 gene, S:gene variants, derivatives, and related sequences. As deduced from the DNA sequence, ced-3 is 503 amino acids in length and contains a serine-rich middle region of about 100 amino acids. The amino acid sequence comprising the claimed ced-3 is shown in Figure 2A-2F (SEQ ID NO: 2) and Figure 2M-2P.
15 A comparison of ced-3 of C. elegans with the inferred ced-3 sequences from the related nematode species C. briggsae (SEQ ID NO: 36) and C. vulgaris (SEQ ID NO: 37) indicates that the non-serine-rich region is highly conserved and the serine-rich region is more variable. The non-serine-rich portion of ced-3 is also homologous with hlCE (SEQ ID NO: 39). The C-terminal .*20 portions of both ced-3 and hlCE are similar to the mouse nedd-2 (SEQ ID NO: The results suggest that ced-3 acts as a cysteine protease in controlling the onset of programmed cell death in C. elegans and that members of the ced- 3/ICE/nedd-2 gene family function in programmed cell death in a wide variety of species.
mlch-2 The cDNA sequence (SEQ ID NO: 41) and deduced amino acid sequence (SEQ ID NO: 42) of mlch-2 are shown in Figure 6A-6C. As expected, mlch-2 shows homology to both hICE and mICE as well as to C.
elegans ced-3 (see Figure 7 and 7A). In contrast to other cell-death genes that have been identified, mlch-2 is preferentially expressed in the thymus and placenta. Example 3 describes how the gene was obtained by screening a mouse thymus cDNA library with a DNA probe derived from hlCE under conditions of low stringency. Given the amino acid sequence and cDNA sequence shown in Figure 6A-6C, preferred methods of obtaining the mlch-2 gene (either genomic or cDNA) are described below.
Ich-1 nedd2, hlCE, mlch-2 and ced-3 are all members of the same gene family. This suggested that new genes might be isolated based upon their Shomology to identified family members.
Ich-1 is 1456 base pairs in length and contains an open reading frame of *o 435 amino acids (Figure 10A (SEQ ID NOs: 50, The C-terminal 130 amino acids of Ich-1 are over 87% identical to mouse nedd2. However, Ich-1 contains a much longer open reading frame and has the pentapeptide QACRG which is the active center of the proteins of the ced-3/ICE family. The results indicate that the cDNA isolated by Kumar et al. may not have been synthesized from a fully processed mRNA and that the 5' 1147 base pairs which Kumar et al. reported for nedd2 cDNA may actually represent the sequence of an intron. The sequence reported by Kumar et al. contains a region which could potentially code for QACRG but these amino acids are encoded in a different reading frame (SEQ ID NOs: 44-49) than that indicated by Kumar et al. (Figure This suggests that Kumar et al. made an error in cloning.
The coding regions of nedd2 and Ich-l1 are highly homologous. The amino acid sequence of the deduced Ich-l, protein shares 28% identity with ced-3 and 27% identity with hICE (Figures 9, 9A).
Ich-1 mRNA is alternatively spliced into two different forms. One mRNA species encodes a protein product of 435 amino acids, designated Ich-lL (SEQ ID NO: 51), which contains amino acid sequence homologous to both and P10 subunits of hICE as well as entire ced-3 (28% identity). The other mRNA encodes a 312 amino-acid truncated version of Ich-1, designated Ich-1 s (SEQ ID NO: 53), that terminates 21 amino acid residues after the QACRG active domain of Ich-1. Expression of Ich-1, and Ich-1s has opposite effects on cell death. Overexpression of Ich-1L induces Rat-1 fibroblast cells to die in culture, while overexpression of the Ich-ls suppresses Rat-1 cell death induced by serum deprivation. Results herein suggest that Ich-1 may play an important role in both positive and negative regulation of programmed cell death in 10 vertebrate animals.
S. Ich-3 Ich-3 (SEQ ID NO: 54) was identified based on its sequence homology *oo with hICE and other isolated ICE homologs. Since the Ich-3 clone isolated by PCR only contains the coding region for the C-terminal half of Ich-3, a mouse 15 thymus cDNA library was screened using the Ich-3 insert. Among 2 million clones screened, 9 positive clones were isolated. The sequence herein is from one clone that contains the complete coding region for Ich-3 gene.
*o Methods of Making ced-3 There are many standard procedures for cloning genes which are wellknown in the art and which can be used to obtain the ced-3 gene (see e.g., Sambrook et al., Molecular Cloning, a Laboratory Manual, 2nd edition, vol.
1-3, Cold Spring Harbor Laboratory Press, 1989). In Example 1, a detailed description is provided of two preferred procedures. The first preferred procedure does not require the availability of ced-3 gene sequence information -22and is based upon a method described by Ruvkun et al. (Molecular Genetics of Caenorhabditis Elegans Heterochromic Gene lin-14 121: 501-516 (1988)). In brief, Bristol and Bergerac strains of nematode are crossed and restriction fragment length polymorphism mapping is performed on the DNA of the resulting inbred strain. Restriction fragments closely linked to ced-3 are identified and then used as probes to screen cosmid libraries for cosmids carrying all or part of the ced-3 gene. Positive cosmids are injected into a nematode strain in which ced-3 has been mutated. Cosmids carrying active ced- 3 genes are identified by their ability to rescue the ced-3 mutant phenotype a* '110 A second method for cloning ced-3 genes relies upon the sequence information which has been disclosed herein. Specifically, DNA probes are :19 constructed based upon the sequence of the ced-3 gene of C. elegans. These probes are labelled and used to screen DNA libraries from nematodes or other species. Procedures for carrying out such cloning and screening are described s*15 more fully below in connection with the cloning and expression of mlch-2, Ich-1, and Ich-3, and are well-known in the art (see, Sambrook et al., Molecular Cloning, a Laboratory Manual, 2nd edition (1988)). When hybridizations are carried out under conditions of high stringency, genes are identified which contain sequences corresponding exactly to that of the probe.
o In this way, the exact same sequence as described by the inventors herein may be obtained. Alternatively, hybridizations may be carried out under conditions of low stringency in order to identify genes in other species which are homologous to ced-3 but which contain structural variations (see Example 1 for a description of how such hybridizations may be used to obtain the ced-3 genes from C. briggsae and C. vulgaris).
The results in Example 2 demonstrate that the products of cell-death genes may be tolerated by cells provided they are expressed at low levels.
Therefore, ced-3 may be obtained by incorporating the ced-3 cDNA described above into any of a number of expression vectors well-known in the art and transferring these vectors into appropriate hosts (see Sambrook et al., Molecular Cloning, a Laboratory Manual, vol. 3 (1988)). As described below in connection with the expression of mlch-2, Ich-1, and Ich-3, expression systems may be utilized in which cells are grown under conditions in which a recombinant gene is not expressed and, after cells reach a desired density, expression may be induced. In this way, the tendency of cells which express ced-3 to die may be circumvented.
mlch-2, Ich-1, and Ich-3 00 DNA encoding mlch-2, Ich-1, and Ich-3 may be obtained from either genomic DNA or from cDNA. Genomic DNA may include naturally occurring 0 introns. Moreover, such genomic DNA may be obtained in association with the 5' promoter region of the sequences and/or with the 3' transcriptional termination region. Further, such genomic DNA may be obtained in association with the genetic sequences which encode the 5' non-translated region of the mlch-2, Ich-1, and Ich-3 mRNA and/or with the genetic sequences which encode the 3' non-translated region. To the extent that a host cell can recognize the transcriptional and/or translational regulatory signals associated with the expression of the mRNA and protein, then the 5' and/or 3' nontranscribed regions of the native gene, and/or, the 5' and/or 3' non-translated regions of the mRNA, may be retained and employed for transcriptional and translational regulation.
Genomic DNA can be extracted and purified from any cell containing mouse chromosomes by means well known in the art (for example, see Guide to Molecular Cloning Techniques, S.L. Berger et al., eds., Academic Press (1987)). Alternatively, mRNA can be isolated from any cell which expresses the genes, and used to produce cDNA by means well known in the art The preferred sources for mlch-2 are thymus or placental cells. The mRNA coding for any of the proteins may be enriched by techniques commonly used to enrich mRNA preparations for specific sequences, such as sucrose gradient centrifugation, or both.
For cloning into a vector, DNA prepared as described above (either human genomic DNA or preferably cDNA) is randomly sheared or enzymatically cleaved, and ligated into appropriate vectors to form a recombinant gene library. A DNA sequence encoding the protein or its functional derivatives may be inserted into a DNA vector in accordance with conventional techniques.
Techniques for such manipulations are disclosed by Sambrook, et al., supra, and are well known in the art.
'10 In a preferred method, oligonucleotide probes specific for the gene are designed from the cDNA sequences shown in the Figures 6A-6C, 10A-10B, 10C-10E, and 14-14A. The oligonucleotide may be synthesized by means well •.known in the art (see, for example, Synthesis and Application of DNA and RNA, S.A. Narang, ed., Academic Press, San Diego, CA (1987)) and employed as a probe to identify and isolate the cloned gene by techniques known in the art.
Techniques of nucleic acid hybridization and clone identification are disclosed by Maniatis, et al. (In: Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY (1982)), and by Hames, et al. (In: Nucleic Acid Hybridization, A Practical Approach, IRL Press, '20 Washington, DC (1985)). Those members of the above-described gene library which are found to be capable of such hybridization are then analyzed to determine the extent and nature of the coding sequences which they contain.
To facilitate the detection of the desired coding sequence, the abovedescribed DNA probe is labeled with a detectable group. This group can be any material having a detectable physical or chemical property. Such materials are well-known in the field of nucleic acid hybridization and any label useful in such methods can be applied to the present invention. Particularly useful are radioactive labels, such as 32 P, 3 H, 4"C, 3 5 S, "25I, or the like. Any radioactive label may be employed which provides for an adequate signal and has a sufficient half-life. The oligonucleotide may be radioactively labeled, for example, by "nick-translation" by well-known means, as described in, for example, Rigby, et al., J. Mol. Biol. 113:237 (1977) or by T4 DNA polymerase replacement synthesis as described in, for example, Deen, et al., Anal. Biochem. 135:456 (1983).
Alternatively, oligonucleotide probes may be labeled with a nonradioactive marker such as biotin, an enzyme or a fluorescent group. See, for example, Leary, et al., Proc. Natl. Acad. Sci. USA 80:4045 (1983); Renz, et al., Nucl. Acids Res. 12:3435 (1984); and Renz, EMBO J. 6:817 (1983).
"10 For Ich-1, the isolation shown in the Examples was as follows. Two primers were used in the polymerase chain reaction to amplify nedd2 cDNA from embryonic day 15 mouse brain cDNA (Sambrook et al., Molecular Cloning, a Laboratory Manual, vol. 3 (1988)). One primer had the sequence (SEQ ID NO: 57): ATGCTAACTGTCCAAGTCTA and the other primer had 15 the sequence (SEQ ID NO: 58): TCCAACAGCAGGAATAGCA. The cDNA see* thus amplified was cloned using standard methodology. The cloned mouse nedd2 cDNA was used as a probe to screen a human fetal brain cDNA library purchased from Stratagene. Such methods of screening and isolating clones are well known in the art (Maniatis, et al., Molecular Cloning, A Laboratory :20 Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY (1982)); Hames, et al., Nucleic Acid Hybridization, A Practical Approach, IRL Press, Washington, DC (1985)). A human nedd-2 cDNA clone was isolated that encodes a protein much longer than the mouse nedd-2 and contains amino acid sequences homologous to the entire hICE and ced-3. The isolated clone was given the name Ice-ced 3 homolog or Ich-1.
The Ich-1 cDNA may be obtained using the nucleic acid sequence information given in Figures 10A or 10B. DNA probes constructed from this sequence can be labeled and used to screen human gene libraries as described herein. Also as discussed herein, Ich-1 may be cloned into expression vectors and expressed in systems in which host cells are grown under conditions in which recombinant genes are not expressed and, after cells reach a desired density, expression is induced. In this way, a tendency of cells which express Ich-1 to die may be circumvented.
One method of making Ich-3 is as follows. mRNA was isolated from embryonic day 14 mouse embryos using Invitrogen's microfast track mRNA isolation kit. The isolated mRNA was reverse transcribed to generate template for PCR amplification. The degenerate PCR primers were: clCEB (SEQ ID NO: 59) {TG(ATCG)CC(ATCG)GGGAA(ATCG)AGGTAGAA} and clCEAs (SEQ ID NO: 60) {ATCAT(ATC)ATCCAGGC(ATCG)TGCAG(AG)GG}.
10 The PCR cycles were set up as follows: 1. 94°C, 3 min; 2. 94°C, 1 min; 3.
48 0 C, 2 min; 4. 72 0 C, 3 min; 5. return to 4 cycles; 6. 94°C, 1 min; 7.
55 0 C, 2 min; 8. 72 0 C, 3 min; 9. return to 34 cycles; 10. 72 0 C, 10 min; 11. end. Such PCR generated a band about 400bp, the predicted size of ICE homologs. The PCR products were cloned into T-tailed blunt-ended pBSKII 15 plasmid vector (Stratagene). Plasmids that contain an insert were analyzed by DNA sequencing.
The Ich-3 cDNA may also be obtained using the nucleic acid sequence 0 information given in Figure 14-14A. DNA probes constructed from this sequence can be labeled and used to screen human gene libraries as described :20 herein. Also as discussed herein, Ich-3 may be cloned into expression vectors go and expressed in systems in which host cells are grown under conditions in which recombinant genes are not expressed and, after cells reach a desired density, expression is induced.
The methods discussed herein are capable of identifying genetic sequences which encode mlch-2, Ich-1, and Ich-3. In order to further characterize such genetic sequences, and, in order to produce the recombinant protein, it is desirable to express the proteins which these sequences encode.
-To express any of the genes herein or their derivatives, transcriptional and translational signals recognizable by an appropriate host are necessary. The cloned coding sequences, obtained through the methods described herein, may be operably linked to sequences controlling transcriptional expression in an expression vector and introduced into a host cell, either prokaryote or eukaryote, to produce recombinant protein or a functional derivative thereof.
Depending upon which strand of the sequence is operably linked to the sequences controlling transcriptional expression, it is also possible to express antisense RNA or a functional derivative thereof.
10 Expression of the protein in different hosts may result in different posttranslational modifications which may alter the properties of the protein.
Preferably, the present invention encompasses the expression of mlch-2, Ich-1, and Ich-3 or a functional derivative thereof, in eukaryotic cells, and especially mammalian, insect and yeast cells. Especially preferred eukaryotic hosts are mammalian cells either in vivo, or in tissue culture. Mammalian cells provide post-translational modifications which should be similar or identical to those found in the native protein. Preferred mammalian host cells include rat-1 .fibroblasts, mouse bone marrow derived mast cells, mouse mast cells immortalized with Kirsten sarcoma virus, or normal mouse mast cells that have been co-cultured with mouse fibroblasts (Razin et al., J. oflmmun. 132:1479 (1984); Levi-Schaffer et al., Proc. Natl. Acad. Sci. (USA) 83:6485 (1986) and Reynolds et al., J. Biol. Chem. 263:12783-12791 (1988)).
A nucleic acid molecule, such as DNA, is said to be "capable of expressing" a polypeptide if it contains expression control sequences which contain transcriptional regulatory information and such sequences are "operably -28linked" to the nucleotide sequence which encodes the polypeptide. A n operable linkage is a linkage in which a coding sequence is connected to a regulatory sequence (or sequences) in such a way as to place expression of the coding sequence under the influence or control of the regulatory sequence. Two DNA sequences the coding sequence of protein and a promoter) are said to be operably linked if induction of promoter function results in the transcription of the coding sequence and if the nature of the linkage between the two DNA sequences does not result in the introduction of a frame-shift mutation; interfere with the ability of regulatory sequences to direct the expression of the coding sequence, antisense RNA, or protein; or interfere with the ability of the coding sequence template to be transcribed by the promoter region sequence. Thus, a promoter region would be operably linked to a DNA sequence if the promoter were capable of effecting transcription of that DNA sequence.
The precise nature of the regulatory regions needed for gene expression may vary between species or cell types, but shall in general include, as necessary, 5' non-transcribing and 5' non-translating (non-coding) sequences involved with initiation of transcription and translation respectively, such as the S"TATA box, capping sequence. CAAT sequence, and the like. Especially, such 5' non-transcribing control sequences will include a region which contains a promoter for transcriptional control of the operably linked gene.
Expression of proteins of the invention in eukaryotic hosts requires the use of regulatory regions functional in such hosts, and preferably eukaryotic regulatory systems. A wide variety of transcriptional and translational regulatory sequences can be employed, depending upon the nature of the eukaryotic host. The transcriptional and translational regulatory signals can also be derived from the genomic sequences of viruses which infect eukaryotic cells, such as adenovirus, bovine papilloma virus, simian virus, herpesvirus, or the like. Preferably, these regulatory signals are associated with a particular gene which is capable of a high level of expression in the host cell.
-29- In eukaryotes, where transcription is not linked to translation, control regions may or may not provide an initiator methionine (AUG) codon, depending on whether the cloned sequence contains such a methionine. Such regions will, in general, include a promoter region sufficient to direct the initiation of RNA synthesis in the host cell. Promoters from heterologous mammalian genes which encode mRNA capable of translation are preferred, and especially, strong promoters such as the promoter for actin, collagen, myosin, etc., can be employed provided they also function as promoters in the host cell. Preferred eukaryotic promoters include the promoter of the mouse S. 10 metallothionein I gene (Hamer, et al., J. Mol. Appl. Gen. 1:273-288 (1982)); the TK promoter of herpesvirus (McKnight, Cell 31:355-365 (1982)); the SV40 early promoter (Benoist, et al., Nature (London) 290:304-310 (1981)); in yeast, the yeast gal4 gene promoter (Johnston, S.A., et al., Proc. Natl. Acad. Sci. (USA) 79:6971-6975 (1982); Silver, et al., 15 Proc. Natl. Acad. Sci. (USA) 81:5951-5955 (1984)) or a glycolytic gene promoter may be used.
It is known that translation of eukaryotic mRNA is initiated at the codon which encodes the first methionine. For this reason, it is preferable to ensure that the linkage between a eukaryotic promoter and a DNA sequence which encodes the proteins of the invention or functional derivatives thereof, does not contain any intervening codons which are capable of encoding a methionine.
The presence of such codons results either in the formation of a fusion protein or a frame-shift mutation.
If desired, a fusion product of the proteins may be constructed. For example, the sequence coding for the proteins may be linked to a signal sequence which will allow secretion of the protein from, or the compartmentalization of the protein in, a particular host. Such signal sequences may be designed with or without specific protease sites such that the signal peptide sequence is amenable to subsequent removal. Alternatively, the native signal sequence for this protein may be used.
Transcriptional initiation regulatory signals can be selected which allow for repression or activation, so that expression of operably linked genes can be modulated. Of interest are regulatory signals which are temperature-sensitive so that by varying the temperature, expression can be repressed or initiated, or are subject to chemical regulation, metabolite.
If desired, the non-transcribed and/or non-translated regions 3' to the sequence coding for the proteins can be obtained by the above-described cloning "methods. The 3'-non-transcribed region may be retained for transcriptional termination regulatory sequence elements; the 3'-non-translated region may be 10 retained for translational termination regulatory sequence elements, or for those elements which direct polyadenylation in eukaryotic cells. Where native expression control signals do not function satisfactorily in a host cell, functional sequences may be substituted.
The vectors of the invention may further comprise other operably linked *,15 regulatory elements such as enhancer sequences, or DNA elements which confer tissue or cell-type specific expression on an operably linked gene.
To transform a mammalian cell with the DNA constructs of the invention many vector systems are available, depending upon whether it is desired to insert the DNA construct into the host cell chromosomal DNA, or to allow it to exist in extrachromosomal form. If the protein encoding sequence and an operably linked promoter are introduced into a recipient eukaryotic cell as a non-replicating DNA (or RNA) molecule, the expression of the protein may occur through the transient expression of the introduced sequence.
In a preferred embodiment, genetically stable transformants may be constructed with vector systems, or transformation systems, whereby mlch-2, Ich-1, or Ich-3 DNA is integrated into the host chromosome. Such integration may occur de novo within the cell or, in a most preferred embodiment, through the aid of a cotransformed vector which functionally inserts itself into the host chromosome, for example, retroviral vectors, transposons or other DNA elements which promote integration of DNA sequences in chromosomes.
Cells which have stably integrated the introduced DNA into their chromosomes are selected by also introducing one or more markers which allow for selection of host cells which contain the expression vector in the chromosome, for example the marker may provide biocide resistance, e.g., resistance to antibiotics, or heavy metals, such as copper, or the like. The selectable marker gene can either be directly linked to the DNA gene sequences to be expressed, or introduced into the same cell by co-transfection. I n "another embodiment, the introduced sequence is incorporated into a plasmid or viral vector capable of autonomous replication in the recipient host. Any of a :10 wide variety of vectors may be employed for this purpose. Factors of importance in selecting a particular plasmid or viral vector include: the ease with which recipient cells that contain the vector may be recognized and selected from those recipient cells which do not contain the vector; the number of copies of the vector which are desired in a particular host; and whether it is 15 desirable to be able to "shuttle" the vector between host cells of different species.
Preferred eukaryotic plasmids include those derived from the bovine papilloma virus, vaccinia virus, SV40, and, in yeast, plasmids containing the 2-micron circle, etc., or their derivatives. Such plasmids are well known in the art (Botstein, et al., Miami Wntr. Symp. 19:265-274 (1982); Broach, J.R., In: The Molecular Biology of the Yeast Saccharomyces: Life Cycle and Inheritance, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, p. 445- 470 (1981); Broach, Cell 28:203-204 (1982); Bollon, et al., J. Clin.
Hematol. Oncol. 10:39-48 (1980); Maniatis, In: Cell Biology: A Comprehensive Treatise, Vol. 3, Gene Expression, Academic Press, NY, pp.
563-608 (1980)), and are commercially available.
Once the vector or DNA sequence containing the construct(s) is prepared for expression, the DNA construct(s) is introduced into an appropriate host cell by any of a variety of suitable means, including transfection. After the introduction of the vector, recipient cells are grown in a medium which selects for the growth of vector-containing cells. Expression of the cloned gene sequence(s) results in the production of the protein, or in the production of a fragment of this protein. This expression can take place in a continuous manner in the transformed cells, or in a controlled manner, for example, expression which follows induction of differentiation of the transformed cells (for example, by administration of bromodeoxyuracil to neuroblastoma cells or the like). The latter is preferred for the expression of the proteins of the invention. By '.growing cells under conditions in which the proteins are not expressed, cell death may be avoided. When a high cell density is reached, expression of the :10 proteins may be induced and the recombinant protein harvested immediately before death occurs.
The expressed protein is isolated and purified in accordance with conventional procedures, such as extraction, precipitation, gel filtration chromatography, affinity chromatography, electrophoresis, or the like.
15 The mlch-2, Ich-1, and Ich-3 sequences, obtained through the methods above, will provide sequences which not only encode proteins but which also provide for transcription of mlch-2, Ich-1, and Ich-3 antisense RNA; the antisense DNA sequence will be that sequence found on the opposite strand of the strand transcribing the mRNA. The antisense DNA strand may also be operably linked to a promoter in an expression vector such that transformation with this vector results in a host capable of expression of the antisense RNA in the transformed cell. Antisense DNA and RNA may be used to interact with endogenous mlch-2, Ich-1, or Ich-3 DNA or RNA in a manner which inhibits or represses transcription or translation of the genes in a highly specific manner. Use of antisense nucleic acid to block gene expression is discussed in Lichtenstein, Nature 333:801-802 (1988).
Methods of Using ced-3 The ced-3 gene (as well as ced-3 homologs and other members of the ced-3 gene family) may be used for a number of distinct purposes. First, portions of the gene may be used as a probe for identifying genes homologous to ced-3 in other strains of nematode (see Example 1) as well as in other species (see Examples 2 and Such probes may also be used to determine whether the ced-3 gene or homologs of ced-3 are being expressed in cells.
The cell death genes will be used in the development of therapeutic methods for diseases and conditions characterized by cell death. Among diseases and conditions which could potentially be treated are neural and muscular degenerative diseases, myocardial infarction, stroke, virally induced cell death and aging. The discovery that ced-3 is related to ICE suggests that cell death genes may play an important role in inflammation (IL-1p is known to be involved in the inflammatory response). Thus therapeutics based upon ced-3 and related cell death genes may also be developed.
o* mlch-2, Ich-1, and Ich-3 mlch-2, Ich-1, and Ich-3 will have the same uses as those described in connection with ced-3 (above) and ICE (see below). The gene sequences may be used to construct antisense DNA and RNA oligonucleotides, which, in turn, may be used to prevent programmed cell death in thymus or placental cells.
Techniques for inhibiting the expression of genes using antisense DNA or RNA are well-known in the art (Lichtenstein, Nature 333:801-802 (1988)).
Portions of the claimed DNA sequence may also be used as probes for determining the level of expression. Similarly the protein may be used to generate antibodies that can be used in assaying cellular expression.
-34- Portions of the mlch-2, Ich-1, and Ich-3 genes described above may be used for determining the level of expression of the proteins (mlch-2 in thymus or placental cells as well as in other tissues and organs). Such methods may be useful in determining if these cells have undergone a neoplastic transformation.
Probes based upon the gene sequences may be used to isolate similar genes involved in cell death. A portion of the gene may be used in homologous recombination experiments to repair defective genes in cells or, alternatively, to develop strains of mice that are deficient in the gene. Antisense constructs may be transfected into cells, according to the native cellular expression 10 patterns of each gene (placental or thymus cells for mlch-2, for example) in order to develop cells which may be maintained in culture for an extended period of time or indefinitely. Alternatively antisense constructs may be used in cell culture or in vivo to block cell death.
The protein may be used for the purpose of generating polyclonal or monoclonal antibodies using standard techniques well known in the art (Klein, Immunology: The Science of Cell-Noncell Discrimination, John Wiley Sons, N.Y. (1982); Kennett et al., Monoclonal Antibodies, Hybridoma: A New Dimension in Biological Analyses, Plenum Press, N.Y. (1980); Campbell, A., "Monoclonal Antibody Technology," In: Laboratory Techniques in Biochemistry and Molecular Biology 13, Burdon et al. eds., Elseiver, Amsterdam (1984); Harlow and Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, N.Y. (1988)). Such antibodies may be used in assays for determining the expression of the genes. Purified protein would serve as the standard in such assays.
Based upon the sequences of Figure 6A-6C, probes may be used to determine whether the mlch-2 gene or homologs of mlch-2 are being expressed in cells. Such probes may be utilized in assays for correlating mlch-2 expression with cellular conditions, e.g. neoplastic transformation, as well as for the purpose of isolating other genes which are homologous to mlch-2.
mlch-2 will be used in the development of therapeutic methods for diseases and conditions characterized by cell death. The diseases and conditions which could potentially be treated include neural and muscular degenerative diseases, myocardial infarction, stroke, virally induced cell death and aging.
Antisense nucleic acids based upon the sequences shown in Figure 6A- 6C may be used to inhibit mlch-2 expression. Such inhibition will be useful in blocking cell death in cultured cells.
mlch-2 may be used to generate polyclonal or monoclonal antibodies using methods well known in the art (see above). The antibodies may be used :10 in assays for determining the expression of mlch-2. Purified mlch-2 would serve as the standard in such assays.
Based upon the sequences of Figures 10A-1OB and 10C-10E, probes may be used to determine whether the Ich-1 gene or homologs of Ich-1 are being expressed in cells. Such probes may be utilized in assays for correlating 15 Ich-1 expression with cellular conditions, e.g. neoplastic transformation, as well as for the purpose of isolating other genes which are homologous to Ich-1.
Ich-1 will be used in the development of therapeutic methods for diseases and conditions characterized by cell death. The diseases and conditions which could potentially be treated include neural and muscular degenerative diseases, myocardial infarction, stroke, virally induced cell death and aging.
Antisense nucleic acids based upon the sequences shown in Figures and 10B, may be used to inhibit Ich-1 expression. Such inhibition will be useful in blocking cell death in cultured cells.
Ich-1 protein may be used to generate polyclonal or monoclonal antibodies using methods well known in the art (see above). The antibodies may be used in assays for determining the expression of Ich-1. Purified Ich-1 would serve as the standard in such assays.
Based upon the sequence of Figure 14, probes may be used to determine whether the Ich-3 gene or homologs of Ich-3 are being expressed in cells. Such probes may be utilized in assays for correlating Ich-3 expression with cellular conditions, e.g. neoplastic transformation, as well as for the purpose of isolating other genes which are homologous to Ich-3.
Ich-3 will be used in the development of therapeutic methods for diseases and conditions characterized by cell death. The diseases and conditions which could potentially be treated include neural and muscular degenerative diseases, myocardial infarction, stroke, virally induced cell death and aging.
Antisense nucleic acids based upon the sequence shown in Figure 14 may be used to inhibit Ich-3 expression. Such inhibition will be useful in blocking cell death in cultured cells.
Ich-3 protein may be used to generate polyclonal or monoclonal antibodies using methods well known in the art (see above). The antibodies *9 may be used in assays for determining the expression of Ich-3. Purified Ich-3 would serve as the standard in such assays.
Method for Preventing Programmed Cell Death. in Vertebrate Cells by Inhibiting the Enzymatic Activity of ICE The present invention is directed to preventing the programmed death of vertebrate cells by inhibiting the action of ICE. The detailed structural analysis performed on the ced-3 gene from C. elegans revealed a homology to human and murine ICE which is especially strong at the QACRG active domain of hICE (see Figure 3A-3B).
In order to determine if ICE functions as a cell death gene in vertebrates, the mICE gene was cloned, inserted into an expression vector and then transfected into rat cells. A close correlation was found between ICE expression and cell death (see Example 2).
Further support for the function of ICE as a cell death gene was obtained from inhibition studies. In order to determine whether cell death can be prevented by inhibiting the enzymatic action of ICE, cell lines were established which produced a high level of crmA. When these cells were transfected with mICE, it was found that a large percentage of the cells expressing mICE maintained a healthy morphology and did not undergo programmed cell death (Example 2 herein).
Evidence that ICE has a physiological role as a vertebrate cell death gene was also obtained by examining cells engineered to over-express bcl-2, an oncogene known to inhibit programmed cell death and to be overexpressed in many follicular and B cell lymphomas. It was found that cells expressing bcl-2 did not undergo cell death despite the high levels of ICE expression (Example o* 2 herein). These results suggest that bcl-2 may promote malignancy by i.e.
o* 0 inhibiting the action of ICE.
Any method of specifically regulating the action of ICE in order to control programmed cell death in vertebrates is encompassed by the present invention. This would include using not only inhibitors specific to ICE, e.g.
crmA, or the inhibitors described by Thornberry et al. (Nature 356:768-774 15 (1992)), but also any method which specifically prevented the expression of the ICE gene. Thus, antisense RNA or DNA comprised of nucleotide sequences complementary to ICE and capable of inhibiting the transcription or translation of ICE are within the scope of the invention (see Lichtenstein, Nature 333:801-802 (1988)).
The ability to prevent vertebrate programmed cell death is of use in developing cells which can be maintained for an indefinite period of time in culture. For example, cells over-expressing crnuL may be used as hosts for expressing recombinant proteins. The ability to prevent programmed cell death may allow cells to live independent of normally required growth factors. It has been found that microinjecting crmA mRNA or a crmA-expressing nucleic acid construct into cells allows chicken sympathetic neurons to live in vitro after the removal of neural growth factor.
Method for Promoting Programmed Cell Death in Vertebrate Cells by Increasing or Inducing the Activity of ICE The expression of ICE can be increased in order to cause programmed cell death. For example, homologous recombination can be used to replace a defective region of an ICE gene with its normal counterpart. The level of regulation amenable to manipulation to either increase or decrease the .i expression of ICE include DNA, RNA, or protein. Genomic DNA, for example, can be mutated by the introduction of selected DNA sequences introduced into the genome by homologous recombination. Any desired mutation can be introduced itz vitro and, through gene replacement, either decoding or regulatory sequences of the gene can be manipulated.
:°Extrachromosomal DNA with the appropriate gene sequence can also be introduced into cells to compete with the endogenous product. At the level of RNA, antisense RNA molecules can be introduced, as well as RNA having 15 more or less affinity for the translational apparatus or greater or lesser tendency to be transcribed. At the level of protein, protein counterparts can be designed having a higher or lower activity.
In addition to direct regulation, and particularly an increase in gene expression of ICE, the possibility also exists for indirect regulation by regulating those cellular components that either induce or suppress the expression of ICE. The inventors have found, for example, that TNF-c induces a program of cell death via the activation of hICE genetic sequences. Thus, a further level of regulation is that of modulating the expression of TNF-ct and its functional counterparts. Thus, any cellular component regulating programmed cell death by means of the ICEIced-3 pathway can itself be regulated rather than directly regulating the ICE gene. Regulation can occur by any of the means discussed above, for example. The genes in the bcl-2 family, p53 and the genes that are regulated by p53 (such as p21), the proteins in the ras pathway (ras, raf 14-3-3), Fas and the proteins in the cytotoxic T cell granules (such as granzyme B) may all directly and indirectly influence the activity of the ICE family. Accordingly, the regulation can also occur by means of any of these genes and others in such pathways. In this way, it may be possible to prevent the uncontrolled growth of certain malignant cells.
Methods of increasing ICE activity may be used to kill undesired organisms such as parasites. crmA is a viral protein that is important for cowpox infection. This suggests that the prevention of cell death may be :important for successful infection and that the promotion of ICE expression may 0••6 provide a means for blocking infection. Activation of ICE family genes may o be used to eliminate cancerous cells or any other unwanted cells. Prevention 06 of cell death by inactivating the ICE family of genes could prevent neuronal degenerative diseases, such as Alzheimer's disease, amylotrophic lateral 00 "sclerosis, and cell death associated with stroke, ischemic heart injury, and *so* t* aging.
5 Having now generally described this invention, the same will be further OO6O described by reference to certain specific examples which are provided herein for purposes of illustration only and are not intended to be limiting unless So otherwise specified. All references cited throughout the specification are incorporated by reference in their entirety.
Example 1 Experimental Procedures General methods and strains The techniques used for culturing C. elegans have been described by Brenner (Brenner, Genetics 77:71-94 (1974)). All strains were grown at 20°C. The wild-type parent strains were C. elegans variety Bristol strain N2, Bergerac strain EM1002 (Emmons et al., Cell 32:55-65 (1983)), C. briggsae and C. vulgaris. The genetic markers used are described below. These markers have been previously described (Brenner, Genetics 77:71-94 (1974)); and Hodgkin et al., Genetics in the Nematode Caenorhabditis Elgens (Wood et al. eds.) pp.491-584, Cold Spring Harbor, New York (1988)).
Genetic nomenclature follows the standard system (Horvitz et al., Mol. Gen.
Genet. 175:129-133 (1979)).
LG I: ced-1 (ei 735); unc-54 (r323) LG VI: unc-31 (e928), unc-30 (e191), ced-3 (n717, n718, n1040, n1129, n11634, n1164, nll65, n1286, n1949, n2426, n2430, n2433), unc-26 (e205), dpy-4 (e1166) LG V: eg-l(n986); unc-76 (e911) LG X: dpy-3(e27) Isolation of addit3.
5 RFLP mapping Two cosmid libraries were used extensively in this work a Sau3A I partial digest genomic library of 7000 clones in the vector pHC79 and a Sau3A I partial digest genomic library of 6000 clones in the vector pJB8 (Coulson et al., Proc. Natl. Acad. Sci. U.S.A. 83:7821-7825 (1986).
Bristol (N2) and Bergerac (EM1002) DNA was digested with various restriction enzymes and probed with different cosmids to look for RFLPs.
nP33 is a HindIII RFLP detected by the "right" end of Jc8. The "right" end of Jc8 was made by digesting Jc8 with EcoRI and self-ligating. nP34 is a HindIII RFLP detected by the "left" end of Jc8. The "left" end of Jc8 was made by digesting Jc8 by Sail and self ligating. nP36 and nP37 are both HindIII RFLPs detected by T10H5 and B0564, respectively.
Germ line transformation The procedure used for microinjection basically follows that of A. Fire (Fire, EMBO J. 5:2673-2680 (1986)). Cosmid DNA was twice CsCI gradient purified. Miniprep DNA was used when deleted cosmids were injected and was prepared from 1.5 ml overnight bacterial culture in superbroth.
Superbroth was prepared by combining 12 g Bacto tryptone, 24 g yeast extract, 8 ml 50% glycerol and 900 ml H 2 0. The mixture was autoclaved and then 100 ml of 0.17 M KHPO, and 0.72 M K 2
HPO
4 were added. The bacterial culture was extracted by the alkaline lysis method as described in Maniatis et al.
10 (Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, NY (1983)). DNA was treated with RNase A (37°C, 30 min) and then with protease K (55 0 C, 30 min). The preparation was phenol- and then chloroform-extracted, precipitated twice (first in 0.3 M Na acetate and second in 0.1 M K acetate, pH and resuspended in 5 1 injection buffer as 15 described by A. Fire (Fire, EMBO J. 5:2673-2680 (1986)). The DNA concentration for injection was in the range of 100 pg to 1 mg per ml.
All transformation experiments used the ced-1(e1735); unc-31(e928) ced-3(n717) strain, unc-31 was used as a marker for co-transformation (Kim et al., Genes Dev. 4:357-371 (1990)). ced-1 was present to facilitate scoring of the ced-3 phenotype. The mutations in ced-1 block the engulfment process of cell death, which makes the corpses of the dead cells persist much longer than that in the wild-type (Hedgecock et al., Science 220:1277-1280 (1983)).
ced-3 phenotype was scored as the number of dead cells present in the head of young L1 animals. The cosmid C10D8 or the plasmid subclones of C10D8 were mixed with C14GIO (unc-31(+)-containing) at a ratio of 2:1 or 3:1 to increase the chances that an Unc-31(+) transformant would contain the cosmid or plasmid being tested. Usually, 20-30 animals were injected in one experiment. Non-Unc Fl progeny of injected animals were isolated three to four days later. About 1/2 to 1/3 of the non-Unc progeny transmitted the non- Unc phenotype to F2 and established a line of transformants. The young L1 progeny of such non-Unc transformants were checked for the number of dead cells present in the head using Nomarski optics.
Determination of ced-3 transcript initiation site 5 Two primers, Pexl (SEQ ID NO: 61): (5'GTTGCACTGCTTTCACGATCTCCCGTCTCT3') and Pex2 (SEQ ID NO: 62): (5'TCATCGACTTTTAGATGACTAGAGAACATC3'), were used for primer extension. The primers for RT-PCR were: SL1 (SEQ ID NO: 63) (5'GTTTAATTACCCAAGTTTGAG3') and log-5 (SEQ ID NO: 64) .10 (5'CCGGTGACATTGGACACTC3'). The products were reamplified using the primers SL1 and oligol0 (SEQ ID NO: 65) (5'ACTATTCAACACTTG3').
A
S* product of the expected length was cloned into the PCR1000 vector (Invitrogen) and sequenced.
Determination and analysis of DNA sequence For DNA sequencing, serial deletions were made according to a procedure developed by Henikoff (Heinkoff, Gene 28:351-359 (1984)).
DNA sequences were determined using Sequenase and protocols obtained from US Biochemicals with minor modifications.
The ced-3 amino acid sequence was compared with amino acid sequences in the GenBank, PIR and SWISS-PROT databases at the National Center for Biotechnology Information (NCBI) using the blast network service.
Cloning of ced-3 genes from other nematode species C. briggsae and C. vulgaris ced-3 genes were isolated from corresponding phage genomic libraries using the ced-3 cDNA subclone pJ118 insert-as a probe under low stringency conditions (5xSSPE, 20% formamide, 0.02% Ficoll, 0.02% BSA, 0.02% polyvinylpyrrolidone, and 1% SDS) at overnight and washed in lxSSPE and 0.5% SDS twice at room temperature and twice at 42 0 C for 20 min each time.
5 Results o• ced-3 is not essential for viability All previously described ced-3 alleles were isolated in screens designed to detect viable mutants in which programmed cell death did not occur (Ellis et al., Cell 44:817-829 (1986)). Such screens might systematically have missed classes of ced-3 mutations that result in inviability. Since animals with the genotype of ced-3/deficiency are viable (Ellis et al., Cell 44:817-829 (1986)), a noncomplementation-screening scheme was designed that would allow the isolation of recessive lethal alleles of ced-3. Four new ced-3 alleles (n1163, n1164, n1165, and n1286) were obtained which were viable as homozygotes.
These new alleles were isolated at a frequency of about 1 in 2500 mutagenized haploid genomes, approximately the frequency expected for the generation of null mutations in an average C. elegans gene (Brenner, Genetics 77:71-94 (1974); Meneely et al., Genetics 92:99-105 (1990); Greenwald et al., Genetics 96:147-160 (1980)).
These results suggest that animals lacking ced-3 gene activity are viable.
In support of this hypothesis, molecular analysis has revealed that three ced-3 mutations are nonsense mutations that prematurely terminate ced-3 translation and one alters a highly conserved splice acceptor site (see below). These mutations would be expected to eliminate ced-3 activity completely. Based upon these considerations, it was concluded that ced-3 gene activity is not essential for viability.
-44- -ced-3 is contained within a 7.5 kb genomicfragment The ced-3 gene was cloned using the approach of Ruvkun et al.
(Molecular Genetics of the Caenorhabditis elegans Heterochronic Gene lin-14 121:501-516 (1988)). Briefly (for further details, see Experimental 5 Procedures), the C. elegans Bristol strain N2 contains 30 dispersed copies of the transposable element Tcl, whereas the Bergerac strain contains more than 400 copies (Emmons et al., Cell 32:55-65 (1983); Finney, Ph.D. Thesis, "The Genetics and Molecular Biology of unc-86, a Caenorhabditis elegans Cell Lineage Gene," Cambridge, MA (1987)). By crossing Bristol and Bergerac strains, a series of recombinant inbred strains were generated in which chromosomal material was mostly derived from the Bristol strain with varying amounts of Bergerac-specific chromosome IV-derived material in the region of the ced-3 gene. By probing DNA from these strains with plasmid pCe2001 which contains Tcl (Emmons et al., Cell 32:55-65 (1983), a 5.1 kb EcoRI Tcl- 5 containing restriction fragment specific to the Bristol strain (restriction fragment length polymorphism nP35) and closely linked to ced-3 was identified.
Cosmids that contained this 5.1 kb restriction fragment were identified and it was found that these cosmids overlapped an existing cosmid contig that had been defined as part of the C. elegans genome project (Coulson et al., Proc. Natl. Acad. Sci. 83:7821-7825 (1986). Four other Bristol-Bergerac restriction fragment length polymorphisms were defined by cosmids in this contig (nP33, np34, nP36, nP37). By mapping these restriction fragment length polymorphisms with respect to the genes unc-30, ced-3 and unc-26, the physical contiguity was oriented with respect to the genetic map and the region containing the ced-3 gene was narrowed to an interval spanned by three cosmids (Fig. 1).
On Southern blot, three of three Berg unc-26 recombinants showed the Bristol nP33 pattern while two of two ced-3 Berg recombinants showed the Bergerac pattern (data not shown). Thus, nP33 maps very close or to the right of unc-26. For nP34, two of two ced-3 Berg recombinants and two of three Berg unc-26 recombinants showed the Bergerac pattern; one of the three Berg unc-26 recombinant showed the Bristol pattern (data not shown).
The genetic distance between ced-3 and unc-26 is about 0.2 mu. Thus, nP34 maps between ced-3 and unc-26, about 0.1 mu to the right of ced-3. Similar experiments mapped nP35, the 5.1 kb Bristol specific Tcl element, to about 0.1 mu to the right of ced-3.
In order to map n36 and n37, Bristol unc-30 ced-3/+ males were crossed with Bergerac hermaphrodites. From the progeny of heterozygotes of 10 genotype unc-30 ced-3 (Bristol)/+ (Bergerac), Unc-30/non-ced-3 and non- Unc-30/ced-3 animals were picked and DNA was prepared from these strains.
nP36 maps very close or to the right of unc-30 since two of two unc-30 Berg recombinants showed Bristol pattern and two of two Berg ced-3 recombinants showed the Bergerac pattern (data not shown). Similarly, nP37 maps very close or to the right.of unc-30 since four of the four Berg ced-3 showed Bergerac pattern and six of six unc-30 Berg recombinants showed the Bristol pattern (data not shown). These experiments narrowed the region containing the ced-3 gene to an interval spanned by the three cosmids (Fig. la).
Cosmids that were candidates for containing the ced-3 gene were microinjected (Fire, EMBO J. 5:2673-2680 (1986)) into ced-3 mutant animals to test for rescue of the mutant phenotype. Specifically, cosmid C14G10, which contains the wild-type unc-31 gene and a candidate cosmid were coinjected into ced-](e1375); unc-31(e928) ced-3(n717) hermaphrodites.
Non-unc progeny were isolated and observed to see if the non-Unc phenotype was transmitted to the next generation, thus establishing a line of transformed animals. Young LI progeny of such transformant lines were examined for the presence of cell deaths using Nomarski optics to see whether the ced-3 phenotype was complemented (see Experimental Procedures). Cosmid C14G10 alone does not confer ced-3 activity when injected into a ced-3 mutant.
unc-31 was used as a marker for co-transformation (Kim et al., Genes Devel. 4:357-371 (1990)). ced-1 was present to facilitate scoring of the ced-3 phenotype. Mutations in ced-1 block the engulfment process of programmed cell death, causing the corpses of dead cells to persist much longer than in the wild-type (Hedgecock et al., Science 220:1277-1280 (1983)). Thus, the presence of a corpse indicates a cell that has undergone programmed cell death. The ced-3 phenotype was scored as the number of corpses present in the head of young LI animals.
As indicated in Fig. 1, of the three cosmids injected (C43C9, W07H6 10 and C48D1), only C48D1 rescued the ced-3 mutant phenotype. Both non-Unc transformed lines obtained, nis1 and nEx2, were rescued. Specifically, L1 ced- I animals contain an average of 23 cell corpses in the head, and L1 ced-1; ced3 animals contain an average of 0.3 cell corpses in the head (Ellis et al., Cell 44:817-829 (1986)). By contrast, ced-1; unc-31 ced-3; nIsl; and ced-1; unc-31 1" ced-3; nEx2 animals contained an average of 16.4 and 14.5 cell corpses in the head, respectively. From these results, it was concluded that C48D1 contains the ced-3 gene.
To locate ced-3 more precisely within the cosmid C48D1, this cosmid was subcloned and the subclones tested for their ability to rescue the ced-3 mutant phenotype (Fig. 1A). From these experiments, ced-3 was localized to a DNA fragment of 7.5 kb A 2.8 kb ced-3 transcript is expressed primarily during embryogenesis and independently of ced-4 function The 7.6 kb pJ107 subclone of C48D1 (Fig. 1A) was used as a probe in a northern blot of polyA+ RNA derived from the wild-type C. elegans strain N2. This probe hybridized to a 2.8 kb transcript. Although this transcript is present in 11 different EMS-induced ced-3 mutant strains, subsequent analysis has shown that all 11 tested ced-3 mutant alleles contain mutations in the genomic DNA that encodes this mRNA (see below), thus establishing this RNA as a ced-3 transcript.
The developmental expression pattern of ced-3 was determined by hybridizing a northern blot of RNA from animals at different stages of development with the ced-3 cDNA subclone pJll8 (see below). The ced-3 transcript was found to be most abundant during embryogenesis, when most programmed cell deaths occur, but was also detected during the LI through L4 larval stages. It is present in relatively high levels in young adults.
Since ced-3 and ced-4 are both required for programmed cell death in C. elegans, and since both are highly expressed during embryonic development (Yuan et al., Dev. 116:309-320 (1992), the possibility existed that one of the genes might regulate the mRNA level of the other. Previous studies have revealed that ced-3 does not regulate ced-4 mRNA levels (Yuan et al., Dev.
S 116:309-320 (1992)). To determine if ced-4 regulates ced-3 mRNA levels, a '15 northern blot of RNA prepared from ced-4 mutant embryos was probed with the :ced-3 cDNA subclone pJ118. It was found that the amount and size of the ced- 3 transcript was normal in the ced-4 mutants n1162, n1416, n1894 and n1920.
Thus, ced-4 does not appear to affect the steady-state levels of ced-3 mRNA.
ced-3 cDNA and genomic Sequences To isolate ced-3 cDNA clones, ced-3 genomic DNA pJ40 (Fig. 1A) was used as a probe to screen a cDNA library of the C. elegans wild-type strain N2 (Kim et al., Genes Dev. 4:357-371 (1990)). The 2.5 kb cDNA clone pJ87 was isolated in this way. On northern blots, pJ87 hybridized to a 2.8 kb transcript and on Southern blots, it hybridized only to bands to which hybridizes (data not shown). Thus, pJ87 represents an mRNA transcribed entirely from pJ40 which can rescue the ced-3 mutant phenotype when microiniected into ced-3 mutant animals. To confirm that pJ87 contains the ced-3 cDNA, a frameshift mutation in the Sail site of pJ40 was made corresponding to the Sall site in the pJ87 cDNA. Constructs containing the frameshift mutation failed to rescue the ced-3 phenotype when microinjected into ced-3 mutant animals (6 transformant lines; data not shown), suggesting that ced-3 activity had been eliminated by mutagenizing a region of genomic DNA that corresponds to the pJ87 cDNA.
The DNA sequence of pJ87 is shown in Figure 2C. pJ87 contains an insert of 2482 bp with an open reading frame of 503 amino acids. It has 953 bp of 3' untranslated sequence, not all of which is essential for ced-3 expression; genomic constructs that do not contain 380 bp of the 3'-most region 10 (pJ107 and its derivatives, see Fig. la) were capable of rescuing ced-3 mutant phenotype. The cDNA ends with a poly-A sequence, suggesting that the complete 3' end of the ced-3 transcript is present.
To confirm the DNA sequence obtained from the ced-3 cDNA and to study the structure of the ced-3 gene, the genomic sequence of the ced-3 gene from the plasmid pJ107 was determined. The insert in pJ107 is 7656 bp in length (Fig. 2A-2F).
To determine the location and nature of the 5' end of the ced-3 transcript, a combination of primer extension and amplification using the polymerase chain reaction (PCR) was used. Two primers, Pexl and Pex2, S 20 were used for primer extension. The Pexl reaction yielded two major bands, whereas the Pex2 reaction gave one band. The Pex2 band corresponds in size to the smaller band from the Pexl reaction, and agrees in length with a possible transcript that is trans-spliced to a C. elegans splice leader (Bektesh et al., Genes Dev. 2:1277-1283 (1988)) at a consensus splice acceptor at position 2166 of the genomic sequence. The nature of the larger Pexl band is unclear.
To confirm these observations, wild-type total RNA was reversetranscribed and then amplified using the primers SL1 and log-5 followed by reamplification using the primers SL1 and oligolO. A product of the expected length was cloned into the PCR1000 vector (Invitrogen) and sequenced. The sequence obtained confirmed the presence of a ced-3 message trans-spliced to SL1 at position 2166 of the genomic sequence. These experiments suggest that a ced-3 transcript is trans-spliced to the C. elegans splice leader SL1 (Bektesh et al., Genes Dev. 2:1277-1283 (1988)) at a consensus splice acceptor at position 2166 of the genomic sequence. Based upon these observations, it is concluded that the start codon of ced-3 is the methionine encoded at position 2232 of the genomic sequence and that ced-3 is 503 amino acids in length.
The predicted ced-3 is hydrophilic (256/503 residues are charged or polar) and does not contain any obvious potential trans-membrane domains.
One region of ced-3 is rich in serines: from amino acid 107 to amino acid 205, 10 32 of 99 amino acids are serine residues.
The sequences of 12 EMS-induced ced-3 mutations (Table 1) were determined. Eight are missense mutations, three are nonsense mutations, and one alters a conserved G at the splice acceptor site of intron 6. Interestingly, nine of these 12 mutations alter residues within the last 100 amino acids of the protein, and none occurs within the serine-rich region.
Table 1. Sites of mutations in the ced-3 gene Allele Mutation Nucleotide Codon Consequence n717 G to A 6297 Altered splicing n718 G to A 2487 65 G to R n1040 C to T 2310 27 L to F n1129 n164 C to T 6434 449 A to V n1163 C to T 7020 486 S to F n1165 C to T 5940 403 Nonsense n1286 G to A 6371 428 Nonsense n1949 C to T 6222 412 Nonsense n2426 G to A 6535 483 E to K n2430 C to T 6485 466 A to V n2433 G to A 5757 360 G to S Nucleotide and codon positions correspond to the numbering in Fig. 2A-2F.
To identify functionally important regions of ced-3, the genomic sequences of the ced-3 genes from the related nematode species C. briggsae and C. vulgaris were cloned and sequenced. Sequence comparison of the three ced- 3 genes showed that the relatively non-serine-rich regions of the proteins are more conserved than are serine-rich regions (Fig. 3A-3B). All 12 EMS-induced ced-3 mutations altered residues that are conserved among the three species.
These results suggest that the non-serine-rich region is important for ced-3 function and that the serine rich region is either unimportant or that residues within it are functionally redundant.
o• 10 ced-3 protein is similar to the mammalian ICE and nedd-2 proteins A search of the GenBank, PIR and SWISS-PROT databases revealed that the non-serine-rich regions of ced-3 are similar to hICE and mICE (Fig. 3A).
The most highly conserved region among the proteins shown in Figure 3A consists of amino acids 246-360 of ced-3 and amino acids 166-287 of the hICE: 15 49 residues are identical (43% identity). The active site cysteine of hICE is located at cysteine 285 (Thornberry et al., Nature 356:768-774 (1992)). The five-amino-acid peptide (QACRG) around this active cysteine is the longest conserved peptide among mICE and hICE and ced-3.
hICE is composed of two subunits (p20 and p10) that appear to be proteolytically cleaved from a single proenzyme to the mature enzyme (Thornberry et al., Nature 356:768-774 (1992)). Two cleavage sites in the proenzyme, Asp-Ser at positions 103 and 297 of hICE, are conserved in ced-3 (position 131 and 371, respectively).
The C-terminal portion of ced-3 and the pl0 subunit of hICE are similar to the protein product of the murine nedd-2 gene. ced-3, nedd-2 and hICE are 27% identical (Fig. 3A-3B). nedd-2 does not contain the QACRG peptide at the active site of hICE and mICE (Fig. 3A). Seven of eight point mutations that were analyzed (n718, n1040, n1129, n1164, n2430, n2426 n2433) result in alterations of amino acids that are conserved or semi-conserved among the three nematode ced-3 proteins, hICE and nedd-2. In particular, the mutation, n2433, introduces a Gly to Ser change near the putative active cysteine (Fig. 2A-2F, Table 1).
Discussion The genes ced-3 and ced-4 are the only genes known to be required for programmed cell death to occur in C. elegans (Ellis et al., Cell 44:817-829 (1986)). Genetic and molecular studies have revealed that the ced-3 gene shares a number of features with ced-4 (see Yuan et al., Dev. 116:309-320 (1992)).
:10 Like ced-4, ced-3 is not required for viability. It appears to contain the sequence for a single mRNA which is expressed mostly in the embryo, the stage during which most programmed cell death occurs. Furthermore, just as ced-3 gene function is not required for ced-4 gene expression (Yuan et al., Dev.
116:309-320 (1992)), ced-4 gene function is not required for ced-3 gene 15 expression. Thus, these two genes do not appear to control the onset of programmed cell death by acting sequentially in a regulatory transcriptional cascade. Unlike ced-4 (Yuan et al., Dev. Biol. 138:33-41 (1992)), ced-3 is expressed at a substantial level in young adults. This observation suggests that ced-3 expression is not limited to cells undergoing programmed cell death.
The ced-4 amino acid sequence is novel. Two regions show similarity to the EF-hand motif, which binds calcium (Yuan et al., Dev. 116:309-320 (1992)). For this reason it has been suggested that ced-4 protein and hence, programmed cell death in C. elegans, might be regulated by calcium, ced-3 contains a region of 99 amino acids that contain 32 serines. Since serines are common phosphorylation sites (Edelman et al., Ann. Rev. Biochem. 56:567-613 (1987)), ced-3 and hence, programmed cell death in C. elegans, may be regulated by phosphorylation. Phosphorylation has previously been suggested to function in cell death (McConkey et al., J. Immunol. 145:1227-1230 (1990)).
McConkey et al. have shown that several agents that elevate cytosolic cAMP level induce thymocyte death. This suggests that protein kinase A may mediate cell death by phosphorylating certain proteins. Although the precise sequence of the serine-rich region varies among the three Caenorhabditis species studied, the relatively high number of serines is conserved in C. elegans, C. briggsae and C. vulgaris. None of the mutations in ced-3 affect the serine-rich region.
These observations are consistent with the hypothesis that the presence of serines is more important than the precise amino acid sequence within this region.
"*10 Much more striking than the presence of the serine-rich region in ced-3 is the similarity between the non-serine-rich regions of ced-3 and hICE and mICE.
The carboxy half of ced-3 is the region that is the most similar to ICE.
A stretch of 115 residues (amino acids 246-360 of ced-3) is 43% identical between ced-3 and hICE. This region in ICE contains a conserved pentapeptide QACRG (positions 361-365 of ced-3), which surrounds the active cysteine.
Specific modification of this cysteine in hICE results in complete loss of activity (Thornberry et al., Nature 356:768-774 (1992)). The ced-3 mutation n2433 alters the conserved glycine in this pentapeptide and eliminates ced-3 function.
".20 This suggests that this glycine is important for ced-3 activity and is an integral part of the active site of ICE. Interestingly, while the mutations n718 (position 67 of ced-3) and n1040 (position 27 of ced-3) eliminate ced-3 function in vivo, they contain alterations in conserved residues which are outside of the mature subunit of hICE (Thornberry et al., Nature 356:768-774 (1992)). These residues may have a non-catalytic role in both ced-3 and ICE function, e.g., they may maintain a proper conformation for proteolytic activation. The hICE precursor (p45) is proteolytically cleaved at 4 sites (Aspl03, Aspll9, Asp297 and Asp316) to generate p24, p20, and pl0 (Thronberry et al., Nature 356:768- 774 (1992)). At least two of the cleavage sites are conserved in ced-3. This indicates that the ced-3 protein is processed.
The similarity between ced-3 and ICE suggests that ced-3 functions as a cysteine protease, controlling programmed cell death by proteolytically activating or inactivating a substrate protein. A substrate for ced-3 could be the product of the ced-4 gene which contains 6 Asp residues. These could be the target of ced-3 (Asp25, Aspl51, Aspl85, Aspl92, Asp459 and Asp541).
Alternatively, ced-3 could directly cause cell death by proteolytically cleaving certain proteins or subcellular structures that are crucial for cell viability.
ced-3 and ICE are part of a novel protein family. Thornberry et al.
suggested that the sequence GDSPG at position 287 of hICE resembles a "10 GX(S/C)XG motif found in serine and cysteine protease active sites (Nature 356:768-774 (1992)). However, in the three nematode ced-3 proteins examined, only the first glycine is conserved, and in mICE, the S/C is not present. This suggests that the ced-3/ICE family shares little sequence similarity with known protease families.
The similarity between ced-3 and ICE suggests not only that ced-3 functions as a cysteine protease, but also that ICE functions in programmed cell **death in vertebrates. Thus, it has been observed that after murine peritoneal macrophages are stimulated with lipopolysaccharide (LPS) and induced to .undergo programmed cell death by exposure to extracellular ATP, mature .20 active IL-1P is released into the culture supernatant. In contrast, when cells are injured by scraping, IL-1P is released exclusively as the inactive proform (Hogoquist et al., Proc. Natl. Acad. USA 88:8485-8489 (1991)). These results suggest that ICE is activated upon induction of programmed cell death. ICE transcript has been detected in cells that do not make IL-1P (Cerretti et al., Science 256:97-100 (1992)), suggesting that other ICE substrates exist. This suggests that ICE could mediate programmed cell death by cleaving a substrate other than IL-1P.
The carboxy-terminal portions of both ced-3 and the pl0 subunit of hICE are similar to the protein encoded by nedd-2. Since nedd-2 lacks the QACRG active domain, it might function to regulate ICE or ICE-like subunits. Interestingly, four ced-3 mutations alter residues conserved between nedd-2 and ced-3. Further, nedd-2 gene expression is high during embryonic brain development, when much programmed cell death occurs. These observations suggest that nedd-2 functions in programmed cell death.
The C. elegans gene ced-9 protects cells from undergoing programmed cell death by directly or indirectly antagonizing the activities of ced-3 and ced-4 (Hengartner et al., Nature 356:494-499 (1992)). bcl-2 also affects the onset of apoptotic cell death. Thus, if hICE or another ced-3/ICE family member is involved in vertebrate programmed cell death, bcl-2 might act by modulating "10 its activity. The fact that bcl-2 is a dominant oncogene suggests that hlCE and other ced-3/ICE family members might be recessive oncogenes. The elimination of such cell death genes would prevent normal cell death and promote malignancy, just as overexpression of bcl-2 does.
Example 2 The mouse ICE homolog (mICE) from a mouse thymus cDNA library (Stratagene) was cloned by low stringency hybridization using hlCE as a probe.
The clone is identical to the clone isolated by Nett et al. Immun. 149:3245- 3259 (1992)) except that base pair 166 is an A and encodes Asn rather than Asp. This may be a DNA polymorphism because the clone was derived from a B6/CBAFIJ (C57Black x CBA) strain cDNA library (Stratagene), while the Nett et al. clone was derived from a WEH13 cDNA library (Stratagene).
Subsequent experiments have shown that this variation is not in a region essential for ICE function (see below).
A transient expression system was developed to determine if overexpression of mICE kills cells. mICE cDNA was fused with the E. coli lac-Z gene and placed under the control of the chicken p-actin promoter (Fig. To test the function of the subunits, P20 and P10, which are processed from-a precursor peptide, two additional fusion genes were made (P20/P10-lacZ and P1O-lacZ).
The constructs, shown in Fig. 4, were transfected into Rat-1 cells by calcium phosphate precipitation. 24 hours after transfection, cells were fixed and X-gal was added. Healthy living rat cells are flat and well-attached to plates, while dying cells are round and often float into the medium. After 3 hours of color development, most blue cells transfected with intact mnICE-lacZ or P20/PO1-lacZ were round. However, most blue cells transfected with PIOlacZ or the control lac-Z construct were normal flat cells (Table Similar 9.
10 results were obtained with NG108-15 neuronal cells (not shown).
9 Table 2. Overexpression of mICE causes Rat-1 cells to undergo programmed cell death The constructs shown in Fig. 4 were transiently transfected into Rat-I cells, Rat-I cells expressing bcl-2 (Rat-l/bcl-2) or Rat-1 cells expressing crmA (Rat-l/crmn). 24 hrs 15 after transfection, cells were fixed and stained with X-gal for 3 hrs. The data shown are the percentage of round blue cells among total number of blue cells. The data were collected from at least three different experiments.
6 Construct Rat-1 Rat-l/bcl-2 Rat-1/crmA pactpgal' 1.44 0.18 2.22 0.53 2.89 0.79 ppactMIOZ 80.81 2.33 9.91 2.08 18.83 2.86 ppactM11Z 93.33 2.68 13.83 4.23 24.48 2.78 ppactM19Z 2.18 0.54 ppactM12Z 2.44 0.98 3.33 1.45 2.55 0.32 ppactl7Z 2.70 1.07 pJ485 1.32 0.78 ppactced38Z 46.73 4.65 35.28 1.36 34.40 2.38 ppactced37Z 3.67 1.39 Methods: a: Construction of bcl-2 expression vector (pJ415): pJ415 was constructed by first inserting the 400bp BgIllIBamHI crmA fragment into the BamHI site of the pBabe/puro vector and then inserting the remaining 1kb BamHI cnrA fragment into the 3'BamHI site in the sense direction. b: Construction of the bcl-2 expression vector (pJ436): pJ436 was constructed by inserting an EcoRI/SalI bcl-2 fragment into the EcoRIISall sites of the pBabe/puro vector.
c: Establishing Rat-1 cell lines that overexpress crnA and bcl-2: pJ415 and pJ436 were electroporated into TCRE retroviral packaging cells (Danos et al., Proc. Natl. Acad. Sci. (USA) 85:6460-6464 (1988)) using a BioRad electroporating apparatus. Supernatant either from overnight transiently transfected TCRE cells or from stable lines of TCRE cells expressing either crmA or bcl-2 were used to infect Rat-1 cells overnight in the presence of 8 tg/ml of polybrene. Resistant cells were selected using 30 ug/ml puromycin for about 10 days. Resistant colonies were cloned and checked for expression using both Northern and Western blots. Bcl-2 antibodies were from S.J. Korsmeyer and from DAKO.
crmA antiserum was made by immunizing rabbits with an E. colioo expressed crmA fusion protein (pJ434). pJ434 was made by inserting an EcoRI/Sall fragment of crmA cDNA into EcoRI/Sall sites of pET21a '(Novagen) and fusion protein was expressed in the E. coli BL21 (DE3) strain. Multiple lines that express either bcl-2 or crmA were checked 20 for suppression of mICE induced cell death and all showed similar results.
When cells were stained with rhodamine-coupled anti-P galactosidase antibody and Hoechst dye, it was found that p-galactosidase-positive round cells had condensed, fragmented nuclei. Such nuclei are indicative of programmed 25 cell death. When observed with an electron microscope, the X-gal reaction product was electron dense, allowing cells expressing mICE-lacZ to be distinguished from other cells (Snyder et al., Cell 68:33-51 (1992)). The cells expressing the chimeric gene showed condensed chromatin and membrane blebbing. These are characteristics of cells undergoing programmed cell death (Wyllie, in Cell Death in Biology and Pathology, 9-34 (1981); Oberhammer et al., Proc. Natl. Acad. Sci. U.S.A. 89:5408-5412 (1992); Jacobson et al., Nature 361:365-369 (1993)). Thus, the results indicate that overexpression of mICE induces programmed cell death and that induction depends on both P20 and P10 subunits.
When color development in Rat-1 cells transfected with mlCE-lacZ or P20/P1O-lacZ is allowed to proceed for 24 hours, a greater number of flat cells become blue. This result indicates that cells tolerate lower levels of ICE activity.
If mICE is a vertebrate homolog of ced-3, then ced-3 might also be expected to cause cell death in vertebrates. This hypothesis was tested by making a ced-3-lacZ fusion construct and examining its ability to cause cell death using the assay as described above. As expected, the expression of ced-3 caused the death of Rat-1 cells (Table 2).
.10 If mICE functions in a similar way to ced-3, mutations eliminating ced-3 activity in C. elegans should also eliminate its activity in vertebrates. This hypothesis was tested by mutating the Gly residue in the pentapeptide active domain of hICE, QACRG, to Ser. It was found that this mutation eliminated the ability of both mICE and ced-3 to cause rat cell death in Rat-1 cells (Table 2).
crmA specifically inhibits ICE activity (Ray et al., above). To demonstrate that cell death associated with overexpression of mICE is due to the enzymatic activity of mICE, Rat-1 cells were infected with a pBabe retroviral construct (Morgenstern et al., Nucl. Acids Res. 18:3587-3596 (1990)) expressing crmA and cell lines identified which produce high levels of crmA.
When the mICE-lacZ construct was transfected into these cell lines, it was found that a large percentage of blue cells had a healthy, flat morphology (Table In addition, a point mutation that changes the Cys residue in the active site pentapeptide, QACRG to a Gly eliminates the ability of mICE to cause cell death (construct ppactM17Z, Figure 4, Table This result indicates that the proteolytic activity of mICE is essential to its ability to kill cells.
bcl-2 can also prevent or inhibit cell death (Vaux et al., Nufiez et al., Strasser et al., Sentman et al., above). Rat-1 cells were infected with the pBabe retroviral construct expressing bcl-2. Transfection of the mlCE-lacZ fusion construct into the cells lines overexpressing bcl-2 showed that a high percentage of blue cells were now healthy (Table Thus, cell death induced by overexpression of mICE can be suppressed by bcl-2. This result indicates that cell death induced by overexpression of mICE is probably caused by activation of a normal programmed cell death mechanism. Thus, taken together, the results all suggest that vertebrate animals have a genetic pathway of programmed cell death similar to that of C. elegans (Fig. Example 3 As described above, the genes in the ICE/ced-3 family would be expected to function during the initiation of programmed cell death. In order to identify additional members of this gene family, cDNA encoding hICE was :10 used to screen a mouse thymus cDNA library (Stratagene) under conditions of low stringency. Using this procedure, a new gene was identified and named "mlch-2" (see Figure 6A-6C for the cDNA sequence and deduced amino acid o sequence of mlch-2).
Figures 7 and 7A show that the protein encoded by mlch-2 contains 15 significant homology to hICE, mICE, and ced-3. The sequence homology indicates that mlch-2, like mICE, is a vertebrate cell death gene.
Northern blot analyses showed that the expression of mlch-2, unlike mICE, which is broadly expressed during embryonic development, is restricted to the thymus and placenta, areas in which cell death frequently occurs.. In addition, it was found that the expression of mlch-2 in the thymus can be induced by dexamethasone, an agent which causes thymus regression. It is concluded that mlch-2 is a thymus/placenta specific vertebrate cell death gene.
Example 4 Extensive cell death occurs in the developing nervous system (Oppenheim, R. Ann. Rev. Neurosci. 145:453-501 (1991)). Many neurons -59die during the period of synapse formation. During this critical period, the survival of neurons depends on the availability of neural trophic factors. The survival of isolated primary neurons in vitro depends critically on the presence of such trophic factors (Davies, A. Development 100:185-208 (1987)).
Removal of such factors induces neuronal cell death, usually within 48 hrs.
The death of the sympathetic neurons and sensory neurons whose survival depends on one or more members of the nerve growth factor family (nerve growth factor, brain-derived neurotrophic factor, and neurotrophin-3) can be prevented by microinjection of a bcl-2 expression vector (Garcia, et al., Science 258:302-304 (1993); Allsopp et al., 1993). To examine if the genes in the ICE/ced-3 family are involved in neuronal cell death, the ability of crmA (which inhibits ICE) to inhibit the death of chicken dorsal root ganglionic neurons induced by NGF removal was examined. It was found that microinjection of an expression vector containing crmA inhibits the death of DRG neurons as effectively as that of a bcl-2 expression vector (Gagliardini, et al., Science 263:826-828 (1994)). This result demonstrates that the genes in the ICE/ced-3 family play a role in regulating neuronal cell death during development.
SExample Results Cloning of Ich-1 The protein product of the C. elegans cell death gene, ced-3, is homologous to the product of the mouse gene, nedd-2. The nedd-2 cDNA in the data bank has an open reading frame of 171 amino acids and has long 3' and 5' untranslated regions. This 171-amino acid nedd-2 protein does not contain the active domain (SEQ ID NO: 66), QACRG, of ICE and ced-3 proteins and is homologous only to the P10 subunit of mammalian ICE and the C-terminal part of ced-3. While analyzing nedd-2 cDNA, the inventors discovered that it contains a sequence that can potentially encode a QACRG pentapeptide, but that the sequence is in another reading frame. The inventors considered the possibility that the nedd-2 cDNA isolated by Kumar et al. contains cloning artifacts and that another nedd-2 transcript encodes a protein homologous to both the P20 and P10 subunits of ICE.
A mouse nedd-2 probe was made by polymerase chain reaction (PCR).
Using this probe, three cDNA libraries were screened: a mouse embryonic day 11.5 cDNA library from CLONTECH (one million clones screened), a human fetal brain cDNA library from James Gusella's laboratory (10 million clones screened), and a human fetal brain cDNA library from Stratagene (one million clones screened). The longest positive cDNA clones were obtained from the Stratagene cDNA library. From the Stratagene library, two cDNA species (pBSH37 and pBSH30) were identified that encode two closely related proteins homologous to the mouse nedd-2 (Figures 10-12).
The insert of pBSH37 (2.5 kb) encodes a protein of 435 amino acids that contains amino acid sequence similarities to both the P20 and P10 subunits of hlCE and the entire ced-3. The insert of pBSH30 (2.2 kb) has an open reading *.i20 frame of 512 amino acids and contains an additional 61 bp one basepair after the sequence encoding QACRG. This causes an early termination of protein translation. The Northern blot analysis showed that expression of this human gene is different than expression of nedd-2 (Kumar et thus, the sequences were renamed Ich-1L (pBSH37) (Figure 10A-10B) and Ich-1 s (pBSH30) (Figure 10C-10E).
A comparison of cDNA sequences revealed that Ich-ls cDNA differs from Ich-l, at the 5' end, around the beginning of the initiation of translation, and by the presence of an additional intron in the middle of Ich-1 s cDNA (Figure 11B). The first difference is at the beginning of the coding region. The putative first methionine of Ich-l1 is 15 amino acids downstream from the first methionine of Ich-L; the first 35 bp of Ich-ls are different from Ich-1L and include a stop codon (Figures 10A-10B and 10C-10E). PCR analysis using primers specific to the first 35 bp of Ich-1 and the Ich-is-specific intron (see below), and human placenta cDNA as template, amplified a DNA fragment of predicted size. This suggests that the 35 bp Ich-1s-specific sequence is not a cloning artifact and is present in the endogenous Ich-1s mRNA.
The second difference is distal to the active domain QACRG. Ich-1s begins to differ from Ich-1L one basepair after the coding region of the active site QACRG. The difference is caused by a 61 bp insertion, which results in 10 a termination codon 21 amino acids downstream from the insertion. The last two identical basepairs of Ich-ls and Ich-1 Lare AG, the general eukaryotic splicing donor consensus sequence (Mount, 1982).
S.
To eliminate the possibility that the 61 bp insertion in pBSH30 is a result o' of incomplete RNA processing, both forms of murine Ich-1 were cloned from adult mouse brain mRNA by PCR using primers flanking the insertion site as described (Experimental Procedure). The resulting 233 and 172 bp fragments (Figure 11B) were cloned separately and sequenced. Three murine Ichclones and two murine Ich-1 s clones were sequenced. Sequencing confirmed that the murine Ich-ls contains the same 61 bp insertion as in human Ich-ls at 20 the same position (Figure 11A).
Chicken Ich-1 from an embryonic chicken cDNA library (Clontech) was also cloned using a chicken Ich-1 probe obtained by PCR (see Experimental Procedures). Two clones were isolated. One encodes Ich-1, and the other encodes Ich-1s which contains a 62 bp insertion at the same position. The DNA sequence of the 62 bp insertion is 72% identical to that of human and murine Ich-ls and also caused premature termination of protein translation. The extra basepair in the intron of chicken Ich-ls causes the amino acid sequence of the last 41 amino acids of chicken Ich-ls to differ from human and murine Ich-ls; however, truncation of the protein may be the important point.
To examine the origin of the 61 bp insertion in murine and human Ich-1s, mouse genomic Ich-1 DNA was cloned. Analysis showed that the 61 bp is from an intron whose sequence is identical in human and mouse Ich-1.
The difference between Ich-1s and Ich-1L is caused by alternative splicing from the two different 5' splicing donor sequences. The first two basepairs of the 61 bp intron and the two basepairs after the 61 bp intron are GT (Figure 11A).
This sequence is the 100% conserved general eukaryotic splicing donor consensus sequence (Mount, 1982). The DNA sequence at the 3' splicing acceptor site is AG. This sequence is the 100% conserved eukaryotic splicing S t1 acceptor sequence (Figure 11A). Thus, the DNA sequences at the splicing C junction are completely consistent with alternative splicing of Ich-1s.
As the result of an insertion of an intron between coding regions, the open reading frame of Ich-1s is divided into two: the first encodes a 312 amino acid peptide homologous to the P20 subunit of hICE. The second encodes a 15 235 amino acid peptide homologous to a part of the P20 subunit and the subunit of hICE. The second is nearly identical to mouse nedd-2 (Figure The data suggest that only the first open reading frame is translated in cells. A schematic diagram of Ich-1, and Ich-ls is shown in Figure 11B.
Ich-lL contains similarities to both ICE (27% identity and 52% 0 "20 similarity) and ced-3 (28% identity and 52% similarity) (Figures 12A-12B and 12C). Thus, the homology between Ich-1 and ced-3, Ich-1 and ICE is about equal.
Ich-1 is expressed in many tissues and THP-1 cells which express interleukin-lp converting enzyme To characterize the function of Ich-1, the expression pattern of Ich-1 was examined. Northern blot analysis of human fetal heart, brain, lung, liver and kidney tissue was done using the insert of pBSH37 as a probe. The probe hybridizes to both Ich-1 s and Ich- transcripts. The analysis showed that 4 kb Ich-1-mRNA is expressed at the same low levels in all tissues examined. When the same Northern blot (completely stripped of the previous probe) was analyzed using the Ich-1s 61 bp intron as a probe (which hybridizes to Ich-1s transcript only), it showed that Ich-Is is expressed in a larger amount in the embryonic heart and brain than in the lung, liver, and kidney. This result demonstrates that in the embryonic lung, liver and kidney, Ich-lL is expressed to a greater extent than Ich-1s is. In Northern blot analysis of adult RNA with the pBSH37 probe, Ich-1 is detected in all the tissues examined. The level is higher in placenta, lung, kidney and pancreas than in heart, brain, liver and skeletal muscle.
To study the expression of Ich-1, and Ich-Is during mouse embryonic development, a quantitative RT-PCR analysis was developed using specific primers that differentiate between Ich-l and Ich-ls. Primers were synthesized that flank the 61 bp intron sequence of Ich-1 s The two primers are located in separate exons separated by a 2.8 kb intron in genomic DNA. Thus, the possibility of genomic DNA contamination was eliminated. Ich-l, and Ich-ls were amplified simultaneously to produce DNA fragments of 172 bp and 233 bp, respectively. The cDNA templates were reverse-transcribed from mRNA isolated from thymus, adult heart, adult kidney, embryonic 15d brain, and adult brain. Negative (no DNA template) and positive (Ich-1s and Ich-1,) controls were used. Actin primers were used on one set of each sample. Analyses showed that only expression of Ich-l, can be detected in thymus while the expression of both Ich-1L and Ichs can be detected in heart, kidney, and both embryonic and adult brain. The expression of Ich-1s was found to be highest in embryonic brain by this PCR analysis. The results are consistent with Northern blot analysis described above. The results were reproducible among multiple mRNA preparations.
To examine whether Ich-1 and hlCE are expressed in the same cells, a Northern blot of THP-1 and U937 cells was analyzed with the Ich-1 probe, pBSH37. hICE expression has been detected in these cells (Thorberry, N. A., et al.,-Nature 356:768-774 (1992); Cerretti, D. et al., Science 256:97-100 (1992)). The inventors found that Ich-1 can be detected in THP.1 and U937 cells. Thus, both Ich-1 and hlCE are expressed in THP. 1 and U937 cells.
Ich-1s, Ich-1L and hlCE expression were compared in human cell lines.
Using a similar quantitative RT-PCR analysis, the expression of Ich-1L, Ich-1s and hlCE were compared in HeLa, Jurkat, THP. 1 and U937 cells. Ich-1L and Ich-1s were amplified simultaneously to produce DNA fragments of 234 bp and 295 bp, respectively. hlCE was amplified as a fragment of 191 bp. cDNA templates were reverse-transcribed from mRNA isolated from HeLa, Jurkat, THP.1, and U937 cells. Negative (no DNA template) and positive (hICE cDNA) controls were used. pBSH37 and pBSH30 were used as positive controls for Ich-lt and Ich-1s expression. Chicken actin cDNA was used as a positive control for actin. Expression of Ich-1s was detected in HeLa and Jurkat cells but not in THP.1 and U937 cells. Both hlCE and Ich-1 transcripts are present in relatively high levels in HeLa cells. The level of Ich-1 transcript is higher than that of hICE transcript in Jurkat cells. Both hICE and Ich-1 expression is detected in THP.1 and U937 cells.
Using a quantitative RT-PCR analysis, the inventors examined the expression of hICE and Ich-I in the normal living T-cell hybridoma D011.10 cells (Haskins, et al., Exp. Med. 157:1149-1169 (1983)) and dying DOl1.10 cells (serum-deprived). The expression of both hICE and Ich-1 can be detected in DO 1.10 cells. Interestingly, the expression levels of both Ich-lL and hlCE appear to increase in dying D011.10 cells.
Overexpression of Ich-1L induces rat-1 fibroblast death To examine the function of Ich-1L, the same transient expression system used for ICE (Miura, et al., Cell 75:653-660 (1993)) was used to determine if overexpression of Ich-1 induces programmed cell death. The human Ich-l1 cDNA was fused with the E. coli lacZ gene and the fused gene was placed under the control of the chicken p-actin promotor (ppactH37Z). This fusion gene was transfected into Rat-1 cells by lipofectamine-mediated gene transfer and the expression of the gene was examined using the X-gal reaction. Results showed that most of the blue (X-gal-positive) Rat-1 cells transfected with ppactH37Z were round. These results are similar to those obtained with cells transfected with mICE-lacZ fusion sequence shown in Table 2. In contrast, most blue cells transfected with vector alone were flat and healthy. This result suggests that the expression of Ich-IL induces Rat-1 cells to die.
To examine whether the cell death induced by Ich-1 has any cell type specificity and to compare its effect with that of mICE, mlCE-lacZ and Ich-1lacZ fusion constructs were transfected to HeLa cells, NG108-15 cells, Rat-1 cells, and COS cells. These cells thus expressed mlCE-lacZ Ich-lt-LacZ (ppactH37Z) and Ich-1 -ac-Z (ppactH30Zl) and lacZ control (pactpgal'). The cell killing effect was assayed as for Table 2. The results are shown in Table 3.
a o a 4 4 4** 4 4 4** *4 S S S 4
S
-66- TABLE 3 Expression cassettes pactpgal' ppactM I OZ ppactH37Z ppactH37ZCS ppactH137ZAT ppactH30Zl1 Cos 13 EFFECTS OF ICH-1 OVEREXPRESSION ON HeLa NG108-15 1(983) 2.9+0.2(1020) 4.2 ±0.2(1535) 93.9±0.3(1003) 80.2±0.5(1545) CELLS IN CULTURE Rat-i Rat-1/bcl-2 2.9±0.2(1470) 3.4 ±0.2(1446) 94.2± 1.1(978) 28.8+0.5(691) 11.0±0.2(1080) 8.3 ±0.9(1053)
ND
ND
1.3+±0.2(676) Rat-1cmnA 3.7±0.1(1459) 45.8 ±1.6(233) 80.7 ±0.9(1010)
ND
ND
ND
91.4 ±0.2(1076) 5.6±0.1(1039) 8.2 +0.7(435) 0.0±+0.0(40) 68.7±1.5(1605) 5 .9 ±0.9(707) 5.2 ±0.2(640) 0.0 ±0.0(61) 92.1 ±0.3(1079) 4.1±0.2(1477) 5.4 ±0.3(1356) 1.8±0.4(785) 21.5 ±3.2(1335)
ND
ND
ND
The nICE-LacZ (ppactM lOZ), Ich-JL-laCZ (pJpactH37Z), Ich-JL(S303C)-lacZ (ppactH37ZCS), Ich- ]L(T352A)-lacZ (pfpactH37ZAT), Ich-1 5 -lacZ (p~actH30Zl) and control vector alone (pactpgal') were transiently transfected into Rat-i cells, Rat-i cells expressing human Mc-2, Rat-i cells expressing cowpox virus crmA gene, HeLa cells, NG108-15 cells and COS cells. Cells were fixed lightly 24 hr after transfection and stained with X-Gal for 3 hr. The data (mean±SEM) shown are the percentage of round blue cells among total number of blue cells counted. The numbers in the parentheses are the number of blue cells counted. The data were collected from at least three independent experiments. ND not determined.
Compared to controls, the cytotoxic effect of Ich-1 and mICE exhibit certain cell type specificities. Expression of Ich-1 or mICE kills Rat-1 cells and HeLa cells effectively dead). NG108 cells are more resistant to Ich-1 and mICE expression than Rat-1 cells and HeLa cells (68-80% dead). Expression of Ich-1 or mICE cannot kill COS cells.
To confirm that the cell death caused by Ich-l expression is apoptosis, the inventors examined the nuclear morphology of the cell death induced by Ich-1 expression. Rat-1 cells were transiently transfected with control pactpgal' vector and 24 hours later, fixed and stained by anti-P-galactosidase antibody or by Hoechst dye 33258 using a protocol from Miura et al. (1993). The nuclear morphology in P-galactosidase-expressing cells is normal and non-condensed.
Rat-1 cells were also transiently transfected with ppactH37Z expressing Ich-1,.
The nuclei of round cells expressing the Ich-l-lacZ chimeric gene were condensed and fragmented. This is one of the characteristics of cells undergoing apoptosis. Thus, the results suggest that overexpression of Ich-l, like that of mICE, causes Rat-1 cells to undergo programmed cell death.
To determine the structure and function of Ich-1L protein, two mutant Ich-1L fusion proteins were made: the first is a Ser Cys 303 in the active site of Ich-1L; the second is a Thr Ala 352 in the putative P10 subunit (Fig. 12A- 12B). The Ala 352 in P10 is an amino acid residue of ced-3 that is conserved in Ich-1L but not in ICE. The mutant Ich-l-lacZ fusion constructs were transfected into Rat-1 cells and expression was examined by the X-gal reaction.
The analysis revealed that the S303C (ppactH37ZCS) and T352A (ppactH37ZAT) mutations eliminated the activity of Ich-l, completely (Table These results suggest that the ability of Ich-lL to cause cell death depends upon its enzymatic activity and that only some characteristics of ced-3 are conserved in Ich-lL.
The cell death induced by overexpression of ICE can be inhibited by bcl- 2 and crmA (Miura, et al., Cell 75:653-660 (1993)). To examine if the cell death induced by expression of Ich-1L could also be inhibited by bcl-2 and crmAIch-lt-lacZ fusion construct (ppactH37Z) was transfected into Rat-1 cells that overexpress either bcl-2 or crmA. Cell death was assayed as described for Table 3. The results showed that the cell death induced by overexpression of Ich-1L could be inhibited effectively by bcl-2 but only marginally by crmA.
Expression of Ich-1 s protects Rat-1 fibroblast death induced by serum removal Since Ich-ls contains two open reading frames, it was important to determine which reading frame is functionally translated (Figure 11A). Ich-1 s was translated in the presence of "S-methionine using in vitro transcribed RNA in a reticulocyte lysate as described in Experimental Procedures. The translated products were run on an SDS-polyacrylamide gel with molecular weight standards. Ich-1 s antisense RNA was used as a negative control. Results showed that only the first reading frame was translated.
Second, E. coli lacZ gene was fused to the ends of the first .15 (ppactH30Z1) and second (ppactH30Z2) open reading frames. The constructs were separately transfected into Rat-1 cells and the cells were assayed for color S* using the X-gal reaction. Results showed that when the lacZ gene was fused to the end of the first open reading frame, blue cells could be detected. Blue cells were not detected when the lacZ gene was fused to the second open reading frame. Thus, it is likely that only the first open reading frame is used in vivo.
To characterize the function of Ich-ls, the ability of ppactH30Zl to cause cell death was examined. ppactH30Zl was transfected into Rat-1 cells, COS cells, HeLa cells and NG108-15 cells, and the X-gal reaction was developed as before. The analysis showed that the expression of ppactH30Z1 did not cause cell death (Table 3).
To examine if Ich-1s has any protective effect against cell death, a stable Rat-1 cell line that expresses Ich-ls was established. The cDNA Ich-1s was cloned into pBabepuro retroviral expression vector (Morgenstern et al., Nucl.
Acids-Res. 18:3587-3596 (1990)) and transfected into Rat-1 cells. The stable transfectants were selected in puromycin and individual clones were assayed for expression of Ich-ls by Northern blot analysis. The clones that expressed Ich-1 s were used for analysis and the clones that did not express Ich-1 s were used as negative controls together with untransfected Rat-1 cells. Nomarski micrographs were taken on days 0, 2, 3, and 4 of control Rat-1 cells, Ich-ls non-expressing Rat-1 cells, crmA expressing Rat-1 cells, and bcl-2 expressing Rat-1 cells in serum-free medium. Trypan blue assay was also performed.
When plated to non-confluent density and washed carefully, 90% of Rat- S 10 1 cells die in serum-free medium. However, under these conditions, Rat-1 cells expressing bcl-2 or crmA are resistant to death (Fig. 13). When the ability of the stable Rat-1 cell lines that express human Ich-ls was tested under serum-free conditions, it was found that they are more resistant to serum deprivation than parental Rat-1 cells and negative control transfectants not expressing Ich-ls (Fig. 13). These experiments suggest that Ich-ls has the ability to prevent cell death.
Ich-1s may prevent cell death by inhibiting Icih-l. The inventors thus examined whether Rat-1 cells express Ich-1. Using mouse Ich-1 cDNA as a oo probe, an mRNA species predictive of the Ich-1 transcript was detected in Rat-1 cells under low stringency conditions.
Discussion The isolation and characterization of Ich-1, a mammalian gene belonging to the cell death gene family of ICE/ced-3, has been described. Two distinct Ich-1 mRNA species have been identified (Ich-1L and Ich-1s). These two cDNAs differ in the 5' region around the translation initiation and in the middle region. The difference in the middle is the result of alternative use of two different 5' splicing donor sites.
The Ich-1 gene is expressed at low levels in embryonic and adult tissues.
Ich-1s is expressed at higher levels than Ich-l in embryonic heart and brain.
The converse is true in embryonic lung, liver and kidney. Expression of Ich-1s can be detected in all tissues examined except thymus. The expression of both hlCE and Ich-1 can be detected in THP.1 cells, HeLa cells, Jurkat cells, U937 cells, and D011.10 cells. The expression of both hlCE and Ich-1L appears to increase in dying cells under serum deprived conditions. Overexpression of Ich-1, in rat fibroblast cells caused programmed cell death prevented by bcl-2.
This suggests that Ich-1 is a programmed cell death gene. Overexpression of Ich-ls, however, did not cause cell death. Stable expression of Ich-1 s prevented Rat-1 cell death induced by serum deprivation. The collective results show that Ich-1 encodes protein products that regulate cell death positively and negatively.
The mouse nedd-2 gene was originally isolated by Kumar et al.
(Biochem. Biophy. Res. Comm. 185:1155-1161 (1992)). The nedd-2 gene 15 was identified as having a transcript of 3.7 kb that is abundantly expressed in embryonic day 10 mouse brain and almost undetectable in adult brain. The o nedd-2 cDNA isolated contained an open reading frame of 171 amino acids and long 5' and 3' untranslated regions with stop codons in all reading frames. The 171-amino-acid open reading frame is homologous to P10 subunit of hICE and 20 the C-terminal part of ced-3 (Yuan, et al., Cell 75:641-752 (1993)). The amino acid sequence of the C-terminal part of Ich-lL is 87% identical to the amino acid sequence of the mouse nedd-2 protein from residues 42 to 172 (the first 41 amino acids are different because of the presence of the 61 bp intron).
In mouse, nedd-2 is a unique gene Wang, unpublished data). Thus, human Ich-1 and mouse nedd-2 must be the same gene.
In the Northern blot analysis described herein, Ich-1 expression in human fetal brain is not high compared to other tissues tested (heart, lung, liver and kidney) and does not appear to be significantly down-regulated in adult brain. Part of the difference could be explained by the different developmental stages tested: mouse E10 versus human 20-26 week old fetuses. However, Ich-1 expression can be detected in human and mouse adult tissues.
In the studies herein, amplification of the 5' untranslated regions of the mouse nedd-2 cDNA that Kumar et al. reported was not achieved. It is possible that the 5' untranslated region in the Kumar et al. clone was a product of incompletely processed nedd-2 mRNA. Both Ich-1 mRNAs are about 4 kb; since the cDNA clones described herein are 2.5 kb and 2.2 kb for Ich-1L and Ich-ls, respectively, these cDNAs are incomplete. However, since they are fully functional in the assay reported herein, the complete coding regions should 10 be encoded in these two cDNAs.
Ich-1 is a new member of the ICE/ced-3 family of cell death genes.
Thus, unlike C. elegans, mammals must have multiple members of ICE/ced-3.
Ich-1 is even slightly more homologous to ced-3 than mICE. The cell death induced by overexpression of Ich-1 was poorly inhibited by crmA. This result is similar to that with ced-3 (Miura, et al., Cell 75:653-660 (1993)).
The nucleotides corresponding to the two amino acid residues of ced-3 that are conserved in Ich-1, but not in ICE, were mutagenized. Results showed that T352A completely eliminated the ability of Ich-1 to cause cell death, despite the fact that the corresponding amino acid in ICE is a Ser. These data also suggest that Ich-1 is mechanistically more similar to ced-3 than ICE, and that Ich-1 and ICE may have evolved independently from ced-3.
The overexpression of ICE and Ich-1 can kill Rat-1 cells and HeLa cells effectively but NG108 cells only moderately. It is possible that NG108 cells express a higher level of ICE and Ich-1 inhibitors. COS cells are completely resistant to the cell killing activity of ICE and Ich-1. COS cells may lack either the activator or the substrates of ICE and Ich-1. This result also suggests that the cytotoxic effects of ICE and Ich-1 have certain specificity and are unlikely to be caused by random cleavage activities of proteases.
Ich-1 encodes protein products that prevent or cause cell death, depending on how the mRNA is processed. Similar regulation has been observed with bcl-x, a bcl-2 related gene (Boise et al., 1993). The bcl-x transcripts can also be processed in two different ways: the larger mRNA, bclxt, encodes a bcl-2 related protein product that can inhibit cell death.
Alternative splicing of bcl-x transcript generates another smaller transcript, bclxs. This encodes an internal truncated version of bcl-x that inhibits the ability of bcl-2 to enhance the survival of growth factor-deprived cells. Control of the RNA splicing may be an important regulatory point in programmed cell death.
Ich-1 s could act to prevent cell death by inactivating the activator of cell death or by directly inactivating Ich-1L. In the transient transfection assay, the 10 expression of Ich-lt-lacZ fusion gene and the ICE-lacZ fusion gene kill the stable Ich-ls-expressing cells as efficiently as the control Rat-1 cells Wang, unpublished data). Thus, unlike crmA or bcl-2, the inhibition of cell death by Ich-ls may be highly dosage-dependent. This could explain why the expression of Ich-1s provided only partial protection of serum deprived Rat-1 cells.
o Possibly, only cells expressing high levels of Ich-ls are protected.
cnrA has the ability to suppress cell death induced by overexpression of Ich-1L. The amino acid sequence of crmA is homologous to the members of the serpin superfamily (Pickup et al., 1986), which usually inhibit serine proteases by acting as pseudosubstrates. The nature of interaction of ICE and crmA protein is likely to be similar to the interaction of other serpin and serine proteases. The inhibition of ICE family members by cnnA may depend upon both the affinity and relative concentration of ICEs and crmA. The fact that cnnA can suppress a certain percentage of cell deaths induced by overexpression of the Ich-lL suggests that crmA and Ich-1 can bind to each other. It is possible that when the Ich-1 concentration is lower, crmA may be able to suppress cell death induced by Ich-1 to a greater extent. Microinjection of crmA expression construct can effectively suppress the death of dorsal root ganglia neurons induced by nerve growth factor deprivation (Gagliardini, et al., Science 263:826-828 (1994)). One or more ICEIced-3 family members may be responsible for neuronal cell death. When crmA expression construct is -73microinjected into neurons, the transient concentration of crmA may be very high. Thus, it is possible that crmA may be able to suppress multiple members of ICEIced-3 family under such conditions despite the fact that their affinity to crmA is not very high.
Since the expression of Ich-1 and ICE can be detected in the same cells, the results described herein suggest that multiple members of ICE/ced-3 family may contribute to cell death induced by a single signal. There are three possible ways that Ice and Ich-1 may act to cause cell death. First, Ich-1 may activate ICE, directly or indirectly, to cause cell death. Second, ICE may inactivate Ich-1, directly or indirectly, to cause cell death. Third, ICE and Ich-1 may act in parallel to cause cell death. In the first scenario, the inhibitor of ICE should inhibit cell death induced by Ich-1. In the second scenario, the inhibitor of Ich-1 should inhibit the cell death induced by ICE. To test this e hypothesis, specific inhibitors for each member of ICH are necessary. For the reasons discussed above, it seems likely that crmA can inhibit other members of ICE/ced-3 family as well. These models can be tested directly by "knockout" mutant mice in which a specific member of the ICE/ced-3 family is mutated.
Experimental Procedures Cloning and construction of plasmids The mouse nedd-2 cDNA was isolated using embryonic mouse brain cDNA and the primer pairs specific for the 5' and 3' untranslated regions and the coding region. Primers nedd2/1 (SEQ ID NO: 67) CAACCCTGTAACTCTTGATT-3') and nedd2/2 (SEQ ID NO: 68) ACCTCTTTGGAGCTACCAGAA-3') were used for amplifying the untranslated region. Primers nedd2/3 (SEQ ID NO: 69) CCAGATCTATGCTAACTGTCCAAGTCTA-3') and nedd2/4 (SEQ ID NO: (5'AAGAGCTCCTCCAACAGCAGGAATAGCA-3') were used for amplifying the nedd-2 coding region. Primer nedd2/5 (SEQ ID NO: 71) (AGAAGCACTTGTCTCTGCTC) and nedd2/6 (SEQ ID NO: 72) TTGGCACCTGATGGCAATAC-3') were used for amplifying the 3' untranslated region. 0.5 kb PCR product of nedd-2 coding region was cloned into pBluescript plasmid vector to be used as a probe (Stratagene).
Two human fetal brain cDNA libraries and one mouse embryonic 11.5d cDNA library were screened with murine nedd-2 cDNA probe at low stringency and one mouse embryonic 11.5d cDNA library. The filters were hybridized in 10 5x SSPE, 30% formamide, lx Denhardt's solution, 1% SDS at 42 0 C overnight and washed in Ix SSPE and 0.5% SDS, twice at room temperature and twice at 45 0 C (20 min). The human Ich-ls (pBSH30) was isolated from the positive clones using a BamHI-Sall fragment of the murine nedd-2 cDNA, a 70 bp fragment which contains 52 bp of the 61 bp intron, as a probe under the same 15 hybridization and washing conditions described above. The phage clones (pBSH37 for Ich-lL, pBSH30 for Ich-1s) were excised in vivo to obtain plasmids by an in vivo excision protocol (Stratagene). To construct expression constructs, PCR was performed using synthetic primers. H1 (SEQ ID NO: 73) (5'-GATATCCGCACAAGGAGCTGA-3') and H2 (SEQ ID NO: 74) .20 CTATAGGTGGGAGGGTGTCC-3') were used for Ich-1L construction. H3 (SEQ ID NO: 75) (5'-GATATCCAGAGGGAGGGAACGAT-3'), corresponding to sequences in the 5' region of Ich-ls cDNA and H4 (SEQ ID NO: 76) (5'-GATATCAGAGCAAGAGAGGCGGT-3'), corresponding to the sequences in the 3' region of the first open reading frame (ORF) of Ich-1 s were used for the first ORF of Ich-ls construction. H3 and H5 (SEQ ID NO: 77) (5'-GATATCGTGGGAGGGTGTCCT-3'), corresponding to the sequences in the 3' region of the second ORF of Ich-1 s were used for the second ORF of Ich-1 s construction. pBSH37 and pBSH30 were used as templates where appropriate. The three PCR products were inserted into the EcoRV site of pBluescript II, and the inserts of these subclones, ppSIIh37, ppSIIh30.1, and pPSIIh30.2, were isolated by digestion with SmaI and KpnI and cloned into SmaI-KpnI sites of BSLacZ (Miura, et al., Cell 75:653-660 (1993)). NotI linkers were added to the KpnI site by digesting with KpnI, blunt-ending by T4 polymerase and ligating in the presence of excess NotI linker. These constructs, BSh37Z, BSh30Z1, and BSh30Z2, were digested with NotI and individually cloned into ppactstneoB (which uses the chicken P-actin promoter) (Miyawaki, et al., Neuron 5:11-18 (1990)). The final plasmids were designated ppactH37Z, ppactH30Z1 and ppactH30Z2, respectively. pBabeH30 plasmid, used for establishing stable Rat-1 cell lines carrying Ich-1 s was constructed by 10 inserting the full length Ich-1s cDNA into the Sail site of pBabe/puro vector (Morgenstern, J. et al., Nucl. Acids Res. 18:3587-3596 (1990)).
To mutagenize Cys 303 to a Ser residue in the active domain of Ich-1L and Ala 352 to a Thr residue in the P10 subunit of Ich-l, primers containing *oo* mutant sites were synthesized as follows: HM1 5'-ATCCAGGCCTCTAGAGGAGAT-3' (SEQ ID NO: 78) HM2 5'-ATCTCCTCTAGAGGCCTGGAT-3' (SEQ ID NO: 79) HM3 5'-TGCGGCTATACGTGCCTCAAA-3' (SEQ ID NO: HM4 5'-TTTGAGGCACGTATAGCCGCA-3' (SEQ ID NO: 81) .i (HM1 corresponds with HM2 and HM3 corresponds with HM4. PCRs were performed in two steps. To make the Cys 303 to Ser mutation, in the first round of PCR, the fragments from the N-terminus to the mutation site of Ich-1, and from the mutant site to the C-terminus of Ich-1L were synthesized using two primer pairs, T3 and HM1, HM2 and T7, and PBSH37 as a template. In the second round of PCR, the two PCR fragments generated in the first reaction were used as templates and T7 and T3 were used as primers. Two such rounds of PCR generated a full length Ich-1. mutant. The other mutation was generated in a similar way, using T3 and HM3, HM4 and T7, for the Ala 352 to Thr mutation, as primers for the first PCR. The PCR products were inserted into the EcoRV site of pBluescript II and sequenced. The mutant cDNA inserts were cloned into expression vectors as described above. The mutated clones were designated ppactH37ZCS and ppactH37ZAT.
Cell culture and functional studies COS cells, Rat-1 cells, HeLa cells, and NG108-15 cells were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal calf serum (FCS). The day before transfection, cells were seeded at a density of about 2.5x10 5 in each of the 6-well dishes. For each well, 0.7-lg of the lacZ chimeric construct and 10ig of lipofectamine reagent were used according to a protocol from GIBCO BRL (Gaithersburg, MD). The cells were incubated S"10 for 3 hr in serum-free medium containing DNA and lipofectamine. Then an equal volume of growth medium containing 20% serum was added without removing the transfection mixture and incubation was continued for 24 hr. The expression of the chimeric gene in cells in culture was detected as previously described (Miura, et al., Cell 75:653-660 (1993)).
15 To establish Rat-I cell lines overexpressing Ich-s, pBabeH30 was o transfected into Rat-1 cells using lipofectamine mediated gene transfer.
.o Resistant cells were selected using 3/tg/ml puromycin for about 10 days. Cells o were assayed for expression of Ich-ls by Northern blot analysis. To examine whether Ich-1s can render Rat-1 cells resistant to apoptosis under conditions of serum deprivation, Rat-1 cells overexpressing Ich-1 s untransfected control Rat- 1 cells, transfected negative control Rat-1 cells, and Rat-1 cells overexpressing bcl-2 or crmA, were seeded in 24-well dishes at 5x10 4 cells in 500 /1 of DMEM containing 10% FCS for 24 hr, washed once with serum-free DMEM, and transferred into 500 il of serum-free DMEM. The cells were harvested at daily intervals and stained with 0.4% trypan blue for 5 min. at room temperature.
The numbers of dead and living cells were counted using a haemocytometer.
Jurkat cells, THP.1 cells, and U937 cells were cultured in RPMI 1640 medium (GIBCO) with 10% fetal calf serum.
RNA analysis The Multiple Tissue Northern (MTN) blots membrane of human fetal and adult tissues (CLONTECH) were probed using human Ich-1L cDNA or the intron of Ich-ls cDNA (for fetal tissue) in 5x SSPE, 10x Denhardt's solution, 50% formamide, 2% SDS, and 100/ig/ml salmon sperm DNA at 42"C overnight. The blots were washed twice in 2x SSPE and 0.05% SDS at room temperature, and twice in 0.lx SSPE, and 0.1% SDS for 20 min. at The Multiple Tissue Northern (MTN) blot membrane of human fetal tissue was first probed with a 1.3 kb fragment from the insert of pBSH37, which hybridizes to both Ich-1 and Ich-1s. The blot was exposed for two days and developed. Then the blot was stripped by boiling the filter in HO twice for 20 min. After stripping, the filter was re-exposed for three days to ensure that the stripping was complete. Then the filter was re-hybridized with the bp BamHI-Sall fragment derived from mouse nedd-2 gene. This fragment contains 52 bp of the 61 bp intron (which is identical to the human Ich-1 s intron; the remaining 18 bp are from an exon and 5 out of the 18 bp are different from the human Ich-I sequence). The Northern blot of THP. 1 cells and U936 cells was carried out under the same conditions. To detect Ich-1 expression in Rat-1 cells, hybridization was carried out in 25% formamide and under otherwise identical conditions.
Cloning of murine Ich-1 A murine Ich-1 cDNA was cloned in two steps by PCR. A 5' murine Ich-I cDNA fragment was amplified using a primer derived from the pBSH37 end sequence (SEQ ID NO: 82) ATTCCGCACAAGGAGCTGATGGCC and a primer from the mouse nedd-2 after the active site sequence (SEQ ID NO: 83) GCTGGTCGACACCTCTATC using a mouse embryo cDNA library as template (a gift from D. Nathan). The resulting 945bp fragment was cloned into the EcoRV site of plasmid pBSKII (Stratagene) and sequenced. A 3' murine Ich-1 cDNA fragment was amplified using a human Ich-1 primer further downstream (SEQ ID NO: 84) CAAGCTTTTGATGCCTTCTGTGA 3' and a nedd-2 primer downstream from the coding region (SEQ ID NO: CTCCAACAGCAGGAATAGCA The resulting fragment was also cloned in pBSKII and sequenced. The two fragments were then joined together using an unique Sail site at nucleotide 930 from the beginning of the coding region.
S. 1.0 Cloning of chicken Ich-1 A chicken Ich-1 cDNA fragment was obtained using murine Ich-1 degenerate primers. The 5' primer was from murine Ich-1 nucleotide 241 to 268 bp (SEQ ID NO: 86): GC(GATC)TT(TC)GA(TC)GC(GATC)TT(TC)TG(TCG)GA(AG)GC3'. The 3' primer was from murine Ich-1 nucleotide 883 to 908 bp (SEQ ID NO: 87): 'CA(GATC)GC(CT)TG(TAG)AT(AG)AA(AG)AACAT(CT)TT(GATC)GG3'.
The resulting DNA fragment was used as a probe to screen a chicken embryonic library (Clontech) by high stringency hybridization.
Quantitative PCR analysis mRNA was isolated using the MicroFast mRNA isolation kit from Invitrogen. 1 to 2 jig of mRNA was used from reverse transcription by random priming (Invitrogen) using MMLV (Moloney Murine Leukemia Virus) reverse transcriptase (Invitrogen). The primers used to amplify murine Ich-1: 5' primer (SEQ ID NO: 88) ATGCTAACTGTCCAAGTCTA and 3' primer (SEQ ID NO: 89) GTCTCATCTTCATCAACTCC The primers used to amplify human Ich-1: 5' primer (SEQ ID NO: -79- GTTACCTGCACACCGAGTCACG and 3' primer (SEQ ID NO: 91) GCGTGGTTCTTTCCATCTTGTTGGTCA The primers used to amplify hlCE: 5' primer (SEQ ID NO: 92) ACCTTAATATGCAAGACTCTCAAGGAG and 3' primer (SEQ ID NO: 93) GCGGCTTGACTTGTCCATTATTGGATA Mouse P-actin primers were used as controls to amplify a 350 bp actin fragment from mouse and human tissue. 5' p-actin primer (SEQ ID NO: 94): GACCTGACAGACTACCTCAT 3' p-actin primer (SEQ ID NO: 95): AGACAGCACTGTGTTGGCAT The following conditions were used for the PCR reactions: 1 x reaction buffer (Promega), 1.5 mM MgCI,, 200 /M •dNTP, 2 pM each primers, 1 unit of Taq DNA polymerase (Promega) in a total volume of 50 1. The DNA was denatured for 4 min at 94°C prior to 26 PCR cycles (94°C, 1 min/55 0 C, 1 min/72°C, 2 min). In some of the experiments, Vent polymerase (Biolab) was used.
In vitro transcription and translation of Ich-1s To determine which open reading frame of Ich-1 was expressed, pBluescript plasmid containing Ich-1 5 (pBSH30) was linearized at the 3' multiple cloning site with XhoI, purified, and transcribed with T3 RNA polymerase for 2 hr at 37°c using a protocol from Stratagene. The plasmid was also linearized at the 5' multiple cloning site with NotI, purified, and transcribed with T7 polymerase as an antisense control. The resulting runoff transcripts were extracted with phenol-chloroform and ethanol precipitated. In vitro translation was performed with rabbit reticulocyte lysate (Promega) in the presence of "S-methionine for 1 hr. at 30 0 C. 5pl lysate was mixed with equal volume of 2xSDS gel loading buffer and subjected to SDS-polyacrylamide gel electrophoresis The gel was dried and exposed to X-ray film.
Example 6 Experimental Procedures Cells and tissue culture HeLa cells were grown in Dulbecco's modified Eagle's medium (DMEM) with 10% fetal bovine serum (FBS). HeLa cells were transfected with pHD1.2 crmA expression vector (Gagliardini, V. et al., Science 263:826-828 (1994)) by calcium phosphate precipitation and two days after transfection, 600 pg/ml of G418 (Gibco) was added for selection. Resistant S*.i colonies were cloned by limiting dilution. Dosage response of HeLa and HeLa/crmA cells to TNF-ca treatment was tested as follows. Cells were seeded in DMEM plus 10% fetal calf serum in a 24-well plate at a density of 4x10 4 cells per well. After overnight incubation, the cells were washed twice with serum-free DMEM. Drugs were then added to a total volume of 0.2 ml of serum-free DMEM and the cells were incubated for 24 hours. Cells were then trypsinized and dead cells scored on a hemocytometer by trypan blue exclusion (Sigma, St. Louis, MO). At least 200 cells were scored per well. Each concentration was tested in duplicate each time.
Western blotting Cells were lysed in SDS sample buffer and cell lysates were subjected to 15% SDS-PAGE. After electroblotting the proteins to an Immobilon nylon membrane (Millipore), the membrane was blocked with 4% nonfat milk in mM Tris-HCl pH7.5, 150 mM NaCI, 0.2% Tween (TBST). The membrane was incubated with anti-crmA antibody (Gagliardini et al., Science 263:826-828 (1994)) (5 pg/ml) for 1 hour at room temperature and then washed five times with TBST. The membrane was incubated with HRP-conjugated goat antirabbit IgG (1/1000 dilution, Amersham) for 30 minutes and washed five times with TBST. crmA protein was detected with an ECL detection kit (Amersham).
DNA transfection One day before transfection, cells were seeded at a density of about 2 xlO0 per well in 6-well dishes. For each well, 1 u±g of plasmid DNA and 10 jig of lipofectamine reagent was added according to a protocol from Gibco BRL.
Cells were incubated for 3 hours in serum-free medium containing DNA and lipofectamine, and then medium was changed to DMEM containing 10% FBS and incubation was continued for 24 hours. The expression of chimeric gene was detected as previously described (Muira et al., Cell 75:653-660 (1993)).
Detection of IL-1 production from HeLa cells HeLa cells were grown overnight in medium containing 10% fetal calf S: serum and then the medium was changed to a serum-free DMEM with or without drugs. After 24 hours, cells were scraped off and precipitated.
15 Conditioned medium was collected, dialyzed against distilled water at 4°C overnight, lyophilized, and the residue dissolved in distilled water. Cell precipitates were extracted with extraction buffer (20 mM HEPES-NaOH pH7.4, 10 mM KCI, 1.5 mM MgC12, 0.5 mM EDTA, 10 j/g/ml PMSF, /ig/ml E64, 2 jg/ml pepstatin, 1 pg/ml leupeptin, 0.5 jig/ml aprotinin, 1% NP- 40). Insoluble materials were removed by centrifugation. Proteins were separated by 15% SDS-PAGE and IL-13 was detected by immunoblotting using anti-human IL-1p antibody (1/300 dilution, Calbiochem).
-82- Results Establishment of crmA-expressing HeLa cells To test the hypothesis that activation of ICE is responsible for TNF-a induced apoptosis, the inventors first established HeLa cell lines constitutively expressing cowpox virus crmA protein. This protein is a viral serpin that can specifically inhibit ICE activity (Ray, C.A. et al., Cell 69:597-604 (1992)).
HeLa cell clones expressing crmA were analyzed by Western blot analysis with affinity-purified anti-crmA. HeLa cells were transfected with crmA expression vector and selected for G418 resistance as described in 10 "Experimental Procedures" above. Six different G418 resistant HeLa cell clones were analyzed. As a positive control, cell lysate of a Rat-1 cell clone expressing crmA (Miura, H. et al., Cell 75:653-660 (1993)) was applied to the gel. Several HeLa cell clones stably expressing crmA were established.
Overexpression of Ice/ced-3 family gene in HeLa/crmA cells The cell lines were tested for resistance to cell death induced by ICE overexpression. The overexpression of cnnA in HeLa cells suppressed ICEinduced cell death. HeLa cells constitutively expressing crmA were transfected with a chimeric expression vector expressing both the lacZ and the mICE gene Transfection is described above in "Experimental Procedures".
One of the HeLa/crmA clones expressing high levels of crmA could efficiently suppress ICE induced cell death. A clone expressing crmA at low levels also could efficiently suppress the ICE induced cell death under the same conditions (viability 69.8 Table 4.) Table 4. Prevention of apoptosis by CrmA round blue cells Expression cassettes HeLa Hela/CrmA pactpGal' 3.2 0.8 1.1 0.4 85.5 3.0 46.9 ppactH37Z 91.4 4.9 87.5 Drug treatments dead cells control 1.5 0.3 5.5 1.8 CHX 2.9 0.7 3.8 TNF 2.7 1.2 2.8 10 CHX TNF 68.2 1.9 9.7 1.2 Cells were transfected and stained as described herein. Plasmid pactpGal' is a control LacZ gene expression vector and plasmid ppactM10Z is a mouse Ice/lacZ chimeric gene expression vector (Muira, M. et al., Cell 75:653-66 (1993)).
Plasmid ppactH37Z is a Ich-1/lacZ chimeric gene expression vector (Lin, W.
e 15 et al., Cell 78:739-750 (1994). The data (mean SEM) shown are the percentage of round blue cells among total number of blue cells counted. To see the effects of CHX (20 jig/ml) and the TNF-a (5 ng/ml), cells were treated with drugs for 24 h and cell viabilities were measured by typan blue dye exclusion.
*The data (mean SEM) shown are the percentage of dead cells. The data were 20 collected from at least three independent experiments.
CrmA is a potent and highly specific serpin for ICE. However, cell death induced by overexpression of ced-3 is poorly suppressed by crmA (Miura, M. et al., Cell 75:653-660 (1993)). Ich-1 (nedd-2) has been described above as third member of the ICE/Ced-3 gene family (see also, Kumar, S. et al., Genes Dev. 8:1613-1626 (1994)). As with ICE, overexpression of Ich-1 induces Rat-1 and HeLa cell death efficiently. However, Ich-1 induced cell death is weakly suppressed by overexpression of crmA in HeLa cells (Table 4) and Rat-1 cells. Thus, crmA does not appear to be a general inhibitor of ICE/Ced-3 family protease.
Suppression of TNFa-induced apoptosis by crmA The effect of crmA expression on TNF-induced apoptosis was tested.
TNF induced cytotoxicity was suppressed by overexpression of crmA. HeLa cells or HeLa/crmA cells were treated with cycloheximide (CHX) (20 /g/ml, Sigma), TNF-a (5ng/ml, Sigma), or a combination of both drugs. Cells were photographed 24 hours after drug treatment. Control HeLa and HeLa/crmA cells were tested for the ability to resist increasing amounts of TNF-a in the presence of 10 jig/ml CHX (Figure 18). In the presence of CHX, TNF-a efficiently induced HeLa cell death (White, E. et al., Mol. Cell. Biol. 12:2570- 2580 (1992)). Under the same conditions, HeLa cells expressing crmA at high levels are resistant to the TNF-a cell death stimulus (Table A clone of HeLa/crmA cells that expresses lower levels of crmA was also resistant under the same conditions dead cells The dose response of crmA-expressing HeLa cells to increasing amounts of TNF-a in the presence of 10 jig/ml of CHX was tested. HeLa/crmA cells are resistant to 0.Olpg/ml to 100ng/ml of TNF-a (Figure 16). After a 24 hour incubation in the presence of 100ng/ml TNF-a, 83% of the control HeLa cells died compared to 23% of HeLa/crmA cells.
Activation of endogenous ICE after TNF stimulation The inventors have detected the expression of both ICE and Ich-1 in HeLa cells (Lin, W. et al., Cell 78:739-750 (1994)). Since Ich-1-induced cell death is only weakly suppressed by crmA and crmA appears to be very effective in preventing cell death induced by TNF and CHX treatment, the ICE-mediated cell death pathway may be activated by TNF stimulation and may play a role in HeLa cell death. If this is the case, TNF stimulation should activate endogenous ICE in HeLa cells.
Pro-IL-1p is the only known endogenous substrate for ICE. Active ICE is an oligometric enzyme with p20 and pl0 subunits (Thornberry, N.A. et al., Nature 356:768-774 (1992); Cerretti, D.P. et al., Science 256:97-100 (1992)).
These subunits are derived from a p45 precursor form of ICE (Thornberry, N.A. et al., Nature 356:768-774 (1992); Cerretti, D.P. et al., Science 256:97- 100 (1992)). If ICE is activated after TNF stimulation, the endogenous 33kd pro-IL-1p should be processed and mature 17.5kd IL-1P secreted into the medium.
To detect mature IL-1, conditioned medium was collected from HeLa 10 cells with or without TNF stimulation. The processing of pro-IL-lp was analyzed by Western blot. Processing of IL-1P was observed only after induction of apoptosis by TNF-a/CHX in HeLa cells. The following samples were compared: purified mature human IL-1P, cell lysates (10 /ig protein/lane), conditioned medium (5 jg/lane), serum-free controls, LPS jg/ml, Sigma) treatment, cycloheximide (20 jg/ml) and TNF (5 ng/ml).
The procedure for detecting IL-1p is described above under "Experimental Procedures". Cell viabilities were measured by trypan blue dye exclusion (97.4 1.5% for serum free control, 97.4 0.2% for LPS treatment, 56.3 2.2% for CHX/TNF-a treatment). The data showed that 10 mature IL-1 was only observed after induction of apoptosis by TNF. These results strongly suggest that TNF stimulation induces apoptosis by activation of an ICE-dependent cell death pathway.
Discussion The inventors have demonstrated herein that overexpression of ICE induces Rat-1 cells to undergo apoptosis (Miura, M. et al., Cell 75:653-660 (1993)) and expression of crmA can prevent chicken DRG neurons from cell death induced by trophic factor deprivation (Gagliardini, V. et al., Science 263:826-828 (1994)). These results show that ICE has the ability to induce cell death and that inhibition of ICE activity can prevent programmed cell death.
However, the results did not show that ICE was, indeed, activated during programmed cell death. Using pro-IL-1P processing as an indicator, the inventors have demonstrated that ICE is activated when HeLa cells are induced to die with TNF-a and CHX.
ICE has unique substrate specificity which requires an Asp in the P1 position (Sleath et al., J. Biol. Chem 265:14526-14528 (1990)). Only two eukaryotic proteases are reported to cleave after the Asp. The other is granzyme B, a serine protease in the cytotoxic granules of killer T lymphocytes 10 (Odake et al., Biochemistry 30:2217-2227 (1991)). In THP.1 cells and HeLa cells, expression of both ICE and Ich-I were detected (Wang et al., Cell 78:739-750 (1994)). However, affinity labeling of THP.1 cell lysates with a competitive, irreversible ICE inhibitor, biotinylated tetrapeptide (acyloxy) methyl ketone, resulted in labeling only of ICE (Thornberry et al., Biochemistry 33:3934-3940 (1994)). This suggests that ICE is the only enzyme to cleave proIL-P in the human monocytic cell line, THP. 1. The present studies show that crmA cannot prevent cell death induced by Ich-1 in HeLa cells.
How CHX potentiates the TNF cytotoxicity in non-transformed cells is unclear. Most of the cell lines, including HeLa cells, NIH3T3 cells, and TA1 cells, are not killed by TNF alone, but are killed by the combined action of TNF and CHX (Reid et al., J. Biol. Chem. 264:4583-4589 (1991); Reid et al., J. Biol. Chem. 266:16580-16586 (1991)). TNF-a is a pleiotrophic cytokine which may induce more than one cellular response in a single cell. The presence of CHX may inhibit the synthesis of certain signaling molecules and thus, potentiate the killing activity of TNF. Alternatively, CHX may simply inhibit the synthesis of a general cell survival factor(s) and thus, allow cells to become more sensitive to TNF cytotoxicity.
HeLa cells express predominantly p55 TNF receptor, thought to be responsible for cell death signalling (Engelmann et al., J. Biol. Chem.
265:14497-14504 (1990); Thoma et al., J. Exp. Med. 72:1019-1023 (1990)).
TNF p55 receptor triggers the activation of phospholypase A2, protein kinase C, sphingomyelinase, phosphatidylcholine-specific phospholypase C, and NFkB (Weigmann et al., J. Biol. Chem. 267:17997-18001 (1992); Schutze et al., Cell 71:765-776 (1992)). In TNF p55 receptor knockout mice, TNF-mediated induction of NK-kB is prevented in thymocytes (Pfeffer et al., Cell 73:457-467 (1993)). TNF p55 receptor knockout mice were resistant to lethal doses of either lipopolysaccharides or S. aureus enterotoxin B. This suggests that TNF receptor mediated hepatocyte necrosis (Pfeffer et al., Cell 73:457-467 (1993)). After the stimulation of HeLa cells by TNF-a/CHX, not all the prolL- ""10 l was converted into mature IL-1p. ICE activity is likely to be tightly controlled within the cells. A small amount of active ICE may be sufficient for induction of apoptosis. Even in the mature IL-1p-producing monocytic cell line, THP. 1, most of the ICE is the p45 inactive form. In non-LPS-stimulated THP.1 cells, only 0.02% of ICE is in the active form. In stimulated cells, the 15 maximum amount of active ICE is less than 2% of total ICE (Ayala et al., J.
Immunol. 153:2592-2599 (1994)). THP.1 cells may have some protective mechanism to prevent the activated ICE from inducing apoptosis. LPS induces the synthesis of a large amount of pro-IL-1p, which may, in fact, confer protection on THP.1 cells, because substrates usually are good competitive 20 inhibitors of enzymes.
In addition to the secretion of mature IL-1, there was a significant drop of the proIL-1p level in the cell lysates prepared from TNF-a/CHX-treated cells. This could be the result of secretion of mature IL-1p, inhibition of biosynthesis of prolL-lp, or increase of proteolytic activity within the dying cells. If, however, the inventors' hypothesis is correct that pro-IL-1p does act as a competitive inhibitor of ICE), a reduction in the level of pro-IL-lp would in fact represent a further reduction in cellular defense against apoptosis and could be one of the reasons that CHX can increase the cytotoxicity of TNFa.
TNF is a central component in the mammalian host inflammatory response (Tracey, K.G. et al., Ann. Rev. Cell. Biol. 9:317-343 (1993)). In models of septic shock, injection of endotoxin (LPS) rapidly induces TNF, IL-1 and IL-6 (Ford, Y. et al., J. Exp. Med. 170:1627-1633 (1989)). Under these conditions, the secretion of IL-1~ appears to be dependent upon TNF because passive immunization with TNF monoclonal antibodies during endotoxemia in vivo attenuates the appearance of IL-1P (Fong, Y. et al., J. Exp. Med.
170:1627-1633 (1989)). The results herein show that TNF plays a role in activating ICE, the key enzyme in processing IL-1P.
Expression of mitochondrial manganese superoxide dismutase has been shown to promote the survival of tumor cells exposed to TNF (Hirose, K.
et al., Mol. Cell. Biol. 13:3301-3310 (1993)). This suggests that the generation of free radicals plays a role in cell death induced by TNF. There are several reports that TNF cytotoxicity is related to the generation of free radicals and 15 lipid peroxides (Hennet, T. et al., Biochem. J. 289:587-592 (1993); Schulze- Osthoff, K. et al., J. Biol. Chem. 267:5317-5323 (1992)). If that is the case, ICE may be activated directly or indirectly by free radicals.
The death of Hela cells induced by TNF is also suppressed by bcl-2 overexpression (the viability is 83.7 2.3% under the experimental conditions 20 described for Table Bcl-2 has been suggested to have the ability to either inhibit the production of free radicals (Kane, D.J. et al., Science 262:1274-1277 (1993)) or prevent free radicals from damaging cells (Hockenberry, D.M.
et al., Cell 75:241-251 (1993)). Thus, in HeLa cells, bcl-2 and crniA may suppress cell death induced by TNF through a single biochemical pathway of programmed cell death.
Fas/Apo (ref)-antigen is a member of TNF receptor family (Itoh, N.
et al., Cell 66:233-243 (1991)). Apoptosis can be induced by stimulation of Fas-antigen by anti-Fas/anti-Apo antibody (Yonehara, S. et al., J. Exp. Med.
169:1747-1756 (1989)) or Fas-ligand (Suda, T. et al., Cell 75:1169-1178 (1993)), a type II transmembrane protein homologous to TNF. Cell death -89signalling after the stimulation of Fas-antigen is largely unknown. However, Fas mediated cell death is protected by overexpression of Elb (Hashimoto, S.
et al., Intern. Immun. 3:343-35 1 (1991)) or bcl-2 (Itoh, N. et al., J. Immun.
151:621-627 (1993)). The data presented herein suggest that stimulation of Fas-antigen may also activate ICE/ced-3 cell death pathway.
SEQUENCE LISTING GENERAL INFORMATION:
APPLICANT:
NAME: The General Hospital Corporation STREET: Fruit Street CITY: Boston STATE: Massachusetts COUNTRY: U. S. A.
POSTAL CODE (ZIP): 02114 (ii) TITLE OF INVENTION: Programmed Cell Death Genes and Proteins 9 (iii) NUMBER OF SEQUENCES: (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (EPO) v) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT/US96/00177 (vi) PRIOR APPLICATION DATA: APPLICATION NUMBER: US 08/368,704 FILING DATE: 04-JAN-1995 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: S(A) LENGTH: 7653 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ix) FEATURE: NAME/KEY: CDS LOCATION: join(2232..2366, 2430..2576, 2855..3109, 4305 4634, 5547..5759, 5817..5942, 6298..6537, 7012 7075) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: AGATCTGAAA TAAGGTGATA AATTAATAAA TTAAGTGTAT TTCTGAGGAA ATTTGACTGT TTTAGCACAA TTAATCTTGT TTCAGAAAAA AAGTCCAGTT TTCTAGATTT TTCCGTCTTA 120 TTGTCGAATT AATATCCCTA TTATCACTTT TTCATGCTCA TCCTCGAGCG GCACGTCCTC 180 AAAGAATTGT GAGAGCAAAC GCGCTCCCAT TGACCTCCAC ACTCAGCCGC CAAAACAAAC 240 a a a a. a
GTTCGAACAT
TTTTCTTTGT
GGCTCGCCGA
TTTAACCTTG
TAGTTTACTA
GCTCATAGAT
CAAAAACAAT
ACCACTCCAT
CTCATTTGGT
AATCCAAATC
GATCAGGAGC
GTCGTCCTTG
CACCTGTCTC
TTGCTGCTGC
TTTTTTTTCC
TATATACAAT
TCGTGTGCTA
ATTTATCGTA
TAAATCGGCT
AACATATTTG
TGTAGCGCTT
TGATAACCCG
GCTACGAGAT
TCCGTAATAA
ATGGGTCTCG
AATTACTAAA
TTCCAGGCTG
TCGTGTGTTG
TCTTTTTGTT
TTTATTGTTG
GTTTTTGCAT
ATAAAACTAC
TTTCGATACT
CCTAAGATTT
CACCTCTTTG
ATGCTCTTTT
GCATTATATT
TTTCAGGGTA
GTATCCTCAA
CGTCTCAATT
TACAATCCAC
GCGAAATTTG
CCATAAGAAT
ACATCTTATT
AAATTATCAT
CGACATTATC
ACGGCAAAAT
GTGTCGATTT
TAAATCGTCA
ATTTTGCGCG
TTTCACAAGA
GCACGCAAAA
ATTTTCGTGA
ACAAACAGAA
TGCTCCTTTT
GAACGTGTTG
CCAGAAAGAT
TGTTTCGTTT
TTTTAAACCT
CAAATCCAAA
CCACATGTTT
GCGGTGTTCT
CGATTTTATA
TGTGCATGGA
AACGCCCGGT
CTTGTCCCGG
ATCGTTTAGA
TTCTTTTCT
CAATAAACCG
ATCTTCTCAA
TTTATAATAT
AATAGCACCG
GTATTAAGGA
ATCTCGTAGC
ACGGGCTCAA
CAACGCTACA
CCAAATATGA
TTTTGGCATT
AGTTTGATAG
ATTTTTCTGT
ACAAAAACAC
CCGTTATCTT
CTAAGCAATT
TCTGAGATTC
AAAAAAACCA
TTACCTTTAC
AATAAATTTA
GACCTCTCCG
TCGAAACCCA
GCTCTTTGTC
GGCAAATGAC
TCATTTTGTA
TTTTGTTTTC
AATGTGAACT
CATCGGCAGT
GCCAAJAAACT
TGTTTATGAT
TTCCGCTAAA
AAAACTACTA
ATCACAAAAT
GAAAACTACA
TTTTTGAAAA
GTAGTCATTT
CTGTAATACG
CCACTTTAAA
ACTTTTAAAT
TAAAATTTTT
AACAAACATT
GCAGTCATCT
ATTACATCAA
TCGAAGTCGA
CTGTTTATGT
CTCACCGCTC
CGAGGGCAAT
GCACCTTCTT
CTTAGGAAAG
GCAATTTCAA
GGGGTTGGAA
CCACATTTCA
GGTACACTCT
GTCCAGATGG
CTTACGAGCC
TTCTCCAAAT
TTCTTCGCAG
ATTCCGATTT
AAAATGGTAA
TCTGAGAATG
GTAATTCTTT
TAATTTTTTT
AAAGGATTAC
CATTCTCTGA
GGCGCACAGG
TCTCCTTGCA
AAAATCAGTT
TTAAAAATCA
TTTGTCGTTT
TTGAAGAAAA
TTTTATAATA
GAAAAACGAT
CGTGTTCATG
TAATGTGAAA
CCTTAGCCCC
CAGTGTGTAT
TGCTTTAAAC
TCTTAGATGA
TCATTTTCCT
TCCGTGATGC
GTGACTCATA
CATCATAAAC
TGTTACGCAA
CACTTTCTCT
TTGAGTATTA
AAGCTCCTTT
CGTACTGCGC
AAATGACTAC
TTTCGAATTT
TGTAGTTCTA
ATTTTGTGTT
ATTTATTCCA
TTTTTAATTC
TTCTAATATT
GTTTTCAAAT
300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 TAAAAATAAC GATTTCTCAT TGAAAATTGT GTTTTATGTT TGCGAAAATA AAAGAGAACT GATTCAAAAC AATTTTAACA AAAAAAAACC CCAAAATTCG CCAGAAATCA AGATAAAAAA 1980 TTCAAGAGGG TCAAAATTTT CCGATTTTAC TGACTTTCAC CTTTTTTTTC GTAGTTCAGT 2040 GCAGTTGTTG GAGTTTTTGA CGAAAACTAG GAAAAAAATC GATAAAAATT ACTCAAATCG 2100 AGCTGAATTT TGAGGACAAT GTTTAAAAAA AAACACTATT TTTCCAATAA TTTCACTCAT 2160 TTTCAGACTA AATCGAAAAT CAAATCGTAC TCTGACTACG GGTCAGTAGA GAGGTCAACC 2220 ATCAGCCGAA G ATG ATG CGT CAA GAT AGA AGG AGC TTG CTA GAG AGG AAC 2270 Met Met Arg Gin Asp Arg Arg Ser Leu Leu Glu Arg Asn 1 5 ATT ATG ATG TTC TCT AGT CAT CTA AAA GTC GAT GAA ATT CTC GAA GTT 2318 Ile Met Met Phe Ser Ser His Leu Lys Val Asp Giu Ile Leu Glu Val 20 CTC ATC GCA AAA CAA GTG TTG AAT AGT GAT AAT GGA GAT ATG ATT AAT 2366 Leu Ile Ala Lys Gin Vai Leu Asn Ser Asp Asn Gly Asp Met Ile Asn 35 40 GTGAGTTTTT AATCGAATAA TAATTTTAAA AAAAAATTGA TAATATAAAG AATATTTTTG 2426 CAG TCA TGT GGA ACG GTT CGC GAG AAG AGA CGG GAG ATC GTG AAA GCA 2474 Ser Cys Gly Thr Vai Arg Giu Lys Arg Arg Giu Ile Val Lys Ala 55 GTG CAA CGA CGG GGA GAT GTG GCG TTC GAC GCG TTT TAT GAT GCT CTT 2522 Val Gin Arg Arg Gly Asp Val Ala Phe Asp Ala Phe Tyr Asp Ala Leu 70 CGC TCT ACG GGA CAC GAA GGA CTT GCT GAA GTT CTT GAA CCT CTC GCC 2570 Arg Ser Thr Gly His Giu Gly Leu Ala Giu Val Leu Giu Pro Leu Ala 85 AGA TCG TAGGTTTTTA AAGTTCGGCG CAAAAGCAAG GGTCTCACGG AAAAAAGAGG 2626 Arg Ser CGGATCGTAA TTTTGCAACC CACCGGCACG GTTTTTTCCT CCGAAAATCG GAAATTATGC 2686 ACTTTCCCAA ATATTTGAAG TGAAATATAT TTTATTTACT GAAAGCTCGA GTGATTATTT 2746 ATTTTTTAAC ACTAATTTTC GTGGCGCAAA AGGCCATTTT GTAGATTTGC CGAAAATACT 2806 TGTCACACAC ACACACACAC ATCTCCTTCA AATATCCCTT TTTCCAGT GTT GAC TCG 2863 Val Asp Ser AAT GCT GTC GAA TTC GAG TGT CCA ATG TCA CCG GCA AGC CAT CGT CGG 2911 Asn Ala Val Giu Phe Glu Cys Pro Met Ser Pro Ala Ser His Arg Arg 100 105 110 AGC CGC GCA TTG AGC CCC GCC GGC TAC ACT TCA CCG ACC CGA GTT CAC 2959 Ser Arg Ala Leu Ser Pro Ala Gly Tyr Thr Ser Pro Thr Arg Val His 115 120 125 CGT GAC AGC GTC TCT TCA GTG TCA TCA TTC ACT TCT TAT CAG GAT ATC Arg Asp Ser Val Ser Ser Val Ser Ser Phe Thr Ser Tyr Gin Asp Ile 130 135 140 145 TAC TCA AGA GCA AGA TCT CGT TCT CGA TCG CGT GCA CTT CAT TCA TCG Tyr Ser Arg Ala Arg Ser Arg Ser Arg Ser Arg Ala Leu His Ser Ser 150 155 160 GAT CGA CAC AAT TAT TCA TCT CCT CCA GTC AAC GCA TTT CCC AGC CAA Asp Arg His Asn Tyr Ser Ser Pro Pro Val Asn Ala Phe Pro Ser Gin 165 170 175 CCT TCT ATGTTGATGC GAACACTAAA TTCTGAGAAT GCGCATTACT CAACATATTT Pro Ser 3007 3055 3103 3159
GACGCGCAAA
GATTTACGGG
TTGTCATTTT
AGACACCAGC
TGAAAAGAAT
CAAGTTTCGC
ATCATTAATG
TGAAATTGCT
TTTTTATATA
TTTCGACATI
AGAAATACCA
GATGTAAAAA
AAATGAATAG
TAACCCCATT
AAATTTCTTC
TGAAAGTCAG
GAAAAAATGC
AAAGGGAAAT
GCGAAACGTC
TATCTCGTAG
CTCGATTTTC
TGTGTTTTCT
GCTACAGTAC
TTTAAACATT
AGATTTTTTG
TGAATTTCTT
CAACGAAAAA
TTTTCCCCTA
TTTTACATCG
CACATCTTTC
AATGTCGAAT
GGGGAAAATG
ATTTTTTCGT
AAGATATTAC
CAAAAATATG
ATTAAAATAC
CAAAATTTCT
AAAAAAGAGG
CGAAAAATAC
GAAACGAATA
TTTGATATTT
TCTTTTAAAG
TTGAAAAAAA
ATTTTTTTCA
GTAGAAATTT
ATCATGTGGT
TTCATGTTGT
AACGACAGCT
TGCGTCTCTC
AATGTAAAAA
TATTAAAATA
TGAGCAACTT
CTTTATTGAT
TGCGAAACAC
ATTTTTGCAT
AGAGGATATA
AALATTTGGGT
AGTAACCCTT
TATGCTCGAA
TTGATCAATT
AGTTACAGTA
ATCATCTAAC
TTCAAGATAT
TGGGCTTTTC
TTGTTCATAT
GCAGAAAAAT
CACTTCACAT
GTCTTCAGCA
ATGCATGCGT
CATTTTTTGT
AAAAAGTAGA
AATTATAGAT
CTGAAAAAAA
TTTTCTACAT
ATTGAATGAA
ATCAAAATCG
TAAATGACTA
TTGTGACAAC
AATAAATTAT
GTTTTCGCTT
ATGTGCCAAA
GCTTATTAAC
GTTCTAGTAT
GAATGACGA.A
AGTAAAAAAG
GCTGAAGACG
TGTGAAATGG
TTTTTTACAC
ATTTTTCAAC
GAATATTAGA
GTTAATAAGC
TCAAAAATTC
CACATGAATG
ACATTGCGAA
ATCCTAAAAC
TTGTAGTGTC
GAATTTTAAT
TTCCGTAAAC
CAAGATATTT
ACGCTTTTTT
ACATATAATT
GCTCTACTTT
AAATAGCAAT
CGCATGCATT
AGAGACGCGG
GATCTCGGTC
TTTTCTGCAC
ATCACATGAT
GCGAAAACCA
ATATCTTGAA
TGCGAAAATT
TAGAAAATTA
ATTAAAATGT
CAACACATTT
3219 3279 3339 3399 3459 3519 3579 3639 3699 3759 3819 3879 3939 3999 4059 4119 4179 4239 4299 CAGCA TCC GCC AAC TCT TCA TTC ACC GGA TGC TCT TCT CTC GGA TAC Ser Ala Asn Ser Ser Phe Thr Gly Cys Ser Ser Leu Gly Tyr 4346 AGT TCA Ser Ser 195 TAC ATA Tyr Ile AGT CGT AAT CGC Ser Arg Asn Arg TCA TTC AGC Ser Phe Ser 200 GAT ATG AAC Asp Met Asn AAA GCT TCT Lys Ala Ser 205 TTT GTC GAT Phe Val Asp GGA CCA ACT CAA Gly Pro Thr Gin TTC CAT GAA Phe His Giu GCA CCA ACC Ala Pro Thr 220 ATG TAC AGA Met Tyr Arg CGT GTT TTC GAC Arg Val Phe Asp 230 AAA ACC Lys Thr AAC TTC TCG Asn Phe Ser AGT CCT Ser Pro 240 0O
S
S. 55 S S
S
S
S
*S CGT GGA ATC Arg Gly Met ACA CGG AAI Thr Arg Asr 26C AGA TGC ATG Arg Cys Met 275
GTACGGCGAA
CCCGGTCTCG
AATTTTGAAC
CATTATGTGT
TTCAGACAAT
AATTATCCAA
AAATGTATTT
TCACAATTCA
ATCATGAAGG
AACAAATCGA
TTGAAATTAC
GTGTCGAGAC
TCCTAATCAC
TCGAATTTCG
TGC CTC Cys' Leu 245 ATC ATA AAT Ile Ile Asn CAC TTT GAG His Phe Giu
CAC
Gir 255 GGT ACC AAG Gly Thr Lys GCC GAC Ala Asp 265 GGC TAT ACG GTT ATT TGC Gly Tyr Thr Val Ile Cys
ATTATATTAC
ACACGACAAT
TTCCGCGAAA
TTTTTCTTAG
TTTCCGCATA
AAATGCACAA
TTAAAAACTT
AATTCAAAAG
ATTTAGAAAA
GAAAAAGAGA
AGTACTCCTT
CAGGTACCGT
CAAAAAGTAA
ATTTTTTTTT
CCAAACGCGA
TTGTGTTAAA
ATGATTTACC
TTTTTCTATA
CAAAACTTGA
TTTAAAATTT
TAAAAACCAC
TTATTCATCC
GTTTTATAAC
ATGAAAAATC
AAAGGCGCAC
AGTTTTTGTC
AATTGAAATC
TGGTTTTTTG
AAI
TGC
TAG
ATA
TAG
GTG
TCC
GATI
ATI
GAT
ACC
GC~
TTC
GTC
GAC AAT CTT ACC AA I Asp Asn Leu Thr Asr 270 AAG GAC AAT CTG ACC Lys Asp Asn Leu Thi 285 TTGCCAT TTTGCGCC!GA :AAAAATG TATAATTTTG TTTCGAA ATTTTCGTTT TTTGATG TAAAAAACCG CACGAAA TCAATTTTCT AAAATTG GCAAACGGTG :GGAAAAG CAATAAAAAT TTGTTTA TTTTTGCAAA TTTTCTA GATTTTTCAA TTTAAAA ATATCCACAG CCATTTG CATTGGACCA AAAATTG CACCATTGGA :GAAAAGC CAAAAAATTC CCAAAAA CCAAAAAAAT ATG CCA Met Pro TTG TTC Leu Phe GGA AGG Gly Arg
AAATGTGGCG
CAAAAAACAA
TTTCCGGCTA
TTTGTAAATT
GAATTTTCAA
TTTCAATATG
CAAAACAACG
ATTTGAAAAA
AATTTTTTTT
CTTCGAGAGT
AAAATTTGTC
CAATAAACCT
AAAAAAAAAG
CAATTTTCTG
4394 4442 4490 4538 4586 4634 4694 4754 4814 4874 4934 4994 5054 5114 5174 5234 5294 5354 5414 5474 CAAAATACCA AAAAGAAACC CGAAAAAATT TCCCAGCCTT GTTCCTAATG TAAACTGATA TTTAATTTCC AG GGA ATG CTC CTG ACA ATT CGA GAC TTT GCC AAA CAC Gly Met Leu Leu Thr Ile Arg Asp Phe Ala Lys His 290 295 300 GAA TCA CAC GGA GAT TCT GCG ATA CTC GTG ATT CTA TCA CAC GGA GAA Giu Ser His Gly Asp Ser Ala Ile Leu Val Ile Leu Ser His Gly Glu 305 310 315 GAG AAT GTG ATT ATT GGA GTT GAT GAT ATA CCG ATT AGT ACA CAC GAG Giu Asn Val Ilie Ile Gly Val Asp Asp Ile Pro Ile Ser Thr His Giu 320 325 330 ATA TAT GAT CTT CTC AAC GCG GCA AAT GCT CCC CGT CTG GCG AAT AAG Ile Tyr Asp Leu Leu Asn Ala Ala Asn Ala Pro Arg Leu Ala Asn Lys 335 340 345 CCG AAA ATC GTT TTT GTG CAG GCT TGT CGA GGC G GTTCGTTTT TTATTTTAAT Pro Lys Ile Val Phe Val Gin Ala Cys Arg Gly Glu 350 355 360 TTTAATATAA ATATTTTAAA TAAATTCATT TTCAG AA CGT CGT GAC AAT GGA TTC Arg Arg Asp Asn Gly Phe 365 CCA GTC TTG GAT TCT GTC GAt GGA GTT CCT GCA TTT CTT CGT CGT GGA Pro Val Leu Asp Ser Vai Asp Gly Val Pro Ala Phe Leu Arg Arg Gly 370 375 380 TGG GAC AAT CGA GAC GGG CCA TTG TTC AAT TTT CTT GGA TGT GTG CGG Trp Asp Asn Arg Asp Gly Pro Leu Phe Asn Phe Leu Gly Cys Val Arg 385 390 395 CCG CAA GTT CAG GTTGCAATTT AATTTCTTGA ATGAGAATAT TCCTTCAAAA Pro Gin Val Gin 400 AATCTAAAAT AGATTTTTAT TCCAGAAAGT CCCGATCGAA AAATTGCGAT ATAATTACGA AATTTGTGAT AAAATGACAA ACCAATCAGC ATCGTCGATC TCCGCCCACT TCATCGGATT GGTTTGAAAG TGGGCGGAGT GAATTGCTGA. TTGGTCGCAG TTTTCAGTTT AGAGGGAATT TAAAAATCGC CTTTTCGAA.A ATTAAAAATT GATTTTTTCA ATTTTTTCGA AAAATATTCC GATTATTTTA TATTCTTTGG AGCGAAAGCC CCGTCCTGTA AACATTTTTA AATGATAATT AATAAATTTT TGCAG CAA GTG TGG AGA AAG AAG CCG AGC CAA GCT GAC ATT 5534 5582 5630 5678 5726 5779 5834 5882 5930 5982 6042 6102 6162 6222 6282 6333 Gin Val Trp Arg 405 Lys Lys Pro Ser Gin Ala Asp Ile 410 415 CTG ATT CGA TAC GCA ACG ACA GCT CAA TAT GTT TCG TGG AGA AAC ACT Leu Ilie Arg Tyr Ala Thr Thr Ala Gin Tyr Val Ser Trp Arg Asn Ser 420 425 430 6381 GCT CGT OGA Ala Arg Gly CAC GCA AAG His Ala Lys 450 AAG GTC GCT Lys Val Ala 465 CAG ATG CCA TCA TGG TTC ATT CAA GCC GTC TGT GAA GTG TTC TCG ACA Ser Trp Phe Ile Gin Ala Val Cys Giu Val Phe Ser Thr 435 440 445 GAT ATG GAT OTT OTT GAG CTG CTG ACT GAA GTC AAT AAG Asp Met Asp Vai Vai Giu Leu Leu Thr Giu Val Asn Lys 455 460 TGT GGA TTT CAG ACA TCA CAG GGA TCG A.AT ATT TTG AAA Cys Gly Phe Gin Thr Ser Gin Gly Ser Asn Ile Leu Lys 470 475 GAG GTACTTGAAA CAAACAATGC ATGTCTAACT TTTAAGGACA 6429 6477 6525 6577 Gin Met Pro Glu 0 9 0000 0 *00* 0000 00 0 0* 00 0* 0* 0 0000 0 0000 0 0090
CAGAAAAATA
TTTTAGCTAA
AATAGTCACT
AAAACGAAAA
GCGAAAATTA
ATTCAAAGTT
CCGCCTSTTT
TAAATTAAAT
GGCAGAGGCT CCTTTTGCAA AATGATTGAT TTTGAATATT ATTTATCGGG TTTCCAGTAA TTTGTAGTTT TTCAACGAAA CATCAACCAT CAAGCATTTA GTCCACGAGT ATTACACGGT TTCTGTGCGG CTTGAAAACA TTCAG ATG ACA TCC CC Met Thr Ser Arg 485
GCCTGCCGCG
TTATGCTAAT
CGTCAACCTA GAATTTTAGT TTTTTTGCGT TAAATTTTGA AAAATGTTTA TTAGCCATTG GATTTTACTG TTTATCGATT TTTA.AATGTA AAAAAAAATA AGCCAAAATT GTTAACTCAT TTAAAAATTA TGGCGCGCGG CAAGTTTGCA AAACGACGCT AGGGATCGGT TTAGATTTTT CCCCAAAATT CTG CTC AAA AAG TTC TAC TTT TG Leu Leu Lys Lys Phe Tyr Phe Trp 6637 6697 6757 6817 6877 6937 6997 7048 *CCG GAA GCA CGA AAC TCT GCC GTC TAAAATTCAC TCGTGATTCA TTGCCCAATT Pro Giu Ala Arg Asn Ser Aia Val 500 7102
GATAATTGTC
TATATTGTTA
CACATTTCCA
AATTTTAATA
ATGCTTCTAT
CCATATTCAT
AAAATCTAAT
CGGCCTTTTA
ATTTCCCGCC
TGTATCTTCT
TCCTATACTC
TTTCTCTACG
ACTCGTTTTG
CAACAAAATA
TTTTGCCGGG
ATTTGAATTA
AAGTTCGGAA
CATCATCTCA
CCCCCAGTTC
ATTTCACTTT
ATAATCTAAA
AATTTGATTA
GTTTCATAGA
AATCAATTTC
ACGAATAGCA
CATTTGGCAA
AATTGCATTC
TCTTTCGCCC
ATCATTCTAT
ATTATGACOT
GTTGTTGTGC
TCATCACCCC
GATTAATTTT
TTCCCATCTC
TTATGTATAA
TTTTTTCGCC
AATTAGTTTA
CATTTCTCTT
TTGTGTCTCG
CCAGTATATA
AACCCCACCA
AACCTATTTT
TCCCGTGCCG
ATTTGTAGGT
GTGATATCCC
AAACCATGTG
CCCATTTTCA
AACGCATAAT
TGTATGTACT
ACCTACCGTA
TTCGCCACAA
GAATGCCTCC
CCCCCCCATC
GATTCTGGTC
7162 7222 7282 7342 7402 7462 7522 7582 7642 AGCAAAGATC T INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 503 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: Met Met Arg Gin Asp Arg Arg Ser Leu Leu Giu Arg Asn Ile Met Met 7653
C
*5*S 0* *5 Phe Lys Thr Gly His Ser Arg His Ile 145 Ser Gin Tyr Ser Ser Gin Val Val Arg Asp Val Glu Gly Asn Ala Ser Arg 115 Arg Asp 130 Tyr Ser Asp Arg Pro Ser Ser Ser 195 His Leu Giu Ala Leu Val1 100 Ala Se r Arg His Se r 180 Se r Lys Se r Arg Asp 70 Giu Phe Se r Se r Arg 150 Tyr Asn Asn Vai Asp Giu Asp Asn Gly 40 Arg Giu Ile 55 Ala Phe Tyr Val Leu Giu Giu Cys Pro 105 Pro Ala Gly 120 Ser Val Ser 135 Ser Arg Ser Ser Ser Pro Ser Ser Phe 185 Arg Ser Phe 200 10 Ile Leu Asp Met Val Lys Asp Pro 90 Met Tyr Ser Arg Pro 170 Thr Se r Aia Leu Se r Thr Phe Ser 155 Vali Gly Lys Giu Ile Ala Leu Ala Pro Ser Thr 140 Arg Asn Cys Ala Val1 Asn Vali Arg Arg Ala Pro 125 Se r Ala Ala Se r Se r 205 Ile Cys Arg Thr Val1 His Arg Gin His Pro 175 Le u Pro Gin Tyr Ile Phe His Giu Giu Asp Met Asn Phe Val Asp Ala Pro Thr 210 215 220 0* 0* 0S
S.
Thr Asn Asp 265 Ile Asp Leu Ile Arg 345 Giu Pro Asn Lys Tyr 425 Val1 Leu Gin Leu INFORMATION FOR SEQ ID NO:3: -99- SEQUENCE CHARACTERISTICS: LENGTH: 132 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GTATTAAGGA ATCACAAAAT TCTGAGAATG CGTACTGCGC AACATATTTG ACGGCAAAAT ATCTCGTAGC GAAAACTACA GTAATTCTTT AAATGACTAC TGTAGCGCTT GTGTCGATTT 120 ACGGGCTCAA TT 132 INFORMATION FOR SEQ ID NO:4: **e o SEQUENCE CHARACTERISTICS: LENGTH: 125 base pairs S(B) TYPE: nucleic acid S: STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: AAAATTCAGA GAATGCGTAT TACAGTCATA TTTGGCGCGC AAAATATCTC GTAGCTAGAA CTACAGTAAT CCTTTAAATG ACTACTGTAG CGTTGTGACG ATTTACGGGT TATCAAAATT 120 CGAAA 125 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: S(A) LENGTH: 116 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID AAATTCTGAG AATGCGCATT ACTCAACATA TTTGACGCGC AAATATCTCG TAGCGAAAAT ACAGTAACCC TTTAAATGAC TATTGTAGTG TCGATTTACG GGCTCGATTT TCGAAA 116 INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 73 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -100- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: TATCTTGAAG CGAAAACTAC TGTAACTCTT TAAAAGAGTA CTGTAGCGCT GGTGTCTGTT TACGGAAATA ATT 73 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 132 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: GTATTACGGC AAGAAATAAT TATGAGAATG CCTATTGCGC ACCATAGTTG ACGCGCAAAA TATCTCGTAG CGAAAACTAC AGTAACTCTT TGAATGACTA CTGTAGCGCT TGTTTCGATT 120 TACGGGCTCG TT 132 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 111 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: GTATAACGGT AACACACAAT TCTGAGAATG CGTATTGCAC AACACATTTG ACGCGCAAAA TATCTCGTAG CGAAAACTAC AGTGATTCGC TGAATGAATA CGGTAGGGTC G 111 INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 122 base pairs TYPE: nucleic acid V STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: CTATTACGGG AGTACAAAAT TCTGAGAATG CGTACTGCGC AACATATTTG ACGCGCAAAA -101- TATTTCGTAT CGAAAACTAC AGTAATTCGT TTATTGGCTA CTGTGCGTGT TGATTTACGG 120 GC 122 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 124 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID GGGAGCACAA AATTCTGACT ATGAGAATGC GTATAAGCAC AAAATATTTC GTAGCGAAAA .00 CTACAGTAAT TTGTCAAGGG ACTACTGTAG CTAGCGCTTG TGTCGATTTA CGGAGCTCGA 120 TTTT 124 INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 74 base pairs e TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both e (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: TATAGGAAAA ATTGAATGAT CAATTGCGCA AAATATTGAC AAACTACGTA AGTAGTAGTG TTTTACGGTT GAAA 74 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 269 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: TCATTCAAGA TATGCTTATT AACACATATA ATTATCATTA ATGTGAATTT CTTGTAGAAA TTTTGGGCTT TTCGTTCTAG TATGCTCTAC TTTTGAAATT GCTCAACGAA AAAATCATGT 120 GGTTTGTTCA TATGAATGAC GAAAAATAGC AATTTTTTAT ATATTTTCCC CTATTCATGT 180 -102- TGTGCAGAAA AATAGTAAAA AGCGCATGCA TTTTTCGACA TTTTTTACAT CGAACGACAG 240 CTCACTTCAC ATGCTGAAGA CGAGAGACG 269 INFORMATION FOR SEQ ID NO:13: SEQUENCE CHARACTERISTICS: LENGTH: 280 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: TCATTCAAGA TATGCTTATT AACATCTATA ATTATCAATA AAGGTAATAT CTTGAAGAAA TTTTGGTTTT CGCTCTAATA TTCTCTACTT TTTAAGTTGC TCAACGAAAA AATAATGGGG 120 TTAATCATGT GATGTTGAAA AATACAAAAA ATGTATTTTA ATACATTTTC CCCCTATTCA 180 TTTGTGCAGA AAAGTGAAAA AAACGCATGCA TTTTTTACAT TATTCGACAT TTTTTTACA 240 TCGACCGAGAT CCCATTTCAC ATGCTGAAGA CGAGAGACG 280 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: I"o* LENGTH: 226 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: TCATTCAAGA TATGCTTATT AACATATAAT TATCATAAGA ATTCTTGAGA AATTTTGGTT TTCGTCTATA TCTCTACTTT TAATTGCTCA ACGAAAAAAT CATGTGATGG AAAAATAAAT 120 TTTTATAATT TTCCCCTATT CATTTGTGCA GAAAATGTAA AAAACGCATG CATTTTTCGA 180 CATTTTTTAC ATCGACGAAC CATTCACATG CTGAAGACGA GAGACG 226 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 108 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -103- (xi) SEQUENCE DESCRIPTION: SEQ ID CAGCTTCGAG AGTTTGAAAT TACAGTACTC CTTAAAGGCG CACACCCCAT TTGCATTGGA CCAAAAATTT GTCGTGTCGA GACCAGGTAC CGTAGTTTTT GTCGCAAA 108 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 31 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: ACAAATTGTC GTGTCGAGAC CGGGCGCCAC A 31 .a• 0 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 46 base pairs S* TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: CAGCAACAAA TGTTTGAAAT TACAGTAATC TTTAAAGGCG CACACC 46 INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 49 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: AACAAAACTT TGTCGTGTCG AGACCGGGTA CCGTATTTTT AATTGCAAA 49 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 43 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -104- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: CTGCAACGAA AGTCTGAAAT TACAGTACCC TTAAAGGCGC ATA 43 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 95 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID GTTAGAAACT AGAGTACCTC TTAAAGGCGC ACATCCTTTC CCACCTATCG AAAATTTGTC GTGTCGAGAC CGGGTAGCTA ATTTTATGCC AAAAA INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 114 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: CAGCAACAAA AGTTTGAAAT TACAGTGCTC TTTAAAGGCA CACACCTTTT TACATTTAAC AAAAAAGTGT CGCTTCGAGA CCGGGTACCG TGTTTTTGGC GCAAAAATCG CTAT 114 INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 114 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: CAGAAGCGAA AATTTGAAAT TACAGTACTC TTTAAACGCT CAACCCCGTT TCTATTCAAT AGAAAGTTGT CGTTTCGAGA CCGGACACCG TATTTTTGGC GCAAAATATA CCTG 114 INFORMATION FOR SEQ ID NO:23: -105- SEQUENCE CHARACTERISTICS: LENGTH: 107 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GATTGTTGAA AATTACAGTA ATCTTTAAAG GCGCACACAC GTTTGTATTT TACAGAAAAT TCTCGTTTCG AGACCGAACA CAGTATTTTT GGCGGAGAAA TTCTAAA 107 INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both SEQUENCE DESCRIPTION: SEQ ID NO:24: TTTGTCGTGT CGAGACCTGG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 103 base pairs TYPE: nucleic acid S. STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID TTTTTAAACT ACAGTACTCT TTAGGAGCGC ACATTTTTTC GCATTTAACA AATTTTTGTC GTGGCGAGAC CTGATACCGT ATTTTTAGGT CAAGATTACT AGG 103 INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: -106- ACCGTTTGAA ACTACAGTAC TCTTTAAAGG CGCGTTTGTC GT 42 INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: CGCTAAATAA GTTTAGCCAA TTTAATTCGC GAGACCCTTT AA 42
S
INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 77 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: AACCAATCAG CATCGTCGAT CTCCGCCCAC TTCATCGGAT TGGTTTGAAA GTGGGCGGAG STGAATTGCTG ATTGGTC 77 INFORMATION FOR SEQ ID NO:29: o* SEQUENCE CHARACTERISTICS: LENGTH: 77 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: AACCAATTAG CGACTTCGGA ATTTCCATAC TTAATCTGAT TGGTTGAAGA ATGGGCAGAG CGAATTGCTG ATTGGCC 77 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 56 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -107- (xi) SEQUENCE DESCRIPTION: SEQ ID AACCAATAGC CTCGTCACTT ATCGATTGGT TAATGGGCGA GGAATTGCTG ATTGGC 56 INFORMATION FOR SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 59 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: TTTTAAGGAC ACAGAAAAAT AGGCAGAGGC TCCTTTTGCA AGCCTGCCGC GCGTCAACC 59 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 60 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: o0o* TTTCAAGCCG CACAGAAAAA GAGGCGGAGC GTCGTTTTGC AAACTTGCCG CGCGCCAACC INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 48 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: TTTAAGCACA GAAAAAAGGC GAGTCTTTTG CAACTGCCGC GCGCAACC 48 INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 2482 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -108- (ix) FEATURE: NAI'E/KEY: CDS LOCATION: 16. .1527 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: AACCATCAGC CGAAG ATG ATG CGT CAA GAT AGA AGG AGC TTG CTA GAG AGG Met Met Arg Gin Asp Arg Arg Ser Leu Leu Glu Arg 9*
AAC
Asn
GTT
Val1
AAT
Asn
GTG
Val1
CC
Arg
AGA
Arg
GCA
Ala
CCG
Pro 125
TCT
Ser
GCA
Ala
CCA
Ala
ATG
Met
ATC
Ile
TGT
Cys
CGA
Arg
ACG
Thr
GTT
Val1
CAT
His
CGA
Arg
CAG
Gin
CAT
His
CCC
Pro 175 TTC TCT AGT CAT CTA AAA GTC GAT GAA ATT CTC GAA 435 180 185 TCT TCT CTC GGA TAC ACT TCA AGT CCT AAT CGC TCA TTC AGC AAA GCT -109- *b 0* S S
S
S
S.
S
S
Ser Ser 190 TCT GGA Ser Gly 205 GAT GCA Asp Ala AAC TTC Asn Phe TTT GAG Phe Glu CTT ACC Leu Thr 270 AAT CTG Asn Leu 285 CAC GAA His Glu GAA GAG Glu Glu GAG ATA Giu Ile AAG CCG Lys Pro 350 AAT GGA Asn Gly 365 CGT CGT Arg Arg TGT GTG Cys Val GCT GAC Gly
ACT
Thr
ACC
Thr
AGT
Ser 240
ATG
Met
TTG
Leu
GGA
Gly
CAC
His
GTG
Val1 320
GAT
Asp
ATC
Ile
CCA
Pro
TGG
Trp,
CCG
Pro 400
CTG
Asn
GAA
Glu
GAC
Asp 230
CTC
Leu
ACC
Thr
TAT
Tyr
ACA
Thr
CTC
Leu 310
GAT
Asp
AAT
Asn
TGT
Cys
GAC
Asp
CCA
Pro 390
TGG
Trp
ACA
Phe
ATG
Met
ACC
Thr
PAT
Asn
GAC
Asp 265
ATT
Ile
GAC
Asp
CTA
Leu
ATT
Ile
CGT
Arg 345
GA
Giu
CCT
Pro
PAT
Asn
AAG
Lys
TAT
Se r
PAC
Asn
ATG
Met
PAT
Asn 250
PAG
Lys
TGC
Cys
TTT
Phe
TCA
Ser
AGT
Ser 330
CTG
Leu
CGT
Arg
GCA
Ala
TTT
Phe
CCG
Pro 410
GTT
675 723 771 819 867 915 963 1011 1059 1107 1155 1203 1251 1299 -110- Ala Asp Ile 415 AGA AAC AGT Arg Asn Ser Leu Ile Arg Tyr Thr Thr Ala Gin Tyr Val Ser Trp GCT CGT GGA Ala Arg Gly TTC ATT CA Phe Ile Gin GCC GTC TGT GAA GTG Ala Vai Cys Giu Val 440 430 TTC TCG Phe Ser 445 1347 1395 ACA CAC GCA Thr His Ala ATG GAT GTT Met Asp Vai GTT GAG CTG CTG ACT GAA Val Glu Leu Leu Thr Giu 455 460 GTC AAT AAG AAG Val Asn Lys Lys GCT TGT GGA TTT Ala Cys Gly Phe
ACA
Thr f 0* 0 *0 0 0! ~0* ATT TTG AA Ile Leu Lys TAC TTT TGG Tyr Phe Trp 495
TTGCCCAATT
AAACCATGTG
CCCATTTTCA
AACGCATAAT
TGTATGTACT
ACCTACCGTA
TTCGCCACAA
GAACTCCCGG
CCCCCCCATC
GATTCTGGTC
TTGAAAATTT
AACACAAATT
TCTTCCAAAA
TCCGCGCACC
ATATTTATAG
TACATAGATA
CAG ATG CC Gin Met Pi 480 CCG GAA GC Pro Giu A]
GATAATTGTC
TATATTGTTA
CACATTTCCA
AATTTTAATA
ATGCTTCTAT
CCATATTCAT
AAAATCTAAT
CCTTTTAAAG
ATTTCCCGCC
AGCAAAGATC
TCGAAATTCC
TACTTGAAAC
TACTCTTGTA
CAATAAGTTT
ATTATTATAG
AAAGAAAACA
~A GAG ATG ACA TCC CC o0 Glu Met Thr Ser Arg 485 ~A CGA AAC TCT GCC GTC a Arg Asn Ser Ala Val 500 TGTATCTTCT CCCCCAGTTC TCCTATACTC ATTTCACTTT TTTCTCTACG ATAATCTAAA ACTCGTTTTC AATTTGATTA
CAACAAAATA
TTTTGCCGGG
ATTTGAATTA
TTCGGAACAT
CATCATCTCA
TTTCTCATAG
AGTAAATAAT
CCCATTTTCC
CGTTTATTAT
TATATATGTT
TTG3CTTTTCT
AGGTAAAAAA
GTTTCATAGA
AATCAATTTC
ACGAATAGCA
TTGGCCAATT
AATTGCATTC
CCCATTCCAT
ATTGGAAAAT
AAAATTTCAA
ATTTCCCTCG
GAGATTTATA
CGCCGTATGT
AGGAATTC
TCA CAG GGA TCG AAT Ser Gin Gly Ser Asn 475 CTG CTC AAA AAG TTC Leu Leu Lys Lys Phe 490 TAAAATTCAC TCGTGATTCA
TCTTTCGCCC
ATCATTCTAT
ATTATGACGT
GTTGTTGTGC
TCATCACCCC
GATTAATTTT
TTCCCATCTC
ATGTATAAAA
TTTTTTCGCC
TCGAGCTTTC
GGATTTTTCG
TTTTTTCAAA
TTTTTCTTCG
TAGCTTGTTA
TTGTGTGTGT
AATTAGTTTA
CATTTCTCTT
TTGTGTCTCG,
CCAGTATATA
AACCCCACCA
AACCTATTTT
TCCCGTGCCG
TTTTGTAGGT
GTGATATCCC
TAATAGGAAT
AGTTTTCAGC
TTTCCCGCTA
ATTCCTCCTC
TTATAATTAT
GTGATTGGTA
1443 1491 1544 1604 1664 1724 1784 1844 1904 1964 2024 2084 2144 2204 2264 2324 2384 2444 2482
-III-
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 503 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein S S
S.
S.
0 0 0**4 0555 *5 S a
S.
09
S.
S
S
S
a Sn.
S
pee.
*5
S.
S. S a
C*
(xi) SEQUENCE Met Arg Gln Asp Ser Ser His Leu Gin Val Leu Asn 35 Val Arg Glu Lys Asp Val Ala Phe Glu Gly Leu Ala Asn Ala Val Glu 100 Ser Arg Ala Leu 115 Arg Asp Ser Val 130 Tyr Ser Arg Ala Asp Arg His Asn 165 Pro Ser Ser Ala 180 Ser Ser Ser Arg 195 Tyr Ile Phe His 210 Ser Arg Val Phe Lys Ser Arg Asp 70 Glu Phe Ser Ser Arg 150 Tyr Asn Asn Glu Asp 230 le Leu sp Met al Lys sp Ala ro Leu et Ser yr Thr er Phe rg Ser 155 ro Val hr Gly er Lys sn Phe et Tyr 235 DESCRIPTION: SEQ ID Arg Arg Ser Leu Leu Glu An, Asn Val1 Asn Val1 Arg Arg Ala Pro 125 Ser Ala Al a Ser Ser 205 Asp Asn -112- Pro Arg Gly Met Cys Leu Ile Ile Asn Asn Giu His Phe Giu Gin Met 245 250 255 Pro Thr Arg Asn Giy Thr Lys Ala Asp Lys Asp Asn Leu Thr Asn Leu 260 265 270 Phe Arg Cys Met Gly Tyr Thr Val Ile Cys Lys Asp Asn Leu Thr Gly 275 280 285 Arg Gly Met Leu Leu Thr Ile Arg Asp Phe Ala Lys His Giu Ser His 290 295 300 Gly Asp Ser Ala Ile Leu Val Ile Leu Ser His Gly Giu Giu Asn Vai 305 310 315 320 le Ile Gly Val Asp Asp Ile Pro Ile Ser Thr His Giu Ile Tyr Asp 325 330 335 Leu Leu Asn Ala Ala Asn Ala Pro Arg Leu Ala Asn. Lys Pro Lys Ile ***340 345 350 S. *o Val Phe Val Gin Ala Cys Arg Gly Giu Arg Arg Asp Asn Gly Phe Pro .355 360 365 Val Leu Asp Ser Val Asp Giy Val Pro Ala Phe Leu Arg Arg Gly Trp 370 375 380 Asp Asn Arg Asp Gly Pro Leu Phe Asn Phe Leu Gly Cys Val Arg Pro 385 390 395 400 Gin Val Gin Gin Val Trp Arg Lys Lys Pro Ser Gin Ala Asp Ile Leu 405 410 415 S le Arg Tyr Ala Thr Thr Ala Gin Tyr Val Ser Trp, Arg Asn Ser Ala :420 425 430 go*o Arg Gly Ser Trp Phe Ile Gin Ala Val Cys Giu Val Phe Ser Thr His 435 440 445 Ala Lys Asp Met Asp Val Val Giu Leu Leu Thr Giu Val Asn Lys Lys 450 455 460 Val Ala Cys Giy Phe Gin Thr Ser Gin Gly Ser Asn Ile Leu Lys Gin 465 470 475 480 Met Pro Giu Met Thr Ser Arg Leu Leu Lys Lys Phe Tyr Phe Trp Pro 485 490 495 Giu Ala Arg Asn Ser Ala Val 500 INFORMVATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 503 amino acids TYPE: amino acid -113- TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: Met Met Arg Gin Asp Arg Trp, Ser Leu Leu Giu Arg Asn Ile Leu Giu 0* Se r Gin Giu 50 Asp Asn Pro Leu Ile 130 Arg Arg Ser Se r Phe 210 Va i Leu Lys Leu Asp Aia Leu Giu 100 Pro Se r Arg Asn Asn 180 Asn Giu Asp Leu Aia Asp Lys 55 Ala Val1 Met Tyr Ser 135 Ser Ser Phe Phe Met 215 Thr Asn
-LU
Leu Ile Leu Gly Asp Met Ile Val Lys Tyr Asp Ala Met Pro Leu Pro Ser Ser 105 Ser Pro Thr Thr Ser Thr Ser Ser Arg 155 Ala Thr Ser 170 Gly Cys Ala 185 Lys Thr Ser Tyr Val Asp Tyr Arg Asn 235 Giu His Phe 250 Asp Ile Ala Leu Se r His Arg Tyr 140 Pro Phe Se r Ala Ala 220 Phe Glu 255 Arg Asn Gly Thr Lys Ala Asp Lys Asp Asn Leu Thr Asn Ile Phe Arg 260 265 270 -114- Cys Met Gly Tyr Thr Val Ile Cys Lys Asp Asn Leu Thr Gly Arg Glu 275 280 285 Met Leu Ser Thr Ile Arg Ser Phe Gly Arg Asn Asp Met His Gly Asp 290 295 300 Ser Ala Ile Leu Val Ile Leu Ser His Gly Giu Giu Asn Val Ile Ile 305 310 315 320 Gly Val Asp Asp Val Ser Val Asn Val His Giu Ile Tyr Asp Leu Leu 325 330 335 Asn Ala Ala Asn Ala Pro Arg Leu Ala Asn Lys Pro Lys Leu Val Phe 340 345 350 :Vai Gin Ala Cys Arg Gly Giu Arg Arg Asp Asn Gly Phe Pro Val Leu 360 365 Asp Ser Val Asp Gly Val Pro Ser Leu Ile Arg Arg Gly Trp Asp Asn ***370 375 380 *Arg Asp Gly Pro Leu Phe Asn Phe Leu Gly Cys Val Arg Pro Gin Val 385 390 395 400 Gin Gin Vai Trp Arg Lys Lys Pro Ser Gin Ala Asp Met Leu Ile Ala 405 410 415 Tyr Ala Thr Thr Ala Gin Tyr Val Ser Trp Arg Asn Ser Ala Arg Gly 420 425 430 Ser Trp Phe Ile Gin Ala Val Cys Glu Val Phe Ser Leu His Ala Lys a..435 440 445 ***Asp Met Asp Val Val Giu Leu Leu Thr Glu Val Asn Lys Lys Val Ala *450 455 460 *Cys Gly Phe Gin Thr Ser Gin Gly Ser Asn Ile Leu Lys Gin Met Pro 465 470 475 480 Giu Leu Thr Ser Arg Leu Leu Lys Lys Phe Tyr Phe Trp Pro Giu Asp 485 490 49S Arg Gly Arg Asn Ser Ala Val 500 INFORMATION FOR SEQ ID NO:37: SEQUENCE CHARACTERISTICS: LENGTH: 497 amino acids TYPE: amino acid TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: -115- Met 1 Phe Lys Thr Gly His Gly Pro Ser Arg 145 Asn Asn Thr Glu Asp 225 Leu Thr Tyr Ile Arg Ser Asp Lys Arg Val1 Ala Pro Ser 135 Se r Asn Ser Ala Val1 215 Arg His Asn Asp Asn 295 -116- Val Ile Leu Ser His Gly Glu Glu Asn Val Ile Ile Gly Val Asp Asp 305 310 315 320 Val Ser Val Asn Val His Glu Ile Tyr Asp Leu Leu Asn Ala Ala Asn 325 330 335 Ala Pro Arg Leu Ala Asn Lys Pro Lys Leu Val Phe Val Gln Ala Cys 340 345 350 Arg Gly Glu Arg Arg Asp Val Gly Phe Pro Val Leu Asp Ser Val Asp 355 360 365 Gly Val Pro Ala Leu Ile Arg Arg Gly Trp Asp Lys Gly Asp Gly Pro 370 375 380 Asn Phe Leu Gly Cys Val Arg Pro Gin Ala Gin Gin Val Trp Arg Lys 385 390 395 400 Lys Pro Ser Gin Ala Asp Ile Leu Ile Ala Tyr Ala Thr Thr Ala Gin 405 410 415 Tyr Val Ser Trp Arg Asn Ser Ala Arg Gly Ser Trp Phe Ile Gin Ala 420 425 430 Val Cys Glu Val Phe Ser Leu His Ala Lys Asp Met Asp Val Val Glu 435 440 445 Leu Leu Thr Glu Val Asn Lys Lys Val Ala Cys Gly Phe Gin Thr Ser 450 455 460 Gin Gly Ala Asn Ile Leu Lys Gin Met Pro Glu Leu Thr Ser Arg Leu 465 470 475 480 Leu Lys Lys Phe Tyr Phe Trp Pro Glu Asp Arg Asn Arg Ser Ser Ala 485 490 495 Val INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 402 amino acids TYPE: amino acid TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: Met Ala Asp Lys Ile Leu Arg Ala Lys Arg Lys Gin Phe Ile Asn Ser 1 5 10 Val Ser Ile Gly Thr Ile Asn Gly Leu Leu Asp Glu Leu Leu Glu Lys 25 -117- Arg Val Leu Asn Gin Giu Giu Met Asp Lys Ilie Lys Leu Ala Asn Ile 40 Thr Ala Met Asp Lys Ala Arg Asp Leu Cys Asp His Val Ser Lys Lys 55 dly Pro Gin Ala Ser Gin Ile Phe Ile Thr Tyr Ile Cys Asn Giu Asp 70 75 Cys Tyr Leu Ala Gly Ile Leu Giu Leu Gin Ser Ala Pro Ser Ala Glu 90 Thr Phe Vai Ala Thr Giu Asp Ser Lys Gly Giy His Pro Ser Ser Ser 100 105 110 ***Giu Thr Lys Giu Giu Gin Asn Lys Giu Asp Giy Thr Phe Pro Gly Leu 115 120 125 Thr Gly Thr Leu Lys Giu Cys Pro Leu Giu Lys Ala Gin Lys Leu Trp .:130 135 140 ***Lys Giu Asn Pro Ser Giu Ile Tyr Pro Ile Met Asn Thr Thr Thr Arg 145 150 155 160 Thr Arg Leu Ala Leu Ilie Ile Cys Asn Thr Giu Phe Gin His Leu Ser 165 170 175 Pro Arg Val Gly Ala Gin Val Asp Leu Arg Giu Met Lys Leu Leu Leu 180 185 190 Glu Asp Leu Giy Tyr Thr Val Lys Val Lys Glu Asn Leu Thr Ala Leu 195 200 205 Giu Met Val Lys Giu Val Lys Giu Phe Ala Ala Cys Pro Glu His Lys *210 215 220 *Thr Ser Asp Ser Thr Phe Leu Val Phe Met Ser His Gly Ile Gin Glu 225 230 235 240 Giy Ilie Cys Gly Thr Thr Tyr Ser Asn Giu Val Ser Asp Ilie Leu Lys 245 250 255 Val Asp Thr Ile Phe Gin Met Met Asn Thr Leu Lys Cys Pro Ser Leu 260 265 270 Lys Asp Lys Pro Lys Val Ile Ile Ile Gin Ala Cys Arg Gly Glu Lys 275 280 285 Gin Gly Val Val Leu Leu Lys Asp Ser Vai Arg Asp Ser Giu Giu Asp 290 295 300 Phe Leu Thr Asp Ala Ilie Phe Giu Asp Asp Gly Ile Lys Lys Ala His 305 310 315 320 Ile Glu Lys Asp Phe Ilie Ala Phe Cys Ser Ser Thr Pro Asp Asn Val 325 330 335 -118- Ser Trp Arg His Pro Val Arg Gly Ser 340 345 Lys His Met Lys Glu Tyr Ala Trp, Ser 355 360 Arg Lys Val Arg Phe Ser Phe Glu Gin 370 375 Pro Thr Ala Asp Arg Val Thr Leu Ile 385 390 Gly His INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 404 amino acids TYPE: amino acid TOPOLOGY: both Leu Phe Ile Glu Ser Leu Ile 350 Cys Asp Leu Giu Asp Ile Phe 365 Pro Glu Phe Arg Leu Gin Met 380 Lys Arg Phe Tyr Leu Phe Pro 395 400 S. 5
SS
S.
S. S S S
S.
S
S S
S
*55*
S
*S
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: Met Ala Asp Lys Val Leu Lys Giu Lys Arg Lys Leu Phe Ile Arg Ser Met Arg Thr Gly Se r Asn Al a Ser Trp 145 -119- Arg Thr Arg Leu Ala Leu Ile Ile Cys Asn Phe Glu Phe Asp Ser Ile 165 170 175 Pro Arg Arg Thr Gly Ala Glu Val Asp Ile Thr Gly Met Thr Met Leu 180 185 190 Leu Gin Asn Leu Gly Tyr Ser Val Asp Val Lys Lys Asn Leu Thr Ala 195 200 205 Ser Asp Met Thr Thr Glu Leu Glu Ala Phe Ala His Arg Pro Glu His 210 215 220 225 Lys Thr Ser Asp Ser Thr Phe Leu Val Phe Met Ser His Gly Ile Arg 230 235 240 SGlu Gly Ile Cys Gly Lys Lys His Ser Glu Gin Val Pro Asp Ile Leu 245 250 255 Gin Leu Asn Ala Ile Phe Asn Met Leu Asn Thr Lys Asn Cys Pro Ser S* 260 265 270 Leu Lys Asp Lys Pro Lys Val Ile Ile Ile Gin Ala Cys Arg Gly Asp 275 280 285 Ser Pro Gly Val Val Trp Phe Lys Asp Ser Val Gly Val Ser Gly Asn 290 295 300 305 Leu Ser Leu Pro Thr Thr Glu Glu Phe Glu Asp Asp Ala Ile Lys Lys 310 315 320 Ala His Ile Glu Lys Asp Phe Ile Ala Phe Cys Ser Ser Thr Pro Asp 325 330 335 Asn Val Ser Trp Arg His Pro Thr Met Gly Ser Val Phe Ile Gly Arg 340 345 350 Leu Ile Glu His Met Gin Glu Tyr Ala Cys Ser Cys Asp Val Glu Glu 355 360 365 Ile Phe Arg Lys Val Arg Phe Ser Phe Glu Gin Pro Asp Gly Arg Ala 370 375 380 385 Gin Met Pro Thr Thr Glu Arg Val Thr Leu Thr Arg Cys Phe Tyr Leu 390 395 400 Phe Pro Gly His INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 171 amino acids TYPE: amino acid TOPOLOGY: both -120- (xi) SEQUENCE DESCRIPTION: SEQ ID Met Leu Thr Val Gin Val Tyr Arg Thr Ser Gin Lys Cys Ser Ser Ser 1 5 10 Lys His Val Val Glu Val Ile Leu Asp Pro Leu Gly Thr Ser Phe Cys 25 Ser Leu Leu Pro Pro Pro Leu Leu Leu Tyr Glu Thr Asp Arg Gly Val 40 Asp Gin Gin Asp Gly Lys Asn His Thr Gin Ser Pro Gly Cys Glu Glu 55 Ser Asp Ala Gly Lys Glu Glu Leu Met Lys Met Arg Ile Pro Thr Arg 65 70 75 Ser Asp Met Ile Cys Gly Tyr Ala Cys Leu Lys Gly Asn Ala Ala Met 85 90 *Arg Asn Thr Lys Arg Gly Ser Trp Tyr Ile Glu Ala Leu Thr Gin Val 100 105 110 Phe Ser Glu Arg Ala Cys Asp Met His Val Ala Asp Met Leu Val Lys 115 120 125 Val Asn Ala Leu Ile Lys Glu Arg Glu Gly Tyr Ala Pro Gly Thr Glu 130 135 140 Phe His Arg Cys Lys Glu Met Ser Glu Tyr Cys Ser Thr Leu Cys Gin *145 150 155 160 Gin Leu Tyr Leu Phe Pro Gly Tyr Pro Pro Thr 165 170 INFORMATION FOR SEQ ID NO:41: SEQUENCE CHARACTERISTICS: LENGTH: 1350 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ix) FEATURE: NAME/KEY: CDS LOCATION: 35..1156 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: TCTTCACAGT GCGAAAGAAC TGAGGCTTTT TCTC ATG GCT GAA AAC AAA CAC 52 Met Ala Glu Asn Lys His 1 CCT GAC AAA CCA CTT AAG GTG TTG GAA CAG CTG GGC AAA GAA GTC CTT 100 -121- Pro Asp Lys Pro Leu Lys Val Leu Glu Gin Leu Gly Lys Glu Val Leu ACG GAG TAC Thr Glu Tyr CTA GAA AAA TTA Leu Giu Lys Leu CAA AGC AAT GTA CTG AAA TTA AAG Gin Ser Asn Val Leu Lys Leu Lys GAG GAA Giu Giu CAT AAA CAA AAA TTT AAC AAT GCT GAA CGC AGT GAC AAG CGT Asp Lys Gin Lys Phe Asn Asn Ala Giu Arg Ser Asp Lys Arg
TG
Trp GTT TTT GTA GAT GCC ATG AAA AAG AAA Val Phe Val Asp Ala Met Lys Lys Lys 60 AGC AAA GTA Ser Lys Val GGT GAA Gly Giu CAT GGT His Gly of.
see*.
00...
ATG CTT CTC CAG ACA TTC TTC AGT GTG GAC CCA GGC AGC CAC Met Leu Leu Gin Thr Phe Phe Ser Val Asp Pro Gly Ser His GAA GCT AAT CTG GAA ATG GAG GAA Giu Ala Asn Leu Giu Met Glu Glu GAA GAA TCA TTG Giu Giu Ser Leu AAC ACT CTC Asn Thr Leu 100 GAA AAG ACA Glu Lys Thr AAG CTT TGT Lys Leu Cys 105 TCC CCT GAA GAG Ser Pro Glu Glu ACA AGO CTT TGC Thr Arg Leu Cys 388 CAA GAA Gin Giu 120 ATT TAC CCA ATA Ile Tyr Pro Ilie GAG GCC AAT GGC Glu Ala Asn Gly ACA CGA AAG GCT Thr Arg Lys Ala CTT ATC ATA TGC AAT ACA GAG TTC AAA CAT CTC TCA CTG AGG TAT Leu Ile Ilie Cys Asn Thr Giu Phe Lys His Leu Ser Leu Arg Tyr GCT AAA TTT GAC ATC ATT GGT ATG AAA Ala Lys Phe Asp Ilie Ilie Gly Met Lys CTT CTT GAA GAC Leu Leu Giu Asp TTA GGC Leu Gly 165 532 TAC GAT GTG Tyr Asp Vai GAG ATG AAA Giu Met Lys 185 GTG AAA GAG GAG CTT ACA GCA GAG GGC ATG GAG TCA Val Lys Glu Giu Leu Thr Aia Giu Giy Met Giu Ser 175 180 GAC TTT GCT GCA Asp Phe Ala Ala TCA GAA CAC CAG Ser Glu His Gin TCA GAC AGC Ser Asp Ser ACA TTC Thr Phe 200 CTG GTG CTA ATG Leu Val Leu Met CAT GGC ACA CTG His Giy Thr Leu GGC ATT TGT GGA Giy Ile Cys Giy ATG CAC AGT GAA Met His Ser Glu ACT CCA GAT GTG Thr Pro Asp Val CAG TAT GAT ACC Gin Tyr Asp Thr TAT CAG ATA TTC AAC AAT TGC CAC TGT CCA GGT CTA CGA GAC AAA CCC -122- Ile
ATC
Ile
AGA
Arg 265
AAT
Asn
ATT
Ile
ACA
Thr
CAT
His
TCA
Ser 345
GCA
His Cys TGC AGA Cys Arg 255 CCC CAG Pro Gin 270 GCT GTC Ala Val ACA ACC Thr Thr TTC ATC Phe Ile CAT CTC His Leu 335 AGT ATT Ser Ile 350 TAT TTC Gly
GGG
Gly
TGC
Cys
CTG
Leu
CAT
His 305
AGA
Arg
GAT
Asp
TCC
Se r
CTC
GGA
Gly 260
GTA
Vali
GTG
Vali
TCC
Se r
TCC
Se r
CTG
Leu 340
CCC
Pro
GGC
ATG
Met
CTA
Leu
AAG
Lys
CGA
Arg 310
TTC
Phe
GTG
Val1
ATT
Ile
TGAGAACAP
Leu Arg Asp Lys Pro 245 1012 1060 1108 LA 1163 -Asp Arg Mla Thr Leu Thr Arg Tyr Phe Tyr Leu Phe Pro Gly Asn 360 365 370 **GCAACAAGCA ACTGAATCTC ATTTCTTCAG CTTGAAGAAG TGATCTTGGC CAAGGATCAC ATTCTATTCC TGAAATTCCA GAACTAGTGA AATTAAGGAA AGAATACTTA TGAATTCAAG ACCAGCCTAA GCAACACAGT GGGATTCTGT TCGATAGACA AGCAAACAAG CAAAAATAAA INFORMATION FOR SEQ ID NO:42: SEQUENCE CHARACTERISTICS: LENGTH: 373 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: Met Ala Giu Asn Lys His Pro Asp Lys Pro Leu Lys Val Leu Giu Gin 1223 1283 1343 1350 -123- 0 *0 0@ S. S
S.
000*
S
55
S
S
*555 *5
S
5S S. S~ S S 1 Leu Asn Giu His Pro Glu Leu Gly Leu 145 Leu Ala His Leu Leu 225 Gly dly Cys Leu Gly Val Arg Ser Gly Ser Cys Arg 130 Ser Leu Glu Gin His 210 Gin Leu Asn Arg Ser 290 Lys Giu Leu Lys Ser Asp Lys Val Ser His Leu Asn 100 Arg Glu 115 Thr Arg Leu Arg Glu Asp Gly Met 180 Thr Ser 195 Gly Ile Tyr Asp Arg Asp Ser Gly 260 Gly Val 275 His Val 5 Val Leu Leu Lys Lys Arg Gly Giu 70 His Gly 85 Thr Leu Lys Thr Lys Ala Tyr Gly 150 Leu Gly 165 Glu Ser Asp Ser Cys Gly Thr Ile 230 Lys Pro 245 Giu Met Asp Leu Giu Lys Thr Glu Trp 55 Met Glu Lys Gin Leu 135 Ala Tyr Giu Thr Thr 215 Tyr Lys Trp Pro Asp 295 Giu Glu 40 Val1 Leu Ala Leu Giu 120 Ile Lys Asp Met Phe 200 Met Gin Val1 Ile Arg 280 Phe Tyr 25 Asp Phe Leu Asn Cys 105 Ile Ile Phe Val1 Lys 185 Leu His Ile Ile Arg 265 Asn Ile 10 Leu Lys Val Gin Leu 90 Se r Tyr Cys Asp Val1 170 Asp Val1 Se r Phe Ile 250 Giu Met Ala Glu Gln Asp Thr 75 Glu Pro Pro Asn Ile 155 Val1 Phe Leu Glu Asn 235 Val1 Se r Giu Phe Lys Lys Ala Phe Met Glu Ile Thr 140 Ile Lys Ala Met Lys 220 Asn Gin Se r Ala Tyr 300 Leu Phe Met Phe Glu Glu Lys 125 Glu Giy Glu Ala Se r 205 Thr Cys Ala Lys Asp 285 Al a Val Gin Asn Asn Lys Lys Ser Val Giu Pro Phe Thr 110 Glu Ala Phe Lys Met Lys Glu Leu 175 Leu Ser 190 His Gly Pro Asp His Cys Cys Arg 255 Pro Gin 270 Ala Val Thr Thr Se r Ala Lys A\sp Glu Arg Asn His Gly 160 Thr Glu Thr Val1 Pro 240 Gly Leu Lys Pro -124- His His Leu Ser Tyr Arg Asp Lys Thr Gly Giy Ser Tyr Phe Ile Thr 305 310 315 320 Arg Leu Ilie Ser Cys Phe Arg Lys His Ala Cys Ser Cys His Leu Phe 325 330 335 Asp Ile Phe Leu Lys Val Gin Gin Ser Phe Giu Lys Ala Ser Ile His 340 345 350 Ser Gin Met Pro Thr Ile Asp Arg Ala Thr Leu Thr Arg Tyr Phe Tyr 355 360 365 Leu Phe Pro Gly Asn 370 INFORMATION FOR SEQ ID NO:43: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 441 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein 9. 9 .9.
9.
99 99 9 9.
9* 9 9999 9 99 99 *9 9 9 9*9* 9 9* 0 .9 99 .9 9 9.
*0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: Ile Gly Arg Leu Ala Pro Glu Gly Leu Pro Vali Val Leu Lys Lys Thr Leu Pro 130 Lys Gly Leu Lys Gly Gly Gln 100 His Pro Leu Met Ala Ala Asp Arg Gly Arg Arg Ile Leu Gin Leu Giu Val1 Ala Met Se r Pro Thr Giu Arg Leu Cys Leu Asp Tyr 140 Leu Ser Thr Asp Thr Val Glu His Ser Leu Asp Asn Lys Asp Gly Pro -125- 0* 6*
V
V.
V V
V.
V
V
eVe.
V
9*
V.
V. V
OS
V.
Val1 Phe Val Ser Leu 225 Gin Asp Tyr Phe Phe 305 Gin Ala Met Thr Glu 385 Ala Arg Tyr Cys Gin Leu Gly 210 Gly Glu Ser Gly Asp 290 Ile Asp Gly Ile Lys 370 Arg Leu Cys Leu Leu Gin Leu Aia 180 Ser Asn 195 Giy Asp Tyr Asp Lys Leu Cys Ile 260 Val Asp 275 Asn Ala Gin Ala Giy Lys Lys Giu 340 Cys Gly 355 Arg Gly Ala Cys Ile Lys Lys Glu 420 Phe Pro 435 Thr Ser 185 Gly Thr Cys Gin Se r 265 Gin Leu Glu Se r Met 345 Lys Giu Ala Tyr Cys 425 Thr Phe Arg Giu Thr 220 Thr Ala Val1 Glu Lys 300 Arg Cys Pro Ala Ala 380 Leu Gly Leu Gin Leu 190 Giu Phe Gin Arg Gly 270 Phe Lys Val1 Giu Arg 350 Met Val1 Lys Glu Arg 430 His Leu Arg Leu Met 240 Thr Ile Leu Phe Gin 320 Asp Asp Asn Se r Asn 400 His Leu INFORMATION FOR SEQ ID NO:44: -126- SEQUENCE CHARACTERISTICS: LENGTH: 120 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ix) FEATURE: NAME/KEY: CDS LOCATION: join(l..6, 10..118) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: S. S i S.
0 0S 0000 S r
OS*S
S. S S S TTC TGG TAG CTC CAA GAG GTT TTT CGA CTT TTT GAC AAT GCT Phe Trp Leu Gin Glu Val Phe Arg Leu Phe Asp Asn Ala AAC TGT Asn Cys CCA AGT CTA CAG AAC AAG CCA AAA ATG TTC Pro Ser Leu Gin Asn Lys Pro Lys Met Phe 20 25 TTC ATC CAA Phe Ile Gin GCA TGT CGT Ala Cys Arg e v
OSOS
S
*0 00 *0 5 0 00.
0O
S.
0 S. 0 0*
S
GGA GGT GCT ATT GGA TCC CTT GGG Gly Gly Ala Ile Gly Ser Leu Gly INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 39 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Phe Trp Leu Gin Glu Val Phe Arg Leu Phe Asp Asn Ala Asn Cys Pro Ser Leu Gin Asn Lys Pro Lys Met Phe Phe Ile Gin Ala Cys Arg Gly Gly Ala Ile Gly Ser Leu Gly INFORMATION FOR SEQ ID NO:46: SEQUENCE CHARACTERISTICS: LENGTH: 120 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -127- (ix) FEATURE: NAME/KEY: CDS LOCATION: 2..119 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: T TCT GGT AGC TCC AAG AGG TTT TTC GAC TTT TTG ACA ATG CTA ACT 46 Ser Gly Ser Ser Lys Arg Phe Phe Asp Phe Leu Thr Met Leu Thr 1 5 10 GTC CAA GTC TAC AGA ACA AGC CAA AAA TGT TCT TCA TCC AAG CAT GTC 94 Val Gin Val Tyr Arg Thr Ser Gin Lys Cys Ser Ser Ser Lys His Val 25 GTG GAG GTG CTA TTG GAT CCC TTG GG 120 Val Glu Val Leu Leu Asp Pro Leu Gly INFORMATION FOR SEQ ID NO:47: SEQUENCE CHARACTERISTICS: LENGTH: 40 amino acids oo TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein *e S(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: Ser Gly Ser Ser Lys Arg Phe Phe Asp Phe Leu Thr Mec Leu Thr Val 1 5 10 Gin Val Tyr Arg Thr Ser Gin Lys Cys Ser Ser Ser Lys His Val Val 20 25 Glu Val Leu Leu Asp Pro Leu Gly INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 120 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ix) FEATURE: NAME/KEY: CDS LOCATION: 3..120 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: -128- TT CTG GTA GCT CCA AGA GGT TTT TCG ACT TTT TGA CAA TGC TAA CTG Leu Val Ala Pro Arg Gly Phe Ser Thr Phe Gin Cys Leu TCC AAG TCT ACA GAA CAA GCC AAA AAT GTT CTT Ser Lys Ser Thr Glu Gin Ala Lys Asn Val Leu 25 TGG AGG TGC TAT TGG ATC CCT TGG G Trp Arg Cys Tyr Trp Ile Pro Trp Ala CAT CCA AGC His Pro Ser ATG TCG Met Ser INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 38 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: Leu Val Ala Pro Arg Gly Phe Ser Thr Phe Gin Cys Leu Ser Lys Ser Thr Glu Gin Ala Lys Asn Val Leu His Pro Ser Met Ser Trp Arg Cys Tyr Trp Ile Pro Trp Ala INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1456 base pairs TYPE: nucleic acid STRANDEDNESS: unknown TOPOLOGY: unknown (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 14..1316 (xi) SEQUENCE DESCRIPTION: SEQ ID GCACAAGGAG CTG ATG GCC GCT GAC AGG GGA CGC AGG ATA TTG GGA GTG Met Ala Ala Asp Arg Gly Arg Arg Ile Leu Gly Val 1 5 -129- TGT GGC ATG Cys Gly Met CAT CCT CAT CAT His Pro His His CPA ACT CTA AAA AAG AAC CGA GTG Glu Thr Leu Lys Lys Asn Arg Val GTG CTA GCC AAA CAG CTG TTG TTG AGC CAA TTG TTA GPA CAT CTT CTG Val Leu Ala Lys Gin Leu Leu Leu Ser Giu Leu Leu Giu His Leu Leu AAC GAC ATC ATC ACC TTG GAA ATG AGG Lys Asp Ile Ile Thr Leu Glu Met Arg CTC ATC CAG Leu Ile Gin GCC AAA Ala Lys CCT AAC Pro Lys GTG GGC ACT TTC Val Cly Ser Phe CAG PAT GTG CPA Gin Asn Val Giu CTC PAC TTG CTC Leu Asn Leu Leu ACC CCT CCC Arg Cly Pro AAG CPA GC Lys Gin Gly OCT TTT CAT GCC Aia Phe Asp Ala TCT CPA GCA CTC Cys Giu Ala Leu ACC GAG ACC Arg Ciu Thr TCT CCC CTT Ser Gly Leu CAC CTC GAG CAT His Leu Ciu Asp TTG CTC ACC ACC Leu Leu Thr Thr CAG CAT CTA CTC CCA CCC TTG ACC TGT GAC TAC CAC TTC ACT CTC CCT Gin His Val Leu Pro Pro Leu Ser Cys Asp Tyr Asp Leu Ser Leu Pro CCC CTC TCT GAG TCC TCT CCC CTT TAC Pro Val Cys Ciu Ser Cys Pro Leu Tyr 130 PAG CTC CCC CTC Lys Leu Arg Leu ACA CAT ACT CTC Thr Asp Thr Val CAC TCC CTA CAC His Ser Leu Asp AAA CAT GCT CCT CTC TGC Lys Asp Gly Pro Val Cys CTT CAC CTC AAC CCT TGC ACT CCT Leu Gin Val Lys Pro Cys Thr Pro 160 TTT TAT CPA ACA Phe Tyr Gin Thr CAC TTC CAC His Phe Gin 170 CTC CTC TTC Leu Val Leu CTG GCA TAT Leu Ala Tyr 175 AGC PAT CTC Ser Asn Val 190 ACC TTG CAG TCT Arg Leu Gin Ser CCT CGT CCC CTA Pro Arg Gly Leu CAC TTC ACT His Phe Thr CGA GAG Cly Glu 195 AAA CPA CTC CPA TTT CCC TCT GCA Lys Clu Leu Ciu Phe Arg Ser Cly 200 CCC CAT GTG CAC CAC ACT Gly Asp Val Asp His Ser 205 210 ACT CTA CTC ACC Thr Leu Val Thr TTC PAC CTT TTG Phe Lys Leu Leu TAT GAC GTC CAT Tyr Asp Val His CTA TCT CAC CAG ACT GCA CAC CPA ATC Leu Cys Asp Gin Thr Ala Gin Glu met CPA GAG Gin Clu 235 -130- AAA CTG CAG Lys Leu Gin TGC ATC GTG Cys Ile Val 255 TTT GCA CAG TTA Phe Ala Gin Leu GCA CAC CGA GTC Ala His Arg Val ACG GAC TCC Thr Asp Ser 250 GCA CTC CTC TCG Ala Leu Leu Ser GGT GTG GAG GGC GCC ATC TAT GGT Gly Val Giu Gly Ala Ile Tyr Gly 265 GTG GAT Val Asp 270 GGG AAA CTG CTC Gly Lys Leu Leu CTC CAA GAG GTT Leu Gin Giu Val CAG CTC TTT GAC Gin Leu Phe Asp AAC GCC AAC TGC CCA Asn Ala Asn Cys Pro CTA CAG AAC AAA Leu Gin Asn Lys AAA ATG TTC TTC Lys Met Phe Phe 913 CAG GCC TGC CGT GGA GAT GAG ACT GAT CGT GGG GTT GAC CAA Gin Ala Cys Arg Gly Asp Giu Thr Asp Arg Gly Val Asp Gin CAA GAT Gin Asp 315 GGA AAG AAC Gly Lys Asn AAA GAA AAG Lys Glu Lys 335 GCA GGA TCC CCT Ala Gly Ser Pro TGC GAG GAG AGT Cys Giu Glu Ser GAT GCC GGT Asp Ala Gly 330 GAC ATG ATA Asp Met Ile 1009 1057 TTG CCG AAG ATG AGA CTG CCC ACG CGC Leu Pro Lys Met Arg Leu Pro Thr Arg 340 TGC GGC Cys Gly 350 TAT GCC TGC CTC Tyr Ala Cys Leu GGG ACT GCC GCC ATG CGG AAC ACC AAA Gly Thr Aia Aia Met Arg Asn Thr Lys 360 1105 1153 GGT TCC TGG TAC ATC GAG GCT CTT GCT Gly Ser Trp Tyr Ile Glu Ala Leu Ala 370 GTG TTT TCT GAG Val Phe Ser Glu GCT TGT GAT ATG Ala Cys Asp Met GTG GCC GAC ATG Val Ala Asp Met GTT AAG GTG AAC Val Lys Val Asn GCA CTT Ala Leu 395 1201 ATC AAG GAT lie Lys Asp AAG GAA ATG Lys Glu Met 415 CGG GAA Arg Glu 400 GGT TAT GCT Gly Tyr Ala GGC ACA GAA TTC Gly Thr Glu Phe CAC CGG TGC His Arg Cys 410 CTC TAC CTG Leu Tyr Leu 1249 TCT GAA TAC TGC AGC ACT CTG TGC CGC Ser Glu Tyr Cys Ser Thr Leu Cys Arg 420 1297 TTC CCA Phe Pro 430 GGA CAC CCT CCC ACA Gly His Pro Pro Thr TGATGTCACC TCCCCATCAT CCACGCCA 1346 AGTGGAAGCC ACTGGACCAC AGGAGGTGTG ATAGAGCCTT TGATCTTCAG GATGCACGGT TTCTGTTCTG CCCCCTCAGG GATGTGGGAA TCTCCCAGAC TTGTTTCCTG 1406 1456 -13 1- INFORMATION FOR SEQ ID NO:51; SEQUENCE CHARACTERISTICS: LENGTH: 435 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein a.
a a a. a Met Pro Gln Ile Ser 65 Ala Leu Pro Glu Glu 145 Pro Leu Phe His (xi) SEQUENCE Ala Ala Asp Arg His His Gin Glu 20 Leu Leu Leu Ser 35 Thr Leu Glu Met Gln Asn Val Glu Phe Asp Ala Phe 85 Glu Asp Met Leu 100 Pro Leu Ser Cys 115 Ser Cys Pro Leu 130 His Ser Leu Asp Cys Thr Pro Glu 165 Gin Ser Arg Pro 180 Thr Gly Glu Lys 195 Ser Thr Leu Val 210 Thr Glu Arg Leu 70 Cys Leu Asp Tyr Asn 150 Phe Arg Glu Thr Leu Lys Leu Leu 40 Glu Leu 55 Leu Asn Glu Ala Thr Thr Tyr Asp 120 Lys Lys 135 Lys Asp Tyr Gin Gly Leu Leu Glu 200 Leu Phe 215 Asn His Gin Leu Arg 90 Ser Ser Arg Pro His 170 Leu Arg Leu Arg Leu Ala Pro Glu Gly Leu Leu Val1 155 Phe Val1 Se r Leu Val1 Leu Lys Lys Thr Leu Pro Ser 140 Cys Gin Leu Gly Gly 220n Val1 Glu Val1 Arg Lys Gin Phe 125 Thr Leu Leu Ser Gly 205 Tyr Leu Ala Lys Asp Gly Ser Gly Pro Gin Gly His Val 110 Pro Val Asp Thr Gln Val Ala Tyr 175 Asn Val 190 Asp Val Asp Val DESCRIPTION: SEQ ID NO:51: Gly Arg Arg Ile Leu Gly Val Cys Gly Met His Val Leu Cys Asp Gin Thr Ala Gin Glu Met Gin Giu Lys Leu Gin Asn 225 230 235 240 -132- Phe Ala Gin Leu Pro Ala His Arg Val Thr Asp Ser Cys Ile Val Ala 245 250 255 Leu Leu Ser His Gly Val Glu Gly Ala Ile Tyr Gly Val Asp Gly Lys 260 265 270 Leu Leu Gln Leu Gln Glu Val Phe Gln Leu Phe Asp Asn Ala Asn Cys 275 280 285 Pro Ser Leu Gin Asn Lys Pro Lys Met Phe Phe Ile Gin Ala Cys Arg 290 295 300 Gly Asp Glu Thr Asp Arg Gly Val Asp Gin Gin Asp Gly Lys Asn His 305 310 315 320 Ala Gly Ser Pro Gly Cys Glu Glu Ser Asp Ala Gly Lys Glu Lys Leu 325 330 335 Pro Lys Met Arg Leu Pro Thr Arg Ser Asp Met Ile Cys Gly Tyr Ala 340 345 350 Cys Leu Lys Gly Thr Ala Ala Met Arg Asn Thr Lys Arg Gly Ser Trp 355 360 365 Tyr Ile Glu Ala Leu Ala Gin Val Phe Ser Glu Arg Ala Cys Asp Met 370 375 380 His Val Ala Asp Met Leu Val Lys Val Asn Ala Leu Ile Lys Asp Arg 385 390 395 400 Glu Gly Tyr Ala Pro Gly Thr Glu Phe His Arg Cys Lys Glu Met Ser 405 410 415 Glu Tyr Cys Ser Thr Leu Cys Arg His Leu Tyr Leu Phe Pro Gly His 420 425 430 Pro Pro Thr INFORMATION FOR SEQ ID NO:52: SEQUENCE CHARACTERISTICS: LENGTH: 2176 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 89..1019 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: -133- AGAGGGAGGG AACGATTTAA GGAGCGAATA CTACTGGTAA ACTAATGGAA GAAATCTGCT GCACCACTGG ATATTGGGAG TGTGTGGC ATG CAT CCT CAT CAT CAG GAA ACT Met His Pro His His Gin Giu Thr CTA AAA Leu Lys AAG AAC CGA GTG GTG CTA GCC AAA CAG CTG TTG TTG AGC GAA Lys Asn Arg Val Val Leu Ala Lys Gin Leu Leu Leu Ser-Glu
TTG
Leu TTA GAA CAT CTT Leu Giu His Leu GAG AAG GAC ATC ATC ACC TTG GAA ATG Giu Lys Asp Ile Ile Thr Leu Giu Met 9.
99 .9 *99.
*999 9 .9 9.
9* 9 99 GAG CTC ATC CAG GCC AAA GTG GGC AGT Giu Leu Ile Gin Ala Lys Vai Gly Ser AGC CAG AAT Ser Gin Asn CTC AAC TTG Leu Asn Leu GAA GCA CTG Glu Ala Leu CCT AAG AGG GGT Pro Lys Arg Gly CAA GCT TTT GAT Gin Ala Phe Asp GTG GAA CTC Val Glu Leu GCC TTC TGT Ala Phe Cys ATG TTG CTC Met Leu Leu AGG GAG ACC AAG Arg Giu Thr Lys GGC CAC CTG GAG Gly His Leu Giu ACC ACC Thr Thr 90 CTT TCT GGG CTT Leu Ser Giy Leu CAT GTA CTC CCA His Val Leu Pro TTG AGC TGT GAC Leu Ser Cys Asp 9. 99 9 9 9 9999 9 .99.
*9 9 9.
.9 .9 99 9 9.
GAC TTG AGT CTC Asp Leu Ser Leu TTT CCG GTG TGT GAG TCC TGT CCC CTT Phe Pro Val Cys Glu Ser Cys Pro Leu AAG AAG CTC CGC Lys Lys Leu Arg TCG ACA GAT ACT Ser Thr Asp Thr GAA CAC TCC CTA Glu His Ser Leu GAC AAT Asp Asn 135 AAA GAT GGT Lys Asp Gly TAT CAA ACA Tyr Gin Thr 155 GTC TGC CTT CAG Val Cys Leu Gin AAG CCT TGC ACT Lys Pro Cys Thr CCT GAA TTT Pro Glu Phe 150 CGG CCT CGT Arg Pro Arg CAC TTC CAG CTG GCA TAT AGG TTG CAG His Phe Gin Leu Ala Tyr Arg Leu Gin 160 GGC CTA Gly Leu 170 GCA CTG GTG TTG AGC Ala Leu Val Leu Ser 175 AAT GTG CAC TTC ACT GGA GAG AAA GAA Asn Val His Phe Thr Gly Glu Lys Glu 180 CTG GAA TTT CGC Leu Glu Phe Arg 185 TCT GGA Ser Gly 190 GGG GAT GTG Gly Asp Val GAC CAC Asp His 195 AGT ACT CTA GTC Ser Thr Leu Val CTC TTC AAG CTT TTG GGC TAT GAC GTC CAT GTT Leu Phe Lys Leu Leu Gly Tyr Asp Val His Val CTA TGT GAC CAG ACT Leu Cys Asp Gin Thr 736 -134- 205 210 GCA CAG GAA ATG CAA GAG AAA CTG CAG PAT TTT Ala Gin Giu Met Gin Glu Lys Leu Gin Asn Phe 220 225 CAC CGA GTC ACG GAC TCC TGC ATC GTG GCA CTC His Arg Val Thr Asp Ser Cys Ilie Val Ala Leu 235 240 GAG GGC GCC ATC TAT GGT GTG GAT GGG AAA CTG Giu Gly Ala Ile Tyr Gly Val Asp Gly Lys Leu 250 255 GTT TTT CAG CTC TTT GAC AAC GCC AAC TGC CCA Val Phe Gin Leu Phe Asp Asn Ala Asn Cys Pro 265 270 275 CCA AAA ATG TTC TTC ATC CAG GCC TGC CGT OGA Pro Lys Met Phe Phe Ile Gin Ala Cys Arg Giy 285 290 CTT GGG CAC CTC CTT CTG TTC ACT GCT GCC ACC Leu Gly His Leu Leu Leu Phe Thr Ala Ala Thr 300 305 215 GCA CAG TTA CCT GCA Ala Gin Leu Pro Ala 230 CTC TCG CAT GGT GTG Leu Ser His Gly Val 245 CTC CAG CTC CAA GAG Leu Gin Leu Gin Glu 260 AGC CTA CAG AAC AAA Ser Leu Gin Asn Lys 280 GGT GCT ATT GGA TCC Gly Ala Ile Gly Ser 295 GCC TCT CTT GCT CTA Ala Ser Leu Ala Leu 310 784 p S p.
p a. S
S
S. S 832 880 928 976 1024
TGAGACTGAT
CGAGGAGAGT
CATGATATGC
TTCCTGGTAC
GGCCGACATG
CACAGAATTC
CTACCTGTTC
GAAGCCACTG
GTTCTGCCCC
GCCTTTGAGT
CAGCCTTGGT
GAAGTTGTAA
CAGTTCCAGC
CTTGAGAGAC
GTGAGAGTTT
CGTGGGGTTG
GATGCCGGTA
GGCTATGCCT
ATCGAGGCTC
CTGGTTAAGG
CACCGGTGCA
CCAGGACACC
GACCACAGGA
CTCAGGGATG
GTGGGACTCC
TGGACCTATT
ACACAGTGTG
TTTTGTAGAT
CATCTCCTAT
GGAAGGTGTC
ACCAACAAGA
AAGAAAAGTT
GCCTCAAAGG
TTGCTCAAGT
TGAACGCACT
AGGAGATGTC
CTCCCACATG
GGTGTGATAG
TGGGAATCTT
AGGCCAGCTC
GCCAGGAATG
GTTATGGGGA
GGCACTTTAG
CTTTTATTTC
CAAATTTAAT
TGGAAAGAAC
GCCGAAGATG
GACTGCCGCC
G-TTTTCTGAG
TATCAAGGAT
TGAATACTGC
ATGTCACCTC
AGCCTTTGAT
CCAGACTTGT
CTTTTCTGTG
TTTCAGCTGC
GAGGGCATAT
TGATTGCTTT
ATTCATATCC
GTAGACATTA.
CACGCAGGAT
AGACTGCCCA
ATGCGGAACA
CGGGCTTGTG
CGGGAAGGTT
AGCACTCTGT
CCCATCATCC
CTTCAGGATG
TTCCTGTGCC
AAGCCCTTTG
AGTTGAAGAG
AAATTCCCCA
TATTACATTA
TCCGCCCTTT
TCTTTTGGCT
CCCCTGGGTG
CGCGCTCAGA
CCAAACGAGG
ATATGCACGT
ATGCTCCTGG
GCCGCCACCT
ACGCCAAGTG
CACGGTTTCT
CATCATCTCT
CCTGTAGAGC
CCTGACAAGT
TATTTGTGTT
GTTAAGATGT
TTGTCCTAGA
CTGAAGAAGC
1084 1144 1204 1264 1324 1384 1444 1504 1564 1624 1684 1744 1804 1864 1924 -135- AAACATGACT AGAGACGCAC CTTGCTGCAG TGTCCAGAAG CGGCCTGTGC GTTCCCTTCA GTACTGCAGC GCCACCCAGT GGAAGGACAC TCTTGGCTCG TTTGGGCTCA AGGCACCGCA GCCTGTCAGC CAACATTGCC TTGCATTTGT ACCTTATTGA TCTTTGCCCA TGGAAGTCTC AAAGATCTTT CGTTGGTTGT TTCTCTGAGC TTTGTTACTG AAATGAGCCT CGTGGGGAGC
ATC
INFORMATION FOR SEQ ID NO:53: SEQUENCE CHARACTERISTICS: LENGTH: 310 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein 1984 2044 2104 2164 2167
S.
S.
S S S
S
*5*S S.
S
S. S Met Ala Asp Se r Pro Gly Val1 Val Thr Val1 145 Tyr (xi) SEQUENCE His Pro His His Lys Gin Leu Leu Ile Ile Thr Leu 35 Phe Ser Gln Asn 50 Gin Ala Phe Asp His Leu Glu Asp Leu Pro Pro Leu 100 Cys Glu Ser Cys 115 Val Glu His Ser 130 Lys Pro Cys Thr Arg Leu Gin Ser 165 DESCRIPTION: SEQ ID NO:53: Gin Glu Thr Leu Lys Lys Asn Arg Vai Val Leu Leu Glu Val1 Ala 70 Met Ser Pro Leu Pro 150 Arg Se r Met Glu Phe Leu Cys Leu Asp 135 Glu Pro Glu Arg 40 Leu Cys Leu Asp Tyr 120 Asn Phe Arg Leu 25 Glu Leu Glu Th r Tyr 105 Lys Lys Tyr Gly Leu Ala Pro Glu Gly Leu Leu 125 Val1 Phe Val1 Leu Glu Lys Lys Val Gly Lys Arg Gly Thr Lys Gin Leu Gin His Pro Phe Pro 110 Ser Thr Asp Cys Leu Gln Gin Leu Ala 160 Leu Ser Asn 175 170 Val His Phe Thr Gly Glu Lys Glu Leu Glu Phe Arg Ser Gly Gly Asp -136- 180 185 190 Val Asp His Ser Thr Ile Val Thr Leu Phe Lys Leu Leu Gly Tyr Asp 195 200 205 Val His Val Leu Cys Asp Gin Thr Ala Gin Glu Met Gin Glu Lys Leu 210 215 220 Gin Asn Phe Ala Gin Leu Pro Ala His Arg Val Thr Asp Ser Cys Ile 225 230 235 240 Val Ala Leu Leu Ser His Gly Val Glu Gly Ala Ile Tyr Gly Val Asp 245 250 255 S Gly Lys Leu Leu Gin Leu Gin Glu Val Phe Gin Leu Phe Asp Asn Ala 260 265 270 S Asn Cys Pro Ser Leu Gin Asn Lys Pro Lys Met Phe Phe Ile Gin Ala 275 280 285 Cys Arg Gly Gly Ala Ile Gly Ser Leu Gly His Leu Leu Met Phe Thr 290 295 300 S Ala Ala Thr Ala Ser Leu 305 310 INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 1399 base pairs *oo* TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both p.
(ii) MOLECULE TYPE: DNA (ix) FEATURE: NAME/KEY: CDS LOCATION: 3..1299 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: CT TTT TTT TTT TTT TTT TTT TAT GTC CTG GAG TCC TGC ACA GCC ATG 47 Phe Phe Phe Phe Phe Phe Tyr Val Leu Glu Ser Cys Thr Ala Met 1 5 10 GCG GCC AGG AGG ACA CAT GAA AGA GAT CCA ATC TAC AAG ATC AAA GGT Ala Ala Arg Arg Thr His Glu Arg Asp Pro Ile Tyr Lys Ile Lys Gly 25 TTG GCC AAG GAC ATG CTG GAT GGG GTT TTT GAT GAC CTG GTG GAG AAG 143 Leu Ala Lys Asp Met Leu Asp Gly Val Phe Asp Asp Leu Val Glu Lys 40 -137- AAT GTT TTA AAT GGA GAT GAG TTG CTC AAA ATA GGG GAA AGT GCG AGT Asn Val Leu Asn Gly Asp Glu Leu Leu Lys Ile Gly Glu Ser Ala Ser 55 TTC ATC CTG AAC AAG GCT GAG AAT CTG OTT GAG AAC TTC TTA GAG AAA Phe Ile ACA GAC Thr Asp Leu Asn Lys Ala Giu Asn Leu Val Giu Phe Leu Glu Lys ATG GCA GGA AAA ATA TTT GCT GGC CAC ATT GCC AAT TCC CAG Met Ala Gly Lys Ilie Phe Ala Gly His Ilie Ala Asn Ser Gln 85 90 CAA TTT TCT AAT GAT GAG GAT GAT GGA CCT CAG Gin Phe Ser Asn Asp Giu Asp Asp Gly Pro Gin 0. to:
S.
*00.
%*too GAA CAG CTG Giu Gin Leu AAG ATA TGT Lys Ile Cys AGT TTA Ser Leu 100 CCT TCT TCT CCA Pro Ser Ser Pro GAA TCC AAG AGA Glu Ser Lys Arg AAA GTA GAG Lys Val Giu 125 TCA CAT CTA Ser His Leu GAT GAT GAA ATG GAG GTA AAT Asp Asp Giu Met Giu Val Asn 130 GGA TTG GCC CAT Gly Leu Ala His ATG CTG Met Leu 145 ACA OCT CCT CAT GGA CTC CAG AGC TCA Thr Ala Pro His Gly Leu Gin Ser Ser 150 GTC CAA GAT ACA Val Gin Asp Thr AAG CTT TGT CCA Lys Leu Cys Pro GAT CAG TTT TGT Asp Gin Phe Cys ATA AAG ACA GAA ile Lys Thr Glu AGG 527 Arg 175 GCA AAA GAG ATA TAT CCA GTG ATG GAG Ala Lys Giu Ile Tyr Pro Val Met Glu GAG GGA CGA ACA Giu Gly Arg Thr COT CTG Arg Leu 190 575 GCT CTC ATC Ala Leu Ile AAT OCT GAT Asn Ala Asp 210 TGC AAC AAA AAG Cys Asn Lys Lys GAC TAC CTT TTT Asp Tyr Leu Phe OAT AGA GAT Asp Arg Asp 205 GAA AAT CTT Glu Asn Leu 623 ACT GAC ATT TTG Thr Asp Ile Leu ATG CAA GAA CTA Met Gin Giu Leu GGA TAC Gly Tyr 225 TCT GTG GTG TTA Ser Val Val Leu GAA AAC CTT ACA Glu Asn Leu Thr
GCT
Ala 235 CAG GAA ATG GAG Gin Giu Met Giu GAG TTA ATG CAG Olu Leu Met Gin GCT GGC CGT CCA Ala Gly Arg Pro CAC CAG TCC TCA His Gin Ser Ser AGC ACA CCT GGT Ser Thr Pro Gly OTT TAT GTC CCA Val Tyr Val Pro TGO CAT CCT GGA AGG AAT Trp His Pro Giy Arg Asn 265 CTG TGG Leu Trp 270 815 -138- GGT GAA GCA Gly Glu Ala AAA CAA AAG CCA Lys Gin Lys Pro GAT GTT Asp Val 280 CTT CAT GAT GAC ACT ATC Leu His Asp Asp Thr Ile 285 TTC AAA ATT TTC AAC AAC TCT AAC TGT CGG AGT Phe Lys Ile 290 Phe Asn Asn Ser Cys Arg Ser AGA GGC AGA Arg Gly Arg CTG AGA Leu Arg 300 AAC AAA CCC Asn Lys Pro AAG ATT Lys Ile 305 CTC ATC ATG CAG GCC TGC Leu Ile Met Gin Ala Cys 310 TAT AAT GGA ACT ATT Tyr Asn Gly Thr Ile 315
TGG
Trp 320 GTA TCC ACA AAC Val Ser Thr Asn GGG ATA GCC ACT Gly Ile Ala Thr GAT ACA GAT GAG Asp Thr Asp Glu 0 0 0 0 0 0S *0 CGT GTG TTG AGC Arg Val Leu Ser AAA TGG AAT AAT Lys Trp Asn Asn ATA ACA AAG GCC Ile Thr Lys Ala CAT GTG His Val 350 1007 1055 1103 1151 GAG ACA GAT Glu Thr Asp TGG AAG GTA Trp Lys Val 370 ATT GCT TTC AAA Ile Ala Phe Lys TCT ACC CCA CAT Ser Thr Pro His AAT ATT TCT Asn Ile Ser 365 CTC ATT GAC Leu Ile Asp GGC AAG ACT GGT Gly Lys Thr Gly CTC TTC ATT TCC Leu Phe Ile Ser
AAA
Lys 380 TGC TTC Cys Phe 385 AAA AAG TAC TGT Lys Lys Tyr Cys TGT TAT CAT TTG Cys Tyr His Leu GAA ATT TTT CGA Glu Ile Phe Arg 1199 1247 GTT CAA CAC TCA Val Gin His Ser GAG GTC CCA GGT Glu Val Pro Gly CTG ACC CAG ATG Leu Thr Gin Met ACT ATT GAG AGA GTA TCC ATG ACA CGC Thr Ile Glu Arg Val Ser Met Thr Arg TAT TTC Tyr Phe 425 TAC CTT TTT CCC GGG AAT Tyr Leu Phe Pro Gly Asn 430 1298 1358 1399 TAGCACAGGC AACTCTCATG CAGTTCACAG TCAAGTATTG CTGTAGCTGA GAAGAAAAGA AAATTCCAAG ATCCCAGGAT TTTTAAATGT GTAAAACTTT T INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 432 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID -139-
S,
*5
S.
S. Phe P 1 Ala I Ala I Vai I Ile 65 Asp Gin Ile Asp Leu 145 Lys Lys Leu Ala Tyr 225 Glu Thr Glu he P ~rg P ,ys I .eu .eu 4et Leu Cys Glu 130 Thr Leu Glu Ile Asp 210 Ser Leu Pro Ala ,he ~rg ~sp ksn Asn Ala Ser Thr 115 Met Ala ys Ile Ile 195 Thr Val Met Gl Prc Phe P Thr H Met I Gly I Lys Gly Leu 100 Pro Glu Pro Pro Tyr 180 Cys Asp Val Gin Val 260 3 Lys he Phe 5 [is Glu .eu Asp Tyr Val Arg Asp Gly Val 40 1sp kla Lys Gln Ser Va1 His Arg 165 Pro Asn Ile Leu Phe 245 Tyi Glr Glu Glu 70 Ile Phe Ser Asn Gly 150 Asp Val Lys Leu Lys 23C Alz Va: I Ly Leu Asn Phe Ser Pro Ala 135 Leu Gir Met LyE Asr 215 Gi 1 Pre s Pr Leu Lys Leu Val Ala Gly Asn Asp 105 Ser Glu 120 Gly Lei Gin Se Phe Cy Glu Ly 18 Phe Asl 200 I Met Gi i Asn Le Arg Pr o Trp Hi 26 o Asp Va Ser Leu 75 s s 5 p n u 0 s 5 1 hr Ala Met Ala le Lys Gly Leu 'al Glu Lys Asn His 90 Glu Ser Ala Ser Lys 170 Glu Tyr Glu Thr Glu 250 Pro Leu [le %sp Lys His Glu 155 Ile Cly Leu Leu Ala 235 His Gly His Ala Asp Arg Glu 140 Val Lys Arg Phe Leu 220 Gin Gin Arg Asp Asn 5 Gly Lys 125 Ser Gin Thr Thr Asp 205 Clu Glu Ser Asn Asp 285 Ser C Pro 110 Val His Asp Glu Arg 190 Arg Asn Met Ser Leu 270 Thr ln .ln lu Leu Thr Arg 175 Leu Asp Leu Glu Asr 255 Tr Il Phe Thr Glu Lys Asp Met Leu 160 Ala Ala Asn Cly i Thr 240 Ser Gly Phe 275 280 Lys Ile 290 Phe Asn Asn Ser Cys Arg Ser Leu Asn Lys Pro Lys -140- Ile Leu Ile Met Gin Ala Cys Arg Gly Arg Tyr Asn Gly Thr Ile Trp 305 310 315 320 Val Ser Thr Asn Lys Gly Ile Ala Thr Ala Asp Thr Asp Glu Glu Arg 325 330 335 Val Leu Ser Cys Lys Trp Asn Asn Ser Ile Thr Lys Ala His Val Glu 340 345 350 Thr Asp Phe Ile Ala Phe Lys Ser Ser Thr Pro His Asn Ile Ser Trp 355 360 365 Lys Val Gly Lys Thr Gly Ser Leu Phe Ile Ser Lys Leu Ile Asp Cys 370 375 380 Phe Lys Lys Tyr Cys Trp Cys Tyr His Leu Glu Glu Ile Phe Arg Lys S385 390 395 400 ,Val Gin His Ser Phe Glu Val Pro Gly Glu Leu Thr Gin Met Pro Thr 405 410 415 SIle Glu Arg Val Ser Met Thr Arg Tyr Phe Tyr Leu Phe Pro Gly Asn 420 425 430 "oo INFORMATION FOR SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 418 amino acids TYPE: amino acid STRANDEDNESS: unknown TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide *0 *e (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: Met Ala Ala Arg Arg Thr His Glu Arg Asp Pro Ile Tyr Lys Ile Lys 1 5 10 Gly Leu Ala Lys Asp Met Leu Asp Gly Val Phe Asp Asp Leu Val Glu 25 Lys Asn Val Leu Asn Gly Asp Glu Leu Leu Lys Ile Gly Glu Ser Ala 40 Ser Phe Ile Leu Asn Lys Ala Glu Asn Leu Val Glu Asn Phe Leu Glu 55 Lys Thr Asp Met Ala Gly Lys Ile Phe Ala Gly His Ile Ala Asn Ser 70 75 Gin Glu Gin Leu Ser Leu Gin Phe Ser Asn Asp Glu Asp Asp Gly Pro 90 -141- Gln Lys Ile Cys Thr Pro Ser Ser Pro Ser Glu Ser Lys Arg Lys Val 100 105 110 Glu Asp Asp Glu Met Glu Val Asn Ala Gly Leu Ala His Glu Ser His 115 120 125 Leu Met Leu Thr Ala Pro His Gly Leu Gin Ser Ser Glu Val Gin Asp 130 135 140 Thr Leu Lys Leu Cys Pro Arg Asp Gin Phe Cys Lys Ile Lys Thr Glu 145 150 155 160 Arg Ala Lys Glu Ile Tyr Pro Val Met Glu Lys Glu Gly Arg Thr Arg 165 170 175 Leu Ala Leu Ile Ile Cys Asn Lys Lys Phe Asp Tyr Leu Phe Asp Arg 180 185 190 Asp Asn Ala Asp Thr Asp Ile Leu Asn Met Gin Glu Leu Leu Glu Asn 195 200 205 Leu Gly Tyr Ser Val Val Leu Lys Glu Asn Leu Thr Ala Gin Glu Met 210 215 220 *o Glu Thr Glu Leu Met Gin Phe Ala Gly Arg Pro Glu His Gin Ser Ser 225 230 235 240 Asp Ser Thr Pro Gly Val Tyr Val Pro Trp His Pro Gly Arg Asn Leu 245 250 255 Trp Gly Glu Ala Pro Lys Gin Lys Pro Asp Val Leu His Asp Asp Thr 260 265 270 Ile Phe Lys Ile Phe Asn Asn Ser Asn Cys Arg Ser Leu Arg Asn Lys 275 280 285 e* Pro Lys Ile Leu Ile Met Gin Ala Cys Arg Gly Arg Tyr Asn Gly Thr 290 295 300 Ile Trp Val Ser Thr Asn Lys Gly Ile Ala Thr Ala Asp Thr Asp Glu 305 310 315 320 Glu Arg Val Leu Ser Cys Lys Trp Asn Asn Ser Ile Thr Lys Ala His 325 330 335 Val Glu Thr Asp Phe Ile Ala Phe Lys Ser Ser Thr Pro His Asn Ile 340 345 350 Ser Trp Lys Val Gly Lys Thr Gly Ser Leu Phe Ile Ser Lys Leu Ile 355 360 365 Asp Cys Phe Lys Lys Tyr Cys Trp Cys Tyr His Leu Glu Glu Ile Phe 370 375 380 Arg Lys Val Gin His Ser Phe Glu Val Pro Gly Glu Leu Thr Gin Met lo on qQ; 400 -142- Pro Thr Ile Glu Arg Val Ser Met Thr Arg Tyr Phe Tyr Leu Phe Pro 405 410 415 Gly Asn INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: ATGCTAACTG TCCAAGTCTA INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: TCCAACAGCA GGAATAGCA 19 INFORMATION FOR SEQ ID NO:59: SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: TGATCGCCAT CGGGGAAATC GAGGTAGAA 29 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -143- (xi) SEQUENCE DESCRIPTION: SEQ ID ATCATATCAT CCAGGCATCG TGCAGAGGG 29 INFORMATION FOR SEQ ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear e* (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: GTTGCACTGC TTTCACGATC TCCCGTCTCT INFORMATION FOR SEQ ID NO:62: e S(i) SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single eo TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: TCATCGACTT TTAGATGACT AGAGAACATC INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: GTTTAATTAC CCAAGTTTGA G 21 INFORMATION FOR SEQ ID NO:64: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear -144- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: CCGGTGACAT TGGACACTC 19 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 15 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID ACTATTCAAC ACTTG INFORMATION FOR SEQ ID NO:66: SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: Gln Ala rys Arg Gly 1 INFORMATION FOR SEQ ID NO:67: *e SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: CAACCCTGTA ACTCTTGATT INFORMATION FOR SEQ ID NO:68: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -145- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:68: ACCTCTTTGG AGCTACCAGA A 21 INFORMATION FOR SEQ ID NO:69: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: CCAGATCTAT GCTAACTGTC CAAGTCTA 28 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid S STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID AAGAGCTCCT CCAACAGCAG GAATAGCA 28 S INFORMATION FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: AGAAGCACTT GTCTCTGCTC INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -146- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: TTGGCACCTG ATGGCAATAC INFORMATION FOR SEQ ID NO:73: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: S GATATCCGCA CAAGGAGCTG A 21 INFORMATION FOR SEQ ID NO:74: e. SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both S(D) TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: CTATAGGTGG GAGGGTGTCC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID GATATCCAGA GGGAGGGAAC GAT 23 INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -147- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: GATATCAGAG CAAGAGAGGC GGT 23 INFORMATION FOR SEQ ID NO:77: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both o (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: GATATCGTGG GAGGGTGTCC T 21 INFORMATION FOR SEQ ID NO:78: S(i) SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both S(D) TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: ATCCAGGCCT CTAGAGGAGA T 21 INFORMATION FOR SEQ ID NO:79: S(i) SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: ATCTCCTCTA GAGGCCTGGA T 21 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both -148- (xi) SEQUENCE DESCRIPTION: SEQ ID TGCGGCTATA CGTGCCTCAA A 21 INFORMATION FOR SEQ ID NO:81: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: TTTGAGGCAC GTATAGCCGC A 21 INFORMATION FOR SEQ ID NO:82: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both D (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: ATTCCGCACA AGGAGCTGAT GGCC 24 INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: GCTGGTCGAC ACCTCTATC 19 INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid -149- STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: CAAGCTTTTG ATGCCTTCTG TGA 23 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both o S(xi) SEQUENCE DESCRIPTION: SEQ ID CTCCAACAGC AGGAATAGCA INFORMATION FOR SEQ ID NO:86: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: GCGATCTTTC GATCGCGATC TTTCTGTCGG AAGGC INFORMATION FOR SEQ ID NO:87: SEQUENCE CHARACTERISTICS: LENGTH: 38 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: CAGATCGCCT TGTAGATAGA AAGAACATCT TTGATCGG 38 INFORMATION FOR SEQ ID NO:88: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid -150- STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: ATGCTAACTG TCCAAGTCTA INFORMATION FOR SEQ ID NO:89: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: GTCTCATCTT CATCAACTCC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID GTTACCTGCA CACCGAGTCA CG 22 INFORMATION FOR SEQ ID NO:91: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: GCGTGGTTCT TCTTTCCATC TTGTTGGTCA INFORMATION FOR SEQ ID NO:92: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid -151- STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: ACCTTAATAT GCAAGACTCT CAAGGAG 27 INFORMATION FOR SEQ ID NO:93: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: GCGGCTTGAC TTGTCCATTA TTGGATA 27 INFORMATION FOR SEQ ID NO:94: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: GACCTGACAG ACTACCTCAT INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: both TOPOLOGY: both (xi) SEQUENCE DESCRIPTION: SEQ ID AGACAGCACT GTGTTGGCAT

Claims (20)

1. A method of preventing programmed cell death in vertebrates comprising the step of inhibiting the enzymatic activity of interleukin-1p converting enzyme.
2. The method of claim 1, wherein said enzymatic activity is inhibited by an interleukin-1 converting enzyme-specific antiprotease.
3. The method of claim 2, wherein said antiprotease is encoded by the *o crmA gene.
4. A method of promoting programmed death in vertebrate cells by ~increasing the enzymatic activity of interleukin-1 p converting enzyme in said cells. 9 The method of claim 4, wherein said vertebrate cells are cancer cells.
6. The method of claim 5, wherein said cancer cells overexpress the oncogene bcl-2.
7. A substantially pure gene which is preferentially expressed in thymus and placental cells and which encodes a protein causing programmed cell death.
8. The gene of claim 7, wherein said gene encodes a protein having the amino acid sequence shown in Figures 6A, 6B and 6C (SEQ ID NO: 42). 153
9. The gene of claim 8, wherein said gene has the cDNA sequence shown in Figures 6A, 6B and 6C (SEQ ID NO: 41). An expression vector having the gene of either claim 8 or claim 9.
11. A host cell transformed with the vector of claim
12. A substantially pure protein, wherein said protein is preferentially expressed in thymus or placental cells and which causes the death of said O•. cells. *.i
13. The protein of claim 12, wherein said protein has the amino acid sequence shown in Figures 6A, 6B and 6C (SEQ ID NO: 42). A functional derivative of the protein of claim 13. see. 0oo* A method of promoting programmed cell death in thymus or placental cells comprising the step of increasing the activity of the protein of claim 12.
16. A substantially pure DNA molecule comprising a cDNA sequence selected from the group consisting of the cDNA sequence shown in Figures and 10OB (SEQ ID NO:
17. An expression vector having the DNA of claim 16.
18. A host cell transformed with the vector of claim 17. 154
19. A substantially pure protein comprising an amino acid sequence selected from the group consisting of the amino acid sequence shown in Figures 10A and 10B (SEQ ID NO: 51). A functional derivative of the protein of claim 19.
21. A substantially pure DNA molecule comprising the cDNA sequence be shown in Figures 14 and 14A (SEQ ID NO: 54). shown in Figures 14 and 14A (SEQ ID NO:
22. An expressfunction vetor having the protein of claim 24.
26. A method of regulating interleukin-1P converting enzyme by regul 23. A host cell transformed w activity of the vector of claim 22. 24. A substantially pure DNA mprotein comprising the amino acid sequence sequence encoding a protein having an amino acid sequence shown in r shown in Figures 1 10B 4A (SEQ ID NO 55).51). ao a: 25. A functional derivative of the protein of claim 24. 26. A method of regulating interleukin-113 converting enzyme by regulating the levels or activity of tumor necrosis factor.
27. A substantially pure DNA molecule comprising a nucleic acid sequence encoding a protein having an amino acid sequence shown in Figures 10A or lOB (SEQ ID NO: 51). 155
28. A substantially pure DNA molecule comprising a nucleic acid sequence encoding a protein having an amino acid sequence shown in Figures 14 and 14A (SEQ ID NO: Dated this 15th day of March, 2000 THE GENERAL HOSPITAL CORPORATION Patent Attorneys for the Applicant S PETER MAXWELL ASSOCIATES S. S**o oSOO OS* S. S 0 0go0 0 *oo* o .5* S0 0O SO 0*
AU22472/00A 1995-01-04 2000-03-22 Programmed cell death genes and proteins Abandoned AU2247200A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU22472/00A AU2247200A (en) 1995-01-04 2000-03-22 Programmed cell death genes and proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US368704 1995-01-04
AU22472/00A AU2247200A (en) 1995-01-04 2000-03-22 Programmed cell death genes and proteins

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU47472/96A Division AU4747296A (en) 1995-01-04 1996-01-04 Programmed cell death genes and proteins

Publications (1)

Publication Number Publication Date
AU2247200A true AU2247200A (en) 2000-06-01

Family

ID=3711656

Family Applications (1)

Application Number Title Priority Date Filing Date
AU22472/00A Abandoned AU2247200A (en) 1995-01-04 2000-03-22 Programmed cell death genes and proteins

Country Status (1)

Country Link
AU (1) AU2247200A (en)

Similar Documents

Publication Publication Date Title
US20030124105A1 (en) Programmed cell death genes and proteins
US6083735A (en) Programmed cell death genes and proteins
WO1996020721A9 (en) Programmed cell death genes and proteins
US8314067B1 (en) Relatedness of human interleukin-1β convertase gene to a C. elegans cell death gene, inhibitory portions of these genes and uses therefor
US5962301A (en) Relatedness of human interleukin-1β convertase gene to a C. elegans cell death gene, inhibitory portions of these genes and uses therefor
US7071302B1 (en) Cloning sequencing and characterization of two cell death genes and uses therefor
JP2004501637A (en) New protease
US7115260B2 (en) Interleukin-1β converting enzyme like apoptotic protease-6
WO1996036698A1 (en) Mch2, AN APOPTOTIC CYSTEINE PROTEASE, AND COMPOSITIONS FOR MAKING AND METHODS OF USING THE SAME
JPH10113188A (en) Interleukin-1-beta converting enzyme-like apoptosis protease 7
AU2002352119B2 (en) Human cDNAs and proteins and uses thereof
WO1997014797A9 (en) Cystatin m, a novel cysteine proteinase inhibitor
AU2247200A (en) Programmed cell death genes and proteins
WO1997014797A2 (en) Cystatin m, a novel cysteine proteinase inhibitor
Leaver et al. Conservation of the tandem arrangement of α1-microglobulin/bikunin mRNA: cloning of a cDNA from plaice (Pleuronectes platessa)
AU696039C (en) Programmed cell death genes and proteins
JP2003510086A (en) Novel human aminopeptidase 22196
WO1995009913A9 (en) Human monocyte/macrophage derived metalloproteinase inhibitor
WO1995009913A1 (en) Human monocyte/macrophage derived metalloproteinase inhibitor
WO1997031931A1 (en) A novel disintegrin metalloprotease and methods of use
US20020136714A1 (en) Relatedness of human interleukin-1beta convertase gene to a C. elegans cell death gene, inhibitory portions of these genes and uses therefor
US6890721B1 (en) Interleukin-1β converting enzyme like apoptotic protease-6
EP0808904A2 (en) Interleukin-1 beta converting enzyme like apoptotic protease-6
AU2002361168A1 (en) Cytotoxic cyplasin of the sea hare, Aplysia punctata, cDNA cloning and expression of bioreactive recombinants
JP2001502171A (en) Interleukin-1β converting enzyme-like cysteine protease

Legal Events

Date Code Title Description
MK5 Application lapsed section 142(2)(e) - patent request and compl. specification not accepted