US5821075A

US5821075A - Nucleotide sequences for novel protein tyrosine phosphatases

Info

Publication number: US5821075A
Application number: US08/596,291
Authority: US
Inventors: Leonel Jorge Gonez; Jan Saras; Lana Claesson-Welsh; Carl-Henrik Heldin
Original assignee: Ludwig Institute for Cancer Research New York
Current assignee: Ludwig Institute for Cancer Research New York
Priority date: 1993-09-01
Filing date: 1994-09-01
Publication date: 1998-10-13
Anticipated expiration: 2015-10-13
Also published as: WO1995006735A2; EP0789771A2; US6066472A; AU683299B2; AU7644394A; JPH09510861A; CA2170515C; CA2170515A1; WO1995006735A3; NZ273219A

Abstract

The invention relates to the cloning of two novel protein tyrosine phosphatases. Nucleic acid sequences encoding these phosphatases (PTPL1 and GLM-2) as well as anti-sense sequences are also provided. The recombinantly produced PTPL1 and GLM-2 proteins also are provided, as well as antibodies to these proteins. Methods relating to isolating the phosphatases, using the nucleic acid sequences, and using the phosphatases also are provided.

Description

This application is a national stage application under 35 U.S.C. §371 of PCT/US94/09943, filed Sep. 1, 1993, which is a continuation of U.S. application Ser. No. 08/115,573, filed Sep. 1, 1993, now abandoned.

FIELD OF THE INVENTION

This invention relates to the isolation and cloning of nucleic acids encoding two novel protein tyrosine phosphatases (PTPs). Specifically, the present invention relates to the isolation and cloning of two PTPs from human glioblastoma cDNA which have been designated PTPL1 and GLM-2. The present invention provides isolated PTP nucleic acid sequences; isolated PTP anti-sense sequences; vectors containing such nucleic acid sequences; cells, cell lines and animal hosts transformed by a recombinant vector so as to exhibit increased, decreased, or differently regulated expression of the PTPs; isolated probes for identifying sequences substantially similar or homologous to such sequences; substantially pure PTP proteins and variants or fragments thereof; antibodies or other agents which bind to these PTPs and variants or fragments thereof; methods of assaying for activity of these PTPs; methods of assessing the regulation of PTPL1 or GLM-2; and methods of identifying and/or testing drugs which may affect the expression or activity of these PTPs.

BRIEF DESCRIPTION OF THE BACKGROUND ART

Protein tyrosine phosphorylation plays an essential role in the regulation of cell growth, proliferation and differentiation (reviewed in Hunter, T. (1987) Cell 50:823-8291). This dynamic process is modulated by the counterbalancing activities of protein tyrosine kinases (PTKS) and protein tyrosine phophatases (PTPs). The recent elucidation of intracellular signaling pathways has revealed important roles for PTKS. Conserved domains like the Src homology 2 (SH2) (Suh, P.-G., et al., (1988) Proc. Natl. Acad Sci. (USA) 85,5419-5423) and the Src homology 3 (SH3) (Mayer, B. J., et al., (1988) Nature 352:272-275) domains have been found to determine the interaction between activated PTKS and signal transducing molecules (reviewed in Pawson, T., and Schiessinger, J. (1993) Current Biol. 3:434-442; Koch, C. A., et al., (1991) Science 252:668-674). The overall effect of such protein interactions is the formation of signaling cascades in which phosphorylation and dephosphorylation of proteins on tyrosine residues are major events. The involvement of PTPs in such signaling cascades is beginning to emerge from studies on the regulation and mechanisms of action of several representatives of this broad family of proteins.

Similarly to PTKS, PTPs can be classified according to their secondary structure into two broad groups, i.e. cytoplasmic and transmembrane molecules (reviewed in Charbonneau, H., and Tonks, N. K. (1992) Annu. Rev. Cell Biol. 8:463-493; Pot, D. A., and Dixon, J. E. (1992) Biochim. Biophys. Acta 1136:35-43). Transmembrane PTPs have the structural organization of receptors and thus the potential to initiate cellular signaling in response to external stimuli. These molecules are characterized by the presence of a single transmembrane segment and two tandem PTP domains; only two examples of transmembrane PTPs that have single PTP domains are known, HPTP-P (Krueger, N. X., et al., (1990) EMBO J. 9:3241-3252) and DPTP10D (Tian, S. -S., et al., (1991) Cell 67:675-685).

Nonreceptor PTPs display a single catalytic domain and contain, in addition, non-catalytic amino acid sequences which appear to control intracellular localization of the molecules and which may be involved in the determination of substrate specificity (Mauro, L. J., and Dixon, J. E. (1994) TIBS 19:151-155) and have also been suggested to be regulators of PTP activity (Charbonneau, H., and Tonks, N. K. (1992) Annu. Rev. Cell Biol. 8:463-493). PTP1B (Tonks, N. K., et al., (1988) J. Biol. Chem. 263:6731-6737) is localized to the cytosolic face of the endoplasmic reticulum via its C-terminal 35 amino acids (Frangioni, J. V., et al., (1992) Cell 68:545-560). The proteolytic cleavage of PTP1B by the calcium dependent neutral protease calpain occurs upstream from this targeting sequence, and results in the relocation of the enzyme from the endoplasmic reticulum to the cytosol; such relocation is concomitant with a two-fold stimulation of PTP1B enzymatic activity (Frangioni, J. V., et al., (1993) EMBO J. 12:4843-4856). Similarly, the 11 kDa C-terminal domain of T-cell PTP (Cool, D. E., et al., (1989) Proc. Natl. Acad. Sci. (USA) 86:5257-5261) has also been shown to be responsible for enzyme localization and functional regulation (Cool, D. E., et al., (1990) Proc. Natl. Acad. Sci. (USA) 87:7280-7284; Cool, D. E., et al., (1992) Proc. Natl. Acad. Sci. (USA) 89:5422-5426).

PTPs containing SH2 domains have been described including PTP1C (Shen, S. -H., et al., (1991) Nature 352:736-739), also named HCP (Yi, T., et al., (1992) Mol. Cell. Biol. 12:836-846), SHP (Matthews, R. J., et al., (1992) Mol. Cell. Biol 12:2396-2405) or SH-PTP1 (Plutzky, J., et al., (1992) Proc. Natl. Acad. Sci. (USA) 89:1123-1127), and the related phosphatase PTP2C (Ahmad, S., et al., (1993) Proc. Nati. Acad. Sci. (USA) 90:2197-2201), also termed SH-PTP2 (Freeman Jr., R. M., et al., (1992) Proc. Natl. Acad. Sci. (USA) 89:11239-11243), SH-PTP3 (Adachi, M., et al., (1992) FEBS Letters 314:335-339), PTP1D (Vogel, W., et al., (1993) Science 259:1611-1614) or Syp (Feng, G.-S., et al., (1993) Science 259:1607-1611). The Drosophila csk gene product (Perkins, L. A., et al., (1992) Cell 70:225-236) also belongs to this subfamily. PTP1C has been shown to associate via its SH2 domains with ligand-activated c-Kit and CSF-1 receptor PTKs (Yi, T., and Ihle, J. N. (1993) Mol. Cell. Biol. 13:3350-3358; Young, Y.-G., et al., (1992) J. Biol. Chem. 267:23447-23450) but only association with activated CSF-1 receptor is followed by tyrosine phosphorylation of PTP1C. Syp interacts with and is phosphorylated by the ligand activated receptors for epidermal growth factor and platelet-derived growth factor (Feng, G.-S., et al., (1993) Science 259:1607-1611). Syp has also been reported to associate with tyrosine phosphorylated insulin receptor substrate 1 (Kuhne, M. R., et al., (1993) J. Biol. Chem. 268:11479-11481).

Two PTPs have been identified, PTPH1 (Yang, Q., and Tonks, N. K. (1991) Proc. Natl. Acad. Sci. (USA) 88:5949-5953) and PTPase MEG (Gu, M., et al., (1991) Proc. Natl. Acad. Sci. (USA) 88:5867-5871), which contain a region in their respective N-terminal segments with similarity to the cytoskeletal- associated proteins band 4.1 (Conboy, J., et al., (1986) Proc. Natl. Acad. Sci. (USA) 83:9512-9516), ezrin (Gould, K. L., et al., (1989) EMBO J. 8:4133-4142), talin (Rees, D. J. G., et al., (1990) Nature 347:685-689) and radixin (Funayama, N., et al., (1991) J. Cell Biol. 115:1039-1048). The function of proteins of the band 4.1 family appears to be the provision of anchors for cytoskeletal proteins at the inner surface of the plasma membrane (Conboy, J., et al., (1986) Proc. Natl. Acad. Sci. (USA) 83:9512-9516; Gould, K. L., et al., (1989) EMBO J. 8:4133-4142). It has been postulated that PTPH1 and PTPase MEG would, like members of this family, localize at the interface between the plasma membrane and the cytoskeleton and thereby be involved in the modulation of cytoskeletal function (Tonks, N. K., et al., (1991) Cold Spring Harbor Symposia on Quantitative Biology LVI:265-273).

The interest in studying PTKs and PTPs is particularly great in cancer research. For example, approximately one third of the known oncogenes include PTKs (Hunter, T. (1989) In Oncogenes and Molecular Origins of Cancer, R. Weinberq, Ed., Coldspring Harbor Laboratory Press, New York). In addition, the extent of tyrosine phosphorylation closely correlates with the manifestation of the transformed phenotype in cells infected by temperature-sensitive mutants of rous sarcoma virus. (Sefton, B., et al., (1980) Cell 20:807-816) Similarly, Brown-Shirner and colleagues demonstrated that over-expression of PTP1B in 3T3 cells suppressed the transforming potential of oncogenic neu, as measured by focus formation, anchorage-independent growth and tumorigenicity (Brown-Shirner, S., et al., (1992) Cancer Res. 52:478-482). Because they are direct antagonists of PTK activity, the PTPs also may provide an avenue of treatment for cancers caused by excessive PTK activity. Therefore, the isolation, characterization and cloning of various PTPs is an important step in developing, for example, gene therapy to treat PTK oncogene cancers.

SUMMARY OF THE INVENTION

The present invention is based upon the molecular cloning of previously uncloned and previously undisclosed nucleic acids encoding two novel PTPs. The disclosed sequences encode PTPs which we have designated PTPL1 and GLM-2. (PTPL1 was previously designated GLM-1 in U.S. patent application Ser. No. 08/115,573 filed Sep. 1, 1993.) In particular the present invention is based upon the molecular cloning of PTPL1 and GLM-2 PTP sequences from human glioblastoma cells. The invention provides isolated cDNA and RNA sequences corresponding to PTPL1 and GLM-2 transcripts and encoding the novel PTPs. In addition, the present invention provides vectors containing PTPL1 or GLM-2 cDNA sequences, vectors capable of expressing PTPL1 or GLM-2 sequences with endogenous or exogenous promoters, and hosts transformed with one or more of the above-mentioned vectors. Using the sequences disclosed herein as probes or primers in conjunction with such techniques as PCR cloning, targeted gene walling, and colony/plaque hybridization with genomic or cDNA libraries, the invention further provides for the isolation of allelic variants of the disclosed sequences, endogenous PTPL1 or GLM-2 regulatory sequences, and substantially similar or homologous PTPL1 or GLM-2 DNA and RNA sequences from other species including mouse, rat, rabbit and non-human primates.

The present invention also provides fragments and variants of isolated PTPL1 and GLM-2 sequences, fragments and variants of isolated PTFL1 or GLM-2 RNA, vectors containing variants or fragments of PTPL1 or GLM-2 sequences, vectors capable of expressing variants or fragments of PTPL1 or GLM-2 sequences with endogenous or exogenous regulatory sequences, and hosts transformed with one or more of the above-mentioned vectors. The invention further provides variants or fragments of substantially similar or homologous PTPL1 and GLM-2 DNA and RNA sequences from species including mouse, rat, rabbit and non-human primates.

The present invention provides isolated PTPL1 and GLM-2 anti-sense DNA, isolated PTPL1 and GLM-2 anti-sense RNA, vectors containing PTPL1 or GLM-2 anti-sense DNA, vectors capable of expressing PTPL1 or GLM-2 anti-sense DNA with endogenous or exogenous promoters, and hosts transformed with one or more of the above-mentioned vectors. The invention further provides the related PTPL1 or GLM-2 anti-sense DNA and anti-sense RNA sequences from other species including mouse, rat, rabbit and non-human primates

The present invention also provides fragments and variants of isolated PTPL1 and GLM-2 anti-sense DNA, fragments and variants of isolated PTPL1 and GLM-2 anti-sense RNA, vectors containing fragments or variants of PTPL1 and GLM-2 anti-sense DNA, vectors capable of expressing fragments or variants of PTPL1 and GLM-2 anti-sense DNA with endogenous or exogenous promoters, and hosts transformed with one or more of the above-mentioned vectors. The invention further provides fragments or variants of the related PTPL1 and GLM-2 anti-sense DNA and PTPL1 and GLM-2 anti-sense RNA sequences from other species including mouse, rat, rabbit and non-human primates.

Based upon the sequences disclosed herein and techniques well known in the art, the invention also provides isolated probes useful for detecting the presence or level of expression of a sequence identical, substantially similar or homologous to the disclosed PTPL1 and GLM-2 sequences. The probes may consist of the PTPL1 and GLM-2 DNA, RNA or anti-sense sequences disclosed herein. The probe may be labeled with, for example, a radioactive isotope; immobilized as, for example, on a filter for Northern or Southern blotting; or may be tagged with any other sort of marker which enhances or facilitates the detection of binding. The probes may be oligonucleotides or synthetic oligonucleotide analogs.

The invention also provides substantially pure PTPL1 and GLM-2 proteins. The proteins may be obtained from natural sources using the methods disclosed herein or, in particular, the invention provides substantially pure PTPL1 and GLM-2 proteins produced by a host cell or transgenic animal transformed by one of the vectors disclosed herein.

The invention also provides substantially pure variants and fragments of PTPL1 and GLM-2 proteins. Using the substantially pure PTPL1 or GLM-2 protein or variants or fragments of the PTPL1 or GLM-2 protein which are disclosed herein, the present invention provides methods of obtaining and identifying agents capable of binding to either PTPL1 or GLM-2. Specifically, such agents include antibodies, peptides, carbohydrates and pharmaceutical agents. The agents may include natural ligands, co-factors, accessory proteins or associated peptides, modulators, regulators, or inhibitors. The entire PTPL1 or GLM-2 protein may be used to test or develop such agents or variants or fragments thereof may be employed. In particular, only certain domains of the PTPL1 or GLM-2 protein may be employed. The invention further provides detectably labeled, immobilized and toxin-conjugated forms of these agents.

The present invention also provides methods for assaying for PTPL1 or GLM-2 PTP activity. For example, using the PTPL1 and GLM-2 anti-sense probes disclosed herein, the presence and level of either PTPL1 or GLM-2 expression may be determined by hybridizing the probes to total or selected mRNA from the cell or tissue to be studied. Alternatively, using the antibodies or other binding agents disclosed herein, the presence and level of PTPL1 or GLM-2 protein may be assessed. Such methods may, for example, be employed to determine the tissue-specificity of PTPL1 or GLM-2 expression.

The present invention also provides methods for assessing the regulation of PTPL1 or GLM-2 function. Such methods include fusion of the regulatory regions of the PTPL1 or GLM-2 nucleic acid sequences to a marker locus, introduction of this fusion product into a host cell using a vector, and testing for inducers or inhibitors of PTPL1 or GLM-2 by measuring expression of the marker locus. In addition, by using labeled PTPL1 and GLM-2 anti-sense transcripts, the level of expression of PTPL1 or GLM-2 mRNA may be ascertained and the effect of various endogenous and exogenous compounds or treatments on PTPL1 or GLM-2 expression may be determined. Similarly, the effect of various endogenous and exogenous compounds and treatments on PTPL1 or GLM-2 expression may be assessed by measuring the level of either PTPL1 or GLM-2 protein with labeled antibodies as disclosed herein.

The present invention provides methods for efficiently testing the activity or potency of drugs intended to enhance or inhibit PTPL1 or GLM-2 expression or activity. In particular, the nucleic acid sequences and vectors disclosed herein enable the development of cell lines and transgenic organisms with increased, decreased, or differently regulated expression of PTPL1 or GLM-2. Such cell lines and animals are useful subjects for testing pharmaceutical compositions.

The present invention further provides methods of modulating the activity of PTPL1 and GLM-2 PTPs in cells. Specifically, agents and, in particular, antibodies which are capable of binding to either PTPL1 or GLM-2 PTP are provided to a cell expressing PTPL1 or GLM-2. The binding of such an agent to the PTP can be used either to activate or inhibit the activity of the protein. In addition, PTPL1 and GLM-2 anti-sense transcripts may be administered such that they enter the cell and inhibit translation of the PTPL1 or GLM-2 mRNA and/or the transcription of PTPL1 or GLM-2 nucleic acid sequences. Alternatively, PTPL1 or GLM-2 RNA may be administered such that it enters the cell, serves as a template for translation and thereby augments production of PTPL1 or GLM-2 protein. In another embodiment, a vector capable of expressing PTPL1 or GLM-2 mRNA transcripts or PTPL1 or GLM-2 anti-sense RNA transcripts is administered such that it enters the cell and the transcripts are expressed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Comparison of PTPL1 SEQ ID NO:11 with proteins of the band 4.1 superfamily (ezrin, -SEQ ID NO:12; band 4.1, -SEQ ID NO:13. The alignment was done using the Clustal V alignment program (Fazioli, F., et al., (1993) Oncogene 8:1335-1345). Identical amino acid residues conserved in two or more sequences, are boxed. A conserved tyrosine residue, which in ezrin has been shown to be phosphorylated by the epidermal growth factor receptor, is indicated by an asterisk

FIG. 2. Comparison of amino acid sequences of GLGF-repeats. The alignment was done manually. Numbers of the GLGF-repeats are given starting from the N-terminus of the protein. Residues conserved in at least eight (42%) repeats are showed in bold letters. Five repeats are found In PTPL1 (SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20), three are found in the guanylate kinases, dlg-A gene product (SEQ ID NO:23, SEQ ID NO:24, and SEQ ID NO:25), PSD-95 (SEQ ID NO:26, SEQ ID NO:27, and SEQ ID NO:28) and the 220-kDa protein (SEQ ID NO:29, SEQ ID NO:30, and SEQ ID NO:31). One GLGF-repeat is found in the guanylate kinase p55 (SEQ ID NO:32), in the PTPs PTPH1 (SEQ ID NO:21) and PTPase MEG (SEQ ID NO:22), and in nitric oxide synthase (NOS, SEQ ID NO:33). One repeat is also found in an altered ros1 transcript from the glioma cell line U-118MG(SEQ ID NO:34).

FIG. 3. Schematic diagram illustrating the domain strucure of PTPL1 and other GLGF-repeat containing proteins. Domains and motifs indicated in the figure are L, leucine zipper motif: Band 4.1, band 4.1-like domain; G, GLGF-repeat; PTPase, catalytic PTPase domain; 3, SH3 domain; GK, guanylate kinase domain, Bind. Reg., co-enzyme binding region.

FIG. 4. PTP activity of PTPL1. Immunoprecipitates from COS-1 cells using an antiserum (αL1B) against PTPL1, unblocked (open circles) or blockeod with peptide (open sguares), were incubated for 2, 4, 6 or 12 minutes with myelin basic protein, ³² P-labeled on tyrosine residues. The amount of radioactivity released as inorganic phosphate is expressed as the percentage of the total input of radioactivity.

DETAILED DESCRIPTION OF THE INVENTION DEFINITIONS

In the description that follows, a number of terms used in biochemistry, molecular biology, recombinant DNA (rDNA) technology and immunology are extensively utilized. In addition, certain new terms are introduced for greater ease of exposition and to more clearly and distinctly point out the subject matter of the invention. In order to provide a clear and consistent understanding of the specification and claims, including the scope to be given such terms, the following definitions are provided.

Gene. A gene is a nucleic acid sequence including a promoter region operably joined to a coding sequence which may serve as a template from which an RNA molecule may be transcribed by a nucleic acid polymerase. A gene contains a promoter sequence to which the polymerase binds, an initiation sequence which signals the point at which transcription should begin, and a termination sequence which signals the point at which transcription should end. The gene also may contain an operator site at which a repressor may bind to block the polymerase and to prevent transcription and/or may contain ribosome binding sites, capping signals, transcription enhancers and polyadenylation signals. The promoter, initiation, termination and, when present, operator sequences, ribosome binding sites, capping signals, transcription enhancers and polyadenylation signals are collectively referred to as regulatory sequences. Regulatory sequences 5' of the transcription initiation codon are collectively referred to as the promoter region. The sequences which are transcribed into RNA are the coding sequences. The RNA may or may not code for a protein. RNA that codes for a protein is processed into messenger RNA (mRNA). Other RNA molecules may serve functions or uses without ever being translated into protein. These include ribosomal RNA (rRNA), transfer RNA (tRNA), and the anti-sense RNAs of the present invention. In eukaryotes, coding sequences between the translation start codon (ATG) and the translation stop codon (TAA, TGA, or TAG) may be of two types: exons and introns. The exons are included in processed mRNA transcripts and are generally translated into a peptide or protein. Introns are excised from the RNA as it is processed into mature mRNA and are not translated into peptide or protein. As used herein, the word gene embraces both the gene including its introns, as may be obtained from genomic DNA, and the gene with the introns excised from the DNA, as may be obtained from cDNA.

Anti-sense DNA is defined as DNA that encodes anti-sense RNA and anti-sense RNA is RNA that is complementary to or capable of selectively hybridizing to some specified RNA transcript. Thus, anti-sense RNA for a particular gene would be capable of hybridizing with that gene's RNA transcript in a selective manner. Finally, an anti-sense gene is defined as a segment of anti-sense DNA operably joined to regulatory sequences such that the sequences encoding the anti-sense RNA may be expressed.

cDNA. Complementary DNA or cDNA is DNA which has been produced by reverse transcription from mature mRNA. In eukaryotes, sequences in RNA corresponding to introns in a gene are excised during mRNA processing. cDNA sequences, therefore, lack the intron sequences present in the genomic DNA to which they correspond. In addition, cDNA sequences will lack the regulatory sequences which are not transcribed into RNA. To create a functional cDNA gene, therefore, the cDNA sequence must be operably joined to a promoter region such that transcription may occur.

Operably Joined. A coding sequence and a promoter region are said to be operably joined when they are covalently linked in such a way as to place the expression or transcription of the coding sequence under the influence or control of the promoter region. If it is desired that the coding sequences be translated into a functional protein, two DNA sequences are said to be operably joined if induction of promoter function results in the transcription of the coding sequence and if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the promoter region to direct the transcription of the coding sequences, or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein. Thus, a promoter region would be operably joined to a coding sequence if the promoter region were capable of effecting transcription of that DNA sequence such that the resulting transcript might be translated into the desired protein or polypeptide.

If it is not desired that the coding sequence be eventually expressed as a protein or polypeptide, as in the case of anti-sense RNA expression, there is no need to ensure that the coding sequences and promoter region are joined without a frame-shift. Thus, a coding sequence which need not be eventually expressed as a protein or polypeptide is said to be operably joined to a promoter region if induction of promoter function results in the transcription of the RNA sequence of the coding sequences.

The precise nature of the regulatory sequences needed for gene expression may vary between species or cell types, but shall in general include, as necessary, 5' non-transcribing and 5' non-translating sequences involved with initiation of transcription and translation respectively, such as a TATA box capping sequence, CAAT sequence, and the like. Especially, such 5' non-transcribing regulatory sequences will include a promoter region which includes a promoter sequence for transcriptional control of the operably joined gene. Such transcriptional control sequences may also include enhancer sequences or upstream activator sequences, as desired.

Vector. A vector may be any of a number of nucleic acid sequences into which a desired sequence may be inserted by restriction and ligation. Vectors are typically composed of DNA although RNA vectors are also available. Vectors include plasmids, phage, phasmids and cosmids. A cloning vector is one which is able to replicate in a host cell, and which is further characterized by one or more endonuclease restriction sites at which the vector may be cut in a determinable fashion and into which a desired DNA sequence may be ligated such that the new recombinant vector retains its ability to replicate in the host cell. In the case of plasmids, replication of the desired sequence may occur many times as the plasmid increases in copy number within the host bacterium or just a single time per host before the host reproduces by mitosis. In the case of phage, replication may occur actively during a lytic phase or passively during a lysogenic phase. An expression vector is one into which a desired DNA sequence may be inserted by restriction and ligation such that it is operably joined to a promoter region and may be expressed as an RNA transcript. Vectors may further contain one or more marker sequences suitable for use in the identification of cells which have or have not been transformed or transfected with the vector. Markers include, for example, genes encoding proteins which increase or decrease either resistance or sensitivity to antibiotics or other compounds, genes which encode enzymes whose activities are detectable by standard assays known in the art (e.g., β-galactosidase or alkaline phosphatase), and genes which visibly affect the phenotype of transformed or transfected cells, hosts, colonies or plaques.

Fragment. As used herein, the term "fragment" means both unique fragments and substantially characteristic fragments. As used herein, the term "fragment" is not to be construed according to standard dictionary definitions.

Substantially Characteristic Fragment. A "substantially characteristic fragment" of a molecule, such as a protein or nucleic acid sequence, is meant to refer to any portion of the molecule sufficiently rare or sufficiently characteristic of that molecule so as to identify it as derived from that molecule or to distinguish it from a class of unrelated molecules. A single amino acid or nucleotide, or a sequence of only two or three, cannot be a substantially characteristic fragment because all such short sequences occur frequently in nature.

A substantially characteristic fragment of a nucleic acid sequence is one which would have utility as a probe in identifying the entire nucleic acid sequence from which it is derived from within a sample of total genomic or cDNA. Under stringent hybridization conditions, a substantially characteristic fragment will hybridize only to the sequence from which it was derived or to a small class of substantially similar related sequences such as allelic variants, heterospecific homologous loci, and variants with small insertions, deletions or substitutions of nucleotides or nucleotide analogues. A substantially characteristic fragment may, under lower stringency hybridization conditions, hybridize with non-allelic and non-homologous loci and be used as a probe to find such loci but will not do so at higher stringency.

A substantially characteristic fragment of a protein would have utility in generating antibodies which would distinguish the entire protein from which it is derived, an allelomorphic protein or a heterospecific homologous protein from a mixture of many unrelated proteins.

It is within the knowledge and ability of one ordinarily skilled in the art to recognize, produce and use substantially characteristic fragments of nucleic acid sequences and proteins as, for example, probes for screening DNA libraries or epitopes for generating antibodies.

Unique Fragment. As used herein, a unique fragment of a protein or nucleic acid sequence is a substantially characteristic fragment not currently known to occur elsewhere in nature (except in allelic or heterospecific homologous variants, i.e. it is present only in the PTPL1 or GLM-2 PTP or a PTPL1 or GLM-2 PTP "homologue"). A unique fragment will generally exceed 15 nucleotides or 5 amino acid residues. One of ordinary skill in the art can identify unique fragments by searching available computer databases of nucleic acid and protein sequences such as Genbank (Los Alamos National Laboratories, USA), SwissProt or the National Biomedical Research Foundation database. A unique fragment is particularly useful, for example, in generating monoclonal antibodies or in screening DNA or cDNA libraries.

Stringent Hybridization Conditions. "Stringent hybridization conditions" is a term of art understood by those of ordinary skill in the art. For any given nucleic acid sequence, stringent hybridization conditions are those conditions of temperature and buffer solution which will permit hybridization of that nucleic acid sequence to its complementary sequence and not to substantially different sequences. The exact conditions which constitute "stringent" conditions, depend upon the length of the nucleic acid sequence and the frequency of occurrence of subsets of that sequence within other non-identical sequences. By varying hybridization conditions from a level of stringency at which no hybridization occurs to a level at which hybridization is first observed, one of ordinary skill in the art can, without undue experimentation, determine conditions which will allow a given sequence to hybridize only with identical sequences. Suitable ranges of such stringency conditions are described in Krause, M. H. and S. A. Aaronson, Methods in Enzymology, 200:546-556 (1991). Stringent hybridization conditions, depending upon the length and commonality of a sequence, may include hybridization conditions of 30° C.-65° C. and from 5X to 0.1X SSPC. Less than stringent hybridization conditions are employed to isolate nucleic acid sequences which are substantially similar, allelic or homologous to any given sequence.

When using primers that are derived from nucleic acid encoding a PTPL1 or GLM-2 PTP, one skilled in the art will recognize that by employing high stringency conditions (e.g. annealing at 50°-60° C.), sequences which are greater than about 75% homologous to the primer will be amplified. By employing lower stringency conditions (e.g. annealing at 35°-37° C.), sequences which are greater than about 40-50% homologous to the primer will be amplified.

When using DNA probes derived from a PTPL1 or GLM-2 PTP for colony/plague hybridization, one skilled in the art will recognize that by employing high stringency conditions (e.g. hybridization at 50°-65° C., 5X SSPC, 50% formamide, wash at 50°-65° C., 0.5X SSPC), sequences having regions which are greater than about 90% homologous to the probe can be obtained, and by employing lower stringency conditions (e.g. hybridization at 35°-37° C., 5X SSPC, 40-45% formamide, wash at 42° C. SSPC), sequences having regions which are greater than 35-45% homologous to the probe will be obtained.

Substantially similar. Two nucleic acid sequences are substantially similar if one of them or its anti-sense complement can bind to the other under strict hybridization conditions so as to distinguish that strand from all or substantially all other sequences in a cDNA or genomic library. Alternatively, one sequence is substantially similar to another if it or its anti-sense complement is useful as a probe in screening for the presence of its similar DNA or RNA sequence under strict hybridization conditions. Two proteins are substantially similar if they are encoded by substantially similar DNA or RNA sequences. In addition, even if they are not encoded by substantially similar nucleic acids, two proteins are substantially similar if they share sufficient primary, secondary and tertiary structure to perform the same biological role (structural or functional) with substantially the same efficacy or utility.

Variant. A "variant" of a protein or nucleic acid or fragment thereof is meant to include a molecule substantially similar in structure to the protein or nucleic acid, or to a fragment thereof. Variants of nucleic acid sequences include sequences with conservative nucleotide substitutions, small insertions or deletions, or additions. Variants of proteins include proteins with conservative amino acid substitutions, small insertions or deletions, or additions. Thus, nucleotide substitutions which do not effect the amino acid sequence of the subsequent translation product are particularly contemplated. Similarly, substitutions of structurally similar amino acids in proteins, such as leucine for isoleucine, or insertions, deletions, and terminal additions which do not destroy the functional utility of the protein are contemplated. Allelic variants of nucleic acid sequences and allelomorphic variants or protein or polypeptide sequences are particularly contemplated. As is well known in the art, an allelic variant is simply a naturally occurring variant of a polymorphic gene and that term is used herein as it is commonly used in the field of population genetics. The production of such variants is well known in the art and, therefore, such variants are intended to fall within the spirit and scope of the claims.

Homologous and homologues. As used herein, the term "homologues" is intended to embrace either and/or both homologous nucleic acid sequences and homologous protein sequences as the context may indicate. Homologues are a class of variants, as defined above, which share a sufficient degree of structural and functional similarity so as to indicate to one of ordinary skill in the art that they share a common evolutionary origin and that the structural and functional similarity is the result of evolutionary conservation. To be considered homologues of the PTPL1 or GLM-2 PTP, nucleic acid sequences and the proteins they encode must meet two criteria: (1) The polypeptides encoded by homologous nucleic acids are at least approximately 50-60% identical and preferably at least 70% identical for at least one stretch of at least 20 amino acids. As is well known in the art, both the identity and the approximate positions of the amino acid residues relative to each other must be conserved and not just the overall amino acid composition. Thus, one must be able to "line up" the conserved regions of the homologues and conclude that there is 50-60% identity; and (2) The polypeptides must retain a functional similarity to the PTPL1 or GLM-2 PTP in that it is a protein tyrosine phosphatase.

Substantially Pure. The term "substantially pure" when applied to the proteins, variants or fragments thereof of the present invention means that the proteins are essentially free of other substances to an extent practical and appropriate for their intended use. In particular, the proteins are sufficiently pure and are sufficiently free from other biological constituents of their hosts cells so as to be useful in, for example, protein sequencing, or producing pharmaceutical preparations. By techniques well known in the art, substantially pure proteins, variants or fragments thereof may be produced in light of the nucleic acids of the present invention.

Isolated. Isolated refers to a nucleic acid sequence which has been: (i) amplified in vitro by, for example, polymerase chain reaction (PCR); (ii) recombinantly produced by cloning; (iii) purified, as by cleavage and gel separation; or (iv) synthesized by, for example, chemical synthesis. An isolated nucleic acid sequence is one which is readily manipulable by recombinant DNA techniques well known in the art. Thus, a nucleic acid sequence contained in a vector in which 5' and 3' restriction sites are known or for which polymerase chain reaction (PCR) primer sequences have been disclosed is considered isolated but a nucleic acid sequence existing in its native state in its natural host is not. An isolated nucleic acid may be substantially purified, but need not be. For example, a nucleic acid sequence that is isolated within a cloning or expression vector is not pure in that it may comprise only a tiny percentage of the material in the cell in which it resides. Such a nucleic acid is isolated, however, as the term is used herein because it is readily manipulable by standard techniques known to those of ordinary skill in the art.

Immunogenetically Effective Amount. An "immunogenetically effective amount" is that amount of an antigen (e.g. a protein, variant or a fragment thereof) necessary to induce the production of antibodies which will bind to the epitopes of the antigen. The actual quantity comprising an "immunogenetically effective amount" will vary depending upon factors such as the nature of the antigen, the organism to be immunized, and the mode of immunization. The determination of such a quantity is well within the ability of one ordinarily skilled in the art without undue experimentation.

Antigen and Antibody. The term "antigen" as used in this invention is meant to denote a substance that can induce a detectable immune response to it when introduced to an animal. Such substances include proteins and fragments thereof.

The term "epitope" is meant to refer to that portion of an antigen which can be recognized and bound by an antibody. An antigen may have one, or more than one epitope. An "antigen" is capable of inducing an animal to produce antibody capable of binding to an epitope of that antigen. An "immunogen" is an antigen introduced into an animal specifically for the purpose of generating an immune response to the antigen. An antibody is said to be "capable of selectively binding" a molecule if it is capable of specifically reacting with the molecule to thereby bind the molecule to the antibody. The selective binding of an antigen and antibody is meant to indicate that the antigen will react, in a highly specific manner, with its corresponding antibody and not with the multitude of other antibodies which may be evoked by other antigens.

The term "antibody" (Ab) or "monoclonal antibody" (Mab) as used herein is meant to include intact molecules as well as fragments thereof (such as, for example, Fab and F(ab')₂ fragments) which are capable of binding an antigen. Fab and F(ab')₂ fragments lack the Fc fragment of intact antibody, clear more rapidly from the circulation, and may have less non-specific tissue binding than an intact antibody. Single chain antibodies, humanized antibodies, and fragments thereof, also are included.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention relates to the identification, isolation and cloning of two novel protein tyrosine phosphatases designated PTPL1 and GLM-2. Specifically, the present invention discloses the isolation and cloning of cDNA and the amino acid sequences of PTPL1 and GLM-2 from human glioblastoma and brain cell cDNA libraries. These phosphatases are, initially, discussed separately below. As they are related in function and utility as well as structurally with respect to their catalytic domains, they are subsequently discussed in the alternative.

In order to identify novel PTPs, a PCR-based approach was used. PCR was performed using cDNA from the human glioma cell line U-343 MGa 31L as a template and degenerate primers that were based on conserved regions of PTPs. One primer was derived from the catalytic site (HCSAG) of the PTP domain and two primers were derived from conserved regions in the N-terminal part of the domain. Several PCR-products were obtained, including some corresponding to the cytoplasmic PTPs PTPH1 (Yang, Q., and Tonks, N. K. (1991) Proc. Natl. Acad. Sci. (USA) 88:5949-5953), PTPase MEG (Gu, M., et al., (1991) Proc. Natl. Acad. Sci. (USA) 88:5867-5871), P19PTP (den Hertog, J., et al., (1992) Biochem. Biophys. Res. Commun. 184:1241-1249), and TC-PTP (Cool, D. E., et al,, (1989) Proc. Natl. Acad. Sci. (USA) 86:5257-5261), as well as to the receptor-like PTPs HPTP-α, HPTP-γ, and HPTP-δ (Krueger, N. X., et al., (1990) EMBO J. 9:3241-3252). In addition to these known sequences, three PCR-products encoding novel PTP-like sequences were found.

One of these PCR-products is almost identical to a PCR-product derived from a human leukemic cell line (Honda, H., et al., (1993) Leukemia 7:742-746) and was chosen for further characterization and was used to screen an oligo-(dT)-primed U-343 MGa 31L cDNA library which resulted in the isolation of the clone λ6.15. Upon Northern blot analysis of mRNA from human foreskin fibroblasts AG1518, probed with the λ6.15 insert, a transcript of 9.5 kb could be seen. Therefore AG1518 cDNA libraries were constructed and screened with λ6.15 in order to obtain a full-length clone. Screening of these libraries with λ6.15, and thereafter with subsequently isolated clones, resulted in several overlapping clones which together covered 8040 bp including the whole coding sequence of a novel phosphatase, denoted PTPL1. The total length of the open reading frame was 7398 bp coding for 2466 amino acids with a predicted molecular mass of 275 kDa. The nucleotide and deduced amino acid sequence of PTPL1 are disclosed as SEQ ID NO.:1 and SEQ ID NO.:2, respectively. Although the sequence surrounding the putative initiator codon at positions 78-80 does not conform well to the Kozak consensus sequence (Kozak, M. (1987) Nucl. Acids Res. 15:8125-8148) there is a purine at position -3 which is an important requirement for an initiation site. The 77 bp 5' untranslated region is GC-rich and contains an inframe stop codon at positions 45-47. A 3' untranslated region of 565 bp begins after a TGA stop codon at positions 7476-7478, and does not contain a poly-A tail.

In the deduced amino acid sequence of PTPL1 no transmerubrane domain or signal sequence for secretion are found, indicating that PTPL1 is a cytoplasmic PTP. Starting from the N-terminus, the sequence of the first 470 amino acid residues shows no homology to known proteins. The region 470-505 contains a leucine zipper motif, with a methionine in the position where the fourth leucine usually is found (LX₆ LX₆ LX₆ MX₆ L); similar replacements of leucine residues with methionine residues are also found in the leucine zippers of the transcription factors CYS-3 (Fu, Y.-H., et al., (1989) Mol. Cell. Biol. 9:1120-1127) and dFRA (Perkins, K. K., et al., (1990) Genes Dev. 4:822-834). Furthermore, consistent with the notion that this is a functional leucine zipper, no helix breaking residues (glycine and proline) are present in this region. The leucine zipper motif is followed by a 300 amino acid region (570-885) with homology to the band 4.1 superfamily (see FIG. 1). The members of this superfamily are cytoskeleton-associated proteins with a homologous domain in the N-terminus (Tsukita, S., et al., (1992) Curr. Opin. Cell Biol. 4:834-839). Interestingly, two cytoplasmic PTPs, PTPH1 and PTPase MEG, contain a band 4.1-like domain. The band 4.1-like domain of PTPL1 is 20% to 24% similar to most known proteins of this superfamily, including ezrin (Gould, K. L., et al., (1989) EMBO J. 8:4133-4142), moesin (Lankes, W. T., and Furthmayr, H. (1991) Proc. Natl. Acad. Sci. (USA) 88:8297-8301), radixin (Funayama, N., et al., (1991) J. Cell Biol. 115:1039-1048), merlin (Trofatter, J. A., et al., (1993) Cell 72:791-800), band 4.1 protein (Conboy, J., et al., (1986) Proc. Natl. Acad. Sci. (USA) 83:9512-9516), PTPH1 (Yang, Q., and Tonks, N. K. (1991) Proc. Natl. Acad. Sci. (USA) 88:5949-5953) and PTPase MEG (Gu, M., et al., (1991) Proc. Natl. Acad. Sci. (USA) 88:5867-5871).

Between amino acid residues 1080 and 1940 there are five 80 amino acid repeats denoted GLGF-repeats. This repeat was first found in PSD-95 (Cho, K.-O., et al., (1992) Neuron 9:929-942), also called SAP (Kistner, U., et al., (1993) J. Biol. Chem. 268:4580-4583), a protein in post-synaptic densities, i.e. structures of the submembranous cytoskeleton in synaptic junctions. Rat PSD-95 is homologous to the discs-large tumor suppressor gene in Drosophila (Woods, D. F., and Bryant, P. J. (1991) Cell 66:451-464), dlg-A, which encodes a protein located in septate junctions. These two proteins each contain three GLGF-repeats, one SH-3 domain and a guanylate kinase domain. Through computer searches in protein data bases complemented by manual searches, 19 GLGF-repeats in 9 different proteins, all of them enzymes, were found (see FIG. 2 and FIG. 3). Besides dlg-A and PSD-95, there are two other members of the guanylate kinase family, a 220-kDa protein (Itoh, M., et al., (1993) J. Cell Biol. 121:491-502) which is a constitutive protein of the plasma membrane undercoat with three GLGF-repeats, and p55 (Ruff, P., et al., (1991) Proc. Natl. Acad. Sci. (USA) 88:6595-6599) which is a palmitoylated protein from erythrocyte membranes with one GLGF-repeat. A close look into the sequence of PTPH1 and PTPase MEG revealed that each of them has one GLGF-repeat between the band 4.1 homology domain and the PTP domain. One GLGF-repeat is also found in nitric oxide synthase from rat brain (Bredt, D. S., et al., (1991) Nature 351:714-718), and a glioma cell line, U-118MG, expresses an altered ros1 transcript (Sharma, S., et al., (1989) Oncogene Res. 5:91-100), containing a GLGF-repeat probably as a result of a gene fusion.

The PTP domain of TPL1 is localized in the C-terminus (amino acid residues 2195-2449). It contains most of the conserved motifs of PTP domains and shows about 30% similarity to known PTPs.

Use of a 9.5 kb probe including SEQ ID NO.:1 for Northern blot analysis for tissue-specific expression showed high expression of PTPL1 in human kidney, placenta, ovaries, and testes; medium expression in human lung, pancreas, prostrate and brain; low expression in human heart, skeletal muscle, spleen, liver, small intestine and colon; and virtually no detectable expression in human leukocytes. Furthermore, using a rat PCR product for PTPL1 as a probe, PTPL1 was found to be expressed in adult rats but not in rat embryos. This latter finding suggests that PTPL1 may have a role, like many PTPs, in the signal transduction process that leads to cellular growth or differentiation.

The rabbit antiserum αL1A (see Example 5), made against a synthetic peptide derived from amino acid residues 1802-1823 in the PTPL1 sequence, specifically precipitated a component of 250 kDa from ³⁵ S!methionine and ³⁵ S!cysteine labeled COS-1 cells transfected with the PTPL1 cDNA. This component could not be detected in untransfected cells, or in transfected cells using either pre-immune serum or antiserum pre-blocked with the immunogenic peptide. Identical results were obtained using the antiserum αL1B (see Example 5) made against residues 450-470 of PTPL1. A component of about 250 kDa could also be detected in immunoprecipitations using AG1518 cells, PC-3 cells, CCL-64 cells, A549 cells and PAE cells. This component was not seen upon precipitation with the preimmune serum, or when precipitation was made with αL1A antiserum preblocked with peptide. The slight variations in sizes observed between the different cell lines could be due to species differences. A smaller component of 78 kDa was also specifically precipitated by the αL1A antiserum. The relationship between this molecule and PTPL1 remains to be determined.

In order to demonstrate that PTPL1 has PTP activity, immunoprecipitates from COS-1 cells transfected with PTPL1 cDNA were incubated with myelin basic protein, ³² P-labeled on tyrosine residues, as a substrate. The amount of radioactivity released as inorganic phosphate was measured. Immunoprecipitates with αL1B (open circles) gave a time-dependent increase in dephosphorylation with over 30% dephosphorylation after 12 minutes compared to 2% dephosphorylation when the antiserum was pre-blocked with peptide (open squares) (see FIG. 4).

The present invention also provides an isolated nucleic acid sequence encoding a novel PTP designated GLM-2, variants and fragments thereof, and uses relating thereto. One sequence encoding a GLM-2 PTP and surrounding nucleotides is disclosed as SEQ ID NO.:3. This sequence includes the coding sequences for GLM-2 PTP as well as both 5' and 3' untranslated regions including regulatory sequences. The full disclosed sequence, designated SEQ ID NO.:3 is 3090 bp in length.

The nucleic acid sequence of SEQ ID NO.:3 includes 1310 base pairs of 5' untranslated region and 673 bp of 3' untranslated region which do not appear to encode a sequence for a poly-A (polyadenylation) tail. Transcription of SEQ ID NO.:3 begins at approximately position 1146. A translation start codon (ATG) is present at positions 1311 to 1313 of SEQ ID NO.:3. The nucleotides surrounding the start codon (AGCATGG) show substantial similarity to the Kozak consensus sequence (RCCATGG) (Kozak, M. (1987) Nucl. Acids Res. 15:8125-8148). A translation stop codon (TGA) is present at positions 2418 to 2420 of SEQ ID NO.:3. The open reading frame of 1107 bp encodes a protein of 369 amino acid residues with a predicted molecular mass of 41 kD. The deduced amino acid sequence of this protein is disclosed as SEQ ID NO.:4.

The sequence disclosed in SEQ ID NO.:3 encodes a single domain PTP similar to the rat PTP STEP (53% identity; Lombroso, et al., 1991) and the human PTP LC-PTP (51% identity; Adachi, M., et al., (1992) FEBS Letters 314:335-339). None of the sequenced regions encodes a polypeptide sequence with any substantial similarity to known signal or transmembrane domains. Further indicating that GLM-2 is a cytoplasmic PTP.

Use of a 3.6 kb probe including SEQ ID NO.:3 for Northern blot analysis for tissue-specific expression showed a strong association with human brain tissue and little or no expression in human heart, placenta, lung, liver, skeletal muscle, kidney or pancreas. This is similar to to the pattern of tissue-specific expression shown by STEP.

Cloning and expression of PTPL1 and GLM-2.

In one series of embodiments of the present invention, an isolated DNA, cDNA or RNA sequence encoding a PTPL1 or GLM-2 PTP, or a variant or fragment thereof, is provided. The procedures described above, which were employed to isolate the first PTPL1 and GLM-2 sequences no longer need be employed. Rather, using the sequences disclosed herein, a genomic DNA or cDNA library may be readily screened to isolate a clone containing at least a fragment of a PTPL1 or GLM-2 sequence and, if desired, a full sequence. Alternatively one may synthesize PTPL1 and GLM-2 encoding nucleic acids using the sequences disclosed herein.

The present invention further provides vectors containing nucleic acid sequences encoding PTPL1 and GLM-2. Such vectors include, but are not limited to, plasmids, phage, plasmids and cosmid vectors. In light of the present disclosure, one of ordinary skill in the art can readily place the nucleic acid sequences of the present invention into any of a great number of known suitable vectors using routine procedures.

The source nucleic acids for a DNA library may be genomic DNA or cDNA. Which of these is employed depends upon the nature of the sequences sought to be cloned and the intended use of those sequences.

Genomic DNA may be obtained by methods well known to those or ordinary skill in the art (for example, see Guide to Molecular Cloning Techniques, S. L. Berger et al., eds., Academic Press (1987)). Genomic DNA is preferred when it is desired to clone the entire gene including its endogenous regulatory sequences. Similarly, genomic DNA is used when it is only the regulatory sequences which are of interest.

Complementary or cDNA may be produced by reverse transcription methods which are well known to those of ordinary skill in the art (for example, see Guide to Molecular Cloning Techniques, S. L. Berger et al., eds., Academic Press (1987)). Preferably, the mRNA preparation for reverse transcription should be enriched in the DNA of the desired sequence. This may be accomplished by selecting cells in which the mRNA is produced at high levels or by inducing high levels of production. Alternatively, in vitro techniques may be used such as sucrose gradient centrifugation to isolate mRNA transcripts of a particular size. cDNA is preferred when the regulatory sequences of a gene are not needed or when the genome is very large in comparison with the expressed transcripts. In particular, cDNA is preferred when a eukaryotic gene containing introns is to be expressed in a prokaryotic host.

To create a DNA or cDNA library, suitable DNA or cDNA preparations are randomly sheared or enzymatically cleaved by restriction endonucleases to create fragments appropriate in size for the chosen library vector. The DNA or cDNA fragments may be inserted into the vector in accordance with conventional techniques, including blunt-ending or staggered-ending termini for ligation. Typically, this is accomplished by restriction enzyme digestion to provide appropriate termini, the filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and ligation with appropriate ligases. Techniques for such manipulations are well known in the art and may be found, for example, in Sambrook, et al., Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989). The library will consist of a great many clones, each containing a fragment of the total DNA or cDNA. A great variety of cloning vectors, restriction endonucleases and ligases are commercially available and their use in creating DNA libraries is well known to those of ordinary skill in the art. See, for example, Sambrook, et al., Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989).

DNA or cDNA libraries containing sequences coding for PTPL1 or GLM-2 nucleic acid sequences may be screened and a sequence coding for either PTPL1 or GLM-2 identified by any means which specifically selects for that sequence. Such means include (a) hybridization with an appropriate nucleic acid probe(s) containing a unique or substantially characteristic fragment of the desired DNA or cDNA (b) hybridization-selected translational analysis in which native mRNA which hybridizes to the clone in question is translated in vitro and the translation products are further characterized (c) if the cloned genetic sequences are themselves capable of expressing mRNA, immunoprecipitation of a translated PTPL1 or GLM-2 recombinant product produced by the host containing the clone, or preferably (d) by using a unique or substantially characteristic fragment of the desired sequence as a PCR primer to amplify those clones with which it hybridizes.

Preferably, the probe or primer is a substantially characteristic fragment of one of the disclosed sequences. More preferably, the probe is a unique fragment of one of the disclosed sequences. In choosing a fragment, unique and substantially characteristic fragments can be identified by comparing the sequence of a proposed probe to the known sequences found in sequence databases. Alternatively, the entire PTPL1 or GLM-2 sequence may be used as a probe. In a preferred embodiment, the probe is a ³² P random-labeled unique fragment of the PTPL1 or GLM-2 nucleic acid sequences disclosed herein. In a most preferred embodiment, the probe serves as a PCR primer containing a unique or substantially characteristic fragment of the PTPL1 or GLM-2 sequences disclosed herein.

The library to be screened may be DNA or cDNA. Preferably, a cDNA library is screened. In a preferred embodiment, a U-343 MGa 31L human glioblastoma (Nister, M., et al., (1988) Cancer Res. 48:3910-3918) or AG1518 human fibroblast (Human Genetic Mutant Cell Repository, Institute for Medical Research, Camden, N.J.) cDNA library is screened with a probe to a unique or substantially characteristic fragment of the PTPL1 sequence. Because PTPL1 is expressed in a wide variety of tissues, cDNA libraries from many tissues may be employedN n another preferred embodiment, a λgt10 human brain cDNA library (Clontech, Calif.) is screened with a probe to a unique or substantially characteristic fragment of the GLM-2 sequence. Because expression of GLM-2 appears to be high in brain tissues but low or absent in other tissues tested, a brain cDNA library is recommended for the cloning of GLM-2.

The selected fragments may be cloned into any of a great number of vectors known to those of ordinary skill in the art. In one preferred embodiment, the cloning vector is a plasmid such as pUC18 or Bluescript (Stratagene). The cloned sequences should be examined to determine whether or not they contain the entire PTPL1 or GLM-2 sequences or desired portions thereof. A series of overlapping clones of partial sequences may be selected and combined to produce a complete sequence by methods well known in the art.

In an alternative embodiment of cloning a PTPL1 or GLM-2 nucleotide sequence, a library is prepared using an expression vector. The library is then screened for clones which express the PTPL1 or GLM-2 protein, for example, by screening the library with antibodies to the protein or with labeled probes for the desired RNA sequences or by assaying for PTPL1 or GLM-2 PTP activity on a phosphorylated substrate such as para-nitrylphenyl phosphate. The above discussed methods are, therefore, capable of identifying cloned genetic sequences which are capable of expressing PTPL1 or GLM-2 PTPs, or variants or fragments thereof.

To express a PTPL1 or GLM-2 PTP, variants or fragments thereof, or PTPL1 or GLM-2 anti-sense RNA, and variants or fragments thereof, transcriptional and translational signals recognizable by an appropriate host are necessary. The cloned PTPL1 or GLM-2 encoding sequences, obtained through the methods described above, and preferably in a double-stranded form, may be operably joined to regulatory sequences in an expression vector, and introduced into a host cell, either prokaryote or eukaryote, to produce recombinant PTPL1 or GLM-2 PTP, a variant or fragment thereof, PTPL1 or GLM-2 anti-sense RNA, or a variant or fragment thereof.

Depending upon the purpose for which expression is desired, the host may be eukaryotic or prokaryotic. For example, if the intention is to study the regulation of PTPL1 or GLM-2 PTP in a search for inducers or inhibitors of its activity, the host is preferably eukaryotic. In one preferred embodiment, the eukaryotic host cells are COS cells derived from monkey kidney. In a particularly preferred embodiment, the host cells are human fibroblasts. Many other eukaryotic host cells may be employed as is well known in the art. For example, it is known in the art that Xenopus oocytes comprise a cell system useful for the functional expression of eukaryotic messenger RNA or DNA. This system has, for example, been used to clone the sodium:glucose cotransporter in rabbits (Hediger, M. A., et. al., Proc. Natl. Acad. Sci. (USA) 84:2634-2637 (1987)). Alternatively, if the intention is to produce large quantities of the PTPL1 or GLM-2 PTPs, a prokaryotic expression system is preferred. The choice of an appropriate expression system is within the ability and discretion of one of ordinary skill in the art.

Depending upon which strand of the PTPL1 or GLM-2 PTP encoding sequence is operably joined to the regulatory sequences, the expression vectors will produce either PTPL1 or GLM-2 PTPs, variants or fragments thereof, or will express PTPL1 and GLM-2 anti-sense RNA, variants or fragments thereof. Such PTPL1 and GLM-2 anti-sense RNA may be used to inhibit expression of the PTPL1 or GLM-2 PTP and/or the replication of those sequences.

Expression of a protein in different hosts may result in different post-translational modifications which may alter the properties of the protein. This is particularly true when eukaryotic genes are expressed in prokaryotic hosts. In the present invention, however, this is of less concern as PTPL1 and GLM-2 are cytoplasmic PTPs and are unlikely to be post-translationally glycosylated.

Transcriptional initiation regulatory sequences can be selected which allow for repression or activation, so that expression of the operably joined sequences can be modulated. Such regulatory sequences include regulatory sequences which are temperature-sensitive so that by varying the temperature, expression can be repressed or initiated, or which are subject to chemical regulation by inhibitors or inducers. Also of interest are constructs wherein both PTPL1 or GLM-2 mRNA and PTPL1 or GLM-2 anti-sense RNA are provided in a transcribable form but with different promoters or other transcriptional regulatory elements such that induction of PTPL1 or GLM-2 mRNA expression is accompanied by repression of the expression of the corresponding anti-sense RNA, or alternatively, repression of PTPL1 or GLM-2 mRNA expression is accompanied by induction of expression of the corresponding anti-sense RNA. Translational sequences are not necessary when it is desired to express PTPL1 and GLM-2 anti-sense RNA sequences.

A non-transcribed and/or non-translated sequence 5' or 3' to the sequence coding for PTPL1 or GLM-2 PTP can be obtained by the above-described cloning methods using one of the probes disclosed herein to select a clone from a genomic DNA library. A 5' region may be used for the endogenous regulatory sequences of the PTPL1 or GLM-2 PTP. A 3'-non-transcribed region may be utilized for a transcriptional termination regulatory sequence or for a translational termination regulatory sequence. Where the native regulatory sequences do not function satisfactorily in the host cell, then exogenous sequences functional in the host cell may be utilized.

The vectors of the invention further comprise other operably joined regulatory elements such as DNA elements which confer tissue or cell-type specific expression of an operably joined coding sequence.

Oligonucleotide probes derived from the nucleotide sequence of PTPL1 or GLM-2 can be used to identify genomic or cDNA library clones possessing a related nucleic acid sequence such as an allelic variant or homologous sequence. A suitable oligonucleotide or set of oligonucleotides, which is capable of encoding a fragment of the PTPL1 or GLM-2 coding sequences, or a PTPL1 or GLM-2 anti-sense complement of such an oligonucleotide or set of oligonucleotides, may be synthesized by means well known in the art (see, for example, Synthesis and Application of DNA and RNA, S. A. Narang, ed., 1987, Academic Press, San Diego, Calif.) and employed as a probe to identify and isolate a cloned PTPL1 or GLM-2 sequence, variant or fragment thereof by techniques known in the art. As noted above, a unique or substantially characteristic fragment of a PTPL1 or GLM-2 sequence disclosed herein is preferred. Techniques of nucleic acid hybridization and clone identification are disclosed by Sambrook, et al., Molecular Cloning, A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989), and by Hames, B.D., et al., in Nucleic Acid Hybridization, A Practical Approach, IRL Press, Washington, D.C. (1985). To facilitate the detection of a desired PTPL1 or GLM-2 nucleic acid sequence, whether for cloning purposes or for the mere detection of the presence of PTPL1 or GLM-2 sequences, the above-described probes may be labeled with a detectable group. Such a detectable group may be any material having a detectable physical or chemical property. Such materials have been well-developed in the field of nucleic acid hybridization and in general most any label useful in such methods can be applied to the present invention. Particularly useful are radioactive labels. Any radioactive label may be employed which provides for an adequate signal and has a sufficient half-life. If single stranded, the oligonucleotide may be radioactively labeled using kinase reactions. Alternatively, oligonucleotides are also useful as nucleic acid hybridization probes when labeled with a non-radioactive marker such as biotin, an enzyme or a fluorescent group. See, for example, Leary, J. J., et al., Proc. Natl. Acad. Sci.(USA) 80:4045 (1983); Renz, M. et al., Nucl. Acids Res. 12:3435 (1984); and Renz, M., EMBO J. 6:817 (1983).

By using the sequences disclosed herein as probes or as primers, and techniques such as PCR cloning and colony/plaque hybridization, it is within the abilities of one skilled in the art to obtain human allelic variants and sequences substantially similar or homologous to PTPL1 or GLM-2 nucleic acid sequences from species including mouse, rat, rabbit and non-human primates. Thus, the present invention is further directed to mouse, rat, rabbit and primate PTPL1 and GLM-2.

In particular the protein sequences disclosed herein for PTPL1 and GLM-2 may be used to generate sets of degenerate probes or PCR primers useful in isolating similar and potentially evolutionarily similar sequences encoding proteins related to the PTPL1 or GLM-2 PTPs. Such degenerate probes may not be substantially similar to any fragments of the PTPL1 or GLM-2 nucleic acid sequences but, as derived from the protein sequences disclosed herein, are intended to fall within the spirit and scope of the claims.

Antibodies to PTPL1 and GLM-2.

In the following description, reference will be made to various methodologies well-known to those skilled in the art of immunology. Standard reference works setting forth the general principles of immunology include Catty, D. Antibodies, A Practical Approach, Vols. I and II, IRL Press, Washington, D.C. (1988); Klein, J. Immunology: The Science of Cell-Noncell Discrimination, John Wiley & Sons, New York (1982); Kennett, R., et al. in Monoclonal Antibodies, Hybridoma: A New Dimension in Biological Analyses, Plenum Press, New York (1980); Campbell, A., "Monoclonal Antibody Technology," in Laboratory Techniques in Biochemistry and Molecular Biology, Volume 13 (Burdon, R., et al., eds.), Elsevier, Amsterdam (1984); and Eisen, H. N., in Microbiology, 3rd Ed. (Davis, B. D., et al., eds.) Harper & Row, Philadelphia (1980).

The antibodies of the present invention are prepared by any of a variety of methods. In one embodiment, purified PTPL1 or GLM-2 PTP, a variant or a fragment thereof, is administered to an animal in order to induce the production of sera containing polyclonal antibodies that are capable of binding the PTP, variant or fragment thereof.

The preparation of antisera in animals is a well known technique (see, for example, Chard, Laboratory Techniques in Biology, "An Introduction to Radioimmunoassay and Related Techniques," North Holland Publishing Company (1978), pp. 385-396; and Antibodies, A Practical Handbook, Vols. I and II, D. Catty, ed., IRL Press, Washington, D.C. (1988)). The choice of animal is usually determined by a balance between the facilities available and the likely requirements in terms of volume of the resultant antiserum. A large species such as goat, donkey and horse may be preferred, because of the larger volumes of serum readily obtained. However, it is also possible to use smaller species such as rabbit or guinea pig which often yield higher titer antisera. Usually, a subcutaneous injection of the antigenic material (the protein or fragment thereof or a hapten-carrier protein conjugate) is used. The detection of appropriate antibodies may be carried out by testing the antisera with appropriately labeled tracer-containing molecules. Fractions that bind tracer-containing molecules are then isolated and further purified if necessary.

Cells expressing PTPL1 or GLM-2 PTP, a variant or a fragment thereof, or, a mixture of such proteins, variants or fragments, can be administered to an animal in order to induce the production of sera containing polyclonal antibodies, some of which will be capable of binding the PTPL1 or GLM-2 PTP. If desired, such PTPL1 or GLM-2 antibody may be purified from other polyclonal antibodies by standard protein purification techniques and especially by affinity chromatography with purified PTPL1 or GLM-2 protein or variants or fragments thereof.

A PTPL1 or GLM-2 protein fragment may also be chemically synthesized and purified by HPLC to render it substantially pure. Such a preparation is then introduced into an animal in order to produce polyclonal antisera of high specific activity. In a preferred embodiment, the protein may be coupled to a carrier protein such as bovine serum albumin or keyhole limpet hemocyanin (KLH), and and used to immunogenize a rabbit utilizing techniques well-known and commonly used in the art. Additionally, the PTPL1 or GLM-2 protein can be admixed with an immunologically inert or active carrier. Carriers which promote or induce immune responses, such as Freund's complete adjuvant, can be utilized.

Monoclonal antibodies can be prepared using hybridoma technology (Kohler et al., Nature 256:495 (1975); Kohler, et al., Eur. J. Immunol. 6:511 (1976); Kohier, et al., Eur. J. Immunol. 6:292 (1976); Hammerling, et al., in Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, N.Y., pp. 563-681 (1981)). In general, such procedures involve immunizing an animal with PTPL1 or GLM-2 PTP, or a variant or a fragment thereof. The splenocytes of such animals are extracted and fused with a suitable myeloma cell line. After fusion, the resulting hybridoma cells are selectively maintained in HAT medium, and then cloned by limiting dilution as described by Wands, J. R., et al., Gastro-enterology 80:225-232 (1981), which reference is herein incorporated by reference. The hybridoma cells obtained through such a selection are then assayed to identify clones which secrete antibodies capable of binding the PTP and/or the PTP antigen. The proliferation of transfected cell lines is potentially more promising than classical myeloma technology, using methods available in the art.

Through application of the above-described methods, additional cell lines capable of producing antibodies which recognize epitopes of the PTPL1 and GLM-2 PTPs can be obtained.

These antibodies can be used clinically as markers (both quantitative and qualitative) of the PTPL1 and GLM-2 PTPs in brain, blastoma or other tissue. Additionally, the antibodies are useful in a method to assess PTP function in cancer or other patients.

The method whereby two antibodies to PTPL1 were produced is outlined in Example 5.

Substantially pure PTPL1 and GLM-2 proteins.

A variety of methodologies known in the art can be utilized to obtain a purified PTPL1 or GLM-2 PTP. In one method, the protein is purified from tissues or cells which naturally produce the protein. Alternatively, an expression vector may be introduced into cells to cause production of the protein. For example, human fibroblast or monkey kidney COS cells may be employed. In another embodiment, mRNA transcripts may be microinjected into cells, such as Xenopus oocytes or rabbit reticulocytes. In another embodiment, mRNA is used with an in vitro translation system. In preferred embodiment, bacterial cells are used to make large quantities of the protein. In a particularly preferred embodiment, a fusion protein, such as a bacterial GST fusion (Pharmacia) may be employed, the fusion product purified by affinity chromatography, and the PTPL1 or GLM-2 protein may be released from the hybrid by cleaving the amino acid sequence joining them.

In light of the present disclosure, one skilled in the art can readily follow known methods for isolating proteins in order to obtain substantially pure PTPL1 or GLM-2 PTP, free of natural contaminants. These include, but are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immuno-affinity chromatography.

Determinations of purity may be performed by physical characterizations (such as molecular mass in size fractionation), immunological techniques or enzymatic assays.

PTPL1 or GLM-2 PTP, variants or fragments thereof, purified in the above manner, or in a manner wherein equivalents of the above sequence of steps are utilized, are useful in the preparation of polyclonal and monoclonal antibodies, for pharmaceutical preparations to inhibit or enhance PTP activity and for in vitro dephosphorylations.

Variants of PTPL1 and GLM-2 nucleic acids and proteins.

Variants of PTPL1 or GLM-2 having an altered nucleic acid sequence can be prepared by mutagenesis of the DNA. This can be accomplished using one of the mutagenesis procedures known in the art.

Preparation of variants of PTPL1 or GLM-2 are preferably achieved by site-directed mutagenesis. Site-directed mutagenesis allows the production of variants of these PTPs through the use of a specific oligonucleotide which contains the desired mutated DNA sequence.

Site-directed mutagenesis typically employs a phage vector that exists in both a single-stranded and double-stranded form. Typical vectors useful in site-directed mutagenesis include vectors such as the M13 phage, as disclosed by Messing, et al., Third Cleveland Symposium on Macromolecules and Recombinant DNA, A. Walton, ed., Elsevier, Amsterdam (1981), the disclosure of which is incorporated herein by reference. These phage are commercially available and their use is generally well known to those skilled in the art. Alternatively, plasmid vectors containing a single-stranded phage origin of replication (Veira, et al., Meth. Enzymol. 153:3 (1987)) may be employed to obtain single-stranded DNA.

In general, site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector that includes within its sequence the DNA sequence which is to be altered. An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically, for example by the method of Crea, et al., Proc. Natl. Acad. Sci. (USA) 75:5765 (1978). The primer is then annealed with the single-stranded vector containing the sequence which is to be altered, and the created vector is incubated with a DNA-polymerizing enzyme such as E. coli polymerase I Klenow fragment in an appropriate reaction buffer. The polymerase will complete the synthesis of a mutation-bearing strand. Thus, the second strand will contain the desired mutation. This heteroduplex vector is then used to transform appropriate cells and clones are selected that contain recombinant vectors bearing the mutated sequence.

While the site for introducing a sequence variation is predetermined, the mutation per se need not be predetermined. For example, to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at a target region and the newly generated sequences can be screened for the optimal combination of desired activity. One skilled in the art can evaluate the functionality of the variant by routine screening assays.

The present invention further comprises fusion products of the PTPL1 or GLM-2 PTPs. As is widely known, translation of eukaryotic mRNA is initiated at the codon which encodes the first methionine. The presence of such codons between a eukaryotic promoter and a PTPL1 or GLM-2 sequence results either in the formation of a fusion protein (if the ATG codon is in the same reading frame as the PTP encoding DNA sequence) or a frame-shift mutation (if the ATG codon is not in the same reading frame as the PTP encoding sequence). Fusion proteins may be constructed with enhanced immunospecificity for the detection of these PTPs. The sequence coding for the PTPL1 or GLM-2 PTP may also be joined to a signal sequence which will allow secretion of the protein from, or the compartmentalization of the protein in, a particular host. Such signal sequences may be designed with or without specific protease sites such that the signal peptide sequence is amenable to subsequent removal.

The invention further provides detectably labeled, immobilized and toxin conjugated forms of PTPL1 and GLM-2 PTPs, and variants or fragments thereof. The production of such labeled, immobilized or toxin conjugated forms of a protein are well known to those of ordinary skill in the art. While radiolabeling represents one embodiment, the PTPs or variants or fragments thereof may also be labeled using fluorescent labels, enzyme labels, free radical labels, avidin-biotin labels, or bacteriophage labels, using techniques known to the art (Chard, Laboratory Techniques in Biology, "An Introduction to Radioimmunoassay and Related Techniques," North Holland Publishing Company (1978)).

Typical fluorescent labels include fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, and fluorescamine.

Typical chemiluminescent compounds include luminol, isoluminol, aromatic acridinium esters, imidazoles, and the oxalate esters.

Typical bioluminescent compounds include luciferin, and luciferase. Typical enzymes include alkaline phosphatase, β-galactosidase, glucose-6-phosphate dehydrogenase, maleate dehydrogenase, glucose oxidase, and peroxidase.

Transformed cells, cell lines and hosts.

To transform a mammalian cell with the nucleic acid sequences of the invention many vector systems are available depending upon whether it is desired to insert the recombinant DNA construct into the host cell's chromosomal DNA, or to allow it to exist in an extrachromosomal form. If the PTPL1 or GLM-2 PTP coding sequence, along with an operably joined regulatory sequence is introduced into a recipient eukaryotic cell as a non-replicating DNA (or RNA) molecule, the expression of PTPL1 or GLM-2 PTP may occur through the transient expression of the introduced sequence. Such a non-replicating DNA (or RNA) molecule may be a linear molecule or, more preferably, a closed covalent circular molecule which is incapable of autonomous replication.

In a preferred embodiment, genetically stable transformants may be constructed with vector systems, or transformation systems, whereby recombinant PTPL1 or GLM-2 PTP DNA is integrated into the host chromosome. Such integration may occur de novo within the cell or, in a most preferred embodiment, be assisted by transformation with a vector which functionally inserts itself into the host chromosome with, for example, retro vectors, transposons or other DNA elements which promote integration of DNA sequences in chromosomes. A vector is employed which is capable of integrating the desired sequences into a mammalian host cell chromosome. In a preferred embodiment, the transformed cells are human fibroblasts. In another preferred embodiment, the transformed cells are monkey kidney COS cells.

Cells which have stably integrated the introduced DNA into their chromosomes may be selected by also introducing one or more markers which allow for selection of host cells which contain the expression vector in the chromosome, for example the marker may provide biocide resistance, e.g., resistance to antibiotics, or heavy metals, such as copper, or the like. The selectable marker can either be directly linked to the DNA sequences to be expressed, or introduced into the same cell by co-transfection.

In another embodiment, the introduced sequence is incorporated into a vector capable of autonomous replication in the recipient host. Any of a wide variety of vectors may be employed for this purpose, as outlined below.

Factors of importance in selecting a particular plasmid or vector include: the ease with which recipient cells that contain the vector may be recognized and selected from those recipient cells which do not contain the vector; the number of copies of the vector which are desired in a particular host; and whether it is desirable to be able to "shuttle" the vector between host cells of different species.

Preferred eukaryotic plasmids include those derived from the bovine papilloma virus, SV40, and, in yeast, plasmids containing the 2-micron circle, etc., or their derivatives. Such plasmids are well known in the art (Botstein, D., et al., Miami Wntr. Symp. 19:265-274 (1982); Broach, J. R., in The Molecular Biology of the Yeast Saccharomyces: Life Cycle and Inheritance, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., p. 445-470 (1981); Broach, J. R., Cell 28:203-204 (1982); Bolion, D. P., et al., J. Clin. Hematol. Oncol. 10:39-48 (1980); Maniatis, T., in Cell Biology: A Comprehensive Treatise, Vol. 3, Gene Expression, Academic Press, NY, pp. 563-608 (1980)), and are commercially available. For example, mammalian expression vector systems which utilize the MSV-LTR promoter to drive expression of the cloned gene and with which it is possible to co-transfect with a helper virus to amplify plasmid copy number and to integrate the plasmid into the chromosomes of host cells have been described (Perkins, A. S., et al., Mol. Cell Biol. 3:1123 (1983); Clontech, Palo Alto, Calif.).

Once the vector or DNA sequence is prepared for expression, it is introduced into an appropriate host cell by any of a variety of suitable means, including transfection. After the introduction of the vector, recipient cells may be grown in a selective medium, which selects for the growth of vector-containing cells. Expression of the cloned nucleic acid sequence(s) results in the production of PTPL1 or GLM-2 PTP, or the production of a variant or fragment of the PTP, or the expression of a PTPL1 or GLM-2 anti-sense RNA, or a variant or fragment thereof. This expression can take place in a transient manner, in a continuous manner, or in a controlled manner as, for example, expression which follows induction of differentiation of the transformed cells (for example, by administration of bromodeoxyuracil to neuroblastoma cells or the like).

In another embodiment of the invention the host is a human host. Thus, a vector may be employed which will introduce into a human with deficient PTPL1 or GLM-2 PTP activity, operable PTPL1 or GLM-2 sequences which can supplement the patient's endogenous production. In another embodiment, the patient suffers from a cancer caused by an oncogene which is a protein tyrosine kinase (PTK). A vector capable of expressing the PTPL1 or GLM-2 protein is introduced within the patient to counteract the PTK activity.

The recombinant PTPL1 or GLM-2 PTP cDNA coding sequences, obtained through the methods above, may be used to obtain PTPL1 or GLM-2 anti-sense RNA sequences. An expression vector may be constructed which contains a DNA sequence operably joined to regulatory sequences such that the DNA sequence expresses the PTPL1 or GLM-2 anti-sense RNA sequence. Transformation with this vector results in a host capable of expression of a PTPL1 or GLM-2 anti-sense RNA in the transformed cell. Preferably such expression occurs in a regulated manner wherein it may be induced and/or repressed as desired. Most preferably, when expressed, anti-sense PTPL1 or GLM-2 RNA interacts with an endogenous PTPL1 or GLM-2 DNA or RNA in a manner which inhibits or represses transcription and/or translation of the PTPL1 or GLM-2 PTP DNA sequences and/or mRNA transcripts in a highly specific manner. Use of anti-sense RNA probes to block gene expression is discussed in Lichtenstein, C., Nature 333:801-802 (1988).

Assays for agonists and antagonists.

The cloning of PTPL1 and GLM-2 now makes possible the production and use of high through-put assays for the identification and evaluation of new agonists (inducers/enhancers) and antagonists (repressors/inhibitors) of PTPL1 or GLM-2 PTPs for therapeutic strategies using single or combinations of drugs. The assay may, for example, test for PTPL1 or GLM-2 PTP activity in transfected cells (e.g. fibroblasts) to identify drugs that interfere with, enhance, or otherwise alter the expression or regulation of these PTPs. In addition, probes developed for the disclosed PTPL1 and GLM-2 nucleic acid sequences or proteins (e.g. DNA or RNA probes or or primers or antibodies to the proteins) may be used as qualitative and/or quantitative indicators for the PTPs in cell lysates, whole cells or whole tissue.

In a preferred embodiment, human fibroblast cells are transformed with the PTPL1 or GLM-2 PTP sequences and vectors disclosed herein. The cells may then be treated with a variety of compounds to identify those which enhance or inhibit PTPL1 or GLM-2 transcription, translation, or PTP activity. In addition, assays for PDGF (platelet derived growth factor) signalling, cell growth, chemotaxis, and actin reorganization are preferred to assess a compounds affect on PTPL1 or GLM-2 PTP transcription, translation or activity.

In another embodiment, the ability of a compound to enhance or inhibit PTPL1 or GLM-2 PTP activity is assayed in vitro. Using the substantially pure PTPL1 or GLM-2 PTPs disclosed herein, and a detectable phosphorylated substrate, the ability of various compounds to enhance or inhibit the phosphatase activity of PTPL1 or GLM-2 may be assayed. In a particularly preferred embodiment the phosphorylated substrate is para-nitryiphenyl phosphate (which turns yellow upon dephosphorylation).

In another embodiment, the ability of a compound to enhance or inhibit PTPL1 or GLM-2 transcription is assayed, Using the PTPL1 or GLM-2 cDNA sequences disclosed herein, one of ordinary skill in the art can clone the 5' regulatory sequences of the PTPL1 or GLM-2 genes. These regulatory sequences may then be operably joined to a sequence encoding a marker. The marker may be an enzyme with an easily assayable activity or may cause the host cells to change phenotypically or in their sensitivity or resistance to certain molecules. A wide variety of markers are known to those of ordinary skill in the art and appropriate markers may be chosen depending upon the host used. Compounds which may alter the transcription of PTPL1 or GLM-2 PTP may be tested by exposing cells transformed with the PTPL1 or GLM-2 regulatory sequences operably joined to the marker and assaying for increased or decreased expression of the marker.

The following examples further describe the particular materials and methods used in developing and carrying out some of the embodiments of the present invention. These examples are merely illustrative of techniques employed to date and are not intended to limit the scope of the invention in any manner.

EXAMPLE 1 Original Cloning of PTPL1

All cells, unless stated otherwise, were cultured in Dulbeco Modified Eagles Medium (DMEM Gibco) supplemented with 10% Fetal Calf Serum (FCS, Flow Laboratories), 100 units of penicillin, 50 μg/ml streptomycin and glutamine. The human glioma cell line used was U-343 MGa 31L (Nister, M., et al , (1988) Cancer Res. 48:3910-3918). The AG1518 human foreskin fibroblasts were from the Human Genetic Mutant Cell Repository, Institute for Medical Research, Camden, N.J.

RNA was prepared from U-343 MGa 31L cells or AG1518 human fibroblasts by guanidine thiocyanate (Merck, Darmstadt) extraction (Chirgwin et al., 1979). Briefly, cells were harvested, washed in phosphate buffered saline (PBS), and lysed in 4M guanidine thiocyanate containing 25 mM sodium citrate (pH 7.0) and 0.1M 2-mercaptoethanol. RNA was sedimented through 5.7M cesium chloride, the RNA pellet was then dissolved in 10 mM Tris hydrochloride (pH 7.5), 5 mM EDTA (TE buffer), extracted with phenol and chloroform, precipitated with ethanol, and the final pellet stored at -70° C. or resuspended in TE buffer for subsequent manipulations. Polyadenylated poly(A)+! RNA was prepared by chromatography on oligo (dT)-cellulose as described in Maniatis et al., 1982.

Poly(A)+ RNA (5 μg) from U-343 MGa 31 L cells was used to make a cDNA library by oligo (dT)-primed cDNA synthesis using an Amersham λgt10 cDNA cloning system. Similarly, a random and oligo (dT) primed cDNA library was prepared from AG1518 fibroblasts using 5 μg of poly(A)+ RNA, a RiboClone cDNA synthesis system (Promega Corporation, Madison, Wis., USA), a Lambda ZAPII synthesis kit (Stratagene), and Gigapack Gold II packaging extract (Stratagene). Degenerate primers were designed based on conserved amino acid-regions of known PTP sequences and were synthesized using a Gene Assembler Plus (Pharmacia-LKB). Sense oligonucleotides corresponded to the sequences SEQ ID NO:5, FWRM I/V WEQ (5'-TTCTGG A/C GNATGATNTGGGAACA-3', SEQ ID NO:6, 23mer with 32-fold degeneracy) and SEQ ID NO:7 KC A/D Q/E YWP (5'-AA A/G TG C/T GANCAGTA C/T TGGCC-3', SEQ ID NO:8 20mer with 32-fold degeneracy), and the anti-sense oligonucleotide was based on the sequence SEQ ID NO:9 HCSAG V/I G (5'-CCNACNCC A/C GC A/G CTGCAGTG-3', 20mer with 64-fold degeneracy). Unpackaged template cDNA from the U-343 MGa 31L library (100 ng) was amplified using Tag polymerase (Perkin Elmer-Cetus) and 100 nq of either sense primer in combination with 100 ng of the anti-sense primer as described (Saiki et al., 1985). PCR was carried out for 25 cycles each consisting of denaturation at 94° C. for 30 sec, annealing at 40° C. for 2 min followed by 55° C. for 1 min, and extension at 72° C. for 2 min. The PCR products were separated on a 2.0% low gelling temperature agarose gel (FMC BioProducts, Rockland, USA) and DNA fragments of approximately 368 base pairs (with FWRM sense primer SEQ ID NO:6) and approximately 300 bp (with KC A/D Q sense primer) were excised, eluted from the gel, subcloned into a T-tailed vector (TA Cloning Kit, Invitrogen Corporation, San Diego, Calif., USA), and sequenced.

Nucleotide sequences from several of the PCR cDNA clones analysed were representative of both cytoplasmic and receptor types of PTPs. Thirteen clones encoded cytoplasmic enzymes including MEG (Gu et al., 1991; 8 clones), PTPH1 (Yang and Tonks, 1991; 2 clones), P19PTP (den Hertog et al., 1992), and TC-PTP (Cool et al., 1989, one clone); 11 clones encoded receptor-type enzymes such as HPTP-α (Kruger et al., 1990, 7 clones), HPTP-γ (Kruger et al., 1990, 3 clones) and HPTP-δ (Kruger et al., 1990, 1 clone), and three clones defined novel PTP sequences. Two of these were named PTPL1 and GLM-2.

The U-343 MGa 31L cDNA library was screened with ³² P-random prime-labeled (Megaprime Kit, Amersham) approximately 368 bp inserts corresponding to PTPL1 as described elsewhere (Huynh et al., 1986); clone λ6.15 was isolated, excised from purified phage DNA by Eco RI (Biolabs) digestion and subcloned into pUC18 for sequencing. All other cDNA clones were isolated from the AG1518 human fibroblast cDNA library which was screened with ³² P-labeled λ6.15 insert and with subsequently isolated partial cDNA clones.

Double-stranded plasmid DNA was prepared by a single-tube mini preparation method (Del Sal et al., 1988) or using Magic mini or maxiprep kits (Promega) according to the manufacturer's specifications. Double-stranded DNA was denatured and used as template for sequencing by the dideoxynucleotide chain-termination procedure with T7 DNA polymerase (Pharmacia-LKB), and M13-universal and reverse primers or synthetic oligonucleotides derived from the cDNA sequences being determined. The complete 7395 bp open reading frame of PTPL1, was derived from six overlapping cDNA clones totalling 8040 bp and predicts a protein of 2465 amino acids with an approximate molecular mass of 275 kDa. The 8040 bp sequence is disclosed as SEQ ID NO.:1.

EXAMPLE 2 Original Cloning of GLM-2

The human glioma cell line U-343 MGa 31L (Nister, M., et al., (1988) Cancer Res. 48:3910-3918) was cultured in Dulbecco's Modified Eagles Medium (DMEM Gibco) supplemented with 10% Fetal Calf Serum (FCS, Flow Laboratories), 100 units of penicillin, 50 μg/ml streptomycin and 2 mM glutamine.

Total RNA was prepared from U-343 MGa 31L cells by guanidine thiocyanate (Merck, Darmstadt) extraction (Chirgwin, et al., 1979). Briefly, cells were harvested, washed in phosphate buffered saline (PBS), and lysed in 4M guanidine thiocyanate containing 25 mM sodium citrate (pH 7.0) and 0.1M 2-mercaptoethanol. RNA was sedimented through 5.7M cesium chloride, the RNA pellet was then dissolved in 10 mM Tris hydrochloride (pH 7.5), 5 mM EDTA (TE buffer), extracted with phenol and chloroform, precipitated with ethanol, and the final pellet stored at -70° C. or resuspended in TE buffer for subsequent manipulations. Polyadenylated poly(A)+! RNA was prepared by chromatography on oligo (dT)-cellulose as described in Maniatis et al. (1982).

Poly(A)+ RNA (5 μg) isolated from U-343 MGa 31L cells was used to make a cDNA library by oligo (dT)-primed cDNA synthesis using an Amersham λgt10 cDNA cloning system. Degenerate primers were designed based on conserved amino acid regions of known PTP sequences, and synthesized using a Gene Assembler Plus (Pharmacia-LKB). Sense oligonucleotides corresponded to the sequences SEQ ID NO:5 FWRM I/V WEQ (5'-TTCTGG A/C GNATGATNTGGGAACA-3', SEQ ID NO:6 23mer with 32-fold degeneracy=primer P1) and SEQ ID NO:7 KC A/D Q/E YWP (5'-AA A/G TG C/T GANCAGTA C/T TGGCC-3', SEQ ID NO:8 20mer with 32-fold degeneracy=primer P2), and the anti-sense oligonucleotide was based on the sequence SEQ ID NO:9 HCSAG V/I G (5'-CCNACNCC A/C GC A/G CTGCAGTG-3', SEQ ID NO:10 20mer with 64-fold degeneracy=primer P3). Unpackaged template cDNA from the U-343 MGa 31L library (100 ng) was amplified using Tag polymerase (Perkin Elmer-Cetus) and 100 ng of either sense primer in combination with 100 ng of the anti-sense primer as described (Saiki, et al., 1985). PCR was carried out for 25 cycles each consisting of denaturation at 94° C. for 30 sec, annealing at 40° C. for 2 min followed by 55° C. for 1 min, and extension at 72° C. for 2 min. The PCR products were separated on a 2.0% low gelling temperature agarose gel (FMC Bioproducts, Rockland, USA) and DNA fragments of approximately 368 base pairs (with FWRM sense primer) and approximately 300 bp (with KC A/D Q sense primer) were excised, eluted from the gel, subdoned into a T-tailed vector (TA Cloning Kit, Invitrogen Corporation, San Diego, Calif., USA), and sequenced. Double-stranded plasmid DNA was prepared by a single-tube mini preparation method (Del Sal, et al., 1988) or by using Magic mini or maxiprep kits (Promega) according to the manufacturer's specifications. Double-stranded DNA was denatured and used as template for sequencing by the dideoxynucleotide chain-termination procedure (Sanger, et al., 1977) with T7 DNA polymerase (Pharmacia-LKB), and M13-universal and reverse primers or, in the case of cDNA clones isolated from the brain cDNA library, using also synthetic oligonucleotides derived from the cDNA sequences being determined.

A human brain cDNA library constructed in λgt10 (Clontech, Calif.) was screened as described elsewhere (Huynh, et al., 1986) with ³² P-random prime-labeled (Megaprime Kit, Amersham) approximately 360 bp inserts corresponding to GLM-2. Clone HBM1 was isolated, excised from purified phage DNA by Eco RI (Biolabs) digestion and subcloned into the plasmid vectors pUC18 or Bluescript (Stratagene) for sequencing. The resulting sequence is disclosed as SEQ ID NO.: 3.

EXAMPLE 3 Tissue-Specific Expression of PTPL1

Total RNA (20 μg) or poly(A)+ RNA (2 μg) denatured in formaldehyde and formamide was separated by electrophoresis on a formaldehyde/1% agarose gel and transferred to nitrocellulose. The filters were hybridized for 16 hrs at 42° C. with ³² P-labeled probes in a solution containing 5× standard saline citrate (SSC; 1× SSC is 50 mM sodium citrate, pH 7.0, 150 mM sodium chloride), 50% formamide, 0.1% sodium dodecyl sulfate (SDS), 50 mM sodium phosphate and 0.1 mg/ml salmon sperm DNA. All probes were labeled by random priming (Feinberg and Vogelstein, 1983) and unincorporated ³² P was removed by Sephadex G-25 (Pharmacia-LKB) chromatography. Human tissue blots (Clontech, Calif.) were hybridized with PTPL1 specific probes according to manufacturer's specifications. Filters were washed twice for 30 min at 60° C. in 2× SSC/0.1% SDS, once for 30 min at 60° C. in 0.5× SSC/0.1% SDS, and exposed to X-ray film (Fuji, XR) with intensifying screen (Cronex Lighting Plus, Dupont) at -70° C.

Northern blot analysis of RNAs from various human tissues showed that the 9.5 kb PTPL1 transcript is expressed at different levels with kidney, placenta, ovaries and testes showing high expression, compared to medium expression in lung, pancreas, prostate and brain tissues, low in heart, skeletal muscle, spleen, liver, small intestine and colon and virtually no detectable expression in leukocytes.

EXAMPLE 4 Tissue-Specific Expression of GLM-2

To investigate the expression of GLM-2 mRNA in human tissues, Northern blot analysis was performed on a commercially available filter (Clontech, Calif.) containing mRNAs from human heart, brain, placenta, lung, liver, skeletal muscle, kidney and pancreas tissue. The filter was hybridized according to manufacturer's specifications with ³² P-labeled GLM-2 PCR product as probe, washed twice for 30 min at 60° C. in 2× standard saline citrate (SSC; 1× SSC is 50 mM sodium citrate, pH 7.0, 150 mM sodium chloride), containing 0.1% sodium dodecyl sulfate (SDS), once for 30 min at 60° C. in 0.5× SSC/0.1% SDS, and exposed to X-ray film (Fuji, RX) with intensifying screen (Cronex Lighting Plus, Dupont) at -70° C.

EXAMPLE 5 Production of PTPL1 specific antisera

Rabbit antisera denoted αL1A and αL1B were prepared against peptides corresponding to amino acid residues 1802 to 1823 SEQ ID NO:1 (PAKSDGRLKPGDRLIKVNDTDV) and 450 to 470 SEQ ID NO:1, (DETLSQGQSQRPSRQYETPFE), respectively, of PTPL1. The peptides were synthesized in an Applied Biosystems 430A Peptide Synthesizer using t-butoxycarbonyl chemistry and purified by reverse phase high performance liquid chromatography. The peptides were coupled to keyhole limpet hemocyanin (Calbiochem-Behring) using glutaraldehyde, as described (Gullick, W. J., et al., (1985) EMBO J. 4:2869-2877), and then mixed with Freund's adjuvant and used to Immunize a rabbit. The αL1A antiserum was purified by affinity chomatography on protein A-Sepharose CL4B (Pharmacia-LKB) as described by the manufacturer.

EXAMPLE 6 Transfection of the PTPL1 cDNA Into COS-1 Cells

The full length PTPL1 cDNA was constructed using overlapping clones and cloned into the SV40-based expression vector pSV7d (Truett, M. A., et al., (1985) DNA 4:333-349), and transfected into COS-1 cells by the calcium phosphate precipitation method (Wigler, M., et al., (1979) Cell 16:777-785). Briefly, cells were seeded into 6-well cell culture plates at a density of 5×10⁵ cells/well, and transfected the following day with 10 μg of plasmid. After overnight incubation, cells were washed three times with a buffer containing 25 mM Tris-HCl, pH 7.4, 138 mM Nacl, 5 mM KCl 0.7 mM CaCl₂, 0.5 mM MgCl₂ and 0.6 mM Na₂ HPO₄, and then incubated with Dulbecco's modified Eagle's medium containing 10% fetal calf serum and antibiotics. Two days after transfection, the cells were used for metabolic labeling followed by immunoprecipitation and SDS-gel electrophoresis, or immunoprecipitation followed by dephosphorylation experiments.

EXAMPLE 7 Metabolic Labeling, Immunoprecipitation and Electrophoresis of PTPL1

Metabolic labeling of COS-1 cells, AG1518 cells, PC-3 cells, CCL-64 cells, A549 cells and PAE cells was performed for 4 h in methionine- and cysteine-free MCDB 104 medium (Gibco) with 150 μCi/ml of ³⁵ S!methionine and ³⁵ S!cysteine (in vivo labeling mix; Amersham). After labeling, the cells were solubilized in a buffer containing 20 mM Tris-HCl, pH 7.4, 150 mM NaCl, 10 mM EDTA, 0.5% Triton X-100, 0.5% deoxycholate, 1.5% Trasylol (Bayer) and 1 mM phenylmethylsulfonyl fluoride (PMSF; Sigma). After 15 min on ice, cell debris was removed by centrifugation. Samples (1 ml) were then incubated for 1.5 h at 4° C. with either αL1A antibodies or αL1A antibodies preblocked with 10 μg of peptide. Immune complexes were then mixed with 50 μl of a protein A-Sepharose (Pharmacia-LKB) slurry (50% packed beads in 150 mM NaCl, 20 Mm Tris-HCL, pH 7.4, 0.2% Triton X-100) and incubated for 45 min at 4° C. The beads were pelleted and washed four times with washing buffer (20 mM Tris-HCl, pH 7.4, 500 mM NaCl, 1% Triton X-100, 1% deoxycholate and 0.2% SDS), followed by one wash in distilled water. The immune complexes were eluted by boiling for 5 min in the SDS-sample buffer (100 mM Tris-HCl, pH 8.8, 0.01% bromophenol blue, 36% glycerol, 4% SDS) in the presence of 10 mM dithiothreitol (DTT), and analyzed by SDS-gel electrophoresis using 4-7% polyacrylamide gels (Blobel, G., and Dobberstein, B. (1975) J. Cell Biol. 67:835-851). The gel was fixed, incubated with Amplify (Amersham) for 20 min, dried and subjected to fluorography.

EXAMPLE 8 Dephosphorylation Assay for PTPL1

COS-1 cells were lysed in 20 mM Tris-HCl, pH 7.4, 150 mM NaCl, 10 mM EDTA, 0.5% Triton X-100, 0.5% deoxycholate, 1.5% Trasylol, 1 mM PMSF and 1 mM DTT, for 15 min. Lysates were cleared by centrifugation, 3 μl of the antiserum αL1B, with or without preblocking with 10 μg peptide, were added and samples were incubated for 2 h at 4° C. Protein A-Sepharose slurry (25 μl) was then added and incubation was prolonged another 30 min at 4° C. The beads were pelleted and washed four times with lysis buffer and one time with dephosphorylation assay buffer (25 mM imidazole-HCl, pH 7.2, 1 mg/ml bovine serum albumin and 1 mM DTT), and finally resuspended in dephosphorylation assay buffer containing 2 μM myelin basic protein ³² P-labeled on tyrosine residues by Baculo-virus expressed intracellular part of the insulin receptor, kindly provided by A. J. Flint (Cold Spring Harbor Laboratory) and M. M. Cobb (University of Texas). After incubation for indicated times at 30° C., the reactions were stopped with a charcoal mixture (Streull, M., et al., (1988) J. Exp. Med. 168:1523-1530) and the radioactivity in the supernatants was determined by Cerenkov counting. For each sample, lysate corresponding to 5 cm² of confluent cells was used.

It should be understood that the preceding is merely a detailed description of certain preferred embodiments and examples of particular laboratory embodiments. It therefore should be apparent to those skilled in the art that various modifications and equivalents can be made without departing from the spirit or scope of the invention as definded in the appended claims.

__________________________________________________________________________
SEQUENCE LISTING                                                          
(1) GENERAL INFORMATION:                                                  
(iii) NUMBER OF SEQUENCES: 34                                             
(2) INFORMATION FOR SEQ ID NO:1:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 8040 base pairs                                               
(B) TYPE: nucleic acid                                                    
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: cDNA to mRNA                                          
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(vi) ORIGINAL SOURCE:                                                     
(A) ORGANISM: HOMO SAPIENS                                                
(ix) FEATURE:                                                             
(A) NAME/KEY: CDS                                                         
(B) LOCATION: 78..7475                                                    
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                   
CCCGCCCCGACGCCGCGTCCCTGCAGCCCTGCCCGGCGCTCCAGTAGCAGGACCCGGTCT60            
CGGGACCAGCCGGTAATATGCACGTGTCACTAGCTGAGGCCCTGGAGGTT110                     
MetHisValSerLeuAlaGluAlaLeuGluVal                                         
1510                                                                      
CGGGGTGGACCACTTCAGGAGGAAGAAATATGGGCTGTATTAAATCAA158                       
ArgGlyGlyProLeuGlnGluGluGluIleTrpAlaValLeuAsnGln                          
152025                                                                    
AGTGCTGAAAGTCTCCAAGAATTATTCAGAAAAGTAAGCCTAGCTGAT206                       
SerAlaGluSerLeuGlnGluLeuPheArgLysValSerLeuAlaAsp                          
303540                                                                    
CCTGCTGCCCTTGGCTTCATCATTTCTCCATGGTCTCTGCTGTTGCTG254                       
ProAlaAlaLeuGlyPheIleIleSerProTrpSerLeuLeuLeuLeu                          
455055                                                                    
CCATCTGGTAGTGTGTCATTTACAGATGAAAATATTTCCAATCAGGAT302                       
ProSerGlySerValSerPheThrAspGluAsnIleSerAsnGlnAsp                          
60657075                                                                  
CTTCGAGCATTCACTGCACCAGAGGTTCTTCAAAATCAGTCACTAACT350                       
LeuArgAlaPheThrAlaProGluValLeuGlnAsnGlnSerLeuThr                          
808590                                                                    
TCTCTCTCAGATGTTGAAAAGATCCACATTTATTCTCTTGGAATGACA398                       
SerLeuSerAspValGluLysIleHisIleTyrSerLeuGlyMetThr                          
95100105                                                                  
CTGTATTGGGGGGCTGATTATGAAGTGCCTCAGAGCCAACCTATTAAG446                       
LeuTyrTrpGlyAlaAspTyrGluValProGlnSerGlnProIleLys                          
110115120                                                                 
CTTGGAGATCATCTCAACAGCATACTGCTTGGAATGTGTGAGGATGTT494                       
LeuGlyAspHisLeuAsnSerIleLeuLeuGlyMetCysGluAspVal                          
125130135                                                                 
ATTTACGCTCGAGTTTCTGTTCGGACTGTGCTGGATGCTTGCAGTGCC542                       
IleTyrAlaArgValSerValArgThrValLeuAspAlaCysSerAla                          
140145150155                                                              
CACATTAGGAATAGCAATTGTGCACCCTCATTTTCCTACGTGAAACAC590                       
HisIleArgAsnSerAsnCysAlaProSerPheSerTyrValLysHis                          
160165170                                                                 
TTGGTAAAACTGGTTCTGGGAAATCTTTCTGGGACAGATCAGCTTTCC638                       
LeuValLysLeuValLeuGlyAsnLeuSerGlyThrAspGlnLeuSer                          
175180185                                                                 
TGTAACAGTGAACAAAAGCCTGATCGAAGCCAGGCTATTCGAGATCGA686                       
CysAsnSerGluGlnLysProAspArgSerGlnAlaIleArgAspArg                          
190195200                                                                 
TTGCGAGGAAAAGGATTACCAACAGGAAGAAGCTCTACTTCTGATGTA734                       
LeuArgGlyLysGlyLeuProThrGlyArgSerSerThrSerAspVal                          
205210215                                                                 
CTAGACATACAAAAGCCTCCACTCTCTCATCAGACCTTTCTTAACAAA782                       
LeuAspIleGlnLysProProLeuSerHisGlnThrPheLeuAsnLys                          
220225230235                                                              
GGGCTTAGTAAATCTATGGGATTTCTGTCCATCAAAGATACACAAGAT830                       
GlyLeuSerLysSerMetGlyPheLeuSerIleLysAspThrGlnAsp                          
240245250                                                                 
GAGAATTATTTCAAGGACATTTTATCAGATAATTCTGGACGTGAAGAT878                       
GluAsnTyrPheLysAspIleLeuSerAspAsnSerGlyArgGluAsp                          
255260265                                                                 
TCTGAAAATACATTCTGCCCTTACCAGTTCAAAACTAGTGGCCCAGAA926                       
SerGluAsnThrPheCysProTyrGlnPheLysThrSerGlyProGlu                          
270275280                                                                 
AAAAAACCCATCCCTGGCATTGATGTGCTTTCTAAGAAGAAGATCTGG974                       
LysLysProIleProGlyIleAspValLeuSerLysLysLysIleTrp                          
285290295                                                                 
GCTTCATCCATGGACTTGCTTTGTACAGCTGACAGAGACTTCTCTTCA1022                      
AlaSerSerMetAspLeuLeuCysThrAlaAspArgAspPheSerSer                          
300305310315                                                              
GGAGAGACTGCCACATATCGTCGTTGTCACCCTGAGGCAGTAACAGTG1070                      
GlyGluThrAlaThrTyrArgArgCysHisProGluAlaValThrVal                          
320325330                                                                 
CGGACTTCAACTACGCCTAGAAAAAAGGAGGCAAGATACTCAGATGGA1118                      
ArgThrSerThrThrProArgLysLysGluAlaArgTyrSerAspGly                          
335340345                                                                 
AGTATAGCCTTGGATATCTTTGGCCCTCAGAAAATGGATCCAATATAT1166                      
SerIleAlaLeuAspIlePheGlyProGlnLysMetAspProIleTyr                          
350355360                                                                 
CACACTCGAGAATTGCCCACCTCCTCAGCAATATCAAGTGCTTTGGAC1214                      
HisThrArgGluLeuProThrSerSerAlaIleSerSerAlaLeuAsp                          
365370375                                                                 
CGAATCCGAGAGAGACAAAAGAAACTTCAGGTTCTGAGGGAAGCCATG1262                      
ArgIleArgGluArgGlnLysLysLeuGlnValLeuArgGluAlaMet                          
380385390395                                                              
AATGTAGAAGAACCAGTTCGAAGATACAAAACTTATCATGGTGATGTC1310                      
AsnValGluGluProValArgArgTyrLysThrTyrHisGlyAspVal                          
400405410                                                                 
TTTAGTACCTCCAGTGAAAGTCCATCTATTATTTCCTCTGAATCAGAT1358                      
PheSerThrSerSerGluSerProSerIleIleSerSerGluSerAsp                          
415420425                                                                 
TTCAGACAAGTGAGAAGAAGTGAAGCCTCAAAGAGGTTTGAATCCAGC1406                      
PheArgGlnValArgArgSerGluAlaSerLysArgPheGluSerSer                          
430435440                                                                 
AGTGGTCTCCCAGGGGTAGATGAAACCTTAAGTCAAGGCCAGTCACAG1454                      
SerGlyLeuProGlyValAspGluThrLeuSerGlnGlyGlnSerGln                          
445450455                                                                 
AGACCGAGCAGACAATATGAAACACCCTTTGAAGGCAACTTAATTAAT1502                      
ArgProSerArgGlnTyrGluThrProPheGluGlyAsnLeuIleAsn                          
460465470475                                                              
CAAGAGATCATGCTAAAACGGCAAGAGGAAGAACTGATGCAGCTACAA1550                      
GlnGluIleMetLeuLysArgGlnGluGluGluLeuMetGlnLeuGln                          
480485490                                                                 
GCCAAAATGGCCCTTAGACAGTCTCGGTTGAGCCTATATCCAGGAGAC1598                      
AlaLysMetAlaLeuArgGlnSerArgLeuSerLeuTyrProGlyAsp                          
495500505                                                                 
ACAATCAAAGCGTCCATGCTTGACATCACCAGGGATCCGTTAAGAGAA1646                      
ThrIleLysAlaSerMetLeuAspIleThrArgAspProLeuArgGlu                          
510515520                                                                 
ATTGCCCTAGAAACAGCCATGACTCAAAGAAAACTGAGGAATTTCTTT1694                      
IleAlaLeuGluThrAlaMetThrGlnArgLysLeuArgAsnPhePhe                          
525530535                                                                 
GGCCCTGAGTTTGTGAAAATGACAATTGAACCATTTATATCTTTGGAT1742                      
GlyProGluPheValLysMetThrIleGluProPheIleSerLeuAsp                          
540545550555                                                              
TTGCCACGGTCTATTCTTACTAAGAAAGGGAAGAATGAGGATAACCGA1790                      
LeuProArgSerIleLeuThrLysLysGlyLysAsnGluAspAsnArg                          
560565570                                                                 
AGGAAAGTAAACATAATGCTTCTGAACGGGCAAAGACTGGAACTGACC1838                      
ArgLysValAsnIleMetLeuLeuAsnGlyGlnArgLeuGluLeuThr                          
575580585                                                                 
TGTGATACCAAAACTATATGTAAAGATGTGTTTGATATGGTTGTGGCA1886                      
CysAspThrLysThrIleCysLysAspValPheAspMetValValAla                          
590595600                                                                 
CATATTGGCTTAGTAGAGCATCATTTGTTTGCTTTAGCTACCCTCAAA1934                      
HisIleGlyLeuValGluHisHisLeuPheAlaLeuAlaThrLeuLys                          
605610615                                                                 
GATAATGAATATTTCTTTGTTGATCCTGACTTAAAATTAACCAAAGTG1982                      
AspAsnGluTyrPhePheValAspProAspLeuLysLeuThrLysVal                          
620625630635                                                              
GCCCCAGAGGGATGGAAAGAAGAACCAAAGAAAAAGACCAAAGCCACT2030                      
AlaProGluGlyTrpLysGluGluProLysLysLysThrLysAlaThr                          
640645650                                                                 
GTTAATTTTACTTTGTTTTTCAGAATTAAATTTTTTATGGATGATGTT2078                      
ValAsnPheThrLeuPhePheArgIleLysPhePheMetAspAspVal                          
655660665                                                                 
AGTCTAATACAACATACTCTGACGTGTCATCAGTATTACCTTCAGCTT2126                      
SerLeuIleGlnHisThrLeuThrCysHisGlnTyrTyrLeuGlnLeu                          
670675680                                                                 
CGAAAAGATATTTTGGAGGAAAGGATGCACTGTGATGATGAGACTTCC2174                      
ArgLysAspIleLeuGluGluArgMetHisCysAspAspGluThrSer                          
685690695                                                                 
TTATTGCTGGCATCCTTGGCTCTCCAGGCTGAGTATGGAGATTATCAA2222                      
LeuLeuLeuAlaSerLeuAlaLeuGlnAlaGluTyrGlyAspTyrGln                          
700705710715                                                              
CCAGAGGTTCATGGTGTGTCTTACTTTAGAATGGAGCACTATTTGCCC2270                      
ProGluValHisGlyValSerTyrPheArgMetGluHisTyrLeuPro                          
720725730                                                                 
GCCAGAGTGATGGAGAAACTTGATTTATCCTATATCAAAGAAGAGTTA2318                      
AlaArgValMetGluLysLeuAspLeuSerTyrIleLysGluGluLeu                          
735740745                                                                 
CCCAAATTGCATAATACCTATGTGGGAGCTTCTGAAAAAGAGACAGAG2366                      
ProLysLeuHisAsnThrTyrValGlyAlaSerGluLysGluThrGlu                          
750755760                                                                 
TTAGAATTTTTAAAGGTCTGCCAAAGACTGACAGAATATGGAGTTCAT2414                      
LeuGluPheLeuLysValCysGlnArgLeuThrGluTyrGlyValHis                          
765770775                                                                 
TTTCACCGAGTGCACCCTGAGAAGAAGTCACAAACAGGAATATTGCTT2462                      
PheHisArgValHisProGluLysLysSerGlnThrGlyIleLeuLeu                          
780785790795                                                              
GGAGTCTGTTCTAAAGGTGTCCTTGTGTTTGAAGTTCACAATGGAGTG2510                      
GlyValCysSerLysGlyValLeuValPheGluValHisAsnGlyVal                          
800805810                                                                 
CGCACATTGGTCCTTCGCTTTCCATGGAGGGAAACCAAGAAAATATCT2558                      
ArgThrLeuValLeuArgPheProTrpArgGluThrLysLysIleSer                          
815820825                                                                 
TTTTCTAAAAAGAAAATCACATTGCAAAATACATCAGATGGAATAAAA2606                      
PheSerLysLysLysIleThrLeuGlnAsnThrSerAspGlyIleLys                          
830835840                                                                 
CATGGCTTCCAGACAGACAACAGTAAGATATGCCAGTACCTGCTGCAC2654                      
HisGlyPheGlnThrAspAsnSerLysIleCysGlnTyrLeuLeuHis                          
845850855                                                                 
CTCTGCTCTTACCAGCATAAGTTCCAGCTACAGATGAGAGCAAGACAG2702                      
LeuCysSerTyrGlnHisLysPheGlnLeuGlnMetArgAlaArgGln                          
860865870875                                                              
AGCAACCAAGATGCCCAAGATATTGAGAGAGCTTCGTTTAGGAGCCTG2750                      
SerAsnGlnAspAlaGlnAspIleGluArgAlaSerPheArgSerLeu                          
880885890                                                                 
AATCTCCAAGCAGAGTCTGTTAGAGGATTTAATATGGGACGAGCAATC2798                      
AsnLeuGlnAlaGluSerValArgGlyPheAsnMetGlyArgAlaIle                          
895900905                                                                 
AGCACTGGCAGTCTGGCCAGCAGCACCCTCAACAAACTTGCTGTTCGA2846                      
SerThrGlySerLeuAlaSerSerThrLeuAsnLysLeuAlaValArg                          
910915920                                                                 
CCTTTATCAGTTCAAGCTGAGATTCTGAAGAGGCTATCCTGCTCAGAG2894                      
ProLeuSerValGlnAlaGluIleLeuLysArgLeuSerCysSerGlu                          
925930935                                                                 
CTGTCGCTTTACCAGCCATTGCAAAACAGTTCAAAAGAGAAGAATGAC2942                      
LeuSerLeuTyrGlnProLeuGlnAsnSerSerLysGluLysAsnAsp                          
940945950955                                                              
AAAGCTTCATGGGAGGAAAAGCCTAGAGAGATGAGTAAATCATACCAT2990                      
LysAlaSerTrpGluGluLysProArgGluMetSerLysSerTyrHis                          
960965970                                                                 
GATCTCAGTCAGGCCTCTCTCTATCCACATCGGAAAAATGTCATTGTT3038                      
AspLeuSerGlnAlaSerLeuTyrProHisArgLysAsnValIleVal                          
975980985                                                                 
AACATGGAACCCCCACCACAAACCGTTGCAGAGTTGGTGGGAAAACCT3086                      
AsnMetGluProProProGlnThrValAlaGluLeuValGlyLysPro                          
9909951000                                                                
TCTCACCAGATGTCAAGATCTGATGCAGAATCTTTGGCAGGAGTGACA3134                      
SerHisGlnMetSerArgSerAspAlaGluSerLeuAlaGlyValThr                          
100510101015                                                              
AAACTTAATAATTCAAAGTCTGTTGCGAGTTTAAATAGAAGTCCTGAA3182                      
LysLeuAsnAsnSerLysSerValAlaSerLeuAsnArgSerProGlu                          
1020102510301035                                                          
AGGAGGAAACATGAATCAGACTCCTCATCCATTGAAGACCCTGGGCAA3230                      
ArgArgLysHisGluSerAspSerSerSerIleGluAspProGlyGln                          
104010451050                                                              
GCATATGTTCTAGATGTGCTACACAAAAGATGGAGCATAGTATCTTCA3278                      
AlaTyrValLeuAspValLeuHisLysArgTrpSerIleValSerSer                          
105510601065                                                              
CCAGAAAGGGAGATCACCTTAGTGAACCTGAAAAAAGATGCAAAGTAT3326                      
ProGluArgGluIleThrLeuValAsnLeuLysLysAspAlaLysTyr                          
107010751080                                                              
GGCTTGGGATTTCAAATTATTGGTGGGGAGAAGATGGAGACTGACCTA3374                      
GlyLeuGlyPheGlnIleIleGlyGlyGluLysMetGluThrAspLeu                          
108510901095                                                              
GGCATATTTATCAGCTCAGTTGCCCCTGGAGGACCAGCTGACTTCCAT3422                      
GlyIlePheIleSerSerValAlaProGlyGlyProAlaAspPheHis                          
1100110511101115                                                          
GGATGCTTGAAGCCAGGAGACCGTTTGATATCTGTGAATAGTGTGAGT3470                      
GlyCysLeuLysProGlyAspArgLeuIleSerValAsnSerValSer                          
112011251130                                                              
CTGGAGGGAGTCAGCCACCATGCTGCAATTGAAATTTTGCAAAATGCA3518                      
LeuGluGlyValSerHisHisAlaAlaIleGluIleLeuGlnAsnAla                          
113511401145                                                              
CCTGAAGATGTGACACTTGTTATCTCTCAGCCAAAAGAAAAGATATCC3566                      
ProGluAspValThrLeuValIleSerGlnProLysGluLysIleSer                          
115011551160                                                              
AAAGTGCCTTCTACTCCTGTGCATCTCACCAATGAGATGAAAAACTAC3614                      
LysValProSerThrProValHisLeuThrAsnGluMetLysAsnTyr                          
116511701175                                                              
ATGAAGAAATCTTCCTACATGCAAGACAGTGCTATAGATTCTTCTTCC3662                      
MetLysLysSerSerTyrMetGlnAspSerAlaIleAspSerSerSer                          
1180118511901195                                                          
AAGGATCACCACTGGTCACGTGGTACCCTGAGGCACATCTCGGAGAAC3710                      
LysAspHisHisTrpSerArgGlyThrLeuArgHisIleSerGluAsn                          
120012051210                                                              
TCCTTTGGGCCGTCTGGGGGCCTGCGGGAAGGAAGCCTGAGTTCTCAA3758                      
SerPheGlyProSerGlyGlyLeuArgGluGlySerLeuSerSerGln                          
121512201225                                                              
GATTCCAGGACTGAGAGTGCCAGCTTGTCTCAAAGCCAGGTCAATGGT3806                      
AspSerArgThrGluSerAlaSerLeuSerGlnSerGlnValAsnGly                          
123012351240                                                              
TTCTTTGCCAGCCATTTAGGTGACCAAACCTGGCAGGAATCACAGCAT3854                      
PhePheAlaSerHisLeuGlyAspGlnThrTrpGlnGluSerGlnHis                          
124512501255                                                              
GGCAGCCCTTCCCCATCTGTAATATCCAAAGCCACCGAGAAAGAGACT3902                      
GlySerProSerProSerValIleSerLysAlaThrGluLysGluThr                          
1260126512701275                                                          
TTCACTGATAGTAACCAAAGCAAAACTAAAAAGCCAGGCATTTCTGAT3950                      
PheThrAspSerAsnGlnSerLysThrLysLysProGlyIleSerAsp                          
128012851290                                                              
GTAACTGATTACTCAGACCGTGGAGATTCAGACATGGATGAAGCCACT3998                      
ValThrAspTyrSerAspArgGlyAspSerAspMetAspGluAlaThr                          
129513001305                                                              
TACTCCAGCAGTCAGGATCATCAAACACCAAAACAGGAATCTTCCTCT4046                      
TyrSerSerSerGlnAspHisGlnThrProLysGlnGluSerSerSer                          
131013151320                                                              
TCAGTGAATACATCCAACAAGATGAATTTTAAAACTTTTTCTTCATCA4094                      
SerValAsnThrSerAsnLysMetAsnPheLysThrPheSerSerSer                          
132513301335                                                              
CCTCCTAAGCCTGGAGATATCTTTGAGGTTGAACTGGCTAAAAATGAT4142                      
ProProLysProGlyAspIlePheGluValGluLeuAlaLysAsnAsp                          
1340134513501355                                                          
AACAGCTTGGGGATAAGTGTCACGGGAGGTGTGAATACGAGTGTCAGA4190                      
AsnSerLeuGlyIleSerValThrGlyGlyValAsnThrSerValArg                          
136013651370                                                              
CATGGTGGCATTTATGTGAAAGATGTTATTCCCCAGGGAGCAGCAGAG4238                      
HisGlyGlyIleTyrValLysAspValIleProGlnGlyAlaAlaGlu                          
137513801385                                                              
TCTGATGGTAGAATTCACAAAGGTGATCGCGTCCTAGCTGTCAATGGA4286                      
SerAspGlyArgIleHisLysGlyAspArgValLeuAlaValAsnGly                          
139013951400                                                              
GTTAGTCTAGAAGGAGCCACCCATAAGCAAGCTGTGGAAACACTGAGA4334                      
ValSerLeuGluGlyAlaThrHisLysGlnAlaValGluThrLeuArg                          
140514101415                                                              
AATACAGGACAGGTGGTTCATCTGTTATTAGAAAAGGGACAATCTCCA4382                      
AsnThrGlyGlnValValHisLeuLeuLeuGluLysGlyGlnSerPro                          
1420142514301435                                                          
ACATCTAAAGAACATGTCCCGGTAACCCCACAGTGTACCCTTTCAGAT4430                      
ThrSerLysGluHisValProValThrProGlnCysThrLeuSerAsp                          
144014451450                                                              
CAGAATGCCCAAGGTCAAGGCCCAGAAAAAGTGAAGAAAACAACTCAG4478                      
GlnAsnAlaGlnGlyGlnGlyProGluLysValLysLysThrThrGln                          
145514601465                                                              
GTCAAAGACTACAGCTTTGTCACTGAAGAAAATACATTTGAGGTAAAA4526                      
ValLysAspTyrSerPheValThrGluGluAsnThrPheGluValLys                          
147014751480                                                              
TTATTTAAAAATAGCTCAGGTCTAGGATTCAGTTTTTCTCGAGAAGAT4574                      
LeuPheLysAsnSerSerGlyLeuGlyPheSerPheSerArgGluAsp                          
148514901495                                                              
AATCTTATACCGGAGCAAATTAATGCCAGCATAGTAAGGGTTAAAAAG4622                      
AsnLeuIleProGluGlnIleAsnAlaSerIleValArgValLysLys                          
1500150515101515                                                          
CTCTTTGCTGGACAGCCAGCAGCAGAAAGTGGAAAAATTGATGTAGGA4670                      
LeuPheAlaGlyGlnProAlaAlaGluSerGlyLysIleAspValGly                          
152015251530                                                              
GATGTTATCTTGAAAGTGAATGGAGCCTCTTTGAAAGGACTATCTCAG4718                      
AspValIleLeuLysValAsnGlyAlaSerLeuLysGlyLeuSerGln                          
153515401545                                                              
CAGGAAGTCATATCTGCTCTCAGGGGAACTGCTCCAGAAGTATTCTTG4766                      
GlnGluValIleSerAlaLeuArgGlyThrAlaProGluValPheLeu                          
155015551560                                                              
CTTCTCTGCAGACCTCCACCTGGTGTGCTACCGGAAATTGATACTGCG4814                      
LeuLeuCysArgProProProGlyValLeuProGluIleAspThrAla                          
156515701575                                                              
CTTTTGACCCCACTTCAGTCTCCAGCACAAGTACTTCCAAACAGCAGT4862                      
LeuLeuThrProLeuGlnSerProAlaGlnValLeuProAsnSerSer                          
1580158515901595                                                          
AAAGACTCTTCTCAGCCATCATGTGTGGAGCAAAGCACCAGCTCAGAT4910                      
LysAspSerSerGlnProSerCysValGluGlnSerThrSerSerAsp                          
160016051610                                                              
GAAAATGAAATGTCAGACAAAAGCAAAAAACAGTGCAAGTCCCCATCC4958                      
GluAsnGluMetSerAspLysSerLysLysGlnCysLysSerProSer                          
161516201625                                                              
AGAAGAGACAGTTACAGTGACAGCAGTGGGAGTGGAGAAGATGACTTA5006                      
ArgArgAspSerTyrSerAspSerSerGlySerGlyGluAspAspLeu                          
163016351640                                                              
GTCACAGCTCCAGCAAACATATCAAATTCGACCTGGAGTTCAGCTTTG5054                      
ValThrAlaProAlaAsnIleSerAsnSerThrTrpSerSerAlaLeu                          
164516501655                                                              
CATCAGACTCTAAGCAACATGGTATCACAGGCACAGAGTCATCATGAA5102                      
HisGlnThrLeuSerAsnMetValSerGlnAlaGlnSerHisHisGlu                          
1660166516701675                                                          
GCACCCAAGAGTCAAGAAGATACCATTTGTACCATGTTTTACTATCCT5150                      
AlaProLysSerGlnGluAspThrIleCysThrMetPheTyrTyrPro                          
168016851690                                                              
CAGAAAATTCCCAATAAACCAGAGTTTGAGGACAGTAATCCTTCCCCT5198                      
GlnLysIleProAsnLysProGluPheGluAspSerAsnProSerPro                          
169517001705                                                              
CTACCACCGGATATGGCTCCTGGGCAGAGTTATCAACCCCAATCAGAA5246                      
LeuProProAspMetAlaProGlyGlnSerTyrGlnProGlnSerGlu                          
171017151720                                                              
TCTGCTTCCTCTAGTTCGATGGATAAGTATCATATACATCACATTTCT5294                      
SerAlaSerSerSerSerMetAspLysTyrHisIleHisHisIleSer                          
172517301735                                                              
GAACCAACTAGACAAGAAAACTGGACACCTTTGAAAAATGACTTGGAA5342                      
GluProThrArgGlnGluAsnTrpThrProLeuLysAsnAspLeuGlu                          
1740174517501755                                                          
AATCACCTTGAAGACTTTGAACTGGAAGTAGAACTCCTCATTACCCTA5390                      
AsnHisLeuGluAspPheGluLeuGluValGluLeuLeuIleThrLeu                          
176017651770                                                              
ATTAAATCAGAAAAAGCAAGCCTGGGTTTTACAGTAACCAAAGGCAAT5438                      
IleLysSerGluLysAlaSerLeuGlyPheThrValThrLysGlyAsn                          
177517801785                                                              
CAGAGAATTGGTTGTTATGTTCATGATGTCATACAGGATCCAGCCAAA5486                      
GlnArgIleGlyCysTyrValHisAspValIleGlnAspProAlaLys                          
179017951800                                                              
AGTGATGGAAGGCTAAAACCTGGGGACCGGCTCATAAAGGTTAATGAT5534                      
SerAspGlyArgLeuLysProGlyAspArgLeuIleLysValAsnAsp                          
180518101815                                                              
ACAGATGTTACTAATATGACTCATACAGATGCAGTTAATCTGCTCCGG5582                      
ThrAspValThrAsnMetThrHisThrAspAlaValAsnLeuLeuArg                          
1820182518301835                                                          
GCTGCATCCAAAACAGTCAGATTAGTTATTGGACGAGTTCCTAGAATT5630                      
AlaAlaSerLysThrValArgLeuValIleGlyArgValProArgIle                          
184018451850                                                              
ACCCAGAATACCAATGTTGCCTCATTTGCTACCGGACATAAACTAACG5678                      
ThrGlnAsnThrAsnValAlaSerPheAlaThrGlyHisLysLeuThr                          
185518601865                                                              
TGCAACAAAGAGGAGTTGGGTTTTTCCTTATGTGGAGGTCATGACAGC5726                      
CysAsnLysGluGluLeuGlyPheSerLeuCysGlyGlyHisAspSer                          
187018751880                                                              
CTTTATCAAGTGGTATATATTAGTGATATTAATCCAAGGTCCGTCGCA5774                      
LeuTyrGlnValValTyrIleSerAspIleAsnProArgSerValAla                          
188518901895                                                              
GCCATTGAGGGTAATCTCCAGCTATTAGATGTCATCCATTATGTGAAC5822                      
AlaIleGluGlyAsnLeuGlnLeuLeuAspValIleHisTyrValAsn                          
1900190519101915                                                          
GGAGTCAGCACACAAGGAATGACCTTGGAGGAAGTTAACAGAGCATTA5870                      
GlyValSerThrGlnGlyMetThrLeuGluGluValAsnArgAlaLeu                          
192019251930                                                              
GACATGTCACTTCCTTCATTGGTATTGAAAGCAACAAGAAATGATCTT5918                      
AspMetSerLeuProSerLeuValLeuLysAlaThrArgAsnAspLeu                          
193519401945                                                              
CCAGTGGTTCCCAGCTCAAAGAGGTCTGCTGTTTCAGCTCCAAAGTCA5966                      
ProValValProSerSerLysArgSerAlaValSerAlaProLysSer                          
195019551960                                                              
ACCAAAGGCAATGGTTCCTACAGTGTGGGGTCTTGCAGCCAGCCTGCC6014                      
ThrLysGlyAsnGlySerTyrSerValGlySerCysSerGlnProAla                          
196519701975                                                              
CTCACTCCTAATGATTCATTCTCCACGGTTGCTGGGGAAGAAATAAAT6062                      
LeuThrProAsnAspSerPheSerThrValAlaGlyGluGluIleAsn                          
1980198519901995                                                          
GAAATATCGTACCCCAAAGGAAAATGTTCTACTTATCAGATAAAGGGA6110                      
GluIleSerTyrProLysGlyLysCysSerThrTyrGlnIleLysGly                          
200020052010                                                              
TCACCAAACTTGACTCTGCCCAAAGAATCTTATATACAAGAAGATGAC6158                      
SerProAsnLeuThrLeuProLysGluSerTyrIleGlnGluAspAsp                          
201520202025                                                              
ATTTATGATGATTCCCAAGAAGCTGAAGTTATCCAGTCTCTGCTGGAT6206                      
IleTyrAspAspSerGlnGluAlaGluValIleGlnSerLeuLeuAsp                          
203020352040                                                              
GTTGTTGATGAGGAAGCCCAGAATCTTTTAAACGAAAATAATGCAGCA6254                      
ValValAspGluGluAlaGlnAsnLeuLeuAsnGluAsnAsnAlaAla                          
204520502055                                                              
GGAGACTCCTGTGGTCCAGGTACATTAAAGATGAATGGGAAGTTATCA6302                      
GlyAspSerCysGlyProGlyThrLeuLysMetAsnGlyLysLeuSer                          
2060206520702075                                                          
GAAGAGAGAACAGAAGATACAGACTGCGATGGTTCACCTTTACCTGAG6350                      
GluGluArgThrGluAspThrAspCysAspGlySerProLeuProGlu                          
208020852090                                                              
TATTTTACTGAGGCCACCAAAATGAATGGCTGTGAAGAATATTGTGAA6398                      
TyrPheThrGluAlaThrLysMetAsnGlyCysGluGluTyrCysGlu                          
209521002105                                                              
GAAAAAGTAAAAAGTGAAAGCTTAATTCAGAAGCCACAAGAAAAGAAG6446                      
GluLysValLysSerGluSerLeuIleGlnLysProGlnGluLysLys                          
211021152120                                                              
ACTGATGATGATGAAATAACATGGGGAAATGATGAGTTGCCAATAGAG6494                      
ThrAspAspAspGluIleThrTrpGlyAsnAspGluLeuProIleGlu                          
212521302135                                                              
AGAACAAACCATGAAGATTCTGATAAAGATCATTCCTTTCTGACAAAC6542                      
ArgThrAsnHisGluAspSerAspLysAspHisSerPheLeuThrAsn                          
2140214521502155                                                          
GATGAGCTCGCTGTACTCCCTGTCGTCAAAGTGCTTCCCTCTGGTAAA6590                      
AspGluLeuAlaValLeuProValValLysValLeuProSerGlyLys                          
216021652170                                                              
TACACGGGTGCCAACTTAAAATCAGTCATTCGAGTCCTGCGGGGTTTG6638                      
TyrThrGlyAlaAsnLeuLysSerValIleArgValLeuArgGlyLeu                          
217521802185                                                              
CTAGATCAAGGAATTCCTTCTAAGGAGCTGGAGAATCTTCAAGAATTA6686                      
LeuAspGlnGlyIleProSerLysGluLeuGluAsnLeuGlnGluLeu                          
219021952200                                                              
AAACCTTTGGATCAGTGTCTAATTGGGCAAACTAAGGAAAACAGAAGG6734                      
LysProLeuAspGlnCysLeuIleGlyGlnThrLysGluAsnArgArg                          
220522102215                                                              
AAGAACAGATATAAAAATATACTTCCCTATGATGCTACAAGAGTGCCT6782                      
LysAsnArgTyrLysAsnIleLeuProTyrAspAlaThrArgValPro                          
2220222522302235                                                          
CTTGGAGATGAAGGTGGCTATATCAATGCCAGCTTCATTAAGATACCA6830                      
LeuGlyAspGluGlyGlyTyrIleAsnAlaSerPheIleLysIlePro                          
224022452250                                                              
GTTGGGAAAGAAGAGTTCGTTTACATTGCCTGCCAAGGACCACTGCCT6878                      
ValGlyLysGluGluPheValTyrIleAlaCysGlnGlyProLeuPro                          
225522602265                                                              
ACAACTGTTGGAGACTTCTGGCAGATGATTTGGGAGCAAAAATCCACA6926                      
ThrThrValGlyAspPheTrpGlnMetIleTrpGluGlnLysSerThr                          
227022752280                                                              
GTGATAGCCATGATGACTCAAGAAGTAGAAGGAGAAAAAATCAAATGC6974                      
ValIleAlaMetMetThrGlnGluValGluGlyGluLysIleLysCys                          
228522902295                                                              
CAGCGCTATTGGCCCAACATCCTAGGCAAAACAACAATGGTCAGCAAC7022                      
GlnArgTyrTrpProAsnIleLeuGlyLysThrThrMetValSerAsn                          
2300230523102315                                                          
AGACTTCGACTGGCTCTTGTGAGAATGCAGCAGCTGAAGGGCTTTGTG7070                      
ArgLeuArgLeuAlaLeuValArgMetGlnGlnLeuLysGlyPheVal                          
232023252330                                                              
GTGAGGGCAATGACCCTTGAAGATATTCAGACCAGAGAGGTGCGCCAT7118                      
ValArgAlaMetThrLeuGluAspIleGlnThrArgGluValArgHis                          
233523402345                                                              
ATTTCTCATCTGAATTTCACTGCCTGGCCAGACCATGATACACCTTCT7166                      
IleSerHisLeuAsnPheThrAlaTrpProAspHisAspThrProSer                          
235023552360                                                              
CAACCAGATGATCTGCTTACTTTTATCTCCTACATGAGACACATCCAC7214                      
GlnProAspAspLeuLeuThrPheIleSerTyrMetArgHisIleHis                          
236523702375                                                              
AGATCAGGCCCAATCATTACGCACTGCAGTGCTGGCATTGGACGTTCA7262                      
ArgSerGlyProIleIleThrHisCysSerAlaGlyIleGlyArgSer                          
2380238523902395                                                          
GGGACCCTGATTTGCATAGATGTGGTTCTGGGATTAATCAGTCAGGAT7310                      
GlyThrLeuIleCysIleAspValValLeuGlyLeuIleSerGlnAsp                          
240024052410                                                              
CTTGATTTTGACATCTCTGATTTGGTGCGCTGCATGAGACTACAAAGA7358                      
LeuAspPheAspIleSerAspLeuValArgCysMetArgLeuGlnArg                          
241524202425                                                              
CACGGAATGGTTCAGACAGAGGATCAATATATTTTCTGCTATCAAGTC7406                      
HisGlyMetValGlnThrGluAspGlnTyrIlePheCysTyrGlnVal                          
243024352440                                                              
ATCCTTTATGTCCTGACACGTCTTCAAGCAGAAGAAGAGCAAAAACAG7454                      
IleLeuTyrValLeuThrArgLeuGlnAlaGluGluGluGlnLysGln                          
244524502455                                                              
CAGCCTCAGCTTCTGAAGTGACATGAAAAGAGCCTCTGGATGCATTTC7502                      
GlnProGlnLeuLeuLys                                                        
24602465                                                                  
CATTTCTCTCCTTAACCTCCAGCAGACTCCTGCTCTCTATCCAAATAAAGATCACAGAGC7562          
AGNAAGTTCATACAACATGCATGTTCTCCTCTATCTTAGAGGGGTATTCTTCTTGAAAAT7622          
AAAAAATATTGAAATGCTGTATTTTTACAGCTACTTTAACCTATGATAATTATTTACAAA7682          
ATTTTAACACTAACCAAACAATGCAGATCTTAGGGATGATTAAAGGCAGCATTGATGATA7742          
GCAAGACATTGTTACAAGGACATGGTGAGTCTATTTTTAATGCACCAATCTTGTTTATAG7802          
CAAAAATGTTTTCCAATATTTTAATAAAGTAGTTATTTTATAGGGCATACTTGAAACCAG7862          
TATTTAAGCTTTAAATGACAGTAATATTGGCATAGAAAAAAGTAGCAAATGTTTACTGTA7922          
TCAATTTCTAATGTTTACTATATAGAATTTCCTGTAATATATTTATATACTTTTTCATGA7982          
AAATGGAGTTATCAGTTATCTGTTTGTTACTGCATCATCTGTTTGTAATCATTATCTC8040            
(2) INFORMATION FOR SEQ ID NO:2:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 3090 base pairs                                               
(B) TYPE: nucleic acid                                                    
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: cDNA to mRNA                                          
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(vi) ORIGINAL SOURCE:                                                     
(A) ORGANISM: HOMO SAPIENS                                                
(ix) FEATURE:                                                             
(A) NAME/KEY: CDS                                                         
(B) LOCATION: 1311..2420                                                  
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                   
GAATTCCGGATTTACCTCAGTCTGTATCCCTTGAATAGCTCACAATAATCGACACATGCA60            
GCTGGGGACTGTGGGTGGGATACTTAGGTGTGGGACACCATATCTTCCAGCAGTAATAAA120           
GAAGTCAGGTGGGAATATGTAACATCTTGAGTGCTCATCCAGGTAGGTACTAAGGTATGA180           
TCAACTCTATGGAAGATCGATTAGGAAACTCCCTGAAAGAGAGTTCAGCCTGAAGAGAGA240           
ACCAAAGGCCAACATCTTGGAGCTGGCTACAGGACAGTAGGATGTAAGCTCGAGGGGAGG300           
AGAGGGTTAGGCGCAGTGGCTCACGCCTGTAGTCCCAACCATTTGGGAGGCTGAGGCAGG360           
CAGATCGCTTGAGCCCGGGGGTTCAAGACCAGCCTGGGCAACATGGCGAAACCCCATCTC420           
TACAAAAAAATACAAAAAAAATGTAGCTGCGTGTGGTGGCATGCACCTGTAGTCACAGCC480           
ACCACAGAGGTTGAGGTGGGAGGACTGCTTGAGCCTGGGAGGTGGAGGCTGCAGCGAACC540           
GAGATTGTGCCACTGCACTCCAGGATGGGCGACAGAGTGAGACCCGGACAGAGTGAGACC600           
CTGTCTCATTCATTCATTCATAAATAAGAAGAGGGGGAAAACGGGTGCCCAGATTGCTCT660           
CAGGCTCCTCCTCCCTTTCAGCTGGTACTTAACCACTCTTAACTTCAGCCTGCTCATGAA720           
TGAAATGGGAATGACAATTCCTAACTCAGGCAGTTTTTGCAAAGACCAGAGAAAATCATG780           
TATTAATACTAGTACCCAGCACCATTCCAAACATACAATACAAATGCCCCATAAATGACA840           
GCCAAGGTAACTGTTCTTTGCTTCCTCTCTTAGGAGACGTGTGAGGTTCTCTGTTGCTCC900           
TTTTGACTCCCAACTCCTGCTACAATGACTGATTTGACACTGATTACCTCACAGTACACA960           
CTGGGTGCTGGCCAACTGCAGCATGCTACGTATCCCACACCCCCTCCCTGAGTGGTGGGA1020          
CATTAATGGTGGGATGGTAGAATGTGCAGTCCGGTCTTGTACATTGAGTGTTAAACCTAC1080          
AATGTTTTGGATGATAGAAGGGACATTCCATCTTCTTACAAGCAGGGAAGTAACGGCAGA1140          
GCTGACTACTGGAAGGTGGTGCTGGTGGTGCAACAGGTTCTGGAGTTAAAACCAATGGAA1200          
AAGAAAGATTTCAGCTTTCCTTAAGACAAGACAAAGAGAAAAACCAGGAGATCCACCTAT1260          
CGCCCATCACATTACAGCCAGCACTGTCCGAGGCAAAGACAGTCCACAGCATGGTC1316              
MetVal                                                                    
CAACCTGAGCAGGCCCCAAAGGTACTGAATGTTGTCGTGGACCCTCAA1364                      
GlnProGluGlnAlaProLysValLeuAsnValValValAspProGln                          
51015                                                                     
GGCCGAGGTGCTCCTGAGATCAAAGCTACCACCGCTACCTCTGTTTGC1412                      
GlyArgGlyAlaProGluIleLysAlaThrThrAlaThrSerValCys                          
202530                                                                    
CCTTCTCCTTTCAAAATGAAGCCCATAGGACTTCAAGAGAGAAGAGGG1460                      
ProSerProPheLysMetLysProIleGlyLeuGlnGluArgArgGly                          
35404550                                                                  
TCCAACGTATCTCTTACATTGGACATGAGTAGCTTGGGGAACATTGAA1508                      
SerAsnValSerLeuThrLeuAspMetSerSerLeuGlyAsnIleGlu                          
556065                                                                    
CCCTTTGTGTCTATACCAACACCACGGGAGAAGGTAGCAATGGAGTAT1556                      
ProPheValSerIleProThrProArgGluLysValAlaMetGluTyr                          
707580                                                                    
CTGCAGTCAGCCAGCCGAATTCTCGACAAGGTTCAGCTGAGGGACGTC1604                      
LeuGlnSerAlaSerArgIleLeuAspLysValGlnLeuArgAspVal                          
859095                                                                    
GTGGCAAGTTCACATTTACTCCAAAGTGAATTCATGGAAATACCAATG1652                      
ValAlaSerSerHisLeuLeuGlnSerGluPheMetGluIleProMet                          
100105110                                                                 
AACTTTGTGGATCCCAAAGAAATTGATATTCCGCGTCATGGAACTAAA1700                      
AsnPheValAspProLysGluIleAspIleProArgHisGlyThrLys                          
115120125130                                                              
AATCGCTATAAGACCATTTTACCAAATCCCCTCAGCAGAGTGTGTTTA1748                      
AsnArgTyrLysThrIleLeuProAsnProLeuSerArgValCysLeu                          
135140145                                                                 
AGACCAAAAAATGTAACCGATTCATTGAGCACCTACATTAATGCTAAT1796                      
ArgProLysAsnValThrAspSerLeuSerThrTyrIleAsnAlaAsn                          
150155160                                                                 
TATATTAGGGGCTACAGTGGCAAGGAGAAAGCCTTCATTGCCACGCAG1844                      
TyrIleArgGlyTyrSerGlyLysGluLysAlaPheIleAlaThrGln                          
165170175                                                                 
GGCCCCATGATCAACACCGTGGATGATTTCTGGCAGATGGTTTGGCAG1892                      
GlyProMetIleAsnThrValAspAspPheTrpGlnMetValTrpGln                          
180185190                                                                 
GAAGACAGCCCTGTGATTGTTATGATCACAAAACTCAAAGAAAAAAAT1940                      
GluAspSerProValIleValMetIleThrLysLeuLysGluLysAsn                          
195200205210                                                              
GAGAAATGTGTGCTATACTGGCCGGAAAAGAGAGGGATATATGGAAAA1988                      
GluLysCysValLeuTyrTrpProGluLysArgGlyIleTyrGlyLys                          
215220225                                                                 
GTTGAGGTTCTGGTTATCAGTGTAAATGAATGTGATAACTACACCATT2036                      
ValGluValLeuValIleSerValAsnGluCysAspAsnTyrThrIle                          
230235240                                                                 
CGAAACCTTGTCTTAAAGCAAGGAAGCCACACCCAACATGTGAGCAAT2084                      
ArgAsnLeuValLeuLysGlnGlySerHisThrGlnHisValSerAsn                          
245250255                                                                 
TACTGGTACACCTCATGGCCTGATCACAAGACTCCAGACAGTGCCCAG2132                      
TyrTrpTyrThrSerTrpProAspHisLysThrProAspSerAlaGln                          
260265270                                                                 
CCCCTCCTACAGCTCATGCTGGATGTAGAAGAAGACAGACTTGCTTCC2180                      
ProLeuLeuGlnLeuMetLeuAspValGluGluAspArgLeuAlaSer                          
275280285290                                                              
CAGGGGCCGAGGGCTGTGGTTGTCCACTGCAGTGCAGGAATAGGTAGA2228                      
GlnGlyProArgAlaValValValHisCysSerAlaGlyIleGlyArg                          
295300305                                                                 
ACAGGGTGTTTTATTGCTACATCCATTGGCTGTCAACAGCTGAAAGAA2276                      
ThrGlyCysPheIleAlaThrSerIleGlyCysGlnGlnLeuLysGlu                          
310315320                                                                 
GAAGGAGTTGTGGATGCACTAAGCATTGTCTGCCAGCTTCGTATGGAT2324                      
GluGlyValValAspAlaLeuSerIleValCysGlnLeuArgMetAsp                          
325330335                                                                 
AGAGGTGGAATGGTGCAAACCAGTGAGCAGTATGAATTTGTGCACCAT2372                      
ArgGlyGlyMetValGlnThrSerGluGlnTyrGluPheValHisHis                          
340345350                                                                 
GCTCTGTGCCTGTATGAGAGCAGACTTTCAGCAGAGACT                                   
AlaLeuCysLeuTyrGluSerArgLeuSerAlaGluThr                                   
355360365                                                                 
GTCCAGTGAGTCATTG2427                                                      
ValGln                                                                    
370                                                                       
AAGACTTGTCAGACCATCAATCTCTTGGGGTGATTAACAAATTACCCACCCAAGGCTTCA2487          
TGAAGGAGCTTCCTGCAATGGAAGGAAGGAGAAGCTCTGAAGCCCATGTATGGCATGGAT2547          
TGTGGAAGACTGGGCAACATATTTAAGATTTCCAGCTCCTTGTGTATATGAATGCATTTG2607          
TAAGCATCCCCCAAATTATTCTGAAGGTTTTTTGATGATGGAGGTATGATAGGTTTATCA2667          
CACAGCCTAAGGCAGATTTTGTTTTGTCTGTACTGACTCTATCTGCCACACAGAATGTAT2727          
GTATGTAATATTCAGTAATAAATGTCATCAGGTGATGACTGGATGAGCTGCTGAAGACAT2787          
TCGTATTATGTGTTAGATGCTTTAATGTTTGCAAAATCTGTCTTGTGAATGGACTGTCAG2847          
CTGTTAAACTGTTCCTGTTTTGAAGTGCTATTACCTTTCTCAGTTACCAGAATCTTGCTG2907          
CTAAAGTTGCAAGTGATTGATAATGGATTTTTAACAGAGAAGTCTTTGTTTTTGAAAAAC2967          
AAAAATCAAAAACAGTAACTATTTTATATGGAAATGTGTCTTGATAATATTACCTATTAA3027          
ATGTGTATTTATAGTCCCTCCTATCAAACAATTACAGAGCACAATGATTGTCATCCGGAA3087          
TTC3090                                                                   
(2) INFORMATION FOR SEQ ID NO:3:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 2465 amino acids                                              
(B) TYPE: amino acid                                                      
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: protein                                               
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                   
MetHisValSerLeuAlaGluAlaLeuGluValArgGlyGlyProLeu                          
151015                                                                    
GlnGluGluGluIleTrpAlaValLeuAsnGlnSerAlaGluSerLeu                          
202530                                                                    
GlnGluLeuPheArgLysValSerLeuAlaAspProAlaAlaLeuGly                          
354045                                                                    
PheIleIleSerProTrpSerLeuLeuLeuLeuProSerGlySerVal                          
505560                                                                    
SerPheThrAspGluAsnIleSerAsnGlnAspLeuArgAlaPheThr                          
65707580                                                                  
AlaProGluValLeuGlnAsnGlnSerLeuThrSerLeuSerAspVal                          
859095                                                                    
GluLysIleHisIleTyrSerLeuGlyMetThrLeuTyrTrpGlyAla                          
100105110                                                                 
AspTyrGluValProGlnSerGlnProIleLysLeuGlyAspHisLeu                          
115120125                                                                 
AsnSerIleLeuLeuGlyMetCysGluAspValIleTyrAlaArgVal                          
130135140                                                                 
SerValArgThrValLeuAspAlaCysSerAlaHisIleArgAsnSer                          
145150155160                                                              
AsnCysAlaProSerPheSerTyrValLysHisLeuValLysLeuVal                          
165170175                                                                 
LeuGlyAsnLeuSerGlyThrAspGlnLeuSerCysAsnSerGluGln                          
180185190                                                                 
LysProAspArgSerGlnAlaIleArgAspArgLeuArgGlyLysGly                          
195200205                                                                 
LeuProThrGlyArgSerSerThrSerAspValLeuAspIleGlnLys                          
210215220                                                                 
ProProLeuSerHisGlnThrPheLeuAsnLysGlyLeuSerLysSer                          
225230235240                                                              
MetGlyPheLeuSerIleLysAspThrGlnAspGluAsnTyrPheLys                          
245250255                                                                 
AspIleLeuSerAspAsnSerGlyArgGluAspSerGluAsnThrPhe                          
260265270                                                                 
CysProTyrGlnPheLysThrSerGlyProGluLysLysProIlePro                          
275280285                                                                 
GlyIleAspValLeuSerLysLysLysIleTrpAlaSerSerMetAsp                          
290295300                                                                 
LeuLeuCysThrAlaAspArgAspPheSerSerGlyGluThrAlaThr                          
305310315320                                                              
TyrArgArgCysHisProGluAlaValThrValArgThrSerThrThr                          
325330335                                                                 
ProArgLysLysGluAlaArgTyrSerAspGlySerIleAlaLeuAsp                          
340345350                                                                 
IlePheGlyProGlnLysMetAspProIleTyrHisThrArgGluLeu                          
355360365                                                                 
ProThrSerSerAlaIleSerSerAlaLeuAspArgIleArgGluArg                          
370375380                                                                 
GlnLysLysLeuGlnValLeuArgGluAlaMetAsnValGluGluPro                          
385390395400                                                              
ValArgArgTyrLysThrTyrHisGlyAspValPheSerThrSerSer                          
405410415                                                                 
GluSerProSerIleIleSerSerGluSerAspPheArgGlnValArg                          
420425430                                                                 
ArgSerGluAlaSerLysArgPheGluSerSerSerGlyLeuProGly                          
435440445                                                                 
ValAspGluThrLeuSerGlnGlyGlnSerGlnArgProSerArgGln                          
450455460                                                                 
TyrGluThrProPheGluGlyAsnLeuIleAsnGlnGluIleMetLeu                          
465470475480                                                              
LysArgGlnGluGluGluLeuMetGlnLeuGlnAlaLysMetAlaLeu                          
485490495                                                                 
ArgGlnSerArgLeuSerLeuTyrProGlyAspThrIleLysAlaSer                          
500505510                                                                 
MetLeuAspIleThrArgAspProLeuArgGluIleAlaLeuGluThr                          
515520525                                                                 
AlaMetThrGlnArgLysLeuArgAsnPhePheGlyProGluPheVal                          
530535540                                                                 
LysMetThrIleGluProPheIleSerLeuAspLeuProArgSerIle                          
545550555560                                                              
LeuThrLysLysGlyLysAsnGluAspAsnArgArgLysValAsnIle                          
565570575                                                                 
MetLeuLeuAsnGlyGlnArgLeuGluLeuThrCysAspThrLysThr                          
580585590                                                                 
IleCysLysAspValPheAspMetValValAlaHisIleGlyLeuVal                          
595600605                                                                 
GluHisHisLeuPheAlaLeuAlaThrLeuLysAspAsnGluTyrPhe                          
610615620                                                                 
PheValAspProAspLeuLysLeuThrLysValAlaProGluGlyTrp                          
625630635640                                                              
LysGluGluProLysLysLysThrLysAlaThrValAsnPheThrLeu                          
645650655                                                                 
PhePheArgIleLysPhePheMetAspAspValSerLeuIleGlnHis                          
660665670                                                                 
ThrLeuThrCysHisGlnTyrTyrLeuGlnLeuArgLysAspIleLeu                          
675680685                                                                 
GluGluArgMetHisCysAspAspGluThrSerLeuLeuLeuAlaSer                          
690695700                                                                 
LeuAlaLeuGlnAlaGluTyrGlyAspTyrGlnProGluValHisGly                          
705710715720                                                              
ValSerTyrPheArgMetGluHisTyrLeuProAlaArgValMetGlu                          
725730735                                                                 
LysLeuAspLeuSerTyrIleLysGluGluLeuProLysLeuHisAsn                          
740745750                                                                 
ThrTyrValGlyAlaSerGluLysGluThrGluLeuGluPheLeuLys                          
755760765                                                                 
ValCysGlnArgLeuThrGluTyrGlyValHisPheHisArgValHis                          
770775780                                                                 
ProGluLysLysSerGlnThrGlyIleLeuLeuGlyValCysSerLys                          
785790795800                                                              
GlyValLeuValPheGluValHisAsnGlyValArgThrLeuValLeu                          
805810815                                                                 
ArgPheProTrpArgGluThrLysLysIleSerPheSerLysLysLys                          
820825830                                                                 
IleThrLeuGlnAsnThrSerAspGlyIleLysHisGlyPheGlnThr                          
835840845                                                                 
AspAsnSerLysIleCysGlnTyrLeuLeuHisLeuCysSerTyrGln                          
850855860                                                                 
HisLysPheGlnLeuGlnMetArgAlaArgGlnSerAsnGlnAspAla                          
865870875880                                                              
GlnAspIleGluArgAlaSerPheArgSerLeuAsnLeuGlnAlaGlu                          
885890895                                                                 
SerValArgGlyPheAsnMetGlyArgAlaIleSerThrGlySerLeu                          
900905910                                                                 
AlaSerSerThrLeuAsnLysLeuAlaValArgProLeuSerValGln                          
915920925                                                                 
AlaGluIleLeuLysArgLeuSerCysSerGluLeuSerLeuTyrGln                          
930935940                                                                 
ProLeuGlnAsnSerSerLysGluLysAsnAspLysAlaSerTrpGlu                          
945950955960                                                              
GluLysProArgGluMetSerLysSerTyrHisAspLeuSerGlnAla                          
965970975                                                                 
SerLeuTyrProHisArgLysAsnValIleValAsnMetGluProPro                          
980985990                                                                 
ProGlnThrValAlaGluLeuValGlyLysProSerHisGlnMetSer                          
99510001005                                                               
ArgSerAspAlaGluSerLeuAlaGlyValThrLysLeuAsnAsnSer                          
101010151020                                                              
LysSerValAlaSerLeuAsnArgSerProGluArgArgLysHisGlu                          
1025103010351040                                                          
SerAspSerSerSerIleGluAspProGlyGlnAlaTyrValLeuAsp                          
104510501055                                                              
ValLeuHisLysArgTrpSerIleValSerSerProGluArgGluIle                          
106010651070                                                              
ThrLeuValAsnLeuLysLysAspAlaLysTyrGlyLeuGlyPheGln                          
107510801085                                                              
IleIleGlyGlyGluLysMetGluThrAspLeuGlyIlePheIleSer                          
109010951100                                                              
SerValAlaProGlyGlyProAlaAspPheHisGlyCysLeuLysPro                          
1105111011151120                                                          
GlyAspArgLeuIleSerValAsnSerValSerLeuGluGlyValSer                          
112511301135                                                              
HisHisAlaAlaIleGluIleLeuGlnAsnAlaProGluAspValThr                          
114011451150                                                              
LeuValIleSerGlnProLysGluLysIleSerLysValProSerThr                          
115511601165                                                              
ProValHisLeuThrAsnGluMetLysAsnTyrMetLysLysSerSer                          
117011751180                                                              
TyrMetGlnAspSerAlaIleAspSerSerSerLysAspHisHisTrp                          
1185119011951200                                                          
SerArgGlyThrLeuArgHisIleSerGluAsnSerPheGlyProSer                          
120512101215                                                              
GlyGlyLeuArgGluGlySerLeuSerSerGlnAspSerArgThrGlu                          
122012251230                                                              
SerAlaSerLeuSerGlnSerGlnValAsnGlyPhePheAlaSerHis                          
123512401245                                                              
LeuGlyAspGlnThrTrpGlnGluSerGlnHisGlySerProSerPro                          
125012551260                                                              
SerValIleSerLysAlaThrGluLysGluThrPheThrAspSerAsn                          
1265127012751280                                                          
GlnSerLysThrLysLysProGlyIleSerAspValThrAspTyrSer                          
128512901295                                                              
AspArgGlyAspSerAspMetAspGluAlaThrTyrSerSerSerGln                          
130013051310                                                              
AspHisGlnThrProLysGlnGluSerSerSerSerValAsnThrSer                          
131513201325                                                              
AsnLysMetAsnPheLysThrPheSerSerSerProProLysProGly                          
133013351340                                                              
AspIlePheGluValGluLeuAlaLysAsnAspAsnSerLeuGlyIle                          
1345135013551360                                                          
SerValThrGlyGlyValAsnThrSerValArgHisGlyGlyIleTyr                          
136513701375                                                              
ValLysAspValIleProGlnGlyAlaAlaGluSerAspGlyArgIle                          
138013851390                                                              
HisLysGlyAspArgValLeuAlaValAsnGlyValSerLeuGluGly                          
139514001405                                                              
AlaThrHisLysGlnAlaValGluThrLeuArgAsnThrGlyGlnVal                          
141014151420                                                              
ValHisLeuLeuLeuGluLysGlyGlnSerProThrSerLysGluHis                          
1425143014351440                                                          
ValProValThrProGlnCysThrLeuSerAspGlnAsnAlaGlnGly                          
144514501455                                                              
GlnGlyProGluLysValLysLysThrThrGlnValLysAspTyrSer                          
146014651470                                                              
PheValThrGluGluAsnThrPheGluValLysLeuPheLysAsnSer                          
147514801485                                                              
SerGlyLeuGlyPheSerPheSerArgGluAspAsnLeuIleProGlu                          
149014951500                                                              
GlnIleAsnAlaSerIleValArgValLysLysLeuPheAlaGlyGln                          
1505151015151520                                                          
ProAlaAlaGluSerGlyLysIleAspValGlyAspValIleLeuLys                          
152515301535                                                              
ValAsnGlyAlaSerLeuLysGlyLeuSerGlnGlnGluValIleSer                          
154015451550                                                              
AlaLeuArgGlyThrAlaProGluValPheLeuLeuLeuCysArgPro                          
155515601565                                                              
ProProGlyValLeuProGluIleAspThrAlaLeuLeuThrProLeu                          
157015751580                                                              
GlnSerProAlaGlnValLeuProAsnSerSerLysAspSerSerGln                          
1585159015951600                                                          
ProSerCysValGluGlnSerThrSerSerAspGluAsnGluMetSer                          
160516101615                                                              
AspLysSerLysLysGlnCysLysSerProSerArgArgAspSerTyr                          
162016251630                                                              
SerAspSerSerGlySerGlyGluAspAspLeuValThrAlaProAla                          
163516401645                                                              
AsnIleSerAsnSerThrTrpSerSerAlaLeuHisGlnThrLeuSer                          
165016551660                                                              
AsnMetValSerGlnAlaGlnSerHisHisGluAlaProLysSerGln                          
1665167016751680                                                          
GluAspThrIleCysThrMetPheTyrTyrProGlnLysIleProAsn                          
168516901695                                                              
LysProGluPheGluAspSerAsnProSerProLeuProProAspMet                          
170017051710                                                              
AlaProGlyGlnSerTyrGlnProGlnSerGluSerAlaSerSerSer                          
171517201725                                                              
SerMetAspLysTyrHisIleHisHisIleSerGluProThrArgGln                          
173017351740                                                              
GluAsnTrpThrProLeuLysAsnAspLeuGluAsnHisLeuGluAsp                          
1745175017551760                                                          
PheGluLeuGluValGluLeuLeuIleThrLeuIleLysSerGluLys                          
176517701775                                                              
AlaSerLeuGlyPheThrValThrLysGlyAsnGlnArgIleGlyCys                          
178017851790                                                              
TyrValHisAspValIleGlnAspProAlaLysSerAspGlyArgLeu                          
179518001805                                                              
LysProGlyAspArgLeuIleLysValAsnAspThrAspValThrAsn                          
181018151820                                                              
MetThrHisThrAspAlaValAsnLeuLeuArgAlaAlaSerLysThr                          
1825183018351840                                                          
ValArgLeuValIleGlyArgValProArgIleThrGlnAsnThrAsn                          
184518501855                                                              
ValAlaSerPheAlaThrGlyHisLysLeuThrCysAsnLysGluGlu                          
186018651870                                                              
LeuGlyPheSerLeuCysGlyGlyHisAspSerLeuTyrGlnValVal                          
187518801885                                                              
TyrIleSerAspIleAsnProArgSerValAlaAlaIleGluGlyAsn                          
189018951900                                                              
LeuGlnLeuLeuAspValIleHisTyrValAsnGlyValSerThrGln                          
1905191019151920                                                          
GlyMetThrLeuGluGluValAsnArgAlaLeuAspMetSerLeuPro                          
192519301935                                                              
SerLeuValLeuLysAlaThrArgAsnAspLeuProValValProSer                          
194019451950                                                              
SerLysArgSerAlaValSerAlaProLysSerThrLysGlyAsnGly                          
195519601965                                                              
SerTyrSerValGlySerCysSerGlnProAlaLeuThrProAsnAsp                          
197019751980                                                              
SerPheSerThrValAlaGlyGluGluIleAsnGluIleSerTyrPro                          
1985199019952000                                                          
LysGlyLysCysSerThrTyrGlnIleLysGlySerProAsnLeuThr                          
200520102015                                                              
LeuProLysGluSerTyrIleGlnGluAspAspIleTyrAspAspSer                          
202020252030                                                              
GlnGluAlaGluValIleGlnSerLeuLeuAspValValAspGluGlu                          
203520402045                                                              
AlaGlnAsnLeuLeuAsnGluAsnAsnAlaAlaGlyAspSerCysGly                          
205020552060                                                              
ProGlyThrLeuLysMetAsnGlyLysLeuSerGluGluArgThrGlu                          
2065207020752080                                                          
AspThrAspCysAspGlySerProLeuProGluTyrPheThrGluAla                          
208520902095                                                              
ThrLysMetAsnGlyCysGluGluTyrCysGluGluLysValLysSer                          
210021052110                                                              
GluSerLeuIleGlnLysProGlnGluLysLysThrAspAspAspGlu                          
211521202125                                                              
IleThrTrpGlyAsnAspGluLeuProIleGluArgThrAsnHisGlu                          
213021352140                                                              
AspSerAspLysAspHisSerPheLeuThrAsnAspGluLeuAlaVal                          
2145215021552160                                                          
LeuProValValLysValLeuProSerGlyLysTyrThrGlyAlaAsn                          
216521702175                                                              
LeuLysSerValIleArgValLeuArgGlyLeuLeuAspGlnGlyIle                          
218021852190                                                              
ProSerLysGluLeuGluAsnLeuGlnGluLeuLysProLeuAspGln                          
219522002205                                                              
CysLeuIleGlyGlnThrLysGluAsnArgArgLysAsnArgTyrLys                          
221022152220                                                              
AsnIleLeuProTyrAspAlaThrArgValProLeuGlyAspGluGly                          
2225223022352240                                                          
GlyTyrIleAsnAlaSerPheIleLysIleProValGlyLysGluGlu                          
224522502255                                                              
PheValTyrIleAlaCysGlnGlyProLeuProThrThrValGlyAsp                          
226022652270                                                              
PheTrpGlnMetIleTrpGluGlnLysSerThrValIleAlaMetMet                          
227522802285                                                              
ThrGlnGluValGluGlyGluLysIleLysCysGlnArgTyrTrpPro                          
229022952300                                                              
AsnIleLeuGlyLysThrThrMetValSerAsnArgLeuArgLeuAla                          
2305231023152320                                                          
LeuValArgMetGlnGlnLeuLysGlyPheValValArgAlaMetThr                          
232523302335                                                              
LeuGluAspIleGlnThrArgGluValArgHisIleSerHisLeuAsn                          
234023452350                                                              
PheThrAlaTrpProAspHisAspThrProSerGlnProAspAspLeu                          
235523602365                                                              
LeuThrPheIleSerTyrMetArgHisIleHisArgSerGlyProIle                          
237023752380                                                              
IleThrHisCysSerAlaGlyIleGlyArgSerGlyThrLeuIleCys                          
2385239023952400                                                          
IleAspValValLeuGlyLeuIleSerGlnAspLeuAspPheAspIle                          
240524102415                                                              
SerAspLeuValArgCysMetArgLeuGlnArgHisGlyMetValGln                          
242024252430                                                              
ThrGluAspGlnTyrIlePheCysTyrGlnValIleLeuTyrValLeu                          
243524402445                                                              
ThrArgLeuGlnAlaGluGluGluGlnLysGlnGlnProGlnLeuLeu                          
245024552460                                                              
Lys                                                                       
2465                                                                      
(2) INFORMATION FOR SEQ ID NO:4:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 369 amino acids                                               
(B) TYPE: amino acid                                                      
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: protein                                               
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                   
MetValGlnProGluGlnAlaProLysValLeuAsnValValValAsp                          
151015                                                                    
ProGlnGlyArgGlyAlaProGluIleLysAlaThrThrAlaThrSer                          
202530                                                                    
ValCysProSerProPheLysMetLysProIleGlyLeuGlnGluArg                          
354045                                                                    
ArgGlySerAsnValSerLeuThrLeuAspMetSerSerLeuGlyAsn                          
505560                                                                    
IleGluProPheValSerIleProThrProArgGluLysValAlaMet                          
65707580                                                                  
GluTyrLeuGlnSerAlaSerArgIleLeuAspLysValGlnLeuArg                          
859095                                                                    
AspValValAlaSerSerHisLeuLeuGlnSerGluPheMetGluIle                          
100105110                                                                 
ProMetAsnPheValAspProLysGluIleAspIleProArgHisGly                          
115120125                                                                 
ThrLysAsnArgTyrLysThrIleLeuProAsnProLeuSerArgVal                          
130135140                                                                 
CysLeuArgProLysAsnValThrAspSerLeuSerThrTyrIleAsn                          
145150155160                                                              
AlaAsnTyrIleArgGlyTyrSerGlyLysGluLysAlaPheIleAla                          
165170175                                                                 
ThrGlnGlyProMetIleAsnThrValAspAspPheTrpGlnMetVal                          
180185190                                                                 
TrpGlnGluAspSerProValIleValMetIleThrLysLeuLysGlu                          
195200205                                                                 
LysAsnGluLysCysValLeuTyrTrpProGluLysArgGlyIleTyr                          
210215220                                                                 
GlyLysValGluValLeuValIleSerValAsnGluCysAspAsnTyr                          
225230235240                                                              
ThrIleArgAsnLeuValLeuLysGlnGlySerHisThrGlnHisVal                          
245250255                                                                 
SerAsnTyrTrpTyrThrSerTrpProAspHisLysThrProAspSer                          
260265270                                                                 
AlaGlnProLeuLeuGlnLeuMetLeuAspValGluGluAspArgLeu                          
275280285                                                                 
AlaSerGlnGlyProArgAlaValValValHisCysSerAlaGlyIle                          
290295300                                                                 
GlyArgThrGlyCysPheIleAlaThrSerIleGlyCysGlnGlnLeu                          
305310315320                                                              
LysGluGluGlyValValAspAlaLeuSerIleValCysGlnLeuArg                          
325330335                                                                 
MetAspArgGlyGlyMetValGlnThrSerGluGlnTyrGluPheVal                          
340345350                                                                 
HisHisAlaLeuCysLeuTyrGluSerArgLeuSerAlaGluThrVal                          
355360365                                                                 
Gln                                                                       
(2) INFORMATION FOR SEQ ID NO:5:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 8 amino acids                                                 
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(ix) FEATURE:                                                             
(A) NAME/KEY: Region                                                      
(B) LOCATION: 5                                                           
(D) OTHER INFORMATION: /note= "Xaa = I or V"                              
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                   
PheTrpArgMetXaaTrpGluGln                                                  
15                                                                        
(2) INFORMATION FOR SEQ ID NO:6:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 23 base pairs                                                 
(B) TYPE: nucleic acid                                                    
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: cDNA                                                  
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                   
TTCTGGMGNATGATNTGGGAACA23                                                 
(2) INFORMATION FOR SEQ ID NO:7:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 7 amino acids                                                 
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(ix) FEATURE:                                                             
(A) NAME/KEY: Region                                                      
(B) LOCATION: 3                                                           
(D) OTHER INFORMATION: /note= "Xaa = A or D"                              
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                   
LysCysXaaGlxTyrTrpPro                                                     
15                                                                        
(2) INFORMATION FOR SEQ ID NO:8:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 20 base pairs                                                 
(B) TYPE: nucleic acid                                                    
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: cDNA                                                  
(iii) HYPOTHETICAL: NO                                                    
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                   
AARTGYGANCAGTAYTGGCC20                                                    
(2) INFORMATION FOR SEQ ID NO:9:                                          
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 7 amino acids                                                 
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(ix) FEATURE:                                                             
(A) NAME/KEY: Region                                                      
(B) LOCATION: 6                                                           
(D) OTHER INFORMATION: /note= "Xaa = V or I"                              
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                   
HisCysSerAlaGlyXaaGly                                                     
15                                                                        
(2) INFORMATION FOR SEQ ID NO:10:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 20 base pairs                                                 
(B) TYPE: nucleic acid                                                    
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: cDNA                                                  
(iii) HYPOTHETICAL: NO                                                    
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                  
CCNACNCCMGCRCTGCAGTG20                                                    
(2) INFORMATION FOR SEQ ID NO:11:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 303 amino acids                                               
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                  
ArgLysValAsnIleMetLeuLeuAsnGlyGlnArgLeuGluLeuThr                          
151015                                                                    
CysAspThrLysThrIleCysLysAspValPheAspMetValValAla                          
202530                                                                    
HisIleGlyLeuValGluHisHisLeuPheAlaLeuAlaThrLeuLys                          
354045                                                                    
AspAsnGluTyrPhePheValAspProAspLeuLysLeuThrLysVal                          
505560                                                                    
AlaProGluGlyTrpLysGluGluProLysLysLysThrLysAlaThr                          
65707580                                                                  
ValAsnPheThrLeuPhePheArgIleLysPhePheMetAspAspVal                          
859095                                                                    
SerLeuIleGlnHisThrLeuThrCysHisGlnTyrTyrLeuGlnLeu                          
100105110                                                                 
ArgLysAspIleLeuGluGluArgMetHisCysAspAspGluThrSer                          
115120125                                                                 
LeuLeuLeuAlaSerLeuAlaLeuGlnAlaGluTyrGlyAspTyrGln                          
130135140                                                                 
ProGluValHisGlyValSerTyrPheArgMetGluHisTyrLeuPro                          
145150155160                                                              
AlaArgValMetGluLysLeuAspLeuSerTyrIleLysGluGluLeu                          
165170175                                                                 
ProLysLeuHisAsnThrTyrValGlyAlaSerGluLysGluThrGlu                          
180185190                                                                 
LeuGluPheLeuLysValCysGlnArgLeuThrGluTyrGlyValHis                          
195200205                                                                 
PheHisArgValHisProGluLysLysSerGlnThrGlyIleLeuLeu                          
210215220                                                                 
GlyValCysSerLysGlyValLeuValPheGluValHisAsnGlyVal                          
225230235240                                                              
ArgThrLeuValLeuArgPheProTrpArgGluThrLysLysIleSer                          
245250255                                                                 
PheSerLysLysLysIleThrLeuGlnAsnThrSerAspGlyIleLys                          
260265270                                                                 
HisGlyPheGlnThrAspAsnSerLysIleCysGlnTyrLeuLeuHis                          
275280285                                                                 
LeuCysSerTyrGlnHisLysPheGlnLeuGlnMetArgAlaArg                             
290295300                                                                 
(2) INFORMATION FOR SEQ ID NO:12:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 296 amino acids                                               
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                  
IleAsnValArgValThrThrMetAspAlaGluLeuGluPheAlaIle                          
151015                                                                    
GlnProAsnThrThrGlyLysGlnLeuPheAspGlnValValLysThr                          
202530                                                                    
IleGlyLeuArgGluValTrpTyrPheGlyLeuHisTyrValAspAsn                          
354045                                                                    
LysGlyPheProThrTrpLeuLysLeuAspLysLysValSerAlaGln                          
505560                                                                    
GluValArgLysGluAsnProLeuGlnPheLysPheArgAlaLysPhe                          
65707580                                                                  
TyrProGluAspValAlaGluGluLeuIleGlnAspIleThrGlnLys                          
859095                                                                    
LeuPhePheLeuGlnValLysGluGlyIleLeuSerAspGluIleTyr                          
100105110                                                                 
CysProProGluThrAlaValLeuLeuGlySerTyrAlaValGlnAla                          
115120125                                                                 
LysPheGlyAspTyrAsnLysGluValHisLysSerGlyTyrLeuSer                          
130135140                                                                 
SerGluArgLeuIleProGlnArgValMetAspGlnHisLysLeuThr                          
145150155160                                                              
ArgAspGlnTrpGluAspArgIleGlnValTrpHisAlaGluHisArg                          
165170175                                                                 
GlyMetLeuLysAspAsnAlaMetLeuGluTyrLeuLysIleAlaGln                          
180185190                                                                 
AspLeuGluMetTyrGlyIleAsnTyrPheGluIleLysAsnLysLys                          
195200205                                                                 
GlyThrAspLeuTrpLeuGlyValAspAlaLeuGlyLeuAsnIleTyr                          
210215220                                                                 
GluLysAspAspLysLeuThrProLysIleGlyPheProTrpSerGlu                          
225230235240                                                              
IleArgAsnIleSerPheAsnAspLysLysPheValIleLysProIle                          
245250255                                                                 
AspLysLysAlaProAspPheValPheTyrAlaProArgLeuArgIle                          
260265270                                                                 
AsnLysArgIleLeuGlnLeuCysMetGlyAsnHisGluLeuTyrMet                          
275280285                                                                 
ArgArgArgLysProAspThrIle                                                  
290295                                                                    
(2) INFORMATION FOR SEQ ID NO:13:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 247 amino acids                                               
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                  
MetHisCysLysValSerLeuLeuAspAspThrValTyrGluCysVal                          
151015                                                                    
ValGluLysHisAlaLysGlyGlnAspLeuLeuLysArgValCysGlu                          
202530                                                                    
HisLeuAsnLeuLeuGluGluAspTyrPheGlyLeuAlaIleTrpAsp                          
354045                                                                    
AsnAlaAspIleThrArgTyrTyrLeuCysLeuGlnLeuArgGlnAsp                          
505560                                                                    
IleValAlaGlyArgLeuProCysSerPheAlaThrLeuAlaLeuLeu                          
65707580                                                                  
GlySerTyrThrIleGlnSerGluLeuGlyAspTyrAspProGluLeu                          
859095                                                                    
HisGlyValAspTyrValSerAspPheLysLeuAlaProAsnGlnThr                          
100105110                                                                 
LysGluLeuGluGluLysValMetGluLeuHisLysSerTyrArgSer                          
115120125                                                                 
MetThrProAlaGlnAlaAspLeuGluPheLeuGluAsnAlaLysLys                          
130135140                                                                 
LeuSerMetTyrGlyValAspLeuHisLysAlaLysAspLeuGluGly                          
145150155160                                                              
ValAspIleIleLeuGlyValCysSerSerGlyLeuLeuValTyrLys                          
165170175                                                                 
AspLysLeuArgIleAsnArgPheProTrpProLysValLeuLysIle                          
180185190                                                                 
SerTyrLysArgSerSerPhePheIleLysIleArgProGlyGluGln                          
195200205                                                                 
GluGlnTyrGluSerThrIleGlyPheLysLeuProSerTyrArgAla                          
210215220                                                                 
AlaLysLysLeuTrpLysValCysValGluHisHisThrPhePheArg                          
225230235240                                                              
LeuThrSerThrAspThrIle                                                     
245                                                                       
(2) INFORMATION FOR SEQ ID NO:14:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 288 amino acids                                               
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                  
ValValCysAsnIleLeuLeuLeuAspAsnThrValGlnAlaPheLys                          
151015                                                                    
ValAsnLysHisAspGlnGlyGlnValLeuLeuAspValValPheLys                          
202530                                                                    
HisLeuAspLeuThrGluGlnAspTyrPheGlyLeuGlnLeuAlaAsp                          
354045                                                                    
AspSerThrAspAsnProArgTrpLeuAspProAsnLysProIleArg                          
505560                                                                    
LysGlnLeuLysArgGlySerProTyrSerLeuAsnPheArgValLys                          
65707580                                                                  
PhePheValSerAspProAsnLysLeuGlnGluGluTyrThrArgTyr                          
859095                                                                    
GlnTyrPheLeuGlnIleLysGlnAspIleLeuThrGlyArgLeuPro                          
100105110                                                                 
CysProSerAsnThrAlaAlaLeuLeuAlaSerPheAlaValGlnSer                          
115120125                                                                 
GluLeuGlyAspTyrAspGlnSerGluAsnLeuSerGlyTyrLeuSer                          
130135140                                                                 
AspTyrSerPheIleProAsnGlnProGlnAspPheGluLysGluIle                          
145150155160                                                              
AlaLysLeuHisGlnGlnHisIleGlyLeuSerProAlaGluAlaGlu                          
165170175                                                                 
PheAsnTyrLeuAsnThrAlaArgThrLeuGluLeuTyrGlyValGlu                          
180185190                                                                 
PheHisTyrAlaArgAspGlnSerAsnAsnGluIleMetIleGlyVal                          
195200205                                                                 
MetSerGlyGlyIleLeuIleTyrLysAsnArgValArgMetAsnThr                          
210215220                                                                 
PheProTrpLeuLysIleValLysIleSerPheLysCysLysGlnPhe                          
225230235240                                                              
PheIleGlnLeuArgLysGluLeuHisGluSerArgGluThrLeuLeu                          
245250255                                                                 
GlyPheAsnMetValAsnTyrArgAlaCysLysAsnLeuTrpLysAla                          
260265270                                                                 
CysValGluHisHisThrPhePheArgLeuAspArgProLeuProPro                          
275280285                                                                 
(2) INFORMATION FOR SEQ ID NO:15:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 288 amino acids                                               
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                  
ValIleCysSerIleHisPheLeuAspGlyValValGlnThrPheLys                          
151015                                                                    
ValThrLysGlnAspThrGlyGlnValLeuLeuAspMetValHisAsn                          
202530                                                                    
HisLeuGlyValThrGluLysGluTyrPheGlyLeuGlnHisAspAsp                          
354045                                                                    
AspSerValAspSerProArgTrpLeuGluAlaSerLysProIleArg                          
505560                                                                    
LysGlnLeuLysGlyGlyPheProCysThrLeuHisPheArgValArg                          
65707580                                                                  
PhePheIleProAspProAsnThrLeuGlnGlnGluGlnThrArgHis                          
859095                                                                    
LeuTyrPheLeuGlnLeuLysMetAspIleCysGluGlyArgLeuThr                          
100105110                                                                 
CysProLeuAsnSerAlaValValLeuAlaSerTyrAlaValGlnSer                          
115120125                                                                 
HisPheGlyAspTyrAsnSerSerIleHisHisProGlyTyrLeuSer                          
130135140                                                                 
AspSerHisPheIleProAspGlnAsnGluAspPheLeuThrLysVal                          
145150155160                                                              
GluSerLeuHisGluGlnHisSerGlyLeuLysGlnSerGluAlaGlu                          
165170175                                                                 
SerCysTyrIleAsnIleAlaArgThrLeuAspPheTyrGlyValGlu                          
180185190                                                                 
LeuHisSerGlyArgAspLeuHisAsnLeuAspLeuMetIleGlyIle                          
195200205                                                                 
AlaSerAlaGlyValAlaValTyrArgLysTyrIleCysThrSerPhe                          
210215220                                                                 
TyrProTrpValAsnIleLeuLysIleSerPheLysArgLysLysPhe                          
225230235240                                                              
PheIleHisGlnArgGlnLysGlnAlaGluSerArgGluHisIleVal                          
245250255                                                                 
AlaPheAsnMetLeuAsnTyrArgSerCysLysAsnLeuTrpLysSer                          
260265270                                                                 
CysValGluHisHisThrPhePheGlnAlaLysLysLeuLeuProGln                          
275280285                                                                 
(2) INFORMATION FOR SEQ ID NO:16:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 77 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                  
AspAlaLysTyrGlyLeuGlyPheGlnIleIleGlyGlyGluLysMet                          
151015                                                                    
GlyArgLeuAspLeuGlyIlePheIleSerSerValAlaProGlyGly                          
202530                                                                    
ProAlaAspPheHisGlyCysLeuLysProGlyAspArgLeuIleSer                          
354045                                                                    
ValAsnSerValSerLeuGluGlyValSerHisHisAlaAlaIleGlu                          
505560                                                                    
IleLeuGlnAsnAlaProGluAspValThrLeuValIle                                   
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:17:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 77 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                  
LysAsnAspAsnSerLeuGlyIleSerValThrGlyGlyValAsnThr                          
151015                                                                    
SerValArgHisGlyGlyIleTyrValLysAlaValIleProGlnGly                          
202530                                                                    
AlaAlaGluSerAspGlyArgIleHisLysGlyAspArgValLeuAla                          
354045                                                                    
ValAsnGlyValSerLeuGluGlyAlaThrHisLysGlnAlaValGlu                          
505560                                                                    
ThrLeuArgAsnThrGlyGlnValValHisLeuLeuLeu                                   
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:18:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 80 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                  
LysAsnSerSerGlyLeuGlyPheSerPheSerArgGluAspAsnLeu                          
151015                                                                    
IleProGluGlnIleAsnAlaSerIleValArgValLysLysLeuPhe                          
202530                                                                    
AlaGlyGlnProAlaAlaGluSerGlyLysIleAspValGlyAspVal                          
354045                                                                    
IleLeuLysValAsnGlyAlaSerLeuLysGlyLeuSerGlnGlnGlu                          
505560                                                                    
ValIleSerAlaLeuArgGlyThrAlaProGluValPheLeuLeuLeu                          
65707580                                                                  
(2) INFORMATION FOR SEQ ID NO:19:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 72 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:                                  
SerGluLysAlaSerLeuGlyPheThrValThrLysGlyAsnGlnArg                          
151015                                                                    
IleGlyCysTyrValHisAspValIleGlnAspProAlaLysSerAsp                          
202530                                                                    
GlyArgLeuLysProGlyAspArgLeuIleLysValAsnAspThrAsp                          
354045                                                                    
ValThrAsnMetThrHisThrAspAlaValAsnLeuLeuArgAlaAla                          
505560                                                                    
SerLysThrValArgLeuValIle                                                  
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:20:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 75 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:                                  
CysAsnLysAsxGluLeuGlyPheSerLeuCysGlyGlyHisAspSer                          
151015                                                                    
LeuTyrGlnValValTyrIleSerAspIleAsnProArgSerValAla                          
202530                                                                    
AlaIleGluGlyAsnLeuGlnLeuLeuAspValIleHisTyrValAsn                          
354045                                                                    
GlyValSerThrGlnGlyMetThrLeuGluGluValAsnArgAlaLeu                          
505560                                                                    
AspMetSerLeuProSerLeuValLeuLysAla                                         
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:21:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 75 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:                                  
AspGluAspGlyLysProGlyPheAsnLeuLysGlyGlyValAspGln                          
151015                                                                    
LysAsnProLeuValValSerArgIleAsnProSerSerProAlaAsp                          
202530                                                                    
ThrCysIleProLysLeuAsnGluGlyAspGlnIleValLeuIleAsn                          
354045                                                                    
GlyArgAspIleSerGluHisThrHisAspGlnValValMetPheIle                          
505560                                                                    
LysAlaSerArgGluSerHisSerArgGluLeu                                         
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:22:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 75 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:                                  
AspGluAsnGlyArgPheGlyPheAsnValLysGlyGlyTyrAspGln                          
151015                                                                    
LysMetProValIleValSerArgValAlaProGlnThrProAlaAsp                          
202530                                                                    
LeuCysValProArgLeuAsnGluGlyAspGlnValValLeuIleAsn                          
354045                                                                    
GlyArgAspIleAlaGluHisThrHisAspGlnValValLeuPheIle                          
505560                                                                    
LysAlaSerCysGluArgHisSerGlyGluLeu                                         
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:23:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 79 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:                                  
ArgGlyAsnSerGlyLeuGlyPheSerIleAlaGlyGlyThrAspAsn                          
151015                                                                    
ProHisIleGlyThrAspThrSerIleTyrIleThrLysLeuIleSer                          
202530                                                                    
GlyGlyAlaAlaAlaAlaAspGlyArgLeuSerIleAsnAspIleIle                          
354045                                                                    
ValSerValAsnAspValSerValValAspValProHisAlaSerAla                          
505560                                                                    
ValAspAlaLeuLysLysAlaGlyAsnValValLysLeuHisVal                             
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:24:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 83 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:                                  
LysGlyGlyLysGlyLeuGlyPheSerIleAlaGlyGlyIleGlyAsn                          
151015                                                                    
GlnHisIleProGlyAspAsnGlyIleTyrValThrLysLeuThrAsp                          
202530                                                                    
GlyGlyArgAlaGlnValAspGlyArgLeuSerIleGlyAspLysLeu                          
354045                                                                    
IleAlaValArgThrAsnGlySerGluLysAsnLeuGluAsnValThr                          
505560                                                                    
HisGluLeuAlaValAlaThrLeuLysSerIleThrAspLysValThr                          
65707580                                                                  
LeuIleIle                                                                 
(2) INFORMATION FOR SEQ ID NO:25:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 73 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:                                  
LysGlyProGlnGlyLeuGlyPheAsnIleValGlyGlyGluAspGly                          
151015                                                                    
GlnGlyIleTyrValSerPheIleLeuAlaGlyGlyProAlaAspLeu                          
202530                                                                    
GlySerGluLeuLysArgGlyAspGlnLeuLeuSerValAsnAsnVal                          
354045                                                                    
AsnLeuThrHisAlaThrHisGluGluAlaAlaGlnAlaLeuLysThr                          
505560                                                                    
SerGlyGlyValValThrLeuLeuAla                                               
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:26:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 79 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:                                  
ArgGlyAsnSerGlyLeuGlyPheSerIleAlaGlyGlyThrAspAsn                          
151015                                                                    
ProHisIleGlyAspAspProSerIlePheIleThrLysIleIlePro                          
202530                                                                    
GlyGlyAlaAlaAlaGlnAspGlyArgLeuArgValAsnAspSerIle                          
354045                                                                    
LeuPheValAsnGluValAspValArgGluValThrHisSerAlaAla                          
505560                                                                    
ValGluAlaLeuLysGluAlaGlySerIleValArgLeuTyrVal                             
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:27:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 79 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:                                  
LysGlyProLysGlyLeuGlyPheSerIleAlaGlyGlyValGlyAsn                          
151015                                                                    
GlnHisIleProGlyAspAsnSerIleTyrValThrLysIleIleGlu                          
202530                                                                    
GlyGlyAlaAlaHisLysAspGlyArgLeuGlnIleGlyAspLysIle                          
354045                                                                    
LeuAlaValAsnSerValGlyLeuGluAspValMetHisGluAspAla                          
505560                                                                    
ValAlaAlaLeuLysAsnThrTyrAspValValTyrLeuLysVal                             
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:28:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 73 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:                                  
ArgGlySerThrGlyLeuGlyPheAsnIleValGlyGlyGluAspGly                          
151015                                                                    
GluGlyIlePheIleSerPheIleLeuAlaGlyGlyProAlaAspLeu                          
202530                                                                    
SerGlyGluLeuArgLysGlyAspGlnIleLeuSerValAsnGlyVal                          
354045                                                                    
AspLeuArgAsnAlaSerHisGluGlnAlaAlaIleAlaLeuLysAsn                          
505560                                                                    
AlaGlyGlnThrValThrIleIleAla                                               
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:29:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 78 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:                                  
HisArgAlaProGlyPheGlyIleAlaIleSerGlyGlyArgAspAsn                          
151015                                                                    
ProHisPheGlnSerGlyGluThrSerIleValIleSerAspValLeu                          
202530                                                                    
LysGlyGlyProAlaAsxGlyGlnLeuGlnGluAsnAsnArgValAla                          
354045                                                                    
MetValAsnGlyValSerMetAspAsnValGluHisAlaPheAlaVal                          
505560                                                                    
GlnGlnLeuArgLysSerGlyLysAsnAlaLysIleThrIle                                
657075                                                                    
(2) INFORMATION FOR SEQ ID NO:30:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 68 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:                                  
ArgLysAsnGluGluTyrGlyLeuArgProAlaSerHisIlePheVal                          
151015                                                                    
LysGluIleSerGlnAspSerLeuAlaAlaArgAspGlyAspIleGln                          
202530                                                                    
GluGlyAspValValLeuLysIleAsnGlyThrValThrGluAsnMet                          
354045                                                                    
SerLeuThrAspAlaLysThrLeuIleGluArgSerLysGlyLysLeu                          
505560                                                                    
LysMetValVal                                                              
65                                                                        
(2) INFORMATION FOR SEQ ID NO:31:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 71 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:                                  
ArgLysGlyAspSerValGlyLeuArgLeuAlaGlyGlyAsnAspVal                          
151015                                                                    
GlyIlePheValAlaGlyValLeuGluAspSerProAlaAlaLysGlu                          
202530                                                                    
GlyLeuGluGluGlyAspGlnIleLeuArgValAsnAsnValAspPhe                          
354045                                                                    
ThrAsnIleIleArgGluGluAlaValLeuPheLeuLeuAspLeuPro                          
505560                                                                    
LysGlyGluGluValThrIle                                                     
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:32:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 72 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:                                  
ValThrGluGluProMetGlyIleThrLeuLysLeuAsnGluLysGln                          
151015                                                                    
SerCysThrValAlaArgIleLeuHisGlyGlyMetIleHisArgGln                          
202530                                                                    
GlySerLeuHisValGlyAspGluIleLeuGluIleAsnGlyThrAsn                          
354045                                                                    
ValThrAsnHisSerValAspGlnLeuGlnLysAlaMetLysGluThr                          
505560                                                                    
LysGlyMetIleSerLeuLysVal                                                  
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:33:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 74 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(iii) HYPOTHETICAL: NO                                                    
(iv) ANTI-SENSE: NO                                                       
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:                                  
ArgLysValGlyGlyLeuGlyPheLeuValLysGluArgValSerPro                          
151015                                                                    
LysLysProValIleIleSerAspLeuIleArgGlyGlyAlaAlaGlu                          
202530                                                                    
GlnSerGlyLeuIleGlnAlaGlyAspIleIleLeuAlaValAsnAsp                          
354045                                                                    
ArgProLeuValAspLeuSerTyrAspSerAlaLeuGluValLeuArg                          
505560                                                                    
GlyIleAlaSerGluThrHisValValLeu                                            
6570                                                                      
(2) INFORMATION FOR SEQ ID NO:34:                                         
(i) SEQUENCE CHARACTERISTICS:                                             
(A) LENGTH: 74 amino acids                                                
(B) TYPE: amino acid                                                      
(C) STRANDEDNESS: single                                                  
(D) TOPOLOGY: linear                                                      
(ii) MOLECULE TYPE: peptide                                               
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:                                  
GluAspHisGluGlyLeuGlyIleSerIleThrGlyGlyLeuGluHis                          
151015                                                                    
GlyValProIleLeuIleSerGlyIleHisProGlyGlnProAlaAsp                          
202530                                                                    
ArgCysGlyGlyLeuHisValGlyAspAlaIleLeuAlaValAsnGly                          
354045                                                                    
ValAsnLeuArgAspThrLeuHisLeuGlyAlaValThrIleLeuSer                          
505560                                                                    
GlnGlnArgGlyGluIleGluPheGluVal                                            
6570                                                                      
__________________________________________________________________________

Claims

We claim:

1. An isolated nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of

(a) nucleotide sequences which hybridize under high stringency conditions to a nucleic acid molecule consiting of the nucleotide sequence of SEQ ID NO:1, and which encodes a naturally occurring PTPL1 protein tyrosine phosphatase, and

(b) nucleotide sequences that differ from the nucleotide sequences of (a) in codon sequence due to the degeneracy of the genetic code.

2. The isolated nucleic acid molecule of claim 1, wherein said isolated nucleic acid molecule encodes a PTPL1 comprising the amino acid sequence as set forth in SEQ ID NO:3.

3. The isolated nucleic acid molecule of claim 1 wherein said nucleotide sequence comprises SEQ ID NO.:1.

4. A method of detecting compounds which increase or decrease expression or phosphatase activity of a PTPL1 protein tyrosine phosphatase encoded by the nucleic acid molecule of claim 1 comprising the steps of

(a) determining a control amount of PTPL1 protein tyrosine phosphatase expression or phosphatase activity in a cell which expresses said PTPL1 protein tyrosine phosphatase;

(b) contacting the cell which expresses said PTPL1 protein tyrosine phosphatase with a test compound;

(c) measuring the expression or phosphatase activity of said PTPL1 protein tyrosine phosphatase in said cell contacted with the test compound; and

(d) comparing the expression or phosphatase activity of said PTPL1 protein tyrosine phosphatase measured in (c) with the control amount of PTPL1 protein tyrosine phosphatase expression or phosphatase activity determined in (a) for an indication of the increase or decrease of expression or phosphatase activity of said PTPL1 protein tyrosine phosphatase.

5. The method of claim 4 wherein expression of said PTPL1 protein tyrosine phosphatase is measured.

6. The method of claim 4 wherein phosphatase activity of said PTPL1 protein tyrosine phosphatase is measured.

7. An isolated PTPL1 protein tyrosine phosphatase encoded by the nucleic acid molecule of claim 1 or claim 3.

8. An isolated nucleic acid molecule comprising a nucleic acid molecule complementary to the isolated nucleic acid molecule of claim 1.

9. The isolated nucleic acid molecule of any one of claims 1-3 and 8 wherein said nucleotide sequence is operably joined to a regulatory sequence.

10. A substantially pure protein comprising a PTPL1 protein tyrosine phosphatase wherein said PTPL1 comprises an amino acid sequence selected from the group consisting of the amino acid sequence as set forth in SEQ ID NO.:3 and a naturally occurring allelic variant of the amino acid sequence as set forth in SEQ ID NO.:3.

11. The substantially pure protein of claim 10 wherein said amino acid sequence comprises the amino acid sequence as forth in SEQ ID NO:3.

12. A method for determining whether a compound increases or decreases the phosphatase activity of a PTPL1 protein tyrosine phosphatase encoded by the nucleic acid molecule of claim 1, comprising

(a) determining a control amount of phosphatase activity of the PTPL1 protein tyrosine phosphatase,

(b) contacting the PTPL1 protein tyrosine phosphatase with the compound,

(c) measuring the phosphatase activity of the PTPL1 protein tyrosine phosphatase, and

(d) comparing the phosphatase activity measured in (c) with the control amount of phosphatase activity determined in (a) as an indication whether the compound increases or decreases the phosphatase activity of the PTPL1 protein tyrosine phosphatase.