CA2294473A1 - Novel family of pheromone receptors - Google Patents

Novel family of pheromone receptors Download PDF

Info

Publication number
CA2294473A1
CA2294473A1 CA002294473A CA2294473A CA2294473A1 CA 2294473 A1 CA2294473 A1 CA 2294473A1 CA 002294473 A CA002294473 A CA 002294473A CA 2294473 A CA2294473 A CA 2294473A CA 2294473 A1 CA2294473 A1 CA 2294473A1
Authority
CA
Canada
Prior art keywords
leu
ser
ile
phe
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002294473A
Other languages
French (fr)
Inventor
Linda Buck
Catherine Dulac
Gilles Herrada
Hiroaki Matsunami
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harvard College
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2294473A1 publication Critical patent/CA2294473A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P15/00Drugs for genital or sexual disorders; Contraceptives
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Physics & Mathematics (AREA)
  • Endocrinology (AREA)
  • Reproductive Health (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Engineering & Computer Science (AREA)
  • Veterinary Medicine (AREA)
  • Cell Biology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention describes a multigene family encoding a collection of novel mammalian pheromone receptors. Nucleic acids encoding the pheromone receptor polypeptides, including fragments and biologically functional variants thereof are provided. Also included are polypeptides and fragments thereof encoded by such nucleic acids, and antibodies relating thereto. Methods and products for using such nucleic acids and polypeptides also are provided.

Description

Field o~f the Invention This invention relates to nucleic acids and encoded polypeptides which are part of a multigene family encoding a collection of novel mammalian pheromone receptors.
The invention further provides representative nucleic acids and encoded polypeptides in this multigene family. The representative polypeptides are expressed in the marine and rat 1 o vomeronasal organ (VNO). Agents which bind the nucleic acids or polypeptides also are provided. The invention further relates to methods of using such nucleic acids and polypeptides in the diagnosis and/or treatment of disease, including the use of these molecules in controlling fertility and behavior in vertebrates and invertebrates.
Background of the Invention Pheromones are intraspecific chemical signals found throughout the animal kingdom.
They regulate populations of animals by inducing innate behaviors and stereotyped changes in physiology (Karlson and Luscher, Nature, 1959,183:55-56; Wilson, Sci. Am., 1963, 208:100-114; Sorensen, Chem. Sens., 1996, 21:245-256). Pheromones can serve as cues for overcrowding, impending danger, reproductive status, gender, or dominance. In rodents, a variety of pheromone effects have been reported. These include effects on estrus and the onset of puberty as well as the induction of mating and aggressive behaviors (Singer, A.G., J. Steroid.
Biochem. Molec. Biol., 1991, 39:627-632; Halpern, M., Ann. Rev. Neurosci., 1987 10:325-362;
Wysocki, C.J., et al., In the Neurobiology of Taste and Smell, 1987, 125-150;
Novotny et al., Chemical signals in Vertebrates, 1990, Vol. 5, eds. D.W. Macdonald et al., Oxford University Press).
The detection of pheromones is mediated by the olfactory system. However, sensory neurons that detect pheromones are typically segregated from those that detect volatile odorants (Keverne, E.B., Trends Neurosci., 1983, 6:381-384; Halpern, M., Ann. Rev.
Neurosci., 1987, 10:325-362; Wysocki, C.J., et al., In the Neurobiology of Taste and Smell, 1987, 125-150;
Hildebrand, J.G., et al., Brain Res., 1997, 677:157-161 ). In marnrnals, sensory neurons in the nasal olfactory epithelium (OE) detect volatile odorants and some pheromones while those in an accessory olfactory organ, called the vomeronasal organ (VNO), are thought to be specialized to detect pheromones. The VNO is a tubular structure, at the base of the nasal septum, which is connected to the nasal cavity by a small duct. Signals from the OE are relayed through the olfactory bulb (OB) to the olfactory cortex, and then to multiple brain regions, including those involved in conscious perception. In contrast, signals from the VNO are conveyed through the accessory olfactory bulb (AOB) to the amygdala and hypothalamus, areas associated with the endocrine and behavioral responses induced by pheromones.
Volatile odorants are detected in the OE by as many as 1000 different types of odorant receptors (ORs), which are differentially expressed by olfactory sensory neurons (Buck and Axel, io Cell, 1991, 65:175-187; Levy, N.S., et al., J. Steroid Biochem. Mol. Biol., 1991, 39:633-637, 1991; Nef, P., et al., Proc. Natl. Acad. Sci., 1992, 89:8948-8952; Strotman, J., et al., Neuroreport, 1992, 3:1053-1056; Ngai, J., et al., Cell, 1993, 72:667-680;
Ressler, K.J., et al, Cell, 1993, 73:597-609; Vassar, R., et al, Cell, 1993, 74:309-318. The ORs are thought to couple to the G protein a subunit, Gao,~, thereby initiating a cascade of transduction events which culminate in the generation of action potentials in the sensory axons (reviewed in Firestein, S., Curr.Opin. in Neurobiology, 1992, 2:444-448; Reed, R., Neuron, 1992, 8:205-209; Ronnett, G., et al., Trends Neurosci, 1992, 15:508-513). Current evidence suggests that each OR may recognize a particular molecular feature that can be shared by many odorants (Ressler, K., et al., Celd, 1994, 79:1245-1255; Vassar, R., et al., Cell, 1994, 79:981-991; Axel, R., Sci. Am., 1995, 1273:154-159; Buck, L., Annu. Rev. Neurosci., 1996, 19:517-544). This is consistent with a combinatorial coding model in which the identities of different odorants are encoded by different combinations of receptors, but each receptor serves as one component of the codes for many odorants. By contrast, very little is known about how pheromones are detected or encoded in the VNO. Although VNO neurons (VNs) resemble olfactory sensory neurons in the nose, only a rare VN expresses an OR gene. VNs also lack a number of other olfactory sensory transduction molecules, including the G protein a subunit,Gaco,~ (Reed, R., Neuron, 1992, 8:205-209), which is highly expressed in olfactory neurons (Dulac and Axel, Cell, 1995, 83:195-206;
Berghard, A., et al, Proc. Natl. Acad Sci. USA, 1996, 93:2365-2369; Wu, Y., et al, Biochem.
Biopys. Res. Com., 1996, 220:900-904). Instead, VNs express high levels of two other G
3o protein a subunits,Gao and Gait (Dulac and Axel, Cell, 1995, 83:195-206;
Halpern, M., Brain Res., 41995, 677:157-161; Berghard, A., et al, Proc. Natl. Acad. Sci. USA, 1996, 93:2365-2369).
G,~ and Gait are expressed in spatially-segregated subsets of VNs that form longitudinal zones in the VNO neuroepithelium. Interestingly, Dulac and Axel have identified a family of 100 candidate pheromones receptors ("VNRs") which appear to be expressed exclusively in the Gait subset (Dulac and Axel, Cell, 1995, 83:195-206).
This invention differs from the state of the art in providing a novel family of mammalian pheromone receptors. Accordingly, the objects of the invention relate to providing compositions containing these novel receptors and their binding partners and methods for using such compositions to modulate pheromone receptor activity.
1 o The invention involves the discovery of a multigene family of mammalian pheromone receptors. In particular, the invention involves the cDNA cloning of multiple pheromone receptors from a marine VNO cDNA library and from a rat VNO cDNA library.
Partial sequences of human homologs of these pheromone receptors also are provided.
In general, the invention provides isolated nucleic acid molecules encoding the novel pheromone receptors, unique fragments of the isolated nucleic acid molecules, expression vectors containing the foregoing, and host cells transfected with the foregoing. The invention also provides isolated pheromone receptor polypeptides and agents which bind such polypeptides, including antibodies. The foregoing can be used in the diagnosis or treatment of conditions, including the control of fertility, that are characterized by the expression of a pheromone receptor 2o polypeptide. Methods for identifying pharmacological agents useful in the diagnosis or treatment of such conditions and methods for identifying additional members of this multigene family also are provided.
Applicants have discovered that the pheromone receptors disclosed herein are expressed in the vomeronasal organ (VNO), particularly in Goco protein expressing neurons. This is in contrast to the prior art VNO pheromone receptors which are expressed in neurons which express different G-coupled proteins (Gait-expressing neumns). Thus, the novel pheromone receptors disclosed herein are distinct from, and expressly exclude, the prior art VNO
pheromone receptors which differ in primary structure, as well as in cell localization. Although Applicants do not intend the invention to be limited to a particular theory or mechanism, the amino acid sequence 3o homology and structural organization of the pheromone receptor polypeptides to other well-known G-protein coupled receptors suggests that the pheromone receptors disclosed herein also are G-protein coupled. Thus, it is anticipated that the binding to the pheromone receptor of its cognate ligand (pheromone) will be accompanied by G-protein signal transduction, an event which can be measured using conventional screening assays, such as assays that measure changes in the intracellular concentrations of calcium and/or cyclic nucleotides (see, e.g., PCT
publication no. WO 94/18959, entitled "Calcium Receptor-Active Molecules", inventors E.
Nemeth et al.).
According to one aspect of the invention, a family of pheromone receptor polypeptides is provided. Each polypeptide of the family shares amino acid sequence homology and structural organization with a pheromone receptor polypeptide selected from the group consisting of SEQ
ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. Each polypeptide member of the receptor family contains, from amino terminus to carboxyl terminus, the following domains: (a) an amino-terminal extracellular domain containing from 30 to 600 amino acids; (b) a transmembrane region comprising: (i) seven non-contiguous transmembrane domains designated TMl, TM2, TM3, TM4, TMS, TM6 and TM7, (ii) three non-contiguous extracellular domains designated EC2, EC3 and EC4, and (iii) three non-contiguous intracellular domains designated IC1, IC2, and IC3,wherein the transmembrane domains, the extracellular domains and the intracellular domains are attached to one another from amino terminus to carboxyl terminus in the order TM1-IC 1-TM2-EC2-TM3- IC2-TM4-EC3-TM6-EC4-TM7, and wherein the transmembrane region has at least about 35%
homology and a length approximately equal to a transmembrane region of a polypeptide selected from the group 2o consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and (c) a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids.
Each polypeptide member of the family is expressed in a Gao protein-expressing vomeronasal organ neuron or are expressed in another olfactory organ neuron in an animal which does not possess a vomeronasal organ. One skilled in the art can readily identify olfactory organs in animals which do not possess a vomeronasal organ.
In general, the amino-terminal extracellular domains (NTDs) of the receptor family members share sequence homology to a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50 to a lesser extent than that observed for the transmembrane region. The length of the extracellular domain can vary among members of the family.
Accordingly, certain embodiments of the invention have extracellular domains that contain at least 50, 100, 200, 300, 400 or 500 amino acids. Preferably, the transmembrane region has greater than 40% homology with the corresponding region of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8,10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50, and more preferably, have even greater sequence homology (e.g., more than 50%, 60%, 70%, 80% or 90%
homology). The length of the carboxyl-terminal intracellular domain can vary among members s of the family. Accordingly, certain embodiments of the invention have carboxyl-terminal intracellular domains that contain at least between 5 and SO amino acids. More preferably, carboxyl-terminal intracellular domains contain between 15 and 25 amino acids.
According to another aspect of the invention, a method for identifying a nucleic acid encoding a pheromone receptor is provided. The method involves contacting a mixture of nucleic acid molecules (genomic library, cDNA library, genomic DNA, RNA, etc.) with at least one nucleic acid probe of a nucleic acid selected from the group consisting of: (a) a nucleic acid molecule selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and SS
that encodes a pheromone receptor polypeptide; (b) a unique fragment of (a); (c) a human homolog of (a) or (b); and (d) a t s set of degenerate primers of any of (a), (b) or (c); and identifying the sequences within the mixture that hybridize to the probe. Selected fragments of human homologs of a pheromone receptor are selected from the group consisting of SEQ ID NO. 51, 53, 54 and 55. In certain embodiments, the nucleic acid probe further includes a detectable label to facilitate identification of the sequence in the library which hybridizes to the probe. In certain embodiments, the probe 2o is represented by a pair of degenerate polymerase chain reaction ("PCR") primers that amplify a unique fragment of a nucleic acid molecule selected from the group consisting of SEQ ID NO.
1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55. The meaning of "unique fragment" in reference to a nucleic acid is provided below.
By "degenerate PCR primers that amplify a unique fragment" is meant degenerate primers which 2s result in the amplification of a unique fragment following a polymerase chain reaction.
According to this embodiment, the method for identifying a nucleic acid encoding a pheromone receptor polypeptide further involves subjecting a mixture of nucleic acids and the degenerate PCR primers to amplification conditions prior to identifying the sequences of the mixture that hybridize to the probe and that form part of the amplification reaction products. In some 3o embodiments the pair of degenerate polymerase chain reaction primers is selected from a conserved sequence motif of a pheromone receptor polypeptide. A "conserved sequence motif' can be determined using the side-by-side comparison of the amino acid sequences of the different pheromone receptor polypeptides of the invention. Exemplary conserved sequence motifs include regions selected from the group consisting of amino acids 191-397, amino acids 565-825, amino acids 637-825, amino acids 637-804, amino acids 619-784, of the polypeptide of, for example, SEQ ID NO. 2 (VRl ). In preferred embodiments, the pair of degenerate polymerase s chain reaction primers is selected from the group consisting of SEQ ID NOs.
60 and 61, SEQ ID
NOs. 62 and 63, SEQ ID NOs. 64 and 63, SEQ ID NOs. 64 and 65, and SEQ ID NOs.
66 and 67.
According to yet another aspect of the invention, an isolated nucleic acid molecule is provided. The isolated nucleic acid molecule hybridizes under high or low stringency conditions to a molecule consisting of a nucleic acid sequence selected from the group consisting of SEQ
1o ID NO. l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55. The invention further embraces nucleic acid molecules that differ from the foregoing isolated nucleic acid molecules in codon sequence due to the degeneracy of the genetic code. The invention also embraces complements of the foregoing nucleic acids.
The pheromone receptors of the invention are expressed in the vomeronasal organ or, in 1 s an animal which lacks such an organ, are expressed in another olfactory organ. More particularly, the receptors of the invention are expressed in a Gao protein-expressing vomeronasal organ neuron. Although not intending to be bound to a particular mechanism, it is believed that the receptors of the invention are G-protein coupled receptors. This is supported by Applicants' discovery that the receptors of the invention are expressed in Goco protein-expressing 2o vomeronasal organ neurons.
The pheromone receptors of the invention bind to ligands (pheromones) which induce certain changes in receptor conformation. Methods for identifying ligands which bind to the pheromone receptors of the invention are provided below, e.g., by forming an affinity matrix containing immobilized receptor and using the matrix to isolate a cognate ligand from a complex 2s mixture. The particular ligand bound by a particular receptor is dictated by the primary and secondary structure of the receptor. In certain embodiments, the immobilized pheromone receptor polypeptide is a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, SO and 52.
3o According to another aspect of the invention, an isolated nucleic acid molecule that is a unique fragment of any of the foregoing isolated nucleic acid molecules is provided. In general, the isolated nucleic acid molecule consists of a unique fragment between 12 and 4000 nucleotides in length, and complements thereof, of any cDNA (SEQ ID NOs. l, 3, 5, 7, 9, 11, ' 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55) encoding a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO.
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
Depending upon its intended use (e.g., probe, primer), the unique fragment can be between 12 and 2000, 1000, 500, 250, 100, 50 or 25 nucleotides in length. Preferably, the isolated nucleic acid molecule consists of between 12 and 35 contiguous nucleotides of the foregoing cDNAs encoding the pheromone receptor polypeptides, or complements of such nucleic acid molecules.
More preferably, the unique fragment is at least 14, 15, 16, 17, 18, 20 or 22 contiguous 1 o nucleotides of the nucleic acid sequence of the foregoing cDNAs encoding the pheromone receptor polypeptides, or complements thereof. Particularly preferred isolated nucleic acid molecules are isolated fragments of the foregoing cDNAs which encode one or more of the following pheromone receptor polypeptide domains, alone or in combination (e.g., as fusion proteins): an amino-terminal extracellular domain, a transmembrane region, and a carboxy-terminal intracellular domain. In certain embodiments, the unique fragments are a pheromone receptor extracellular domain or a pheromone receptor intracellular domain coupled to at least one (e.g., 1, 2, 3, 4, 5, 6, or 7) transmembrane domain.
According to yet another aspect of the invention, an isolated nucleic acid molecule comprising a molecule having a sequence selected from the group consisting of SEQ ID NO. 51, 53, 54, S5, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a pheromone receptor polypeptide are provided. This aspect of the invention further embraces nucleic acid molecules that differ from these nucleic acid molecules in codon sequence due to the degeneracy of the genetic code, and diversity among pheromone receptors and complements of foregoing.
According to still other aspects of the invention, an expression vector comprising any of the foregoing isolated nucleic acid molecules operably linked to a promoter and host cells transformed or transfected with the same also are provided.
According to another aspect of the invention, an isolated polypeptide encoded by any of the above-described isolated nucleic acid molecules is provided. Preferably, the isolated 3o polypeptide is a pheromone receptor polypeptide that has a pheromone receptor activity or an antigenic fragment thereof. As used herein, a pheromone receptor activity refers to the ability of the pheromone receptor to selectively bind to its cognate iigand (pheromone) and, optionally, _g_ upon binding, to induce signal transduction in a cell that expresses the pheromone receptor. In preferred embodiments, the isolated polypeptide comprises a pheromone receptor polypeptide having a sequence selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
According to yet other embodiments, the isolated polypeptide comprises a polypeptide encoded by a nucleic acid which hybridizes under high or low stringency conditions to the extracellular domain, transmembrane region and/or intracellular domain of a cDNA sequence selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55 that encodes a pheromone receptor 1 o polypeptide or fragment thereof. Thus, the invention embraces portions of a pheromone receptor polypeptide that may include, for example, an amino-terminal extracellular domain or a carboxy-terminal intracellular domain coupled to 1, 2, 3, 4, 5, 6, or 7 transmembrane domains.
Preferably, such polypeptides or fragments thereof are unique fragments and can function as, for example, antigens for making antibodies specific for pheromone receptor family members.
Accordingly, the polypeptides of the invention can be used to isolate additional members of the pheromone receptor family or, alternatively, can be used to induce in vivo an immune response to a pheromone receptor, i.e., can be incorporated into a vaccine preparation.
Such vaccine compositions are useful for controlling fertility or behavior in an animal by administering to the animal, an effective amount of the vaccine to elicit an immune response to the pheromone receptor. Thus, the invention embraces fragments or variants of the foregoing pheromone receptors which exhibit certain detectable activities, e.g., a ligand binding activity, an antigenicity activity. In certain embodiments, the isolated polypeptide is encoded by a cDNA
selected from the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a pheromone receptor polypeptide or one or more of its domains.
According to another aspect of the invention, there are provided isolated binding polypeptides which selectively bind a unique amino acid sequence of a pheromone receptor polypeptide or fragment thereof. The isolated binding polypeptide in certain embodiments binds to a polypeptide comprising the extracellular domain and/or 1, 2, 3, 4, 5, 6, or 7 transmembrane 3o domains of a pheromone receptor polypeptide selected from the group consisting of SEQ ID
NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.

_ _ _9_ The isolated polypeptide preferably binds to a polypeptide consisting of the amino-terminal extracellular domain and/or one or more portions of the transmembrane region of a pheromone receptor polypeptide sequence selected from the group consisting of SEQ ID NO.
' 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
In preferred embodiments, isolated binding polypeptides include antibodies and fragments of antibodies (e.g., Fab, F(ab)2, Fd and antibody fragments which include a CDR3 region which binds selectively to the unique sequences of the polypeptides of the invention). In the preferred embodiments, the isolated binding peptides do not bind to pheromone receptors that are expressed in vomeronasal organ neurons other than Gao-protein-expressing neurons.
1 o The invention provides in yet other aspects, isolated nucleic acids or polypeptides of the invention that are: (a) immobilized to an insoluble support (an affinity matrix containing immobilized pheromone receptor polypeptide or a unique fragment thereof); (b) associated with, covalently coupled to, or encapsulated a drug delivery device (e.g., a microsphere} to effect controlled release of the isolated nucleic acid or polypeptide in vivo or in vitro; (c) covalently coupled to another isolated nucleic acid or protein to form a chimeric molecule; and/or (d) labeled with a detectable agent (e.g., a radiolabel, a fluorescent label).
Thus, the invention provides chimeric molecules containing at least one first structural domain of one pheromone receptor polypeptide (e.g., an extracellular domain) coupled to a second structural domain (e.g., a transmembrane domain, such as TM1, TM2, etc.) of a different pheromone receptor 2o polypeptide. The invention also provides a method for isolating a pheromone receptor by (1) contacting a composition containing a putative pheromone receptor of the above-described family with an affinity matrix containing immobilized binding polypeptide under conditions to permit the pheromone receptor to selectively bind to the immobilized binding polypeptide, and (2) isolating the polypeptides that bind to the affinity matrix.
According to still another aspect of the invention, pharmaceutical compositions containing any of the foregoing compounds of the invention in a pharmaceutically acceptable carrier and methods of producing same by placing the compositions in the carrier also are provided.
According to still another aspect of the invention, methods for modulating a pheromone 3o receptor activity (e.g., a ligand binding activity, a signal transduction activity) in a cell (vertebrate or invertebrate) are provided. The cell can be located in vivo or in vitro and the methods can be used to down regulate (inhibit) or up regulate (stimulate) the pheromone receptor activity. For example, to inhibit a ligand binding activity, the cell is contacted with an inhibitor that can be an isolated binding polypepdde that binds to an extracellular portion of the receptor and, thereby, inhibits receptor binding to its cognate ligand. Such binding also can induce conformational changes in the receptor that alter the signal transduction activity of the receptor.
s The inhibitor can be an isolated antibody (or function equivalent thereof) which binds to an epitope located on an extracellular portion (such as EC2, EC3, EC4) of the pheromone receptor polypeptide, e.g., an amino-terminal extracellular domain or an "extracellular transmembrane region domain", i.e., an extracellular portion of the transmembrane region located between one or more transmembrane domains. Alternatively, the inhibitor can be an agent (e.g., an isolated 1 o competitive binding polypeptide) that inhibits receptor-ligand binding.
For example, the inhibitor can be an isolated fragment of a pheromone receptor (preferably, a soluble fragment), which fragment contains a ligand (pheromone) binding site. Other inhibitors can be identif ed in screening assays which test the ability of a putative inhibitor to inhibit pheromone receptor-mediated signal transduction or which test the ability of the putative inhibitor to inhibit binding 15 of a pheromone receptor to its known cognate ligand. Similarly, such screening assays can be used to identify molecules which stimulate pheromone receptor-mediated signal transduction.
Exemplary molecules which stimulate transduction include the naturally-occurring ligands (e.g., isolated from a biological source (e.g., urine, vaginal fluid), as well as synthetic ligands obtained from a non-biological source (e.g., a combinatorial library).
2o According to still another aspect of the invention, methods for inhibiting the binding of a pheromone having a binding domain to a pheromone receptor polypeptide having a ligand binding site that selectively binds to the binding domain are provided. The method involves contacting (in vivo or in vitro) the pheromone receptor polypeptide with an agent which binds to the ligand binding site under conditions to permit binding of the agent to the receptor. For 25 example, the agent can be an isolated binding polypeptide that binds to the ligand binding site of the pheromone receptor. Thus, the agent can be an isolated antibody (or functionally equivalent fragment thereof) which selectively binds to the ligand binding site of the receptor.
Alternatively, the agent can be a pheromone receptor antagonist, e.g., a molecule that mimics the structure of the naturally-occurring ligand but that does not mimic the function (stimulating 3o the receptor) of the naturally-occurring ligand. Agents which inhibit ligand binding can be identified in screening assays which test the ability of a putative binding inhibitor to inhibit binding of a pheromone receptor to its cognate ligand (e.g., pheromone). Such molecules can be isolated from a biological source or from a non-biological source.
According to another aspect of the invention, methods for modulating pheromone receptor-mediated signal transduction in a subject are provided. The methods involve administering to a subject in need of such treatment an agent that selectively binds to any of the above-described isolated nucleic acid molecules which encode a pheromone receptor or unique fragment thereof, or an expression product thereof, in an amount effective to modulate (down regulate or up regulate) pheromone receptor-mediated signal transduction in the subject.
Exemplary agents include antisense nucleic acid molecules and binding polypeptides.
t o Thus, according to yet another aspect of the invention, methods are provided for identifying lead compounds for an pharmacological agent useful in the diagnosis or treatment of a condition associated with pheromone receptor signal transduction activity or otherwise generally associated with binding of the receptor to its cognate Iigand.
Preferably, cells expressing intact pheromone receptor polypeptides or portions thereof are used in the screening z s assays for identifying lead compounds which modulate pheromone receptor-mediated ligand binding or signal transduction activity. Cells expressing these polypeptides, isolated pheromone receptor polypeptides and fragments of these polypeptides which contain the ligand binding site can be used in the screening assays for identifying lead compounds which modulate binding of the receptor to a known ligand.
2o The screening methods involve forming a mixture of a pheromone receptor polypeptide (as noted above) or fragment thereof containing a ligand binding site; a molecule which is known to ( 1 ) interact with the foregoing receptor to effect pheromone receptor-mediated signal transduction or (2) bind to the ligand binding site of the receptor; and a candidate pharmacological agent. The mixture is incubated under conditions which, in the absence of the 25 candidate pharmacological agent, permit a first amount of pheromone receptor-ligand binding or receptor-mediated signal transduction by the known ligand. A test amount of the selective binding of the ligand by receptor or of the specific activation of signal transduction is determined. Detection of an increase in the foregoing activities in the presence of the candidate - pharmacological agent indicates that the candidate pharmacological agent is a lead compound 3o for a pharmacological agent which increases specific activation of pheromone receptor-mediated signal transduction or selective binding of the ligand by the ligand binding site of the receptor.
Detection of a decrease in the foregoing activities in the presence of the candidate pharmacological agent indicates that the candidate pharmacological agent is a lead compound for a pharmacological agent which decreases specific activation of pheromone receptor-mediated signal transduction or selective binding of the ligand by the ligand binding site of the receptor.
Pheromone receptor polypeptides that are useful in the screening assays, preferably, are those selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. Extracellular domains or portions thereof and portions of the transmembrane region, alone or coupled to one another, of these pheromone receptor polypeptides (indicated in the Examples) can be tested for their ability to inhibit receptor-ligand binding.
1 o These and other objects of the invention will be described in further detail in connection with the detailed description of the invention.
All patents, patent publications, references and other information identified in this document are incorporated in their entirety herein by reference.
Brief Description of the Drn~i in,g$
Figure 1 depicts a comparison of the deduced protein sequences encoded by VR
cDNA clones.
Figure 2 is a schematic comparison of ORs, VNRs, and Vrs.
Figure 3 depicts a comparison of the deduced protein sequences encoded by the 2o Go-VN cDNA clones.
Brief Descrix~tion of the Seauences SEQ ID NO. 1 is the nucleotide sequence of the mouse pheromone receptor VRl cDNA (GenBank Accession No. AF011411).
SEQ ID NO. 2 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR1 cDNA (GenBank Accession No. AF011411).
SEQ ID NO. 3 is the nucleotide sequence of the mouse pheromone receptor VR2 cDNA (GenBank Accession No. AF011412).
SEQ ID NO. 4 is the predicted amino acid sequence of the polypeptide encoded by 3o the mouse pheromone receptor VR2 cDNA (GenBank Accession No. AF011412).
SEQ ID NO. 5 is the nucleotide sequence of the mouse pheromone receptor VR3 cDNA (GenBank Accession No. AF011413).

SEQ ID NO. 6 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR3 cDNA (GenBank Accession No. AF011413).
SEQ ID NO. 7 is the nucleotide sequence of the mouse pheromone receptor VR4 cDNA (GenBank Accession No. AF011414).
SEQ ID NO. 8 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR4 cDNA (GenBank Accession No. AF011414).
SEQ ID NO. 9 is the nucleotide sequence of the mouse pheromone receptor VRS
cDNA (GenBank Accession No. AF011415).
SEQ ID NO. 10 is the predicted amino acid sequence of the polypeptide encoded by 1 o the mouse pheromone receptor VRS cDNA (GenBank Accession No. AF011415).
SEQ ID NO. 11 is the nucleotide sequence of the mouse pheromone receptor VR6 cDNA (GenBank Accession No. AFO l 1416).
SEQ ID NO. 12 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR6 cDNA (GenBank Accession No. AF011416).
SEQ ID NO. 13 is the nucleotide sequence of the mouse pheromone receptor VR7 cDNA (GenBank Accession No. AF011417).
SEQ ID NO. 14 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR7 cDNA (GenBank Accession No. AF011417).
SEQ ID NO. 15 is the nucleotide sequence of the mouse pheromone receptor VR8 2o cDNA (GenBank Accession No. AF011418).
SEQ ID NO. 16 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR8 cDNA (GenBank Accession No. AF011418).
SEQ ID NO. 17 is the nucleotide sequence of the mouse pheromone receptor VR9 cDNA (GenBank Accession No. AF011419).
SEQ ID NO. 18 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR9 cDNA (GenBank Accession No. AF011419).
SEQ ID NO. 19 is the nucleotide sequence of the mouse pheromone receptor VR10 cDNA (GenBank Accession No. AF011420).
SEQ ID NO. 20 is the predicted amino acid sequence of the polypeptide encoded by 3o the mouse pheromone receptor VR10 cDNA (GenBank Accession No. AFO11420).
SEQ ID NO. 21 is the nucleotide sequence of the mouse pheromone receptor VRl 1 cDNA (GenBank Accession No. AF011421).

SEQ ID NO. 22 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VRl 1 cDNA (GenBank Accession No. AF011421).
SEQ ID NO. 23 is the nucleotide sequence of the mouse pheromone receptor VR12 cDNA (GenBank Accession No. AFOl 1422).
SEQ ID NO. 24 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR12 cDNA (GenBank Accession No. AF011422).
SEQ ID NO. 25 is the nucleotide sequence of the mouse pheromone receptor VR13 cDNA (GenBank Accession No. AF011423).
SEQ ID NO. 26 is the predicted amino acid sequence of the polypeptide encoded by 1o the mouse pheromone receptor VR13 cDNA (GenBank Accession No. AFOl 1423).
SEQ ID NO. 27 is the nucleotide sequence of the mouse pheromone receptor VR14 cDNA (GenBank Accession No. AF011424).
SEQ ID NO. 28 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR14 cDNA (GenBank Accession No. AF011424).
I s SEQ ID NO. 29 is the nucleotide sequence of the mouse pheromone receptor cDNA (GenBank Accession No. AF011425).
SEQ ID NO. 30 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR15 cDNA (GenBank Accession No. AF011425).
SEQ ID NO. 31 is the nucleotide sequence of the mouse pheromone receptor VR16 20 cDNA (GenBank Accession No. AF011426).
SEQ ID NO. 32 is the predicted amino acid sequence of the polypeptide encoded by the mouse pheromone receptor VR16 cDNA (GenBank Accession No. AF011426).
SEQ ID NO. 33 is the nucleotide sequence of the rat pheromone receptor Go-VN1 cDNA (GenBank Accession No. AF016178).
2s SEQ ID NO. 34 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN1 cDNA (GenBank Accession No. AF016178).
SEQ ID NO. 35 is the nucleotide sequence of the rat pheromone receptor Go-VN2 cDNA (GenBank Accession No. AF016179).
SEQ ID NO. 36 is the predicted amino acid sequence of the polypeptide encoded by 3o the rat pheromone receptor Go-VN2 cDNA (GenBank Accession No. AF016179).
SEQ ID NO. 37 is the nucleotide sequence of the rat pheromone receptor Go-VN3 cDNA (GenBank Accession No. AF016180).

W0.99/00422 PCT/US98/13680 SEQ ID NO. 38 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN3 cDNA (GenBank Accession No. AF016180).
SEQ ID NO. 39 is the nucleotide sequence of the rat pheromone receptor Go-VN4 cDNA (GenBank Accession No. AF016181 ).
L 5 SEQ ID NO. 40 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN4 cDNA (GenBank Accession No. AFO 16181 ).
SEQ ID NO. 41 is the nucleotide sequence of the rat pheromone receptor Go-VNS
cDNA (GenBank Accession No. AF016182).
SEQ ID NO. 42 is the predicted amino acid sequence of the polypeptide encoded by to the rat pheromone receptor Go-VNS cDNA {GenBank Accession No. AF016182).
SEQ ID NO. 43 is the nucleotide sequence of the rat pheromone receptor Go-VN6 cDNA (GenBank Accession No. AF016183).
SEQ ID NO. 44 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN6 cDNA (GenBank Accession No. AF016183).
1 s SEQ ID NO. 45 is the nucleotide sequence of the rat pheromone receptor Go-cDNA (GenBank Accession No. AF016184).
SEQ ID NO. 46 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN7 cDNA (GenBank Accession No. AF016184).
SEQ ID NO. 47 is the nucleotide sequence of the rat pheromone receptor Go-2o cDNA (GenBank Accession No. AF016185).
SEQ ID NO. 48 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN13C cDNA (GenBank Accession No. AF016185).
SEQ ID NO. 49 is the nucleotide sequence of the rat pheromone receptor Go-cDNA (GenBank Accession No. AF016186).
25 SEQ ID NO. 50 is the predicted amino acid sequence of the polypeptide encoded by the rat pheromone receptor Go-VN13B cDNA (GenBank Accession No. AF016186).
SEQ ID NO. 51 is a partial nucleotide sequence of the human pheromone receptor hVRI.
- SEQ ID NO. 52 is the predicted amino acid sequence of the polypeptide encoded by 3o the partial sequence of the human pheromone receptor hVRI .
SEQ ID NO. 53 is a partial nucleotide sequence of the human pheromone receptor hVN01.

SEQ ID NO. 54 is a partial nucleotide sequence of the human pheromone receptor hVN02.
SEQ ID NO. 55 is a partial nucleotide sequence of the human pheromone receptor hVN03.
SEQ ID NO. 56 is the nucleotide sequence of primer AL 1.
SEQ ID NO. 57 is the nucleotide sequence of primer AL3.
SEQ ID NO. 58 is a fifty amino acid sequence of Go-VN13B (SEQ ID NO. 50) that is absent from Go-VN13C (SEQ ID NO. 48).
SEQ ID NO. 59 is the amino acid sequence of a rat kidney extracellular calcium/
1 o polyvalent canon-sensing receptor.
SEQ ID NO. 60 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 61 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 62 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 63 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 64 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 65 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 66 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 67 is a degenerate oligonucleotide primer from a conserved VR
domain.
SEQ ID NO. 68 is the nucleotide sequence of the coding region of the mouse 2o pheromone receptor VRI.
SEQ ID NO. 69 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR2.
SEQ ID NO. 70 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR3.
SEQ ID NO. 71 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR4.
SEQ ID NO. 72 is the nucleotide sequence of the coding region of the mouse pheromone receptor VRS.
SEQ ID NO. 73 is the nucleotide sequence of the coding region of the mouse 3o pheromone receptor VR6.
SEQ ID NO. 74 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR7.

- _ - 17-SEQ ID NO. 75 is the nucleotide sequence of the coding region of the mouse pheromone receptor VRB.
i SEQ ID NO. 76 is the nucleotide sequence of the coding region of the mouse x pheromone receptor VR9.
SEQ ID NO. 77 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR10.
SEQ ID NO. 78 is the nucleotide sequence of the coding region of the mouse pheromone receptor VRl l .
SEQ ID NO. 79 is the nucleotide sequence of the coding region of the mouse 1o pheromone receptor VR12.
SEQ ID NO. 80 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR13.
SEQ ID NO. 81 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR14.
SEQ ID NO. 82 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR15.
SEQ iD NO. 83 is the nucleotide sequence of the coding region of the mouse pheromone receptor VR16.
SEQ ID NO. 84 is the nucleotide sequence of the coding region of the rat pheromone 2o receptor GoVNl.
SEQ ID NO. 85 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVN2.
SEQ ID NO. 86 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVN3.
SEQ ID NO. 87 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVN4.
SEQ ID NO. 88 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVNS.
- SEQ ID NO. 89 is the nucleotide sequence of the coding region of the rat pheromone 3o receptor GoVN6.
SEQ ID NO. 90 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVN7.

WO 99/00422 PCT/US98/13l80 SEQ ID NO. 91 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVNI3C.
SEQ ID NO. 92 is the nucleotide sequence of the coding region of the rat pheromone receptor GoVNI3B.
Detailed Description of the Invention The present invention in one aspect involves the cloning of cDNAs encoding several members of a multigene family of pheromone receptors. Complete cDNA sequences for selected marine and rat pheromone receptors are provided. Partial sequences of the human gene also are provided. The present invention also relates to the discovery that this family of pheromone receptors is expressed in a Ga° protein-expressing vomeronasal organ neurons ("Gq~
VNO") or in another olfactory organ neuron in an animal (preferably, a mammal and more preferably, a human) which lacks a vomeronasal organ. Throughout this description, the pheromone receptors of the invention alternatively are referred to as "pheromone receptors", "Ga°+ VNO pheromone receptors" or, simply, "Gaco+ VNO receptors".
Analysis of the sequence homology between members of the receptor family by comparison to nucleic acid and protein databases established that the pheromone receptor family has several domains. These include, from amino terminus to carboxyl terminus:
(a) an amino-terminal extracellular domain containing from 30 to 600 amino acids; (b) a transmembrane region comprising: (i) seven non-contiguous transmembrane domains designated TM1, TM2, TM3, TM4, TMS, TM6 and TM7, (ii) three non-contiguous extracellular domains designated EC2, EC3 and EC4, and (iii) three non-contiguous intracellular domains designated IC1, IC2, and IC3, wherein the transmembrane domains, the extracellular domains and the intracellular domains are attached to one another from amino terminus to carboxyl terminus in the orderTMl-IC1-TM2-EC2-TM3-IC2-TM4-EC3-TMS-IC3-TM6-EC4-TM7,andwhereinthe transmembrane region has at least about 35% homology and a length approximately equal to a transmembrane region of a polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and (c) a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids. Each polypeptide member of the family is 3o expressed in a Gao protein-expressing vomeronasal organ neuron or are expressed in another olfactory organ neuron in an animal which does not possess a vomeronasal organ. One skilled in the art can readily identify olfactory organs in animals which do not possess a vomeronasal organ. The homology can be calculated using various, publicly available software tools developed by NCBI (Bethesda, Maryland) that can be obtained through the Internet (ftp://ncbi.nlm.nih.gov/pub~. Exemplary tools include the BLAST system.
Pairwise and ClustalW alignments (BLOSUM30 matrix setting) as well as Kyte-Doolittle hydropathic analysis s can be obtained using the MacVector sequence analysis software (Oxford Molecular Group).
The structure of the Gacp+ VNO pheromone receptors suggests that these receptors are members of the large G protein-coupled receptor superfamily (GPCR). Like other GPCRs, the Gao+ VNO pheromone receptors exhibit seven hydrophobic stretches ("hydrophobic domains") and are similar in structure to other types of GPCRs, the calcium sensing receptor (CSR Ser. ID
~ o No. 59) and the metabotropic glutamate receptors (mGluRs). The CSR and mGluRs are unusual among the GPCRs in that they have extremely long N-terminal extracellular domain (e.g., 557-565 amino acids), a feature that is shared by the pheromone receptors of the invention. Despite this similarity, the receptors of the invention do not share substantial primary structure homology with the CSR and mGluRs. The receptors of the invention also are very different structurally 15 from two other G-protein coupled receptors, the odorant receptors and Gai2+
vomeronasal receptors, which share none of the characteristic sequence motifs of the receptors of the invention and, moreover, which have very small (--12-28 amino acids) N-terminal extracellular domains.
The receptors of the invention differ somewhat in amino acid sequence, with regions of relatively high sequence homology. Refer to Examples 1 and 2 for a discussion and illustration 20 of the amino acid sequence homology for the marine and rat Gao+ VNO
receptors, respectively.
Other features of these members of the Gao+ VNO receptor family also are discussed and illustrated in the Examples. For example, signal sequences have been identified for several of the Gao+ VNO receptors disclosed in the Examples.
Homologs and alleles of the pheromone receptor nucleic acids of the invention can be 25 identified by conventional techniques. Thus, an aspect of the invention is those nucleic acid sequences (SEQ ID NOs. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55) which code for Goco+ VNO pheromone receptors and which hybridize to a nucleic acid molecule consisting of the coding region of any one Goco+ VNO
pheromone receptor selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 30 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, under high or low stringency conditions. The term "high or low stringency conditions" as used herein refers to parameters with which the art is familiar. Nucleic acid hybridization parameters may be found in references which compile such methods, e.g. Molecular Cloning: A Laboratory Manual, J.
Sambrook, et al., eds., Second Edition, Cold Spring Harbor Laboratory Press, Coid Spring Harbor, New York, I 989, or Current Protocols in Molecular Biology, F.M.
Ausubel, et al., eds., John Wiley & Sons, Inc., New York. More specifically, high stringency conditions, as used herein, refers, for example, to hybridization at 65°C in hybridization buffer (3.5 x SSC, 0.02%
Ficoll, 0.02% polyvinyl pyrolidone, 0.02% Bovine Serum Albumin, 2.SmM
NaH,P04(pH7), 0.5% SDS, 2mM EDTA). SSC is O.15M sodium chloride/O.15M sodium citrate, pH7;
SDS is sodium dodecyl sulphate; and EDTA is ethylenediaminetetracetic acid. Low stringency conditions would be the same, but with a lower temperature (e.g., SS
°C). After hybridization, 1 o the membrane upon which the DNA is transferred is washed at 2 x SSC at room temperature and then at 0.2 x SSC/0.5% SDS at temperatures of up to 65°C. Additional conditions of varying stringency are provided in the Examples.
There are other conditions, reagents, and so forth which can used, which result in a similar degree of stringency. The skilled artisan will be familiar with such conditions, and thus they are not given here. It will be understood, however, that the skilled artisan will be able to manipulate the conditions in a manner to permit the clear identification of homologs and alleles of the Goco+ VNO pheromone receptor nucleic acids of the invention. The skilled artisan also is fanuliar with the methodology for screening cells and libraries for expression of such molecules which then are routinely isolated, followed by isolation of the pertinent nucleic acid molecule 2o and sequencing.
In general homologs and alleles typically will share at least 35% nucleotide identity and/or at least SO% amino acid identity to the cDNAs encoding a Ga°+
VNO pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, in some instances will share at least 50% nucleotide identity and/or at least 65% amino acid identity and in still other instances will share at least 60% nucleotide identity and/or at least 75% amino acid identity. Watson-Crick complements of the foregoing nucleic acids also are embraced by the invention.
As discussed above in the Summary of the invention, certain domains within the pheromone receptors may share even greater sequence homology to a pheromone receptor polypeptide selected from the 3o group consisting of SEQ ID NO. 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.

In screening for Gao+ VNO pheromone receptor polypeptides, a Southern blot may be performed using the foregoing conditions, together with a radioactive probe.
After washing the membrane to which the DNA is finally transferred, the membrane can be placed against X-ray film to detect the radioactive signal.
The invention also includes degenerate nucleic acids which include alternative colons to those present in the native materials. For example, serine residues are encoded by the colons TCA, AGT, TCC, TCG, TCT and AGC. Each of the six colons is equivalent for the purposes of encoding a serine residue. Thus, it will be apparent to one of ordinary skill in the art that any of the serine-encoding nucleotide triplets may be employed to direct the protein synthesis 1o apparatus, in vitro or in vivo, to incorporate a serine residue into an elongating Gao+ VNO
pheromone receptor polypeptide. Similarly, nucleotide sequence triplets which encode other amino acid residues include, but are not limited to,: CCA, CCC, CCG and CCT
(proline colons); CGA, CGC, CGG, CGT, AGA and AGG (arginine colons); ACA, ACC, ACG and ACT (threonine colons); AAC and AAT (asparagine colons); and ATA, ATC and ATT
(isoleucine colons). Other amino acid residues may be encoded similarly by multiple nucleotide sequences. Thus, the invention embraces degenerate nucleic acids that differ from the biologically isolated nucleic acids in colon sequence due to the degeneracy of the genetic code.
In addition, areas of high similarity among pheromone receptors may differ in amino acid sequences such that they share many, but not all, amino acids. Their nucleotide sequences all 2o differ accordingly.
The invention also provides isolated unique fragments of the cDNAs encoding a Gao+
VNO polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52, or complements of these sequences. A unique fragment is one that is a 'signature' for the larger nucleic acid. It, for example, is long enough to assure that its precise sequence is not found in molecules outside of the Goco+ VNO pheromone receptor nucleic acids defined above. Unique fragments can be used as probes in Southern blot assays to identify such nucleic acids, or can be used as primers in amplification assays such as those employing PCR. As known to those skilled in the art, large probes such as 200 nucleotides or more are preferred for certain uses such as Southern blots, 3o while smaller fragments will be preferred for uses such as PCR. Unique fragments also can be used to produce fusion proteins for generating antibodies or determining binding of the polypeptide fragments, as demonstrated in the Examples, or for generating immunoassay components. Likewise, unique fragments can be employed to produce nonfused fragments of the Gao+ VNO pheromone receptor polypeptides, useful, for example, in the preparation of antibodies, in immunoassays, and as a competitive binding partner of the pheromones and/or other ligands which bind to the Gao+ VNO pheromone receptor polypeptides, for example, in therapeutic applications. Unique fragments further can be used as antisense molecules to inhibit the expression of Gao+ VNO pheromone receptor nucleic acids and polypeptides, particularly for the insecticide and other fertility control purposes as described in greater detail below.
As will be recognized by those skilled in the art, the size of the unique fragment will depend upon its conservancy in the genetic code. Thus, some regions of a cDNA
selected from 1o the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a Gao+
VNO polypeptide, and its complement will require longer segments to be unique while others will require only short segments, typically between 12 and 32 nucleotides (e.g. 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 and 32 bases long). Virtually any segment of the region of the 1 s cDNAs encoding the full length Gaco+ VNO polypeptide or their complements, that is I 8 or more nucleotides in length will be unique. Those skilled in the art are well versed in methods for selecting such sequences, typically on the basis of the ability of the unique fragment to selectively distinguish the sequence of interest from non-Gao+ VNO pheromone receptor nucleic acids. A comparison of the sequence of the fragment to those on known data bases typically is 2o all that is necessary, although in vitro confirmatory hybridization and sequencing analysis may be performed.
As mentioned above, the invention embraces antisense oligonucleotides that selectively bind to a nucleic acid molecule encoding a Goco+ VNO pheromone receptor polypeptide, to decrease a pheromone receptor activity (e.g., a Iigand binding activity, a signal transduction 25 activity). This is desirable in virtually any condition wherein a reduction in pheromone binding or induction of a behavior that is triggered by pheromone binding is desirable, including to control fertility and behavior in vertebrates and invertebrates. The compositions of the invention are particularly useful in, for example, controlling fertility in livestock and controlling reproduction in rodents or insects by interrupting the normal behaviors of rodents or insects that 3o result in reproduction. As used herein, the term "antisense oligonucleotide" or "antisense"
describes an oligonucleotide that is an oligoribonucleotide, oligodeoxyribonucleotide, modified oligoribonucleotide, or modified oligodeoxyribonucleotide which hybridizes under physiological _ _ _ 23 _ conditions to DNA comprising a particular gene or to an mRNA transcript of that gene and, thereby, inhibits the transcription of that gene and/or the translation of that mRNA. The antisense molecules are designed so as to interfere with transcription or translation of a target gene upon hybridization with the target gene or transcript. Those skilled in the art will recognize s that the exact length of the antisense oligonucleotide and its degree of complementarity with its target will depend upon the specific target selected, including the sequence of the target and the particular bases which comprise that sequence. It is preferred that the antisense oligonucleotide be constructed and arranged so as to bind selectively with the target under physiological conditions, i.e., to hybridize substantially more to the target sequence than to any other sequence 1 o in the target cell under physiological conditions. Based upon the cDNA
sequences of Examples l and 2 (SEQ ID NOs. 1, 3, 5, 7, 9, 11,13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55), or upon allelic or homologous genomic and/or cDNA
sequences, one of skill in the art can easily choose and synthesize ariy of a number of appropriate antisense molecules for use in accordance with the present invention. In order to be sufficiently i s selective and potent for inhibition, such antisense oligonucleotides should comprise at least 10 and, more preferably, at least 15 consecutive bases which are complementary to the target, although in certain cases modified oligonucleotides as short as 7 bases in length have been used successfully as antisense oligonucleotides (Wagner et al., Nature Biotechnol.
14:840-844, 1996).
Most preferably, the antisense oligonucleotides comprise a complementary sequence of 20-30 2o bases. Although oligonucleotides may be chosen which are antisense to any region of the gene or mRNA transcripts, in preferred embodiments the antisense oligonucleotides correspond to N-terminal or S' upstream sites such as translation initiation, transcription initiation or promoter sites. In addition, 3'-untranslated regions may be targeted. Targeting to mRNA
splicing sites has also been used in the art but may be less preferred if alternative mRNA
splicing occurs. In 25 addition, the antisense is targeted, preferably, to sites in which mRNA
secondary structure is not expected {see, e.g., Sainio et al., Cell Mol. Neurobiol. 14(5):439-457, 1994) and at which proteins are not expected to bind. Finally, although, Examples 1 and 2 disclose cDNA sequences (SEQ ID NOs. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55), one of ordinary skill in the art may easily derive the genomic DNA
3o corresponding to the cDNA of these cDNAs. Thus, the present invention also provides for antisense oligonucleotides which are complementary to the genomic DNA
corresponding to a cDNA sequence selected from the group consisting of SEQ ID NOs. 1, 3, 5, 7, 9, 1 l, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55. Similarly, antisense to allelic or homologous cDNAs and genomic DNAs are enabled without undue experimentation.
In one set of embodiments, the antisense oligonucleotides of the invention may be composed of "natural" deoxyribonucleotides, ribonucleotides, or any combination thereof. That is, the 5' end of one native nucleotide and the 3' end of another native nucleotide may be covalently linked, as in natural systems, via a phosphodiester internucleoside linkage. These oligonucleotides may be prepared by art recognized methods which may be carried out manually or by an automated synthesizer. They also may be produced recombinantly by vectors.
1o In preferred embodiments, however, the antisense oligonucleotides of the invention also may include "modified" oligonucleotides. That is, the oligonucleotides may be modified in a number of ways which do not prevent them from hybridizing to their target but which enhance their stability or targeting or which otherwise enhance their therapeutic effectiveness.
The term "modified oligonucleotide" as used herein describes an oligonucleotide in t 5 which ( 1 ) at least two of its nucleotides are covalently linked via a synthetic internucleoside linkage (i.e., a linkage other than a phosphodiester linkage between the 5' end of one nucleotide and the 3' end of another nucleotide) and/or (2) a chemical group not normally associated with nucleic acids has been covalently attached to the oligonucleotide. Preferred synthetic internucleoside linkages are phosphorothioates, alkylphosphonates, phosphorodithioates, 2o phosphate esters, alkylphosphonothioates, phosphoramidates, carbamates, carbonates, phosphate triesters, acetamidates, carboxymethyl esters and peptides.
The term "modified oligonucleotide" also encompasses oligonucleotides with a covalently modified base and/or sugar. For example, modified oligonucleotides include oiigonucleotides having backbone sugars which are covalently attached to low molecular weight 25 organic groups other than a hydroxyl group at the 3' position and other than a phosphate group at the 5' position. Thus modified oligonucleotides may include a 2'-O-alkylated ribose group.
In addition, modified oligonucleotides may include sugars such as arabinose instead of ribose.
The present invention, thus, contemplates pharmaceutical preparations containing modified antisense molecules that are complementary to and hybridizable with, under physiological 3o conditions, nucleic acids encoding pheromone receptor polypeptides, together with pharmaceutically acceptable carriers.

Antisense oligonucleotides may be administered as part of a pharmaceutical composition.
Such a pharmaceutical composition may include the antisense oligonucleotides in combination with any standard physiologically and/or pharmaceutically acceptable carriers which are known in the art. The compositions should be sterile and contain a therapeutically effective amount of the antisense oligonucleotides in a unit of weight or volume suitable for administration to a patient. The term "pharmaceutically acceptable" means a non-toxic material that does not interfere with the effectiveness of the biological activity of the active ingredients. The term "physiologically acceptable" refers to a non-toxic material that is compatible with a biological system such as a cell, cell culture, tissue, or organism. The characteristics of the carrier will 1 o depend on the route of administration. Physiologically and pharmaceutically acceptable Garners include diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials which are well known in the art.
As used herein, a "vector" may be any of a number of nucleic acids into which a desired sequence may be inserted by restriction and ligation for transport between different genetic environments or for expression in a host cell. Vectors are typically composed of DNA although RNA vectors are also available. Vectors include, but are not limited to, plasmids, phagemids and virus genomes. A cloning vector is one which is able to replicate in a host cell, and which is further characterized by one or more endonuclease restriction sites at which the vector may be cut in a determinable fashion and into which a desired DNA sequence may be ligated such that 2o the new recombinant vector retains its ability to replicate in the host cell. In the case of plasmids, replication of the desired sequence may occur many times as the plasmid increases in copy number within the host bacterium or just a single time per host before the host reproduces by mitosis. In the case of phage, replication may occur actively during a lytic phase or passively during a lysogenic phase. An expression vector is one into which a desired DNA
sequence may be inserted by restriction and ligation such that it is operably joined to regulatory sequences and may be expressed as an RNA transcript. Vectors may further contain one or more marker sequences suitable for use in the identification of cells which have or have not been transformed or transfected with the vector. Markers include, for example, genes encoding proteins which increase or decrease either resistance or sensitivity to antibiotics or other compounds, genes 3o which encode enzymes whose activities are detectable by standard assays known in the art (e.g., 13-galactosidase or alkaline phosphatase), and genes which visibly affect the phenotype of transformed or transfected cells, hosts, colonies or plaques (e.g., green fluorescent protein).

Preferred vectors are those capable of autonomous replication and expression of the structural gene products present in the DNA segments to which they are operably joined.
As used herein, a coding sequence and regulatory sequences are said to be "operably"
joined when they are covalently linked in such a way as to place the expression or transcription of the coding sequence under the influence or control of the regulatory sequences. If it is desired that the coding sequences be translated into a functional protein, two DNA
sequences are said to be operably joined if induction of a promoter in the 5' regulatory sequences results in the transcription of the coding sequence and if the nature of the linkage between the two DNA
sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the 1o ability of the promoter region to direct the transcription of the coding sequences, or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein. Thus, a promoter region would be operably joined to a coding sequence if the promoter region were capable of effecting transcription of that DNA sequence such that the resulting transcript might be translated into the desired protein or polypeptide.
The precise nature of the regulatory sequences needed for gene expression may vary between species or cell types, but shall in general include, as necessary, 5' non-transcribed and 5' non-translated sequences involved with the initiation of transcription and translation respectively, such as a TATA box, capping sequence, CART sequence, and the like. Especially, such 5' non-transcribed regulatory sequences will include a promoter region which includes a 2o promoter sequence for transcriptional control of the operably joined gene.
Regulatory sequences may also include enhancer sequences or upstream activator sequences as desired. The vectors of the invention may optionally include 5' leader or signal sequences. The choice and design of an appropriate vector is within the ability and discretion of one of ordinary skill in the art.
Expression vectors containing all the necessary elements for expression are commercially available and known to those skilled in the art. See, e.g., Sambrook et al., Molecular Cloning:
A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989. Cells are genetically engineered by the introduction into the cells of heterologous DNA
(RNA) encoding pheromone receptor polypeptide or fragment or variant thereof. That heterologous DNA (RNA) is placed under operable control of transcriptional elements to permit the expression of the 3o heterologous DNA in the host cell.
Preferred systems for mRNA expression in mammalian cells are those such as pRc/CMV
(available from Invitrogen, Carlsbad, CA) that contain a selectable marker such as a gene that confers 6418 resistance (which facilitates the selection of stably transfected cell lines) and the human cytomegalovirus (CMV) enhancer-promoter sequences. Additionally, suitable for expression in primate or canine cell lines is the pCEP4 vector (Invitrogen), which contains an Epstein Barr virus (EBV) origin of replication, facilitating the maintenance of plasmid as a multicopy extrachromosomal element. Another expression vector is the pEF-BOS
plasmid containing the promoter of polypeptide Elongation Factor 1 a, which stimulates efficiently transcription in vitro. The plasmid is described by Mishizuma and Nagata (Nuc.
Acids Res.
18:5322, 1990), and its use in transfection experiments is disclosed by, for example, Demoulin (Mol. Cell. Biol. 16:4710-4716,1996). Still another preferred expression vector is an adenovirus, described by Stratford-Perricaudet, which is defective for E1 and E3 proteins (J. Clin. Invest.
90:626-630, 1992). The use of the adenovirus as an Adeno.PlA recombinant is disclosed by Warnier et al., in intradermal injection in mice for immunization against P1A
(Int. J. Cancer, 67:303-310, 1996).
The invention also embraces so-called expression kits, which allow the artisan to prepare a desired expression vector or vectors. Such expression kits include at least separate portions of each of the previously discussed coding sequences. Other components may be added, as desired, as long as the previously mentioned sequences, which are required, are included.
The invention also permits the construction of pheromone receptor gene "knock-outs"
in cells and in animals, providing materials for studying certain aspects of pheromone receptor 2o binding, signal transduction activity, or function.
The invention also provides isolated polypeptides, which include a pheromone receptor polypep6de selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52 and unique fragments of these pheromone receptor polypeptides. Such polypeptides are useful, for example, alone or as fusion proteins to generate antibodies.
A unique fragment of a pheromone receptor polypeptide, in general, has the features and characteristics of unique fragments as discussed above in connection with nucleic acids. As will be recognized by those skilled in the art, the size of the unique fragment will depend upon factors such as whether the fragment constitutes a portion of a conserved protein domain. Thus, some 3o regions of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO.
2, 4, 6, 8, 10,12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52 will require longer segments to be unique while others will require only short segments, typically between 5 and 12 amino acids (e.g. 5, 6, 7, 8, 9, 10, 11 and 12 amino acids long).
Unique fragments of a polypeptide preferably are those fragments which retain a distinct functional capability of the polypeptide. Functional capabilities which can be retained in a unique fragment of a polypeptide include interaction with antibodies, interaction with other polypeptides (G-proteins) or molecules (e.g., a ligand) or fragments thereof, selective binding of nucleic acids or proteins, and enzymatic activity. Those skilled in the art are well versed in methods for selecting unique amino acid sequences, typically on the basis of the ability of the unique fragment to selectively distinguish the sequence of interest from non-family members.
1 o A comparison of the sequence of the fragment to those on known data bases typically is all that is necessary.
The invention embraces variants of the pheromone receptor polypeptides described above. As used herein, a "variant" of a pheromone receptor polypeptide is a polypeptide which contains one or more modifications to the primary amino acid sequence of a pheromone receptor poiypeptide. Modifications which create a pheromone receptor variant can be made to a pheromone receptor polypeptide 1 ) to reduce or eliminate an activity of a pheromone receptor polypeptide, such as a ligand binding activity or a signal transduction activity; 2) to enhance a property of a pheromone receptor polypeptide, such as protein stability in an expression system or the stability of protein-protein binding; or 3) to provide a novel activity or property to a 2o pheromone receptor polypeptide, such as addition of an antigenic epitope or addition of a detectable moiety. Modifications to a pheromone receptor polypeptide are typically made to the nucleic acid which encodes the pheromone receptor polypeptide, and can include deletions, point mutations, truncations, amino acid substitutions and additions of amino acids or non-amino acid moieties. Alternatively, modifications can be made directly to the polypeptide, such as by cleavage, addition of a linker molecule, addition of a detectable moiety, such as biotin, addition of a fatty acid, and the like. Modifications also embrace fusion proteins comprising all or part of the pheromone receptor amino acid sequence.
In general, variants include pheromone receptor polypeptides which are modified specifically to alter a feature of the polypeptide unrelated to its physiological activity. For 3o example, cysteine residues can be substituted or deleted to prevent unwanted disulfide linkages.
Similarly, certain amino acids can be changed to enhance expression of a pheromone receptor polypeptide by eliminating proteolysis by proteases in an expression system.

Mutations of a nucleic acid which encode a pheromone receptor polypeptide preferably preserve the amino acid reading frame of the coding sequence, and preferably do not create i regions in the nucleic acid which are likely to hybridize to form secondary structures, such a hairpins or loops, which can be deleterious to expression of the variant polypeptide.
Mutations can be made by selecting an amino acid substitution, or by random mutagenesis of a selected site in a nucleic acid which encodes the polypeptide. Variant polypeptides are then expressed and tested for one or more activities to determine which mutation provides a variant polypeptide with the desired properties. Further mutations can be made to variants {or to non-variant pheromone receptor polypeptides) which are silent as to the 1o amino acid sequence of the polypeptide, but which provide preferred codons for translation in a particular host. The preferred colons for translation of a nucleic acid in, e.g., E. coli, are well known to those of ordinary skill in the art. Still other mutations can be made to the noncoding sequences of a pheromone receptor gene or cDNA clone to enhance expression of the polypeptide. The activity of variants of pheromone receptor polypeptides can be tested by cloning the gene encoding the variant pheromone receptor polypeptide into a bacterial or mammalian expression vector, introducing the vector into an appropriate host cell, expressing the variant pheromone receptor polypeptide, and testing for a functional capability of the pheromone receptor polypeptides as disclosed herein. For example, the variant pheromone receptor polypeptide can be tested for a ligand binding activity, wherein a ligand to which the 2o receptor binds is contacted with the variant receptor and the amount of ligand binding to the variant receptor is determined using conventional procedures to measure the binding of one molecule to another. Preparation of other variant polypeptides may favor testing of other activities, as will be known to one of ordinary skill in the art.
The skilled artisan will also realize that conservative amino acid substitutions may be 2s made in pheromone receptor polypeptides to provide functionally equivalent variants of the foregoing polypeptides, i.e, the variants retain the functional capabilities of the pheromone receptor polypeptides. As used herein, a "conservative amino acid substitution" refers to an amino acid substitution which does not alter the relative charge or size characteristics of the protein in which the amino acid substitution is made. Variants can be prepared according to 3o methods for altering polypeptide sequence known to one of ordinary skill in the art such as are found in references which compile such methods, e.g. Molecular Cloning: A
Laboratory Manual, J. Sambrook, et al., eds., Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989, or Current Protocols in Molecular Biology, F.M.
Ausubel, et al., eds., John Wiley & Sons, Inc., New York. To a certain extent, the various members of the pheromone receptor family that are illustrated in the Examples represent exemplary functionally equivalent variants of the pheromone receptor polypeptides. Other functionally equivalent variants include s conservative amino acid substitutions of the amino acids of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52. Conservative substitutions of amino acids include substitutions made amongst amino acids within the following groups:
(a) M, I, L, V; (b) F, Y, W; (c) K, R, H; (d) A, G; (e) S, T; (f) Q, N; and (g) E, D.
1 o Conservative amino-acid substitutions in the amino acid sequence of pheromone receptor polypeptides to produce functionally equivalent variants of pheromone receptor polypeptides typically are made by alteration of the nucleic acid encoding pheromone receptor polypeptides.
Such substitutions can be made by a variety of methods known to one of ordinary skill in the art.
For example, amino acid substitutions may be made by PCR-directed mutation, site-directed 15 mutagenesis according to the method described in Proc. Nat. Acad. Sci.
U.S.A. 82: 488-492, 1985, or by chemical synthesis of a gene encoding a pheromone receptor polypeptide. Where amino acid substitutions are made to a small unique fragment of a pheromone receptor polypeptide, such as a ligand binding site peptide, the substitutions can be made by directly synthesizing the peptide. The activity of functionally equivalent fragments of pheromone 2o receptor polypeptides can be tested by cloning the gene encoding the altered pheromone receptor polypeptide into a bacterial or mammalian expression vector, introducing the vector into an appropriate host cell, expressing the altered pheromone receptor polypeptide, and testing for a functional capability of the pheromone receptor polypeptides as disclosed herein. Peptides which are chemically synthesized can be tested directly for function, e.g., for binding to a ligand to 25 which the unaltered pheromone receptor is known to bind.
The invention as described herein has a number of uses, some of which are described elsewhere herein. First, the invention permits isolation of the pheromone receptor polypeptides of the Examples. A variety of methodologies well-known to the skilled practitioner can be utilized to obtain isolated pheromone receptor molecules. The polypeptide may be purified from 3o cells which naturally produce the polypepdde by chromatographic means or immunological recognition. Alternatively, an expression vector may be introduced into cells to cause production of the polypeptide. In another method, mRNA transcripts may be microinjected or otherwise WO 9!9/00422 PCT/US98/13680 introduced into cells to cause production of the encoded polypeptide.
Translation of mRNA in cell-free extracts such as the reticulocyte lysate system also may be used to produce polypeptide.
Those skilled in the art also can readily follow known methods for isolating pheromone receptor polypeptides. These include, but are not limited to, immunochromatography, FiPLC, size-exclusion chromatography, ion-exchange chromatography and immune-affinity chromatography.
The isolation of the pheromone receptor gene also makes it possible for the artisan to diagnose a disorder characterized by expression of pheromone receptor . These methods involve determining expression of the pheromone receptor gene, and/or pheromone receptor 1 o polypeptides derived therefrom. In the former situation, such determinations can be carried out via any standard nucleic acid determination assay, including the polymerase chain reaction as exemplified in the examples below, or assaying with labeled hybridization probes.
The invention also makes it possible to isolate the naturally occurring ligands (pheromones) and other ligands that have a ligand binding domain, namely, by the binding of such molecules to the pheromone receptor polypeptides (or fragments thereof containing a ligand binding site). Binding of the receptors to a ligand can be accomplished by introducing into a biological system in which the proteins bind (e.g., a cell) a molecule that includes a binding domain (putative ligand) in an amount sufficient to detect the binding.
The invention also provides agents such as binding polypeptides which bind to 2o pheromone receptor polypeptides and/or to complexes of pheromone receptor polypeptides and their ligand binding partners. Such binding agents can be used, for example, in screening assays to detect the presence or absence of pheromone receptor polypeptides and complexes of pheromone receptor polypeptides and their ligand binding partners and in purification protocols to isolate pheromone receptor polypep~tides and complexes of pheromone receptor polypeptides and their ligand binding partners. Such agents also can be used to inhibit the native activity of the pheromone receptor polypeptides or their ligand binding partners, for example, by binding to such polypeptides, or their binding partners or both.
The invention, therefore, embraces peptide binding agents which, for example, can be . antibodies or fragments of antibodies having the ability to selectively bind to pheromone receptor 3o polypeptides. Antibodies include polyclonal and monoclonal antibodies, prepared according to conventional methodology.

W0.99/00422 PCTNS98/13680 Significantly, as is well-known in the art, only a small portion of an antibody molecule, the paratope, is involved in the binding of the antibody to its epitope (see, in general, Clark, W.R.
(1986) The Experimental Foundations of Modern Immunology Wiley & Sons, Inc., New York;
Roitt, I. (1991) Essential Immunology, 7th Ed., Blackweil Scientific Publications, Oxford). The pFc' and Fc regions, for example, are effectors of the complement cascade but are not involved in antigen binding. An antibody from which the pFc' region has been enzymatically cleaved, or which has been produced without the pFc' region, designated an F(ab')2 fragment, retains both of the antigen binding sites of an intact antibody. Similarly, an antibody from which the Fc region has been enzymatically cleaved, or which has been produced without the Fc region, 1 o designated an Fab fragment, retains one of the antigen binding sites of an intact antibody molecule. Proceeding further, Fab fragments consist of a covalently bound antibody light chain and a portion of the antibody heavy chain denoted Fd. The Fd fragments are the major determinant of antibody specificity (a single Fd fragment may be associated with up to ten different light chains without altering antibody specificity) and Fd fragments retain epitope binding ability in isolation.
Within the antigen-binding portion of an antibody, as is well-known in the art, there are complementarity determining regions (CDRs), which directly interact with the epitope of the antigen, and framework regions (FRs), which maintain the tertiary structure of the paratope (see, in general, Clark, 1986; Roitt, 1991 ). In both the heavy chain Fd fragment and the light chain of IgG immunoglobulins, there are four framework regions (FRl through FR4) separated respectively by three complementarity determining regions (CDR/ through CDR3).
The CDRs, and in particular the CDR3 regions, and more particularly the heavy chain CDR3, are largely responsible for antibody specificity.
It is now well-established in the art that the non-CDR regions of a mammalian antibody may be replaced with similar regions of nonspecific or heterospecific antibodies while retaining the epitopic specificity of the original antibody. This is most clearly manifested in the development and use of "humanized" antibodies in which non-human CDRs are covalently joined to human FR and/or Fc/pFc' regions to produce a functional antibody.
Thus, for example, PCT International Publication Number WO 92/04381 teaches the production and use of 3o humanized marine RSV antibodies in which at least a portion of the marine FR regions have been replaced by FR regions of human origin. Such antibodies, including fragments of intact antibodies with antigen-binding ability, are often referred to as "chimeric"
antibodies.

Thus, as will be apparent to one of ordinary skill in the art, the present invention also provides for F(ab')2, Fab, Fv and Fd fragments; chimeric antibodies in which the Fc and/or FR
and/or CDRI and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human sequences; chimeric F(ab')2 fragment antibodies in which the FR and/or CDRI and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human sequences; chimeric Fab fragment antibodies in which the FR
and/or CDRI and/or CDR2 and/or light chain CDR3 regions have been replaced by homologous human or non-human sequences; and chimeric Fd fragment antibodies in which the FR and/or CDRl and/or CDR2 regions have been replaced by homologous human or non-human sequences. The present 1o invention also includes so-called single chain antibodies.
Thus, the invention involves polypeptides of numerous size and type that bind specifically to pheromone receptor polypeptides, and/or complexes of both pheromone receptor polypeptides and their ligand binding partners. These polypeptides may be derived also from sources other than antibody technology. For example, such polypeptide binding agents can be provided by degenerate peptide libraries which can be readily prepared in solution, in immobilized form or as phage display libraries. Combinatorial libraries also can be synthesized of peptides containing one or more amino acids. Libraries further can be synthesized of peptoids and non-peptide synthetic moieties.
Phage display can be particularly effective in identifying binding peptides useful 2o according to the invention. Briefly, one prepares a phage library (using e.g. m13, fd, or lambda phage), displaying inserts from 4 to about 80 amino acid residues using conventional procedures.
The inserts may represent, for example, a completely degenerate or biased array. One then can select phage-bearing inserts which bind to the pheromone receptor polypeptide.
This process can be repeated through several cycles of reselection of phage that bind to the pheromone 2s receptor polypeptide. Repeated rounds lead to enrichment of phage bearing particular sequences. DNA sequence analysis can be conducted to identify the sequences of the expressed polypeptides. The minimal linear portion of the sequence that binds to the pheromone receptor polypeptide can be determined. One can repeat the procedure using a biased library containing ' inserts containing part or all of the minimal linear portion plus one or more additional degenerate 3 o residues upstream or downstream thereof. Yeast two-hybrid screening methods also may be used to identify polypeptides that bind to the pheromone receptor polypeptides.
Thus, the pheromone receptor polypeptides of the invention, or a fragment thereof, can be used to screen peptide WO 99!00422 PCTNS98/13680 libraries, including phage display libraries, to identify and select peptide binding partners of the pheromone receptor polypeptides of the invention. Such molecules can be used, as described, for screening assays, for purification protocols, for interfering directly with the functioning of pheromone receptor and for other purposes that will be apparent to those of ordinary skill in the art.
A pheromone receptor polypeptide, or a fragment which contains the ligand binding site, also can be used to isolate naturally-occurring ligands and other binding partners of the receptors of the invention. For example, an isolated pheromone receptor can be used to isolate ligands that bind to the receptor binding site by immobilizing a receptor (or fragment containing the ligand binding site) on a chromatographic media, such as polystyrene beads, or a filter, and using the immobilized polypeptide to isolate molecules that bind to this affinity matrix in accordance with standard procedures for affinity chromatography.
It will also be recognized that the invention embraces the use of the pheromone receptor cDNA sequences in expression vectors, as well as to transfect host cells and cell lines, be these ~ 5 prokaryotic (e.g., E. coli), or eukaryotic (e.g., CHO cells, COS cells, yeast expression systems and recombinant baculovirus expression in insect cells). Especially useful are oocytes, mammalian cells such as mouse, hamster, pig, goat, primate, etc. They may be of a wide variety of tissue types, and include primary cells and cell lines. The expression vectors require that the pertinent sequence, i.e., those nucleic acids described supra, be operably linked to a promoter.
When administered, the therapeutic compositions of the present invention are administered in pharmaceutically acceptable preparations. Such preparations may routinely contain pharmaceutically acceptable concentrations of salt, buffering agents, preservatives, compatible carriers, supplementary immune potentiating agents such as adjuvants and cytokines and optionally other therapeutic agents.
The therapeutics of the invention can be administered by any conventional route, including injection or by gradual infusion over time. The administration may, for example, be oral, intravenous, intraperitoneal, intramuscular, intracavity, subcutaneous, or transdermal.
When antibodies are used therapeutically, a preferred route of administration is by pulmonary 3o aerosol. Techniques for preparing aerosol delivery systems containing antibodies are well known to those of skill in the art. Generally, such systems should utilize components which will not significantly impair the biological properties of the antibodies, such as the paratope binding WO 99/00422 PCT/US98/13b80 capacity (see, for example, Sciarra and Cutie, "Aerosols," in ReminQton's Pharmaceutical Science, 18th edition, 1990, pp 1694-1712; incorporated by reference). Those of skill in the art can readily determine the various parameters and conditions for producing antibody aerosols without resort to undue experimentation. When using antisense preparations of the invention, s slow intravenous administration is preferred.
' Preparations for parenteral administration include sterile aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, Io including saline and buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's or fixed oils. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like.
15 The preparations of the invention are administered in effective amounts. An effective amount is that amount of a pharmaceutical preparation that alone, or together with further doses, produces the desired response in the condition being treated, e.g., modifying fertility or pheromone-mediated behaviors that are related to reproduction or aggression.
For example, this can involve the use of the compounds of the invention as pesticides to slow or halt insect or 2o rodent behaviors that result in reproduction. Alternatively, this can involve the use of the compounds of the invention as agents for controlling fertility in animals (e.g., livestock, domestic animals), by providing compounds which inhibit or stimulate the behaviors in such animals that result in reproduction or agression. This can be monitored by routine methods, e.g., observing the behavior in the animal (vertebrate or invertebrate) recipient.
25 The invention also contemplates gene therapy, e.g., to prepare an animal model for studying the conditions and behaviors (e.g., fertility, aggression) that are pheromone receptor-mediated. The procedure for performing ex vivo gene therapy is outlined in U.S. Patent 5,399,346 and in exhibits submitted in the file history of that patent, all of which are publicly available documents. In general, it involves introduction in vitro of a functional copy of a gene 3o into a cells) of a subject which contains a defective copy of the gene, and returning the genetically engineered cells) to the subject. The functional copy of the gene is under operable control of regulatory elements which permit expression of the gene in the genetically engineered cell(s). Numerous transfection and transduction techniques as well as appropriate expression vectors are well known to those of ordinary skill in the art, some of which are described in PCT
application W095/00654. In vivo gene therapy using vectors such as adenovirus, retroviruses, herpes virus, and targeted Iiposomes also is contemplated according to the invention.
The invention further provides efficient methods of identifying pharmacological agents or lead compounds for agents active at the level of a pheromone receptor or pheromone receptor fragment modulatable cellular function. In particular, such functions include Iigand binding activity. Generally, the screening methods involve assaying for activation of pheromone receptors or assaying for compounds which interfere with a pheromone receptor activity such as pheromone receptor binding to its cognate Iigand. Such methods are adaptable to automated, high throughput screening of compounds. The target therapeutic indications for pharmacological agents detected by the screening methods that block pheromone receptor activity are limited only in that the target cellular function be subject to modulation by alteration of the formation of a complex comprising a pheromone receptor polypeptide or fragment thereof and one or more t 5 natural pheromone receptor ligands. Target indications include cellular processes modulated by pheromone receptor signal transduction following receptor-ligand binding.
A wide variety of assays for pharmacological agents are provided, including, labeled in vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays, cell-based assays such as two- or three-hybrid screens, expression assays, activation of G-proteins, 2o etc. For example, three-hybrid screens are used to rapidly examine the effect of transfected nucleic acids on the intracellular binding of pheromone receptor or pheromone receptor fragments to specific extracellular targets (e.g., ligands in biological samples, such as urine, vaginal fluid, or in combinatorial libraries) .
Pheromone receptor fiagments used in the methods, when not produced by a transfected 25 nucleic acid are added to an assay mixture as an isolated polypeptide. The assay can be used to screen putative Iigands for their ability to bind to the receptor. Pheromone receptor polypeptides preferably are produced recombinantly, although such polypeptides may be isolated from biological extracts. Recombinantly produced pheromone receptor polypeptides include chimeric proteins comprising a fusion of a pheromone receptor protein with another polypeptide.
3o For example, a polypeptide fused to a pheromone receptor polypeptide or fragment may also provide means of readily detecting the fusion protein, e.g., by immunological recognition or by fluorescent labeling.

In addition to the pheromone receptor, a screening assay mixture includes a binding partner for the receptor, e.g., a naturally occurring ligand that is capable of binding to the pheromone receptor or, alternatively, is comprised of an analog which mimics the pheromone receptor binding properties of the naturally occurring ligand for purposes of the assay. The S screening assay mixture also comprises a candidate pharmacological agent (e.g., a putative receptor agonist or antagonist). Typically, a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a different response to the various concentrations.
Typically, one of these concentrations serves as a negative control, i.e., at zero concentration of agent or at a concentration of agent below the limits of assay detection.
Candidate agents 1 o encompass numerous chemical classes, although typically they are organic compounds.
Preferably, the candidate pharmacological agents are small organic compounds, i.e., those having a molecular weight of more than 50 yet less than about 2500, preferably less than about 1000 and, more preferably, less than about 500. Candidate agents comprise functional chemical groups necessary for structural interactions with polypeptides and/or nucleic acids, and typically 15 include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups and more preferably at least three of the functional chemical groups.
The candidate agents can comprise cyclic carbon or heterocyclic structure and/or aromatic or polyaromatic structures substituted with one or more of the above-identified functional groups.
Candidate agents also can be biomolecules such as peptides, saccharides, fatty acids, sterols, 2o isoprenoids, purines, pyrimidines, derivatives or structural analogs of the above, or combinations thereof and the like. Where the agent is a nucleic acid, the agent typically is a DNA or RNA
molecule, although modified nucleic acids as defined herein are also contemplated.
Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and 25 directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides, synthetic organic combinatorial libraries, phage display libraries of random peptides, and the like. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced.
Additionally, natural and synthetically produced libraries and compounds can be readily be 3o modified through conventional chemical, physical, and biochemical means.
Further, known pharmacological agents may be subjected to directed or random chemical modifications such as WO 99!00422 PCT/US98l13680 acylation, alkylation, esterification, amidification, etc. to produce structural analogs of the agents.
A variety of other reagents also can be included in the mixture. These include reagents such as salts, buffers, neutral proteins (e.g., albumin), detergents, etc.
which may be used to facilitate optimal protein-protein and/or protein-nucleic acid binding. Such a reagent may also reduce non-specific or background interactions of the reaction components.
Other reagents that improve the efficiency of the assay such as protease, inhibitors, nuclease inhibitors, antimicrobial agents, and the like may also be used.
The mixture of the foregoing assay materials is incubated under conditions whereby, but 1 o for the presence of the candidate pharmacological agent, the pheromone receptor polypeptide specifically binds the cellular binding target, a portion thereof or analog thereof. The order of addition of components, incubation temperature, time of incubation, and other parameters of the assay may be readily determined. Such experimentation merely involves optimization of the assay parameters, not the fundamental composition of the assay. Incubation temperatures typically are between 4°C and 40°C. Incubation times preferably are minimized to facilitate rapid, high throughput screening, and typically are between 0.1 and 10 hours.
After incubation, the presence or absence of specific binding between the pheromone receptor polypeptide and one or more binding targets is detected by any convenient method available to the user. For cell free binding type assays, a separation step is often used to separate 2o bound from unbound components. The separation step may be accomplished in a variety of ways. Conveniently, at least one of the components is immobilized on a solid substrate, from which the unbound components may be easily separated. The solid substrate can be made of a wide variety of materials and in a wide variety of shapes, e.g., microtiter plate, microbead, dipstick, resin particle, etc. The substrate preferably is chosen to maximum signal to noise ratios, primarily to minimize background binding, as well as for ease of separation and cost.
Separation may be effected for example, by removing a bead or dipstick from a reservoir, emptying or diluting a reservoir such as a microtiter plate well, rinsing a bead, particle, chromatographic column or filter with a wash solution or solvent. The separation step preferably includes multiple rinses or washes. For example, when the solid substrate is a microtiter plate, 3o the wells may be washed several times with a washing solution, which typically includes those components of the incubation mixture that do not participate in specific bindings such as salts, buffer, detergent, non-specific protein, etc. Where the solid substrate is a magnetic bead, the beads may be washed one or more times with a washing solution and isolated using a magnet.
Detection may be effected in any convenient way for cell-based assays such as two- or three-hybrid screens. The transcript resulting from a reporter gene transcription assay of S Pheromone receptor polypeptide binding to a target molecule typically encodes a directly or indirectly detectable product, e.g., [3-galactosidase activity, luciferase activity, and the like. A
wide variety of cell based assays for G-protein coupled receptors could also be employed for detection of molecules that stimulate (agonsists) pheromone receptors or block (agonists) that stimulation by natural ligands or agonists. Pheromone receptor polypeptides or chimeric receptors composed only in-part of a pheromone receptor could be employed in these assays.
The chimeric receptors might, for example, contain part of another G-protein coupled receptor such that binding of a ligand to the pheromone receptor binding domain results in coupling to a particular G-protein where activation could be easily assayed. For cell free binding assays, one of the components usually comprises, or is coupled to, a detectable label. A
wide variety of 15 labels can be used, such as those that provide direct detection (e.g., radioactivity, luminescence, optical or electron density, etc). or indirect detection (e.g., epitope tag such as the FLAG epitope, enzyme tag such as horseradish peroxidase, etc.). The label may be bound to a pheromone receptor binding partner (ligand), or incorporated into the structure of the binding partner.
A variety of methods may be used to detect the label, depending on the nature of the label 2o and other assay components. For example, the label may be detected while bound to the solid substrate or subsequent to separation from the solid substrate. Labels may be directly detected through optical or electron density, radioactive emissions, nonradioactive energy transfers, etc.
or indirectly detected with antibody conjugates, strepavidin-biotin conjugates, etc. Methods for detecting the labels are well known in the art.
25 The invention provides pheromone receptor -specific binding agents, methods of identifying and making such agents, and their use in diagnosis, therapy and pharmaceutical development, including the development of pesticides and other agents for controlling fertility and reproduction (or related behaviors) in animals. For example, pheromone receptor-specific pharmacological agents are useful in a variety of diagnostic and therapeutic applications, 3o especially where disease or disease prognosis is associated with improper utilization of a pathway involving pheromone receptor. Novel pheromone receptor-specific binding agents include pheromone receptor-specific antibodies and other natural intracellular binding agents identified with assays such as two hybrid screens, and non-natural intracellular binding agents identified in screens of chemical libraries and the like.
In general, the specificity of pheromone receptor binding to a binding agent is shown by binding equilibrium constants. Targets which are capable of selectively binding a pheromone receptor polypeptide preferably have binding equilibrium constants of at least about 10' M-', more preferably at least about 10g M'', and most preferably at least about 109 M-'. The wide variety of cell based and cell free assays may be used to demonstrate pheromone receptor -specific binding. Cell based assays include one, two and three hybrid screens, assays in which pheromone receptor -mediated transcription is inhibited or increased activation of G-proteins, etc. Cell free assays include pheromone receptor -protein binding assays, immunoassays, etc.
Other assays useful for screening agents which bind pheromone receptor polypeptides include fluorescence resonance energy transfer (FRET), and electrophoretic mobility shift analysis (EMSA).
Various techniques may be employed for introducing nucleic acids of the invention into cells, depending on whether the nucleic acids are introduced in vitro or in vivo in a host. Such techniques include transfection of nucleic acid-CaP04 precipitates, transfection of nucleic acids associated with DEAF, transfection with a retrovirus including the nucleic acid of interest, liposome mediated transfection, and the like. For certain uses, it is preferred to target the nucleic acid to particular cells. In such instances, a vehicle used for delivering a nucleic acid of the 2o invention into a cell (e.g., a retrovirus, or other virus; a liposome) can have a targeting molecule attached thereto. For example, a molecule such as an antibody specific for a surface membrane protein on the target cell or a ligand for a receptor on the target cell can be bound to or incorporated within the nucleic acid delivery vehicle. For example, where liposomes are employed to deliver the nucleic acids of the invention, proteins which bind to a surface membrane protein associated with endocytosis may be incorporated into the liposome formulation for targeting and/or to facilitate uptake. Such proteins include capsid proteins or fragments thereof tropic for a particular cell type, antibodies for proteins which undergo internalization in cycling, proteins that target intracellular localization and enhance intracellular half life, and the like. Polymeric delivery systems also have been used successfully to deliver 3o nucleic acids into cells, as is known by those skilled in the art. Such systems even permit oral delivery of nucleic acids.

Preparation and analysis of single cell cDNAs Male mouse (C57BL/6J) VNOs were minced, incubated in Trypsin-EDTA (Gibco-BRL/LTI, Rockville, Maryland), and triturated to obtain dissociated cells. The cells were centrifuged ( 1000 RPM, 5 min) and resuspended in phosphate buffered saline +
0.1 % bovine serum albumin. Individual cells that appeared to be neurons were transferred to separate tubes 1 o with a microcapillary pipet.
cDNAs were prepared from each cell and amplified according to Brady and Iscove (Methods in Enzymology, 1993, 225:611-621) with minor modifications. Briefly, cDNAs were prepared from the 3' ends of mRNAs by reverse transcription with an oligo (dT) primer, and a poly dA stretch was added to each cDNA with terminal transferase. The cDNAs were then I5 amplified by PCR with one of two primers, AL1 (ATTGGATCCAGGCCGCTCTGGACAA
AATATGAA TTC(T) ( SEQ. ID. No. 56) (Dulac and Axel, Cell, 1995, 83:195-206 or (GGCACATGG ACGAAATCTTGGTACTCTTCAGAATTC(T), (SEQ. ID. No. 57) and Taq polymerase [Amplitaq LD ("ALD") or Amplitaq Stoffel Fragment ("ASF") (Perkin Elmer, Norwalk, CT )].
20 Aliquots of each cDNA sample were electrophoresed on agarose gels and blotted onto nylon membranes (Hybond N+, Amersham, Piscataway, NJ) (Ausubel, F., et al., Current Protocols in Molecular Biology, 1988, John Wiley & Sons NY, NY; Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989). The blots were hybridized at 55° or 70°C in Hyb Buffer (O.SM sodium phosphate 25 buffer (pH7.3), 4% SDS, 1% bovine serum albumin (BSA)) with 32P-labeled probes prepared by random priming (Prime-It II, Stratagene, La Jolla, CA).
Construction and screening of single cell cDNA libraries ' An aliquot of cDNA sample VN14 was digested with Eco RI and gel-isolated fragments 30 of 0.1-1.5 kb were cloned into ~.ZapII Ausubel, F., et al., Current Protocols in Molecular Biology, 1988, John Wiley & Sons NY, NY; Sambrook, J., et al., Molecular Cloning: A
Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 1989).
Two thousand library clones were plated at low density. Replica filter lifts were hybridized at 75°C
(in Hyb Buffer containing 2~g/ml poly (dT)24 and 1pg/ml of random dA-dT 20-mers) t0 32P-labeled probes (~2.5 x 108 CPM/pg; 5 x 106 CPM/ml) prepared by PCR of different single cell cDNA samples. Clones that hybridized to only a VN14 probe were isolated, and a probe prepared from the insert of each was hybridized to blots of selected single cell cDNAs. Clones that hybridized to only VN14 cDNAs were sequenced.
Isolation and analysis of VR cDNA clones sc153, one VN14+VN2- clone from the VN14 library, was used as probe to screen a to mouse VNO cDNA library ('~,VNO') (Berghard, A., et al., JNeurosci, 1996, 16:909-918) and a mouse genomic DNA library (Stratagene, La Jolla, CA) (70°C, Hyb buffer). Hybridizing clones were found only in the genomic library. A fragment containing 2kb upstream of sc153 was isolated from one genomic clone (15361) and used to screen 1VN0 (55°C, Hyb Buffer). The region (D10-TM7) of one clone (D10) that showed homology to TM7 of the CSR
(SEQ ID NO.
59) was then used to screen 1VN0 (55°C, Hyb Buffer), yielding a variety of VR cDNA clones.
Additional clones were obtained from 1VN0 using probes prepared from clones previously isolated, or from PCR products obtained by amplification of mouse genomic DNA
or VNO
cDNA with degenerate primers (Buck, L., et al., Cell, 1991, 65:175-187) matching conserved motifs in the VRs. Some PCR products were also cloned into pCR2.1 (Invitrogen, Carlsbad, 2o CA) and sequenced.
Analysis of VR mRNAs by RT-PCR
Random-primed cDNA prepared from male or female C57BL/6J mouse VNO RNAs (or VR cDNA clones) were used in PCR reactions with degenerate primers (Buck and Axel, Cell 1991, 65:175-187) matching conserved VR motifs to amplify VR sequences corresponding to amino acids 33-772 in VRl (SEQ ID NO. 2). Nested PCR was performed with a 1/1000 dilution of the first PCR reaction and primer pairs matching regions of putative exons 1 and 6 in specific VR cDNA clones. Blots prepared from size-fractionated, nested PCR products were hybridized (70°C, Hyb buffer containing 100p,g/ml herring sperm DNA (Sigma, St Louis, MO)) to probes 3o prepared from the PCR products of the cDNA clones.
Northern and Southern blots and genomic library screens - _ - 43 -Northern Blots: One ~g of PolyA+ RNA prepared from mouse VNO and OE, or purchased from Clontech (other tissue RNAs), was size fractionated on formaldehyde gels, and blotted (see above) (Berghard and Buck, J Neurosci, 1996, 16:909-918). The blot was hybridized (70°C, Hyb Buffer) with a 32P-labeled probe prepared from the regions of cDNAs VRI, VR2, VR4, and VR15 corresponding to that encoding amino acids 33-772 in VR1 (SEQ
ID NO. 1 ).
Southern Blots: 5 ~g of genomic DNA prepared from C57BL6/J mouse liver was digested with Eco RI or Hind III, size fractionated, and blotted (Ressler et al, Cell, 1993, 73:597-609). The blots were hybridized (70°C, Hyb buffer containing sperm DNA
{see above)) to 1 o probes prepared from 3' untranslated segments of different VR cDNA clones [VR2 (nt.2607-2961 of SEQ ID NO. 3), VR3 (nt. 2505-2907 of SEQ ID NO. S), and VR15 (nt. 3239-3689 of SEQ ID NO. 29)]. A VR4 probe was also used, which gave the same results as highly related VR15 probe.
Genomic library screens to determine VR gene number: A mouse genomic library was screened separately at 70°C or 55°C (see above) with different 32P-labeled probes. Probe 1: a mix of segments of cDNAs VRl (SEQ ID NO. 1 ), VR2 (SEQ ID NO. 3), VR4 (SEQ ID
NO. 7), and VR15 (SEQ ID NO. 29) encoding the region cowesponding to amino acids 619-772 of VRl (SEQ ID NO. 2). Probes 2-6: Segments of VR genes obtained from mouse genomic DNA by PCR with degenerate primers matching conserved VR sequence motifs. The PCR
segments 2o corresponded to the following amino stretches in VRl (SEQ ID NO. 2): amino acids 191-397, 565-825, 637-825, 637-804, and 619-784. For example, degenerate oligonucleotide primer pairs used included:
for amino acids 191-397:
5' primer= (GCT)TI(CT)A(CT) CA(AG)(AG)TIGCI(AC~IAA(AG)GA(CT)AC (SEQ ID NO.
60), 3' primer= G(CT)(AG)T(GT)IGCI(AG)(CT)I(AG)C(AG)T{AG)IACI(AG)C(AG)TT (SEQ ID
NO. 61 );
for amino acids 565-825:
5' primer= (ACXAG)ITG (CT)CCI(GT)AIIA(CTXAC)A{AG)TA(CT)GCIAA (SEQ ID NO. 62), 3' primer= GIC(GT)IA(C'T)IA(AG)IATIA (CT)(AG)TAI(AC)(AT)(CT)TTIGGIAC (SEQ ID
NO. 63);

for amino acids 637-825:
5' primes= ATI(AT)(GC)I (CT) TI(AG)TITT(CT)TG(CT)TT(CT)(CT)TITG (SEQ ID NO.
64), 3' primer= GIC(GT)IA(CT)IA(AG)IATIA (CT)(AG)TAI(AC)(AT)(CT)TTIGGIAC (SEQ ID
NO. 63);
for amino acids 637-804:
5' primer-- ATI(AT)(GC)I(CT)TI(AG)TITT(CT)TG(CT)TT(CT)(CT)TITG (SEQ ID NO.
64), 3' primer= (AG)IATI(GC)(AT)(AG)AAIA(CT)(CT)TCIACI (AG)CIACCAT (SEQ ID NO. 65);
and for amino acids 619-784:
5' primer= GA(CT)ACICCIATIGTIAA(AG)GCIAA(CT}AA (SEQ ID NO. 66), 3' primer= AAIGTIA(CT)CCAIACI(GC)(AT)(AG)CA(AG)AAIAC (SEQ ID NO. 67), wherein all primers are in a 5'-» 3' direction, I:Inosine.
In situ hybridization is In situ hybridization was performed according to Schaeren-Wiemers and Gerfin-Moser (Histochemistry, 1993, 100:431-440) with sequential 16 micron sections of male or female VNOs. Digoxigenin- labeled cRNA probes were prepared from the same 3' untranslated regions of VR cDNAs as used for the genomic Southern blots. Sections were counter-stained with Hoechst 33258, which labels nuclei. The numbers of G,o- or G,;Z-labeled cells (or cells labeled 2o with VR probes) was determined by counting the number of nuclei in labeled regions. The total number of cells was considered to be the sum of G,~+ and G,~+ cells in adjacent sections.
Chromosome mapping of VR genes Southern blots of genomic DNA from C57BL/6J and Mus spretus (Jackson Labs) 25 digested with different restriction enzymes were prepared and probed with specific VR cDNA
probes as described above. Southern blots of Eco RI, size fractionated genomic DNAs from 94 different backcross mice (M. spretus x (M. spretus x C57BL/6J)), were purchased from Jackson Labs. These blots were hybridized to probes prepared from 3' untranslated segments of the VR2 or VR4 (see above) cDNA at 70°C and washed (see above}. Polymorphic bands were typed as 3o either M. spretus or M. spretus/C57BL/6J. The data was sent to the Jackson Laboratory Backcross DNA Mapping Panel Resource for determination of the chromosomal locations of the polymorphic fragments. Additional information was obtained via Internet from Jackson Laboratory Mouse Genome Informatics.
Cloning of a gene differentially expressed in G,~+ VNs Different members of the OR and VNR families are expressed in different neurons in the OE and G,;~+ zone of the VNO, respectively. It therefore appeared likely that the same would be true of sensory receptors expressed by G,~+ VNs. The differential screening of cDNA libraries with cDNA probes prepared from a few neurons can be used to identify genes expressed in one neuron, but not another (Buck, L., et al, Annu. Rev. Neurosci., 1996, 19:517-544). Using PCR, this can be accomplished with single cells (Brady, G., et al., Methods in Enzymology, 1993, 225:611-621; Dulac, C., et al., Cell, 1995, 83 :195-206).
To search for genes encoding receptors expressed by G,~+ VNs, we looked for genes expressed in one G,~+ VN, but not another, using the PCR-based differential screening approach.
In initial experiments, we isolated a series of mouse VNs, prepared cDNAs from the 3' ends of ~ s mRNAs present in each, and amplified the single-cell cDNA fragments by PCR. Many of the amplified, single-cell cDNA samples hybridized to an OMP probe, confirming their derivation from VNs (Berghard et al, Proc. Natl. Acad. Sci. USA, 1996, 93:2365-2369).
With one exception, Gp and G,;Z probes hybridized to different OMP+ samples, allowing us to identify samples that were derived from Ga+ VNs.
2o We next prepared a library from one of the Gn+ single-cell cDNA samples (VN14), and isolated clones that hybridized to a probe prepared from VN14, but not to a probe prepared from another G,~+ sample (VN2). We identified 3 VN14+VN2- clones, which differed in size, but were otherwise identical in sequence. None contained an open reading frame, which was not surprising since, in the method used, the amplified cDNAs are only 400-800 by long, and are 25 derived from the 3' ends of mRNAs (Brady and Iscove, Methods in Enzymology, 1993, 225:611-621 ).
I We next hybridized one of the VN14+VN2- clones (sc153) to the original panel of single-cell cDNAs. sc 153 hybridized to VN 14, but not to any of the other cDNA samples.
Consistent with this result, sc153 hybridized to only a small percentage (~0.3%) of VNs in VNO
30 tissue sections.
Using sc153 as probe, we were able to isolate a sc153+ clone from a mouse genomic library which contained ~2 kb of DNA S' to the sc153 sequence. Using this 2kb fragment as probe, we isolated a matching clone (D10) from the VNO cDNA library. Sequence analysis showed that sc 153 and D 10 were derived from the same gene, but that the D 10 cDNA was truncated at the 3' end and did not contain the final 685 by of sequence present in sc 153. Like sc153, D10 hybridized to only a small percentage of VNs in VNO tissue sections.
The 5' end of the D 10 cDNA contained a short open reading frame, which encoded a protein fragment with homology to transmembrane domain 7 (TM7) of the calcium sensing receptor (CSR), a G protein-coupled receptor (GPCR) (Brown et al, Nature, 1993, 366:575-580).
When the TM7-related region of D 10 (D 10-TM7) was hybridized at reduced stringency (55°C) to the original panel of single-cell cDNAs, it labeled many of the G,°+
samples, but none of G,~+
ones (except the one that was also G,°+, and was probably derived from two cells). Since D10 labeled only a small percentage of VNs in tissue sections under high stringency conditions, this suggested that many G,°+ neurons express a gene related to D 10, but not identical to it.
A novel multigene family encoding VNO receptors Hybridization of D10-TM7 to the VNO cDNA library at reduced stringency yielded a number of related cDNA clones (e.g. VRl-VR3, SEQ ID NOs. 1-6). Additional related cDNAs were obtained by RT-PCR with degenerate primers (e.g. VR6-VR7, SEQ ID NOs. 11-14), or by screening the VNO cDNA library with a PCR product obtained from genomic DNA
(e.g., VR4, VRS, SEQ ID NOs. 7-10).
2o These cDNAs encode a novel family of proteins, which are members of the G
protein-coupled receptor (GPCR) superfamily (Figure 1). Like other GPCRs, these VNO
receptors (VRs) have 7 hydrophobic stretches that may serve as membrane spanning domains. Only 287 of 850 residues are identical in all of the molecules shown in Figurel, indicating that the family is diverse. The VRs are related to two other types of GPCR, the calcium sensing receptor (CSR) and the metabotropic glutamate receptors (mGluRs) (Tanabe, Y., et al., Neuron, 1992, 8:169-179; Brown, E., et al., Nature, 1993, 366:575-580). The most highly related molecule is the CSR; for example, VRl is 31% identical to rat CSR (Riccardi et al., Proc.
Natl. Acad. Sci. USA, 1995, 92:131-135), with the highest homology residing in the TM1-TM7 region (44%) (Figure 1 ). However, the VRs comprise a distinct family of receptors, which share novel sequence 3o motifs, and are more related to one another than they are to other receptors. For example, two divergent VRs, VRl (SEQ ID NO. 1, 2) and VR4 (SEQ ID NO. 7, 8), are 70%
identical in TM1-TM7, and 48% identical overall.

_ - 47 _ The VRs are unusual among GPCRs in having an extremely long N-terminal extracellular domain (Figures 1 and 2). This feature is shared by the CSR and mGluRs, and by an unrelated class of GPCRs that includes several receptors for glycoprotein hormones (Segaloff, D., et al., Oxf. Rev. Reprod Biol., 1992, 14:141-168). Importantly, the VRs are very different from both ORs and VNRs, which are also GPCRs (Buck. L., et al., Cell, 1991 51:127-133;
Dulac, C., et al., Cell, 1995, 83:195-206). VRs share none of the characteristic sequence motifs of ORs or VNRs. In addition, the size of the N-tenrninal extracellular domain of VRs (557-565 amino acids) far exceeds that of ORs and VNRs (~12-28 amino acids) (Figure 2). The VRs are most variable in the N-terminal domain (25% identical residues compared to 57% in TMl-TM7). In t o the structurally-related mGluRs, the ligand binding site is thought to reside in the large N-terminal domain (O'I3ara et al., Neuron, 1993, 11:41-52; Takahashi et al, J.
Biol. Chem., 1993, 268:1934I-19345). If this is also true of VRs, the accentuated diversity of the N-terminal domain may reflect an ability to recognize diverse pheromonal ligands.
Most of the VR cDNAs that we analyzed appeared to belong to one of three subfamilies ~5 of highly related molecules. For example, VRl (SEQ ID NOs. 1, 2), VR2 (SEQ
ID NOs. 3, 4), and VR3 (SEQ ID NOs. 5, 6) are very similar as are VR4 (SEQ ID NOs. 7, 8) and VRS (SEQ
ID NOs. 9, IO), and VR6 (SEQ ID NOs. 11, 12) and VR7 (SEQ ID NOs. 13, 14) (Figure 1).
Nonetheless, our results indicate that all of these cDNAs were derived from different genes.
First, all cDNAs were sequenced on both strands to rule out sequencing errors.
Second, the RNA
2o used for library construction and PCR came from an inbred mouse strain (C57BL/6J), so they cannot be allelic variants. Third, the error rates of reverse transcriptase (or Taq polymerase) cannot account for the extent to which the cDNAs differ. For example,VR4 (SEQ
ID NOs. 7, 8) and VRS (SEQ ID NOs. 9, 10) cDNAs are 99% identical in nucleotide sequence, but the reverse transcriptase used to prepare them has an error rate of only 3.6 x 10-s/bp (Ji, J., et al., 2s Biochemistry,1992, 31:954-958).
Variant forms of VR mRNA
Many of the VRs we characterized lacked a segment of the N-terminal domain present in other VRs. Invariably, the missing segment corresponded to a region of the human CSR
3o encoded by a single exon, or pair of exons (Pollak, M., et al., Cell, 1993, 73:1297-1303). We also found several different VR cDNAs that contained a stretch of noncoding sequence at a site corresponding to a CSR exon-intron boundary (e.g. VRI S). This suggested that the exon-intron structure of VR genes resembles that of the CSR gene, and that variant forms of VR mRNAs might be generated by differential RNA splicing.
Variant VR mRNAs could derive either from different genes, or from the same gene by alternative RNA splicing. Consistent with the latter possibility, two pairs of cDNAs that we sequenced VR8 (SEQ ID NOs. 15, 16) and VR9 (SEQ ID NOs. I7, 18), and VR10 (SEQ
ID
NOs. 19, 20) and VRl 1 (SEQ ID NOs. 21, 22) were identical in nucleotide sequence, but were missing different segments. However, when we used RT-PCR to amplify VNO mRNA
sequences encoding 5 different VRs, we obtained one major PCR product in each case, regardless of whether the RNA used was from male or female mice. In 4 cases, the size of the to major product corresponded to a complete VR, even though one of the cDNAs (but not the PCR
product) contained an intron (#5). In one case, in which the cDNA lacked one exon {#2), the major PCR product was even smaller, and was found to lack two exons. Although PCR products of a smaller size were also seen in these experiments, they were much less abundant.
These results suggest that different VR forms derive from different genes.
Thus many ~5 VR genes may be expressed pseudogenes, which either lack one or more exons, or have mutations that prevent proper RNA splicing. We cannot exclude the possibility that some variant VRs are functional, however. For example, some truncated VRs that lack transmembrane domains could conceivably be secreted pheromone-binding proteins.
2o Differential expression of VR genes in VNO neurons To investigate the tissue distribution of VR gene expression, we conducted Northern blot analyses in which size fractionated polyA+ RNAs from different mouse tissues were hybridized to a mix of radiolabeled VR cDNAs. The mixed probe hybridized to VNO RNAs of ~1.9-3.7 kb, with intense hybridization to RNAs of 2.8-3.5 kb. It did not hybridize to RNAs from a 25 variety of other tissues, including olfactory epithelium and brain. This suggested that VR genes may be expressed exclusively in the VNO.
We found two partial cDNAs that were highly related to VR cDNAs in the NCBI
dbEST
database, one from spleen and the other from 2-cell stage mouse embryos.
However, when we hybridized the most highly related VR cDNAs (VR6 and VR7) to spleen sections, only one 3o questionably-labeled cell was seen out of ~1.4 x 106 cells with one VR
probe, and none was seen with the other. The EST clones might be DNA contaminants, or be due to the widespread, but low level, misexpression of tissue specific genes {Sarkar, G., et al., Science, 1989, 244:331-334);

nonetheless, we cannot exclude the possibility that VR genes are expressed at a low frequency in some other tissues.
To examine the patterns of expression of different VR genes in the VNO, we conducted in situ hybridization experiments. Labeled segments of the 3' untranslated regions of three VR
cDNAs were hybridized separately, or in combination, to sequential sections through the VNO.
Probes prepared from G,~ and G,~ cDNAs were hybridized to adjacent sections to delineate the G,~+ and G,~+ zones of the VNO neuroepithelium.
The Gp and G,;2 probes gave patterns of hybridization similar to those we had previously seen (Berghard, A., et al, J. Neurosci., 1996, 16:909-918). The G,~probe hybridized to a wavy stripe of VNO neurons in the basal (lower) region of the VNO neuroepithleium, whereas the G,;z probe hybridized to an adjacent stripe of neurons in the apical (upper) part of the neuroepithelium. The waviness of the two zones appears to be caused by the periodic presence of blood vessels near the base of the epithelium (Berghard, A., et al, J.
Neurosci., 1996, 16:909-918). Approximately 57% of VNs were labeled by the G,,Z probe and 43% were labeled by the 1s Gp probe. The single layer of supporting cells located just beneath the epithelial surface was not labeled by either probe.
Each of the VR probes hybridized to a small percentage (2.4-5.7%) of VNs that appeared to be restricted to the basal, Gm+ zone of the VNO neuroepithelium. Labeled neurons were scattered throughout the anterior-posterior and dorsal-ventral extent of the G,~+ zone. Small 2o clusters of labeled cells were somtimes seen, particularly with the VR2 probe The mixed probe labeled a larger percentage of VNs (10.6%) that was almost equal to the sum of the percentages labeled by its individual components (10.8%). Thus different G,~+ neurons must express different VRs.
No differences were seen in the patterns of hybridization obtained using VNOs from male 25 and female mice, and no hybridization was observed in the nasal olfactory epithelium using either the mix of VR probes or a full-length VR cDNA probe (not shown).
Subsequent analyses of the size of the VR gene family, and the number of VR genes recognized by the VR in situ hybridization probes, allowed us to estimate the number of VR genes expressed by individual neurons (see below).
The size of the VR multigene family To investigate the size of the VR gene family, we hybridized several different mixed VR
gene probes to a mouse genomic library, using high (70°C) or low (55°C) stringency conditions.
A probe prepared from the membrane spanning regions (putative exon 6) of several different cDNA clones hybridized to 59 and 98 clones per haploid genome equivalent, at high and low stringency, respectively. To obtain probes that were potentially more diverse, we amplified internal segments of putative exon3 or 6 from genomic DNA by PCR with degenerate primers.
At high stringency, these probes hybridized to 60-140 clones per haploid equivalent. These results indicate that there are as many as 140 VR genes in the mouse genome.
The VR probes that we used for in situ hybridization each labeled a small percentage of 1 o neurons. To determine how many VR genes each probe recognized, we hybridized probes prepared from the same VR cDNA segments to Southern blots of C57BL/6J mouse genomic DNA which had been digested with Eco RI or Hind III. Each probe hybridized to a small number of restriction fragments. Given the small size of the probes {350-450 bp), most of these fragments should represent at least one gene, provided that there are no introns in the region probed. Consistent with this assumption, the VRZ {SEQ ID NO. 3) probe hybridized to 7 different restriction fragments, as many as five of which could be accounted for by characterized VR cDNAs that were 91-98% identical to VR2 (SEQ ID NO. 3) in the region probed.
Given the number of genes recognized by each VR probe and the percentage G,°+ neurons that hybridized to each, we estimate that each VR gene may be expressed in only ~1.1-1.9% of 2o G,°+ VNs. Since there appear to be 60-140 VR genes in the mouse genome, this suggests that each Gm+ VNO neuron may express only one, or at most a few, VR genes.
Linkage of chromosomal clusters of VR and OR genes We previously found that there are clusters of OR genes at multiple chromosomal sites in the mouse genome (Sullivan, S., et al., Proc. Natl. Acad. Sci., 1996, 93:884-888). To investigate the chromosomal locations of VR genes, we used the Jackson Laboratory Backcross DNA Mapping Panel, which allows the mapping of mouse genes using interspecies mouse crosses.
Probes prepared from the 3' untranslated regions of VR2 (SEQ ID NO. 3) or VR4 cDNAs 3o were first hybridized to Southern blots of genomic DNAs from two mouse species, C57BL/6J
and Mus spretus, which had been digested with different restriction enzymes.
Eco RI digests showed a number of restriction length polymorphisms with both VR probes. The VR probes were then hybridized to Eco RI-digested DNAs from a large panel of different backcross mice ((C57BL/6J x M. spretus) x M. spretus).
The patterns of inheritance of the polymorphic fragments recognized by the two VR
probes allowed us to assign chromosomal locations to approximately 9 VR genes.
Using the VR4 (SEQ ID NO. 7) probe, we could follow the inheritance of 4 polymorphic restriction fragments. All of these cosegregated in the backcrosses, and mapped to the proximal end of chromosome 7 (near D7Bir5). Five restriction fragments were followed for the VR2 (SEQ ID
NO. 3) probe. Again, all of the restriction fragments cosegregated, allowing us to map the VR2 (SEQ ID NO. 3) fragments to the distal end of chromosome 4 (near D4Bir1).
Given the 1 o resolution of the genetic mapping, the cosegregating fragments can be no more than 3.8 cM from one another. These results indicate that VR genes are located near the ends of at least two different mouse chromosomes. They also indicate that highly related VR genes are clustered at the same chromosomal locus, as previously seen in our studies and others (Ben-Arie et al, Human Molecular Genetics, 1994, 3:229-235.).
The VR4 gene subfamily appears to be closely linked to one OR gene locus, (olfRS ) (Sullivan, S., et al., Proc. Natl. Acad. Sci., 1996, 93:884-888). Although the VRs and ORs were mapped in different mouse crosses, the synaptotagmin-3 gene (Syt3 ) was mapped in both crosses, allowing an estimate of their relative positions. The OR locus mapped 15.05 cM
proximal to Syt3 while the VR4 gene cluster mapped 14.89 cM proximal to Syt3.
(Jackson 2o Laboratory Mouse Genome Informatics), suggesting a close linkage between VR
and OR genes at the proximal end of chromosome 7. Our previous studies indicate that multiple OR gene loci arose via a series of duplications of very large chromosomal domains that maintained linkages between OR genes and members of other gene families. These results therefore suggest that VR
genes and OR genes might have been linked in a primitive ancestor. They also suggest the possibility that additional clusters of VR genes might be linked to other OR
gene loci.
Preparation of cDNA Libraries from Isolated VNO Neurons 3o VNOs were dissected from adult (7- to 8-week-old) male Lewis rats (Sprague-Dawley).
Single-cell cDNA synthesis and amplification were performed and checked according to Dulac and Axel (Cell,1995, 83:195-206). Southern blot analysis of single-cell cDNA
was used to detect expression of tubulin, OMP, Go, and Gi2a (Dulac and Axel, Cell, 1995, 83:195-206).
Eighteen cDNAs showed strong hybridization with tubulin and OMP probes, indicating that they originated from mature neurons, and were selected for further study. Cells VN3 and VN13 exhibited high levels of Go expression, whereas VN10 showed presence of Gi2a, indicating the origin of these cells from two distinct regions of the VNO neuroepithelium. VN
13 single-cell cDNA library was prepared according to Dulac and Axel (Cell, 1995, 83:195-206).
Differential Screening of Single-Cell Library Plaque-forming units ( 12 x 103} from the VN 13 library were plated at low density, and 1 o duplicate filters (Hybond N+, Amersham) were hybridized with probes generated from VN 10 and VN 13 single-cell cDNAs, following the procedure described in Dulac and Axel, Cell, 1995, 83:195-206. Ten phage plaques were detected that showed a positive signal unique to the VN13 probe. These plaques were purified, and the corresponding phage inserts were amplified by PCR, run on 1.5% agarose gel, blotted onto nylon filter, and hybridized with the VN10, VN3, and t5 VN13 single-cell cDNA probes.
Isolation and Analysis of Full-Length cDNA Clones A 425 by clone, Go-VN13A, present at the frequency of 0.1% in the VN13 single-cell cDNA library, was selected and in vivo excised to generate the pBlueScriptSK(-) phagemid.
2o High stringency (65 °C) screening of a cDNA library prepared from female rat VNO (Dulac and Axel, Cell, 1995, 83:195-206) with the Go-VN13A cDNA probe led to the isolation of Go-VN13B (SEQ ID NO. 49) , presenting 90% sequence homology with Go-VN13A.
Phages (7.2 x 105) of the female rat VNO library were further screened with the Go-VN13B (SEQ ID
NO. 49) cDNA probe under low stringency conditions: hybridization was carried out at 55 °C for 25 24 hr, and the filters were washed three times at 55°C for 30 min in O.Sx SSC and 0.5% SDS.
A total of 75 positive phages were identified and the corresponding inserts were amplified by PCR and analyzed by Southern blot using the Go-VN13B (SEQ ID NO. 49) probe at both high (65 °C) and low (SS °C) stringency. This led to the identification of 22 cDNA clones with insert sizes longer than 3 kb. Among those, six distinct subfamilies were defined by absence of 3o cross-hybridization under stringent conditions of hybridization and washing. Full-length clones (Go-VN1 to Go-VN6, SEQ ID NOs. 33, 35, 37, 39, 41, 43), each representative of a subfamily, were selected for in vivo excision and sequenced. Go-VN13C (SEQ ID NO. 47) and Go-VNI3B

(SEQ ID NO. 49) are identical sequences differing by a 150 by deletion in Go-VN13C (SEQ ID
NO. 47). This sequence encodes for NMDQCANCPEYQYANTEKNKCIQKGVIVLSYEDPLGMALALIAFCFSAFTV (SEQ ID
NO. 58) in Go-VN13B (SEQ ID NO. 49) and is replaced by an M at position 552 in Go-VN13C
s (SEQ ID NO. 48).
DNA Sequencing and Sequence Analysis DNA sequencing was performed using ABI Prism dye terminator cycle ready reaction (Perkin Elmer, Norwalk, CT ) according to manufacturer's protocol. Samples were run on an ABI
Prism 310 Genetic Analyzer (Perkin Elmer, Norwalk, CT). Sequence homologies were determined using the BLAST system (NIH network service). Pairwise and ClustalW
alignments (BLOSUM30 matrix setting) as well as Kyte-Doolittle hydropathic analysis were obtained with the MacVector sequence analysis software (Oxford Molecular Group).
~ 5 In Situ Hybridization Analysis In situ hybridization was performed as described elsewhere (Schaeren-Wiemers, N., et al., Histochemistry, 1993, 100:431-440). VNOs were dissected from adult male (8- to 9-week-old), adult female (9- to 11-week-old), and young (1-week-old) rats.
Tissues were embedded in Tissue-Tek OCT. Antisense and sense digoxigenin-labeled probes were generated 2o from the full-length cDNAs encoding for Go, Gi2a, Go-VN13B (SEQ ID NO. 49), and Go-VN1 to Go-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41, 43), as well as from the 3' untranslated regions of the Go-VN1 to Go-VN6 clones.
Imaging Processing and Statistical Analysis 2s Digital photographs were captured with a Leitz DMRB microscope (Leica) coupled to a ProgRes3012 digital camera (Kontron Electronic) and further processed with the Photoshop (Adobe System) and Canvas (Deneba) software for Macintosh. The relative positions of cells exhibiting a positive signal by in situ hybridization were measured along the basal-apical axis using the NIH Image analysis software. The number of cells in hemiconcentric sections of 10%
along this axis from the basal (value = 0) to the apical (value =100) boundaries was determined.
', Average data for Go-VN1 and Go-VN3 to Go-VN6 were obtained from six to eight VNO
' sections, corresponding to four individuals analyzed in two independent experiments. For Go-VN2, 14 VNO sections, corresponding to ten individuals and four independent experiments, were analyzed for each sex.
Southern Blot Analysis of Rat Genomic DNA and Screening of Rat and Human Genomic Libraries Genomic DNA, prepared from Lewis rat (Sprague-Dawley) liver, was digested with the restriction enzymes EcoRI and BamHI, size fractionated on 0.8% agarose gels, and blotted onto nylon membrane (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Second Edition, Coid Spring Harbor Laboratory Press, 1989). Membranes were cross-linked under UV
light, hybridized overnight at both high (68°C) and low (55°C) stringency in hybridization buffer, and washed as described above. 32P-labeled probes were generated by random priming, using the following DNA templates: EcoRI-EcoRV, NotI-NsiI, EcoRI-SaII, PstI-NdeI, Xbal-HincII, and EcoRI-NsiI fragments of Go-VN1 to Go-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41, 43), respectively; a full-length (425 bp) insert of Go-VN13A; and a cDNA
fragment including the seven transmembrane domains of Go-VN13B (SEQ ID NO. 49). Plaque-forming units (3 x 105) from rat and human genomic libraries (Stratagene, La Jolla, CA) were screened at low stringency (55 °C) using a mix of 32P-labeled probes prepared from fragments of Go-VN1 to Go-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41, 43) encompassing the transmembrane domains 2 to 7.
The VNO Neuroepitheiium Expresses Two Independent Families of Pheromone Receptors We hypothesized the existence of two distinct families of genes encoding pheromone receptor genes that are selectively colocalized with either the Go protein in the basal half of the vomeronasal neuroepithelium or with the Gi2a protein in the apical region. For simplicity of nomenclature, and with the understanding that the cosegregation of distinct G-protein subunits with independent families of pheromone receptors is consistent but does not demonstrate a functional link, the family of genes encoding putative pheromone receptors that we have previously identified and that colocalize with Gi2a will be named GiZa VN, whereas the novel 3o family of receptors coexpressed with Go and described in this study will be named Go-VN. In the absence of information concerning the nature of the Go-VN receptor molecules, we reiterated the cloning strategy that allowed us to identify a family of putative pheromone receptor genes expressed by GiZa+ neurons (Dulac and Axel, Cell, 1995, 83:195-20b). This strategy was based on the assumption that individual neurons within the VNO are likely to express only one pheromone receptor gene and that transcripts encoding a given receptor represent between 1 and 0.1 % of a single-cell mRNA. Differential screening of cDNA libraries constructed from single-VNO neurons takes advantage of the fact that different cells express different receptors and thus provides an experimental solution to the problem of detecting a specific transcript in a heterogeneous population of neurons. In this attempt, we expected that differential screening of a cDNA library prepared from an isolated Go+, Gi2a VNO neuron would permit the isolation of a class of pheromone receptor genes distinct from the Gi2a VN family of receptor genes.
io A cDNA library prepared from a Go+ neuron (VN13) was dif~'erentially hybridized with s2p-labeled probes prepared from YN13 and from a second VNO neuron cDNA
(VN10). A 425 by cDNA (Go-VN13A) present at a frequency of 0.1% in the VN13-cDNA library showed selective hybridization with VN13 cell probe. Two cDNAs of longer size, Go-VN13B (SEQ ID
NO. 49) and Go-VN13C (SEQ ID NO. 47), were subsequently isolated from a cDNA
library prepared from dissected adult VNOs and showed 90% sequence similarity with Go-VN13A.
Hybridization to VNO cross-sections with digoxigenin-labeled antisense RNA
probe showed that expression of these transcripts is restricted to a small subpopulation of VNO
neurons in a location consistent with the region of Go expression of the neuroepithelium.
The sequence of Go-VN13B (SEQ ID NO. 49) reveals a partial open reading frame that includes seven 2o hydrophobic stretches of 20 amino acids in length. Go-VN13B (SEQ ID NO. 49) sequence does not share any resemblance with the odorant receptor genes nor with the family of putative pheromone receptor genes previously identified (see below). In addition, hybridization of Go-VN13B DNA probe to genomic DNA identified two discrete bands at high stringency and 13 or more at lower stringency, revealing the existence of a family of closely related genes in the rat genome.
Taken together, these data indicate that we have isolated a novel multigene family encoding seven transmembrane domain receptors and expressed by subsets of VNO
neurons from the basal half of the neuroepithelium.
3o Sequences of a New Family of VNO Receptors Recombinant phages from a VNO cDNA library were screened at low stringency with the Go-VN13B (SEQ ID NO. 49) DNA pmbe. Six distinct gene subfamilies were isolated that showed no cross-hybridization under stringent conditions of hybridization and washing. cDNAs Go-VN1 to Go-VN6, each representative of a subfamily, were fully sequenced (SEQ ID Nos 33, 35, 37, 39, 41 and 43).
In Go-VN1 to Go-VN5 cDNAs (SEQ ID Nos 33, 35, 37, 39 and 41), the first methionine of the open reading frame was tentatively chosen as a start for protein translation, revealing large open reading frames ranging from 548 to 866 amino acids. A frame shift in the Go-VN6 (SEQ
ID NO. 44) sequence (amino acid 532; indicated by slash bar in Fig. 3) indicated that this transcript is unable to generate a functional protein.
to Deduced Amino Acid Sequences of cDNAs from the Go-VN Family of Pheromone Receptors The deduced amino acid sequences of eight cDNAs belonging to the Go-VN family of putative pheromone receptors is shown in Figure 3. Predicted position of seven transmembrane domains is also indicated (I-VII). Amino acids common to at least five cDNAs are shaded.
Amino acids common to the rat mGluRl and Ca2+-sensing receptors are indicated by a star.
Hydropathy analysis of the predicted Go-VN proteins with the Kyte-Doolittle algorithm identified a large hydrophilic N-terminal domain that ranges in size from 274 amino acids in Go-VN 1 (SEQ ID NO. 34) to 595 in Go-VN4 (SEQ ID NO. 40). This is preceded in cDNAs Go-VN4 (SEQ ID NO. 40), Go-VN7 (SEQ ID NO. 46), and Go-VN13C (SEQ ID NO. 50) by 2o an initial hydrophobic 21 amino acid segment characteristic of eukaryotic signal sequences. A
cluster of seven hydrophobic regions representing potential membrane-spanning helices and typical of the G protein-coupled receptor superfamily is followed by a short hydrophilic sequence that indicates a potential intracytoplasmic C-terminal domain. A database search indicated the presence of sequence motifs common to Ca2+-sensing and metabotropic glutamate (mGluR) receptors (Houamed, K., et al., Science, 1991, 252:1318-1321; Masu, M., et al., Nature, 1991, 349:760-765; Brown, E., et al., Nature, 1993, 366:575-580 ; Pollak, M., et al., Cell, 1993 75:1297-1303). Pa.irwise sequence alignments reveal 18% to 23% sequence identity between the rat Ca2+-sensing receptor and the most distant (Go-VN3, SEQ ID Nos.37, 38) and the closest (Go-VN1, SEQ ID NOs. 33, 34) Go-VN sequences, respectively. Sequences of rat mGluRl and 3o Go-VN cDNAs appear more distantly related. Several localized regions showed a more pronounced degree of similarity, including a cysteine-rich sequence just preceding the first transmembrane domain (amino acid 206 to 260 in Go-VN1, SEQ ID NO. 34), the predicted transmembrane domains 2 to 7 with surrounding cytoplasmic and extracellular loops, and the relative position of 20 cysteines. The N-terminal and first transmembrane domains show little degree of homology. In mGluR and Ca2+-sensing receptors, the second intracellular loop is involved in providing specificity for G-protein coupling (Gomeza, J., et al., J. Biol. Chem., s 1996, 271:2199-2205), enabling dii~erent classes of mGluR receptors to activate phospholipase C or to inhibit adenylyl cyclase. In Go-VN, this domain is rich in basic residues, as expected for potential G-protein coupling, and shows closer resemblance to the class II and III mGluRs that were shown to couple to Go and Gi subunits. Overall, the six Go-VN sequences share between 42% and 75% sequence identity. Regions of Go-VN proteins downstream of transmembrane domain 2 are nearly identical in all VNO receptor s~uences. In contrast, N-terminal extracellular regions and first transmembrane domains are quite divergent.
Anomalies in Go-VN cDNA Sequences: Two unusual features were observed in the sequence of some Go-VN cDNAs. Iu Go-VN1 (SEQ ID NO. 33) and Go-VN3 (SEQ ID NO.
37) cDNAs, stretches of open reading frame can be found in the 5' extremity of the cDNAs that 15 generate polypeptide sequences of 310 and and 152 amino acids, respectively, which are interrupted by a frameshift in Go-VNl and by an insertion of 500 nucleic acids in Go-VN3. The prospective receptor protein sequences indicated for Go-VN1 (SEQ ID NO. 33) and Go-VN3 (SEQ ID NO. 37) (Fig. 3) start at the next available methionin and are therefore significantly shorter than those of other receptor cDNAs.
2o Go-VN7 (SEQ ID NO. 45) and Go-VN13C (SEQ ID NO. 47) cDNAs show a similar deletion of 150 by located at the exact same position in the sequence.
Strikingly, the 150 by deletion does not alter the own reading frame but generates a gap that encompasses 34 amino acids upstream of the first transmembrane domain and most of the first transmembrane domain itself.
25 Hydropathy analysis of Go-VN7 (SEQ ID NO. 46) and Go-VN13C (SEQ ID NO. 48) protein sequences detects only a seven to eight amino acid long hydrophobic stretch that might not be long enough to replace the deleted transmembrane domain 1 and allow the appropriate folding of the protein. Except for the 150 by gap, sequences of Go-VN13B (SEQ
ID NO. 50) and Go-VN 13 C (SEQ ID NO. 48) are identical. This raises the question as to whether both transcripts 3o might originate from alternative splicing of the same gene. Alternatively, they might be transcribed from independent genes that evolved from recent duplication and deletion events.

Size of the Go-VN Family of Genes We investigated the size of the Go-VN family of receptors by hybridizing 32P-labeled cDNA probes prepared from regions spanning the most divergent N-terminal half of the receptor protein to rat genomic DNA. Individual probes identify two to four discrete bands under s stringent conditions of hybridization and washing. Under conditions of reduced stringency, each of the individual probes now generates a unique pattern of 12 to 20 bands, providing a direct illustration of the existence of a very large family of related genes.
A direct estimate of the size of the Go-VN receptor gene family was obtained by low stringency screening of a rat genomic library. PCR amplification on genomic DNA had indicated 1 o that receptor genes are devoid of introns in the region encompassing transmembrane domains 2 to 7, enabling us to deduce directly the number of genes present in the rat genome. A mix of s2p_labeled DNA probes prepared from the six Go-VN cDNA fragments identified 110 positive clones per haploid genome, indicating that the family of Go-VN receptors may consist of 100 genes.
Expression Pattern of Go-VN Receptors The pattern of expression of the Go-VN receptor genes was examined by in situ hybridization with digoxigenin-labeled RNA antisense probes. No signal was observed after hybridizing the mix of Go-VNl to Go-VN6 (SEQ ID NOs. 33, 35, 37, 39, 41 and 43) receptor 2o probes to sections of muscle, testis, brain, or whole head. The adult olfactory epithelium was also consistently negative, although rare positive cells (one to three cells per section) were observed in the olfactory neuroepithelium of E19 rat embryo. In contrast, strong signals were observed when antisense receptor RNA probes were hybridized to VNO neuroepithelium. In adults, each one of the Go-VN probes detects small subsets of VNO sensory neurons. When hybridization and washing were performed at lower temperature, the number of faintly labeled neurons increased, revealing cross- hybridization to more distant receptor genes.
Under high stringency conditions, cDNA clones Go-VN1 to Go-VN6 label 1.9%, 3.6%, 6.1%, 0.4%, 3.5%, and 1.3% of the VNO sensory neurons, respectively. Under the same experimental conditions, the mix of all six Go-VN RNA probes labels 19% of the cells. This 3o number is similar to the sum of labeled neurons detected with the six individual Go-VN probes (17%), indicating that probes representing the six receptor subfamilies recognize distinct populations of VNO sensory neurons. Spatial Distribution of Go-VN Receptor Transcripts WO 99/x0422 PCT/US98/13680 Positive neurons identified with each of the Go-VN probes were randomly distributed along the anteroposterior and dorso-ventral axis of the VNO neuroepithelium. Most RNA
probes recognize cells that are preferentially localized in the most basal two-thirds of the neuroepithelium corresponding to the zone of Go expression. However, careful examination of adjacent cross-sections of vomeronasal neuroepithelium labeled with each of the Go-VN
probes reveals a well-organized spatial distribution of receptor expression. Different receptors appear preferentially localized in radial zones that define a series of hemiconcentric rings of distinct diameters. This pattern is observed along the entire length of the VNO and is conserved in all animals analyzed. The Go-VN3 (SEQ ID NO. 37) probe, for example, recognizes a subset of 1o neurons that are confined to the most basal third of the VNO
neuroepithelium. In contrast, the Go-VN1 (SEQ ID NO. 33), Go-VN4 (SEQ ID NO. 39), and Go-VNS (SEQ ID NO. 41) RNA
probes identify cells restricted to a hemiconcentric zone immediately apical to the area of Go-VN3 expression, whereas Go-VN2 identifies cells apposed to the apical Iayer of supporting cells. Go-VN6 in turn is found only in sparse cells immediately apposed to the basal membrane.
This is best seen in a statistical representation of Go-VN receptor localization collected from VNO sections and multiple animals that shows a striking conservation of these patterns. Thus, transcription of Go-VN cDNAs appears restricted to one of three circumscribed areas of the VNO
neuroepithelium in a manner quite reminiscent of the odorant receptor gene expression in four zones of the MOE (Ressler, K., et al., Cell, 1993, 73:597-609 ; Vassar, R., et al., Cell, 1993, 74:309-318). Although Go-VN3 (SEQ ID NO. 37) and Go-VN6 (SEQ ID NO. 43) transcripts show a clear segregation in the most basal region of the VNO neuroepithelium, the sequence anomalies found in both transcripts leave the functionality of this area of the neuroepithelium as an open question.
Sexual Dimorphism in Receptor Di$tribution and Age-Related Changes To identify potential sexual dimorphism in Go-VN receptor expression, we systematically hybridized each probe to sections originating from adult male and female rat VNOs. All receptors were equally distributed in males and females with the striking exception of Go-VN2 (SEQ ID
NO. 35). In females, Go-VN2 appears expressed in a large and centrally located region 3o comprising one-third of the neuroepithelium. In sharp contrast, the same probe recognizes in males a cohort of cells in the most apical side of the neuroepithelium, closely apposed to the VNO lumen, and most likely intermingled with Gi2a VNO sensory neurons. Such a difference in the Go-VN2 expression pattern in males and females might result from the expression of the same receptor gene in a different zone of the VNO epithelium or from a differential expression of two distinct but closely related genes of the Go-VN2 subfamily. In females, Go-VN2 generates a very intense hybridization signal to most positive neurons and a fainter staining on s a second set of labeled cells. The population of faintly labeled cells was never detected in males, indicating the existence of a female-specific neuronal subpopulation expressing either a lower level of the Go-VN2 transcript or a female-specific receptor significantly different but still cross-hybridizing to the Go-VN2 probe. We followed the emergence of receptor expression and of the VNO zonal organization during development and postnatal stages preceding puberty.
1o Go-VN receptor expression is first detected in the VNO of E14 embryos. No significant difference is observed in the onset of expression of Gi2a VN and Go-VN classes of receptor genes. In agreement with data of Berghard and Buck, 1996 in mouse, segregation of Gi2a and Go expression in the apical and basal areas of VNO neuroepithelium, respectively, is not apparent in the embryo and in 1-week-old animals. In contrast, Gi~+ cells appear randomly 15 distributed in large clusters over the whole thickness of the neuroepithelium, intermingled with Go cells. At 4 weeks after birth, however, Gi2a cells appear clearly localized in the apex of the epithelium. Similarly, in situ hybridization experiments with mixes of Go-VN
and GiZa VN
receptor probes on sections of the VNOs dissected from late embryos and 1-week-old animals show that the two cell populations are still intermingled at early postnatal stages. We observed 20 that the zonal distribution of the two families of receptors slowly emerges during sexual maturation to reach the spatial distribution observed in adults. Preliminary data indicate that the sexual dimorphic expression pattern of Go-VN2 is undetectable at 6 weeks after birth. Thus, in contrast to the zones of olfactory receptor gene expression, which are already present in the olfactory epithelium at the earliest stages of receptor gene expression in the embryo (Sullivan, 25 S., et al., Neuron, 1995, 15:779-789), the spatial organization of the VNO
neuroepithelium as detected by G-protein and receptor gene expression emerges only in a late postnatal period and reaches its definitive pattern at sexual maturity.
Expression of Go-VN Receptors Is Restricted to Go+ VNp Neurons 3o The expression of some of the Go-VN receptors in neurons lining the VNO
lumen in an area mainly occupied by Gi~+ cells raises the obvious question as to whether the expression of this family of genes is strictly restricted to Go+ VNO neurons. Single-cell cDNA prepared from 23 individual VNO neurons was analyzed by Southern blots with probes representing the six divergent subfamilies of Go-VN receptors and was PCR amplified with degenerated primers based on conserved motifs between Go-VN receptor sequences. Both approaches confirmed that none of the 19 cell cDNAs prepared from Gi2a+ neurons contained any sequence of the Go-VN
receptor family. In contrast, all four cDNAs generated from Gi2a cells contained a sequence related to the Go-VN receptors. PCR products generated with degenerated primers based on conserved motifs between Go-VN receptor sequences and obtained from the four Go+ cells were subcloned and sequenced. For each single-cell cDNA, the insert sequences from ten independent colonies were found to be identical. This set of data strongly suggests that Go-VN receptor 1 o genes are not expressed by Gi2a+ neurons and constitutes preliminary evidence for the expression of only one Go-VN receptor gene per neuron.
Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims. All references 1 s disclosed herein are incorporated by reference in their entirety.
A Sequence Listing is presented below and is followed by what is claimed.

SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: PRESIDENT AND FELLOWS OF HARVARD COLLEGE
(ii) TITLE OF THE INVENTION: NOVEL PHEROMONE RECEPTORS
(iii) NUMBER OF SEQUENCES: 92 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Wolf, Greenfield & Sacks, P.C.
(B) STREET: 600 Atlantic Avenue (C) CITY: Boston (D) STATE: MA
(E) COUNTRY: U.S.A.
(F) ZIP: 02210-2211 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: DOS
(D) SOFTWARE: FastSEQ for Windows Version 2.0 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 60/051,284 (B) FILING DATE: 30-~TC1N-1997 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Plumer, Elizabeth R.
(B) REGISTRATION NUMBER: 36,637 (C) REFERENCE/DOCKET NUMBER: H0498/7074 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 617-720-3500 (B) TELEFAX: 617-720-2441 ( C ) TELEX
(2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3080 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 57...2606 (D) OTHER INFORMATION: VR1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:

Met Lys Gln Leu Cys Ala Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Leu Thr Glu Pro Ser Cys Phe Trp Arg Ile Arg Asn Ser Glu Asp Ser Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Lys Thr Asp Glu Pro Ile Glu Asp Ser Phe Tyr Asn Tyr Asp Leu Ser Phe Arg Ile Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Ile Asp Glu Ile Asn Arg Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Phe Ile Gly Gly Asn Cys Gln Asp Leu Leu ', Arg Val Met Asp Gln Ala Tyr Thr Gln Ile Asn Gly His Met Asn Phe TAT TGT TTA TGT
GCC
ATA
GGT

Val Asn Phe Tyr Asp AspSerCys IleGly Leu Thr Tyr Cys Leu Ala TCA AAA TCC ATG

Gly Pro Trp Thr Leu LysLeuAla HisSer Ser Met Ser Lys Ser Met ', 150 155 160 GTT TTT CCA CTA

Pro Leu Phe Gly Phe AsnProAsn ArgAsp His Asp Val Phe Pro Leu ', CGG CTG CAT CAT GTA GCCCCCAAG ACACAT TTG TCC 635 CCC GTC CAG GAC

Arg Leu His His Val AlaProLys ThrHis Leu Ser Pro Val Gln Asp ATG TCC ATG TGG

His Gly Val Leu Phe HisPheArg ThrTrp Ile Gly Met Ser Met Trp ATC GAT GAC TTT

Leu Val Ser Asp Gln GlyIleGln LeuSer Asp Leu Ile Asp Asp Phe GAA CAA CAT GCT

Arg Glu Ser Arg Gly IleCysLeu PheVal Asn Met Glu Gln His Ala GAA ATG ATA GCT

Ile Pro Asn Gln Tyr MetThrArg ThrIle Tyr Asp.
Glu Met Ile Ala ATG AAG GTT TAT

Lys His Ile Thr Ser Ser Ala Val Ile Ile Gly Glu Met Lys Val Tyr ACT TTT AGA GAG

Met Asn Ser Leu Glu Ala Ser Arg Trp Glu Leu Gly Thr Phe Arg Glu ATC TCA CAA ATC

Ala Arg Arg Trp Ile Thr Thr Trp Asp Val Thr Asn Ile Ser Gln Ile TTC TTC CAT ACT

Lys Lys Asp Thr Leu Asn Leu Gly Ile Ile Phe Glu Phe Phe His Thr TTT TTA AAT CAA

His His Arg Glu Ile Pro Lys Lys Phe Met Thr Met Phe Leu Asn Gln AAA ATT TCT TTG

Asn Thr Ala Tyr Pro Val Asp His Thr Ile Glu Trp Lys Ile Ser Leu AAT AAG AAC ATG

Asn Tyr Phe Cys Ser Ile Ser Ser Ile Arg His His Asn Lys Asn Met AAC TGG ACA AAC

Ile Thr Phe Asn Thr Leu Glu Ser Leu His Tyr Asp Asn Trp Thr Asn AGT AAT TTG GTT

Val Ala Met Asp Glu Gly Tyr Tyr Asn Ala Tyr Ala Ser Asn Leu Val ACC ATT TTT GAG

Val Ala His Tyr His Glu Tyr Gln Gln Val Ser Gln Thr Ile Phe Glu AAA TTC ACT CAG

Lys Lys Ala Pro Lys Arg Tyr Ala Cys Gln Val Ser Lys Phe Thr Gln AAA ACG AAC GAA

Ser Leu Met Thr Arg Val Phe Pro Val Gly Leu Val Lys Thr Asn Glu CAT TGT ACA ATT

Asn Met Lys Arg Glu Asn Gln Glu Tyr Asp Phe Ile His Cys Thr Ile TTT GGA TTA ATA

Ile Trp Asn Pro Gln Gly Leu Lys Val Lys Gly Ser Phe Gly Leu Ile TGT CAA AAA TCT

Tyr Leu Pro Phe Pro Gln Arg Leu His Ile Asp Asp Cys Gln Lys Ser GCC TCA CCT TCC

Leu Glu Trp Lys Gly Gly Thr Gln Val Pro Ser Val Ala Ser Pro Ser Cys Ser Val Ala Cys Thr Ala Gly Phe Arg Lys Ile Tyr Gln Lys Glu Thr Ala Asp Cys Cys Phe Asp Cys Val Gln Cys Pro Glu Asn Glu Ile ~ TCC AAC GAA ACA GAT ATG GAA CAG TGT GTG AGG TGT CCA GAT GAT AAG 1739 _ Ser Asn Glu Thr Asp Met Glu Gln Cys Val Arg Cys Pro Asp Asp Lys Tyr AlaAsn IleGluGln ThrHisCysLeu SerArgAla ValSer Phe Leu AlaTyr GluAspSer LeuGlyMetAla LeuGlyCys MetAla Leu Ser PheSer AlaIleThr IleLeuIleLeu ValThrPhe ValLys Tyr Lys AspThr ProThrVal LysAlaAsnAsn ArgIleLeu SerTyr Ile Leu LeuIle SerLeuVal PheCysPheLeu CysSerLeu LeuPhe Ile Gly ProPro AspGlnVal ThrCysIlePhe GlnGlnThr ThrPhe Gly Val LeuPhe ThrValSer ValSerThrVal LeuAlaLys ThrIle Thr Val ValMet AlaPheLys LeuThrThrPro GlyArgArg MetArg Gly Met MetMet ThrGlyAla ProLysLeuVal IleProIle CysThr Leu Ile GlnLeu ValLeuCys GlyIleTrpLeu ValThrSer ProPro Phe Ile AspArg AspIleGln SerGluHisGly LysIleVal IleLeu Cys Asn LysGly SerValIle AlaPheHisVal ValLeuGly TyrLeu Gly Ser LeuAla LeuGlySer PheThrLeuAla PheLeuAla ArgAsn Leu .

ProAsp ThrPheAsn Glu Lys PheLeuThr PheSerMet LeuVal Ala PheCys SerValTrp IleThrPhe LeuProVal TyrHisSer ThrArg GlyArg ValMetVal ValValGlu ValPheSer IleLeuAla SerSer AlaGly LeuLeuMet CysIlePhe ValProLys CysTyrVal IleLeu IleArg ProAspSer AsnPheIle LysAsnHis LysGlyLys LeuLeu TATTGAAACTTTC GATATTCAAC TTATCTTATT

ATGGTATGAA CTTCAT
AATGTTAGAT

Tyr (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 850 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
Met Lys Gln Leu Cys Ala Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Leu Thr Glu Pro Ser Cys Phe Trp Arg Ile Arg Asn Ser Glu Asp Ser Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Lys Thr Asp Glu Pro Ile Glu Asp Ser Phe Tyr Asn Tyr Asp Leu Ser Phe Arg Ile Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Ile Asp Glu Ile Asn Arg Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Phe Ile Gly Gly Asn Cys Gln Asp Leu Leu Arg Val Met Asp Gln Ala Tyr Thr Gln Ile Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Ile Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys His Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Val Ile Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu Phe His Gly Ile Ile Thr Phe Glu His His Arg Phe Glu Ile Pro Lys Leu Asn Lys Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ile Arg Met His His Ile Thr Phe Asn Asn Thr Leu Glu Trp Thr Ser Leu His Asn Tyr Asp Val Ala Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His GIu Tyr Ile Phe Gln Gln Val Glu Ser Gln Lys Lys Ala Lys Pro Lys Arg Tyr Phe Thr Ala Cys Gln Gln Val Ser Ser Leu Met Lys Thr Arg Val Phe Thr Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gln Cys Thr Glu Tyr Asp Ile Phe Ile Ile Trp Asn Phe Pro Gln Gly Leu Gly Leu Lys Val Lys Ile Gly Ser Tyr Leu Pro Cys Phe Pro Gln Arg Gln Lys Leu His Ile Ser Asp Asp Leu Glu Trp Ala Lys Gly Gly Thr Ser Pro Gln Val Pro Ser Ser Val Cys Ser Val Ala Cys Thr Ala Gly Phe Arg Lys Ile Tyr Gln Lys Glu Thr Ala Asp Cys Cys Phe Asp Cys Val Gln Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asp Met Glu Gln Cys Val Arg Cys Pro Asp Asp Lys Tyr Ala Asn Ile Glu Gln Thr His Cys Leu Ser Arg Ala Val Ser Phe Leu Ala Tyr Glu Asp Ser Leu Gly Met Ala Leu Gly Cys Met Ala Leu Ser Phe Ser Ala Ile Thr Ile Leu Ile Leu Val Thr Phe Val Lys Tyr Lys Asp Thr Pro Thr Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Ile Leu Leu Ile Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu Phe Ile Gly Pro Pro Asp Gln Val Thr Cys Ile Phe Gln Gln Thr Thr Phe Gly Val Leu Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met Arg Gly Met Met Met Thr Gly Ala Pro Lys Leu Val Ile Pro Ile Cys Thr Leu Ile Gln Leu Val Leu Cys Gly Ile Trp Leu Val Thr Ser Pro Pro Phe Ile Asp Arg Asp Ile Gln Ser Glu His Gly Lys Ile Val Ile Leu Cys Asn Lys Gly Ser Val Ile Ala Phe His Val Val Leu Gly Tyr Leu Gly Ser Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Ile Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Arg Val Met Val Val Val Glu Val Phe Ser Ile Leu Ala Ser Ser Ala Gly Leu Leu Met Cys Ile Phe Val Pro Lys Cys Tyr Val Ile Leu Ile Arg Pro Asp Ser Asn Phe Ile Lys Asn His Lys Gly Lys Leu Leu Tyr (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2961 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 86...2509 (D) OTHER INFORMATION: VR2 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:

GTGCAACTGT
GTGTGTGATG
TTTTTCTGCA
TCAGAAACGG
ATTTCACAGC

CAGATCCTAG
CAGAC

MetLysGln LeuCysThr PheThrIle TTG AAG

Ser LeuPhe Leu Phe SerLeuIle LeuCysCys TrpSerGlu Leu Lys AGC AGG

Pro CysPhe Trp Ile LysLysSer GluAspAsn AspGlyAsp Ser Arg CAA CAT

Leu ArgGlu Cys Phe TyrLeuTrp LysThrAsp GluProIle Gln His GAT AAT

Glu SerPhe Tyr Tyr AspLeuSer PheArgIle AlaGlySer Asp Asn TAT CTG

Glu GluLeu Leu Val MetPhePhe AlaThrAsp GluIleAsn Tyr Leu Lys Asn Pro Tyr Leu Leu Pro Asn Met Ser Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Leu Arg Ser Leu Asp Gln Glu Tyr A1a Gln Ile Asp Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp ' 125 130 135 Asp Ser Cys Ala Thr Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe CGC CGG CAT GTC GTA

', Asn ProAsn LeuArgAsp His Asp Leu Pro Val HisGlnVal Arg His j GCC CCCAAG GACACACAT TTG TCC GGC ATG TCC TTGATGTTT 688 CAT GTC

Ala ProLys AspThrHis Leu Ser Gly Met Ser LeuMetPhe His Val CTG TCA

His PheArg TrpThrTrp Ile Gly Val Ile Asp AspAspGln Leu Ser ', 205 210 215 AGA AGC

Gly IleGln PheLeuSer Asp Leu Glu Glu Gln ArgHisGly Arg Ser ', 220 225 230 ATC AAC

Ile CysLeu AlaPheVal Asn Met Pro Glu Met GlnIleTyr Ile Asn I

ACA ATG

Met ThrArg AlaThrIle Tyr Asp Gln Ile Thr SerSerAla Thr Met ATG ACT

', Lys ValVal IleIleTyr Gly Asp Asn Ser Leu GluAlaSer Met Thr GCT ATC

Phe ArgArg TrpGluGlu Leu Gly Arg Arg Trp IleThrThr Ala Ile Thr Gln Trp Asp Val Ile Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu Phe His Gly Thr Ile Thr Phe Ala His His Lys Asp Glu Ile Pro Lys Phe Arg Asn Phe Met Gln Thr Lys Lys Thr Ala Lys Tyr Leu Val Asp I, ATT TCT CAT ACT ATT TTG GAG TGG AAT TAT TTT AAT TGT TCA ATC TCT 1168 IleSer ThrIle Leu Trp TyrPhe CysSer IleSer His Glu Asn Asn TTT AAC

LysAsn SerSerLys MetGlyHis ThrPhe AsnThr LeuGln Phe Asn ATG AGC

TrpThr AlaLeuHis AsnTyrAsp AlaLeu AspGlu GlyTyr Met Ser GTG ACC

AsnLeu TyrAsnAla ValTyrAla AlaHis TyrHis GluTyr Val Thr AAA AAA

IleLeu GlnGlnVal GluSerGln LysAla ProLys ArgTyr Lys Lys TCC AAA

PheThr AlaCysGln GlnValSer LeuMet ThrArg ValPhe Ser Lys AAC CAT

MetAsn ProValGly GluLeuVal MetLys ArgGlu AsnGln Asn His ATT TTT

CysThr GluTyrAsp IlePheIle TrpAsn ProGln GlyLeu Ile Phe TAT TGC

GlyLeu LysValLys ValGlySer LeuPro PhePro LysSer Tyr Cys TTG GCC

GlnGln LeuHisIle AlaAspAsp GluTrp MetGly GlyThr Leu Ala AGA GAT

SerVal AspMetGlu GlnCysVal CysPro AsnLys TyrAla Arg Asp CAA GTG

AsnLeu GluGlnThr HisCysLeu ArgThr SerPhe LeuAla Gln Val CTA ATG

TyrGlu AspProLeu GlyMetAla GlyCys AlaLeu SerPhe Leu Met GTC GTG

SerAla IleThrIle LeuValLeu ThrPhe LysTyr LysAsp Val Val CGC AGC

ThrPro IleValLys AlaAsnAsn IleLeu TyrIle LeuLeu Arg Ser TGT CTC

IleSer LeuValPhe CysPheLeu SerLeu PheIle GlyHis Cys Leu CAG ACA

ProAsp GlnValThr CysIleLeu GlnThr PheGly ValLeu Gln Thr .

TCT GTG AAA ATA

Phe Thr Val Ser Val Thr LeuAla Thr ThrValVal Ser Val Lys Ile ACT CCA AGG AGA

Met Ala Phe Lys Leu Thr GlyArg Met GlyMetMet Thr Pro Arg Arg AAG GTC ATT ACC

Met Thr Gly Ala Pro Leu IlePro Cys LeuIleGln Lys Val Ile Thr ATC TTG TCT CCC

Leu Val Leu Cys Gly Trp ValThr Pro PheIleAsp Ile Leu Ser Pro GAA GGG GTC CTT

Arg Asp Ile Gln Ser His LysIle Ile CysAsnLys Glu Gly Val Leu TTC GTC GGA TTG

Gly Ser Val Val Ala His ValLeu Tyr GlySerLeu Phe Val Gly Leu 700 . 705 710 ACT GCT GCT AAC

Ala Leu Gly Ser Phe Leu PheLeu Arg LeuProAsp Thr Ala Ala Asn AAG CTA AGC CTG

Thr Phe Asn Glu Ala Phe ThrPhe Met ValPheCys Lys Leu Ser Leu TTC CCT CAC ACC

Ser Val Trp Ile Thr Leu ValTyr Ser ArgGlyLys Phe Pro His Thr GAG TTC TTG TCT

Val Met Val Val Val Val SerIle Ala SerAlaGly Glu Phe Leu Ser TTT CCA TAT ATT

Leu Leu Met Cys Ile Val LysCys Val LeuIleArg Phe Pro Tyr Ile ATA AAC GGT TTG

Pro Asp Ser Asn Phe Gln HisLys Lys LeuTyr Ile Asn Gly Leu TAGATGATAT TCTTAATAAA
TCAACTTATC

AAAATAAAGT CAAACTGGAC
AATATACAGA

GAACTGGGAT CCAATATTTT
TCTCAATTGA

AGCCATGTAC GGTTACCCTA
TTAATTAATG

CTCTAGGCAT AAGGGTACTG
GCTGTCCTTG

CCAGTAATCA ATGGAGTTCT
ACATTATTCC

GACTTTATTC GAATAAATAA
AATGTTCTAT

AAAAAAA

(2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 808 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Met Lys Gln Leu Cys Thr Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg Ile Lys Lys Ser Glu Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Lys Thr Asp Glu Pro Ile Glu Asp Ser Phe Tyr Asn Tyr Asp Leu Ser Phe Arg Ile Ala Gly Ser Glu Tyr Glu Leu Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Met Ser Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Leu Arg Ser Leu Asp Gln Glu Tyr Ala Gln Ile Asp Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Thr Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Thr Gln Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly Asp Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Thr Gln Trp Asp Val Ile Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu Phe His Gly Thr Ile Thr Phe Ala His His Lys Asp Glu Ile Pro Lys Phe Arg Asn Phe Met Gln Thr Lys Lys Thr Ala Lys Tyr Leu Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Gly His Phe Thr Phe Asn Asn Thr Leu Gln Trp Thr Ala Leu His Asn Tyr Asp Met Ala Leu Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu Tyr Ile Leu Gln Gln Val Glu Ser Gln Lys Lys Ala Lys Pro Lys Arg Tyr Phe Thr Ala Cys Gln Gln Val Ser Ser Leu Met Lys Thr Arg Val Phe Met Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gln Cys Thr Glu Tyr Asp Ile Phe _ 73 _ Ile Ile Trp Asn Phe Pro Gln Gly Leu Gly Leu Lys Val Lys Val Gly Ser Tyr Leu Pro Cys Phe Pro Lys Ser Gln Gln Leu His Ile Ala Asp Asp Leu Glu Trp Ala Met Gly Gly Thr Ser Val Asp Met Glu Gln Cys Val Arg Cys Pro Asp Asn Lys Tyr Ala Asn Leu Glu Gln Thr His Cys Leu Gln Arg Thr Val Ser Phe Leu Ala Tyr Glu Asp Pro Leu Gly Met Ala Leu Gly Cys Met Ala Leu Ser Phe Ser Ala Ile Thr Ile Leu Val Leu Val Thr Phe Val Lys Tyr Lys Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Ile Leu Leu Ile Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu Phe Ile Gly His Pro Asp Gln Val Thr Cys Ile Leu Gln Gln Thr Thr Phe Gly Val Leu Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met Arg Gly Met Met Met Thr Gly Ala Pro Lys Leu Val Ile Pro Ile Cys Thr Leu Ile Gln Leu Val Leu Cys Gly Ile Trp Leu Val Thr Ser Pro Pro Phe Ile Asp Arg Asp Ile Gln Ser Glu His Gly Lys Ile Val Ile Leu Cys Asn Lys Gly Ser Val Val Ala Phe His Val Val Leu Gly Tyr Leu Gly Ser Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Ile Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Lys Val Met Val Val Val Glu Val Phe Ser Ile Leu Ala Ser Ser Ala Gly Leu Leu Met Cys Ile Phe Val Pro Lys Cys Tyr Val Ile Leu Ile Arg Pro Asp Ser Asn Phe Ile Gln Asn His Lys Gly Lys Leu Leu Tyr (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2907 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/ItEY: Coding Sequence (B) LOCATION: 1...2409 (D) OTIiER INFORMATION: VR3 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:

His Phe Tyr Leu Gly Ala Val Asp Lys Pro Ile Glu Asp Asn Phe Tyr.

TCA TTA AGA GCA GAG

Asn Leu LysPhe Ile Ala Ser GluTyr PheLeu Ser Leu Arg Ala Glu GTA TTT ACT ATC CCT

Leu Met PheAla Asp Glu Asn LysAsn TyrLeu Val Phe Thr Ile Pro CCC ATA ATG ATC AAC

Leu Asn ThrLeu Phe Ser Ile GlyGly CysHis Pro Ile Met Ile Asn TTA AGA GAT TAT AAT

Asp Leu GlyLeu Gln Ala Thr GlnIle GlyHis Leu Arg Asp Tyr Asn AAT GTT TTC TTA TGT

Met Phe AsnTyr Cys Tyr Asp AspSer AlaIle Asn Val Phe Leu Cys CTT GGA TGG TCC GCA

Gly Thr ProSer Lys Thr Leu AsnLeu MetHis Leu Gly Trp Ser Ala TCA CCA TTC TCA AAC

Ser Met LeuVal Phe Gly Phe AsnPro LeuHis Ser Pro Phe Ser Asn CAT CGG CAT CAA AAG

Asp Asp LeuHis Val His Val AlaThr AspThr His Arg His Gln Lys TTG CAT GTC ATG AGA

His Ser GlyIle Ser Leu Phe HisPhe TrpThr Leu His Val Met Arg ATA CTG TCA GAC CAG

Trp Gly ValIle Asp Asp Lys GlyIle PheLeu Ile Leu Ser Asp Gln GAT AGA AGC CAT TTA

Ser Leu GluGlu Gln Arg Gly IleCys AlaPhe Asp Arg Ser His Leu AAT ATC AAC ATA AGG

Val Met ProGlu Met Gln Tyr MetThr AlaThr Asn Ile Asn Ile Arg TAT AAA ATG TTA GTT

Ile Asp GlnIle Thr Ser Ala LysVal IleIle Tyr Lys Met Leu Val GGT ATG ACA GTA AGA

Tyr Glu AsnSer Leu Glu Ser PheArg TrpGlu Gly Met Thr Val Arg TTA GCT ATC ACA TGG

Asn Gly ArgArg Trp Ile Thr SerGln AspVal Leu Ala Ile Thr Trp ACA AAA TTC AAT GGG

Ile Asn LysGlu Thr Leu Leu PheHis ThrIle Thr Lys Phe Asn Gly Thr Phe Ala His Arg Arg Phe Glu Ile Pro Lys Phe Lys Lys Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Asp His IleThr Asn Thr GluTrp Leu His Phe Asn Leu Thr Ala ATG GAT GGT TTG

Asn Tyr AspMetVal Ser Glu TyrAsn TyrAsn Ala Met Asp Gly Leu CAC TAC GAA TTT

Val Tyr AlaValAla Thr His HisIle GlnGln Val His Tyr Glu Phe GCA CCC AGA ACT

Glu Ser GlnLysLys Lys Lys PhePhe ValCys Gln Ala Pro Arg Thr ATG ACC GTA AAC

Gln Val SerSerLeu Lys Arg PheThr ProVal Gly Met Thr Val Asn AAG AGG AAT ACA

Glu Leu ValAsnMet His Glu GlnCys GluTyr Asp Lys Arg Asn Thr AAC CCA GGC TTA

Ile Phe LeuIleTrp Phe Gln LeuGly LysVal Lys Asn Pro Gly Leu CCT TTT CAG GAA

Ile Gly SerTyrLeu Cys Pro ArgGln LeuHis Ile Pro Phe Gln Glu TGG ATG GGA GTG

Ser Asp AspLeuGlu Ala Gly ThrSer ValPro Ser Trp Met Gly Val GCA ACT GGA AAA

Ser Val CysSerVal Cys Ala PheArg IleHis Gln Ala Thr Gly Lys TGC TTT TGT TGC

Lys Glu ThrAlaAsp Cys Asp ValGln ProGlu Asn Cys Phe Cys Cys ACA ATG CAG AAG

Glu Val SerAsnGlu Asp Glu CysVal CysPro Tyr Thr Met Gln Lys ATA AAA CAC TCA

Asp Lys TyrAlaAsn Glu Thr CysLeu ArgAla Val Ile Lys His Ser GAA CCA GGG CTA .

SerPhe Leu Tyr Glu Asp LeuGlyIle LeuGlyCys Ile Ala Pro Ala ACA

AlaLeu SerPheSer Ala Ile IleLeuValLeu IleThrPhe Leu Thr GTG

LysTyr LysAspThr Pro Ile LysAlaAsnAsn ArgIleLeu Ser Val GTC

TyrIle LeuLeuIle Ser Leu PheCysPheLeu CysSerLeu Leu Val GTC

PheIle GlyHisPro Asn Gln SerCysValLeu GlnGlnThr Thr Val TCT

PheGly ValPhePhe Thr Val ValSerThrVal LeuAlaLys Thr Ser AAG

IleThr ValValMet Ala Phe LeuThrThrPro GlyArgArg Met Lys GCA

ArgGlu MetLeuVal Thr Gly ProLysLeuVal IleProIle Cys Ala TGT

ThrLeu IleGlnPhe Val Leu GlyIleTrpLeu IleThrSer Pro Cys CAA

ProPhe IleAspArg Asp Ile SerGluHisGly LysIleVal Ile Gln ATT

LeuCys AsnLysGly Ser Val AlaPheHisVal ValLeuGly Tyr Ile AGC

LeuGly SerLeuAla Leu Gly PheThrLeuAla PheLeuAla Arg Ser GAA

AsnLeu ProAspThr Phe Asn AlaLysPheLeu ThrPheSer Met Glu ATC

LeuVal PheCysSer Val Trp ThrPheLeuPro ValTyrHis Ser Ile GTT

ThrArg GlyLysVal Met Val ValGluValPhe SerIleLeu Ala Val TGT

SerSer AlaGlyLeu Leu Met IlePheValPro LysCysTyr Val Cys AAT

IleLeu ValArgPro Asp Ser PheIleArgLys TyrLysAsp Lys Asn .

_77_ Phe Arg Tyr ATAAAAATTT AAATAATATA CAAATTTGAA

{2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 803 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal {xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
His Phe Tyr Leu Gly Ala Val Asp Lys Pro Ile Glu Asp Asn Phe Tyr Asn Ser Leu Leu Lys Phe Arg Ile Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ala Tyr Thr Gln Ile Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Ile Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Asn Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Ser Phe Asn Pro Asn Leu His Asp His Asp Arg Leu His His Val His Gln Val Ala Thr Lys Asp Thr His Leu Ser His Gly Ile Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Lys Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys Gln Ile Met Thr Ser Leu Ala Lys Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Asn Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Val Ile Thr Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Thr Ile Thr Phe Ala His Arg Arg Phe Glu Ile Pro Lys Phe Lys Lys Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile .

_78_ Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Asp His Ile Thr Phe Asn Asn Thr Leu Glu Trp Thr Ala Leu His Asn Tyr Asp Met Val Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu His Ile Phe Gln Gln Val Glu Ser Gln Lys Lys Ala Lys Pro Lys Arg Phe Phe Thr Val Cys Gln Gln Val Ser Ser Leu Met Lys Thr Arg Val Phe Thr Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gln Cys Thr Glu Tyr Asp Ile Phe Leu Ile Trp Asn Phe Pro Gln Gly Leu Gly Leu Lys Val Lys Ile Gly Ser Tyr Leu Pro Cys Phe Pro Gln Arg Gln Glu Leu His Ile Ser Asp Asp Leu Glu Trp Ala Met Gly Gly Thr Ser Val Val Pro Ser Ser Val Cys Ser Val Ala Cys Thr Ala Gly Phe Arg Lys Ile His Gln Lys Glu Thr Ala Asp Cys Cys Phe Asp Cys Val Gln Cys Pro Glu Asn Glu Val Ser Asn Glu Thr Asp Met Glu Gln Cys Val Lys Cys Pro Tyr Asp Lys Tyr Ala Asn Ile Glu Lys Thr His Cys Leu Ser Arg Ala Val Ser Phe Leu Ala Tyr Glu Asp Pro Leu Gly Ile Ala Leu Gly Cys Ile Ala Leu Ser Phe Ser Ala Ile Thr Ile Leu Val Leu Ile Thr Phe Leu Lys Tyr Lys Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Ile Leu Leu Ile Ser Leu Val Phe Cys Phe Leu Cys Ser Leu Leu Phe Ile Gly His Pro Asn Gln Val Ser Cys Val Leu Gln Gln Thr Thr Phe Gly Val Phe Phe Thr Val Ser Val Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Met Ala Phe Lys Leu Thr Thr Pro Gly Arg Arg Met Arg Glu Met Leu Val Thr Gly Ala Pro Lys Leu Val Ile Pro Ile Cys Thr Leu Ile Gln Phe Val Leu Cys Gly Ile Trp Leu Ile Thr Ser Pro Pro Phe Ile Asp Arg Asp Ile Gln Ser Glu His Gly Lys Ile Val Ile Leu Cys Asn Lys Gly Ser Val Ile Ala Phe His Val Val Leu Gly Tyr Leu Gly Ser Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Ile Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Lys Val Met Val Val Val Glu Val Phe Ser Ile Leu Ala Ser Ser Ala Gly Leu Leu Met Cys Ile Phe Val Pro Lys Cys Tyr Val Ile Leu Val Arg Pro Asp Ser Asn Phe Ile Arg Lys Tyr Lys Asp Lys Phe Arg Tyr (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3625 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 117...2672 (D) OTHER INFORMATION: VR4 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
', TGAATATGCA 60 ATAAACCTCA
CATTTGCACA
AAGAAATAAA
AGCTGGTAGA
AATCTGATGT

GCTGATATGC ATGGCACTTC TTAAGGCAGG

ACAATCCGCA AAAAAG
CTGCCCAGGT ATG

Met TTC AAT ACA

Phe IlePhe Met Gly Val Phe LeuLeu Ile Leu Leu Met Phe Asn Thr ', GCC AATTTC ATT GAT CCC AGG TTTTGG ATA TTG GAT GAA 215 TGC AGA AAT

Ala AsnPhe Ile Asp Pro Arg PheTrp Ile Leu Asp Glu Cys Arg Asn TTA GCT ATC

', Ile ThrAsp Glu Tyr Leu Gly SerCys Phe Leu Ala Ala Leu Ala Ile GAT AAC ACT

Val GlnThr Pro Ile Glu Lys TyrPhe Thr Leu Asn Phe Asp Asn Thr j 50 55 60 65 AAA TTG TTG

Leu LysThr Thr Lys Asn His TyrAla Ala Vai Phe Ala Lys Leu Leu CCT TTA AAT

Met AspGlu Ile Asn Arg Tyr AspLeu Pro Met Ser Leu Pro Leu Asn Ile Ile Arg Tyr Ser Leu Gly His Cys Asp Gly Lys Thr Val Thr Pro Thr Pro Tyr Leu Phe His Arg Lys Lys Gln Ser Pro Ile Pro Asn Tyr Phe Cys Asn Glu Glu Ser Met Cys Ser Phe Leu Leu Ser Gly Pro Asn Trp Asp Glu Ser Leu Ser Phe Trp Lys Tyr Leu Asp Ser Phe Leu Ser ', 150 155 160 Pro Arg Ile Leu Gln Leu Ser Tyr Gly Ser Phe Ser Ser Ile Phe Ser .

CCC TAT GCC GAC

AspAsp Glu Gln Tyr Tyr Leu Gln Met Pro Lys Thr Pro Tyr Ala Asp ATG TTC TAT TGG

SerLeu Ala Leu Ala Val Ser Ile Leu Leu Lys Asn Met Phe Tyr Trp ATC GAT GGA TTT

TrpIle Gly Leu Val Pro Asp Asp Gln Asn Gln Leu Ile Asp Gly Phe CAG AAC ATT GCC

LeuGlu Leu Lys Lys Ser Glu Lys Glu Cys Phe Phe Gln Asn Ile Ala GTT GTT CCA ACT

ValLys Met Ile Ser Asp Glu Ser Phe Gln Lys Glu Val Val Pro Thr ATT TCA AAT ATC

IleAsn Tyr Lys Gln Val Lys Leu Thr Val Ile Ile Ile Ser Asn Ile AAT GAT TTC TGG

TyrGly Glu Thr Tyr Phe Ile Leu Ile Arg Met Glu Asn Asp Phe Trp AGA ATC AAA AAT

ProPro Ile Leu Gln Ile Trp Thr Thr Gln Leu Phe Arg Ile Lys Asn GAC CAT TTC TCA

ProThr Ser Lys Thr Ile Ser Asp Thr Tyr Gly Leu Asp His Phe Ser CAT ATT TTT TTT

ThrPhe Leu Pro His Gly Glu Ser Gly Lys Asn Val His Ile Phe Phe CTC ACA TGT ATG

GinThr Trp Phe His Arg Asn Asp Leu Leu Val Pro Leu Thr Cys Met AAC GAC TCT AAA

GluTrp Lys Tyr Ile Ser Glu Ser Ala Asn Cys Ile Asn Asp Ser Lys TCT TCA TGG GAA

LeuLys Asn Ser Ser Asp Ala Phe Asp Leu Met Glu Ser Ser Trp Glu TTT AAT AAC AAT

LysLeu Asp Met Ala Ser Glu Ser His Ile Tyr Ala Phe Asn Asn Asn CAT CAT AAT CAG

ValHis Ala Ile Ala Ala Leu Glu Met Leu Gln Ala His His Asn Gln GAT AAA AGT TGC

AspAsn Gln Ala Ile Asn Gly Gly Ala Ser His Leu Asp Lys Ser Cys W0.99I00422 PCTNS98/13680 Lys Val Asn Ser Phe Leu Arg Arg Thr Tyr Phe Thr Asn Pro Leu Gly Asp Lys Val Phe Met Lys Gln Arg Val Ile Met Gln Asp Glu Tyr Asp Ile Val His Phe Ala Asn Leu Ser Gln His Leu Gly Ile Lys Met Lys Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu Tyr Val Asp Met Ile Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser ', 500 505 510 Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys ', 515 520 525 Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Val Asn Cys Pro Glu ', Tyr Gln Tyr Ala Asn Thr Glu Gln Asn Lys Cys Ile Gln Lys Gly Val Thr Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Met Ala Phe CysPhe SerAlaPheThr Val LeuCys PheVal Ala Val Val AAT AGC

Lys His HisAsp ThrProIleVal LysAla AsnArg LeuSer Asn Ser TTT TCC

Tyr Leu LeuLeu MetSerLeuMet PheCys LeuCys PhePhe Phe Ser GTC CAA

Phe Ile GlyLeu ProAsnLysVal IleCys LeuGln IleThr Val Gln ACA GCC

Phe Gly IleVal PheThrValAla ValSer ValLeu LysThr Thr Ala GTC AGA

', Val Thr ValVal LeuAlaPheLys ValThr ProGly ArgLeu Val Arg AGA TAC TTC CTT GTA TCA GGG ACA CTA AAC TAC ATT ATT CCT ATA TGT. 2231 Arg Tyr Phe Leu Val Ser Gly Thr Leu Asn Tyr Ile Ile Pro Ile Cys GTC TCT CCT

Ser Leu Leu Gln Cys Val Leu Cys Ala Ile Trp Leu Ala Val Ser Pro ATC ATC ATT

Pro Phe Val Asp Ile Asp Glu His Ser Gln His Gly His Ile Ile Ile CTT GGA TAC

Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr TTG GCC AAG

Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Lys TTC AGC ATG

Asn Leu Pro Asp Ala Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met TAC CAT AGC

Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser ATC TTG GCA

Thr Lys Gly Lys His Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala ATT TAT ATC

Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Val Pro Lys Ile Tyr Ile AGA GAA AAA

Ile Leu Met Arg Pro Glu Arg Asn Ser Thr Gln Lys Ile Arg Glu Lys ACATAACCA

Ser Tyr Phe GGTTGCTCTA

AATCTTGCAC

CAATTTTATT

GTTGATAAGG

GGTTACACAT

ATAATCAGCA

AGAAAATACT

GAAATGTTCC

CAGGGATTCT

ATTCTCAACA

TACACAAGCT

CAGTGGGAGA

GCATTGGGGA

GTCAGTGGGG

AATAAATTAA

AAAA

(2) INFORMATION FOR SEQ ID NO: B:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 852 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:8:
Met Phe Ile Phe Met Gly Val Phe Phe Leu Leu Asn Ile Thr Leu Leu Met Ala Asn Phe Ile Asp Pro Arg Cys Phe Trp Arg Ile Asn Leu Asp Glu Ile Thr Asp Glu Tyr Leu Gly Leu Ser Cys Ala Phe Ile Leu Ala Ala Val Gln Thr Pro Ile Glu Lys Asp Tyr Phe Asn Thr Thr Leu Asn Phe Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala Met Asp Glu Ile Asn Arg Tyr Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Arg Tyr Ser Leu Gly His Cys Asp Gly Lys Thr Val Thr Pro Thr Pro Tyr Leu Phe His Arg Lys Lys Gln Ser Pro Ile Pro Asn Tyr Phe Cys Asn Glu Glu Ser Met Cys Ser Phe Leu Leu Ser Gly Pro Asn Trp Asp Glu Ser Leu Ser Phe Trp Lys Tyr Leu Asp Ser Phe Leu Ser Pro Arg Ile Leu Gln Leu Ser Tyr Gly Ser Phe Ser Ser Ile Phe Ser Asp Asp Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Ala Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe Ile Leu Tyr Leu Lys Trp Asn Trp Ile Gly Leu Val Ile Pro Asp Asp Asp Gln Gly Asn Gln Phe Leu Leu Glu Leu Lys Lys Gln Ser Glu Asn Lys Glu Ile Cys Phe Ala Phe Val Lys Met Ile Ser Val Asp Glu Val Ser Phe Pro Gln Lys Thr Glu Ile Asn Tyr Lys Gln Ile Val Lys Ser Leu Thr Asn Val Ile Ile Ile Tyr Gly Glu Thr Tyr Asn Phe Ile Asp Leu Ile Phe Arg Met Trp Glu Pro Pro Ile Leu Gln Arg Ile Trp Ile Thr Thr Lys Gln Leu Asn Phe Pro Thr Ser Lys Thr Asp Ile Ser His Asp Thr Phe Tyr Gly Ser Leu Thr Phe Leu Pro His His Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Trp Phe His Leu Arg Asn Thr Asp Leu Cys Leu Val Met Pro Glu Trp Lys Tyr Ile Asn Ser Glu Asp Ser Ala Ser Asn Cys Lys Ile Leu Lys Asn Ser Ser Ser Asp Ala Ser Phe Asp Trp Leu Met Glu Glu Lys Leu Asp Met Ala Phe Ser Glu Asn Ser His Asn Ile Tyr Asn _ 385 390 395 400 Ala Val His Ala Ile Ala His Ala Leu His Glu Met Asn Leu Gln Gln Ala Asp Asn Gln Ala Ile Asp Asn Gly Lys Gly Ala Ser Ser His Cys Leu Lys Val Asn Ser Phe Leu Arg Arg Thr Tyr Phe Thr Asn Pro Leu Gly Asp Lys Val Phe Met Lys Gln Arg Val Ile Met Gln Asp Glu Tyr Asp Ile Val His Phe Ala Asn Leu Ser Gln His Leu Gly Ile Lys Met Lys Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu Tyr Val Asp Met Ile Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Val Asn Cys Pro Glu Tyr Gln Tyr Ala Asn Thr Glu Gln Asn Lys Cys Ile Gln Lys Gly Val Thr Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Met Ala Phe Cys Phe Ser Ala Phe Thr Ala Val Val Leu Cys Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr Leu Leu Leu Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Leu Pro Asn Lys Val Ile Cys Val Leu Gln Gln Ile Thr Phe Gly Ile Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr Val Thr Val Val Leu Ala Phe Lys Val Thr Val Pro Gly Arg Arg Leu Arg Tyr Phe Leu Val Ser Gly Thr Leu Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Val Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Gln His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Lys Asn Leu Pro Asp Ala Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys His Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Val Pro Lys Ile Tyr Ile Ile Leu Met Arg Pro Glu Arg Asn Ser Thr Gln Lys Ile Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3125 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...2169 (D) OTHER INFORMATION: VR5 (xi) SEQUENCE
DESCRIPTION:
SEQ ID N0:9:

AGT TCA CTT GGA

Ile Cys Asn Glu Glu Met Cys PheLeu Ser Pro Asn Ser Ser Leu Gly AGT AAG GAC TTC

Trp Asp Glu Ser Leu Phe Trp TyrLeu Ser Leu Ser Ser Lys Asp Phe CTT GGA AGT ATC

Pro His Ile Leu Gln Ser Tyr SerPhe Ser Phe Ser Leu Gly Ser Ile CCC TAT GCC AAG

Asp Asp Glu Gln Tyr Tyr Leu GlnMet Pro Asp Thr Pro Tyr Ala Lys ATG TTC TAT AAA

Ser Leu Ala Leu Ala Val Ser IleLeu Leu Trp Asn Met Phe Tyr Lys ATC GAC GGA CAA

Trp Ile Gly Leu Val Pro Asp AspGln Asn Phe Leu Ile Asp Gly Gln ', 85 90 95 CAG AAC ATT TTT

Leu Glu Leu Lys Lys Ser Glu LysGlu Cys Ala Phe Gln Asn Ile Phe Val Lys Met Ile Ser Val Asp Glu Val Ser Phe Pro Gln Lys Thr Glu Ile Tyr Tyr Lys Gln Ile Val Lys Ser Leu Thr Asn Val Ile Ile Ile ', 130 135 140 Tyr Gly Glu Thr Tyr Asn Phe Ile Asp Leu Ile Phe Arg Met Trp Glu ATT CAG
AGA
ATA
TGG
ATC
ACC
ACA
AAA
CAA
TTG
AAT

i Pro Pro Leu Arg Ile Trp ThrThrLys Gln Leu Phe Ile Gln Ile Asn AGT ACA CAT TCA

Pro Thr Lys Asp Ile Ser AspThrPhe Tyr Gly Leu Ser Thr His Ser CTA CAC ATT TTT

Thr Phe Pro His Gly Glu SerGlyPhe Lys Asn Val Leu His Ile Phe TGG CAT ACA ATG

Gln Thr Phe Leu Arg Asn AspLeuTyr Leu Val Pro Trp His Thr Met AAA ATT GAC AAA

', Glu Trp Tyr Asn Ser Glu SerAlaSer Asn Cys Ile Lys Ile Asp Lys WO 99/00422 PCTlUS98/13680 AAG TCA TGG CTA ATG
AAC

Leu LysAsnSer Ser Asp Ala Ser Phe Asp Met Glu Gln Ser Trp Leu GCC AAC ATA

Lys LeuAspMet Phe Ser Asp Asn Ser His Tyr Asn Val Ala Asn Ile GCC AAT CTG

Val HisAlaIle His Ala Leu His Glu Met Gln Gln Ala Ala Asn Leu ATA AGT TCT

Asp AsnGlnAla Asp Asn Gly Lys Gly Ala His Cys Leu Ile Ser Ser TTT ACT AAT

Lys ValAsnSer Leu Arg Arg Thr Tyr Phe Pro Leu Gly Phe Thr Asn ATG CAG GAT

Asp LysValPhe Lys Gln Arg Val Ile Met Glu Tyr Asp Met Gln Asp GCG GGG ATT

Ile ValHisPhe Asn Leu Ser Gln His Leu Lys Met Lys Ala Gly Ile AGC CGA CAC

Leu GlyLysPhe Pro Tyr Leu Pro His Gly Ser His Leu Ser Arg His ATT AGA AAG

Tyr ValAspMet Glu Leu Ala Thr Gly Arg Met Pro Ser Ile Arg Lys GCA AGA AGA

Ser ValCysSer Asp Cys Ser Pro Gly Phe Leu Trp Lys Ala Arg Arg GCC CCC TGC

Glu GlyMetAla Cys Cys Phe Val Cys Ser Pro Glu Asn Ala Pro Cys GAG GTG AAT

Glu IleSerAsn Thr Asn Met Asp Gln Cys Cys Pro Glu Glu Val Asn AAC ATT CAG

Tyr GlnTyrAla Thr Glu Gln Asn Lys Cys Lys Gly Val Asn Ile Gln TAT GCA CTT

Thr PheLeuSer Glu Asp Pro Leu Gly Met Ala Leu Met Tyr Ala Leu TCT CTT TGT

Ala PheCysPhe Ala Phe Thr Ala Val Val Val Phe Val Ser Leu Cys ACT AAC AGA

Lys HisHisAsp Pro Ile Val Lys Ala Asn Ser Leu Ser Thr Asn Arg TAT CTATTACTC TCA CTC ATG TTC TGT TTT TCC TTT TTC.1536 ATG CTG TGC

_ _87_ Tyr Leu LeuLeuMet SerLeuMet PheCysPheLeu CysSerPhe Phe Phe Ile GlyLeuPro AsnLysVal IleCysValLeu GlnGlnIle Thr Phe Gly IleValPhe ThrValAla ValSerThrVal LeuAlaLys Thr Val Thr ValValLeu ATaPheLys ValThrAspPro GlyArgArg Leu Arg Tyr PheLeuVal SerGlyThr LeuAsnTyrIle IleProIle Cys 565 5?0 575 Ser Leu LeuGlnCys ValLeuCys AlaIleTrpLeu AlaValSer Pro Pro Phe ValAspIle AspGluHis SerGlnHisGly HisIleIle Ile Val Cys AsnLysGly SerValThr AlaPheTyrCys ValLeuGly Tyr Leu Ala CysLeuAla LeuGlySer PheThrLeuAla PheLeuAla Lys Asn Leu ProAspAla PheAsnGlu AlaLysPheLeu ThrPheSer Met Leu Val PheCysSer ValTrpVal ThrPheLeuPro ValTyrHis Ser Thr Lys GlyLysHis MetValAla ValGluIlePhe SerIleLeu Ala Ser Ser AlaGlyMet LeuGluCys IlePheValPro LysIleTyr Ile ', ., Ile Leu Met Arg Pro Glu Arg Asn Ser Thr Gln Lys Ile Arg Glu Lys ' 705 710 715 720 Ser Tyr Phe ', CTTCGTTTTG ATTTCATGGA GATTGCCCTC TGGTAACTTC CAAAAACCGT TGATAAGGCA 2458 ', AAACCATCTA CCAAATCAAA TAATCAATGA GAAACACAGA CTAACTAAAT AATCAGCAAA 2578 _88_ (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 723 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
Ile Cys Asn Glu Glu Ser Met Cys Ser Phe Leu Leu Ser Gly Pro Asn Trp Asp Glu Ser Leu Ser Phe Trp Lys Tyr Leu Asp Ser Phe Leu Ser Pro His Ile Leu Gln Leu Ser Tyr Gly Ser Phe Ser Ser Ile Phe Ser Asp Asp Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Ala Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe Ile Leu Tyr Leu Lys Trp Asn Trp Ile Gly Leu Val Ile Pro Asp Asp Asp Gln Gly Asn Gln Phe Leu Leu Glu Leu Lys Lys Gln Ser Glu Asn Lys Glu Ile Cys Phe Ala Phe Val Lys Met Ile Ser Val Asp Glu Val Ser Phe Pro Gln Lys Thr Glu Ile Tyr Tyr Lys Gln Ile Val Lys Ser Leu Thr Asn Val Ile Ile Ile Tyr Gly Glu Thr Tyr Asn Phe Ile Asp Leu Ile Phe Arg Met Trp Glu Pro Pro Ile Leu Gln Arg Ile Trp Ile Thr Thr Lys Gln Leu Asn Phe Pro Thr Ser Lys Thr Asp Ile Ser His Asp Thr Phe Tyr Gly Ser Leu Thr Phe Leu Pro His His Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Trp Phe His Leu Arg Asn Thr Asp Leu Tyr Leu Val Met Pro Glu Trp Lys Tyr Ile Asn Ser Glu Asp Ser Ala Ser Asn Cys Lys Ile Leu Lys Asn Ser Ser Ser Asp Ala Ser Phe Asp Trp Leu Met Glu Gln Lys Leu Asp Met Ala Phe Ser Asp Asn Ser His Asn Ile Tyr Asn Val Val His Ala Ile Ala His Ala Leu His Glu Met Asn Leu Gln Gln Ala Asp Asn Gln Ala Ile Asp Asn Gly Lys Gly Ala Ser Ser His Cys Leu Lys Val Asn Ser Phe Leu Arg Arg Thr Tyr Phe Thr Asn Pro Leu Gly Asp Lys Val Phe Met Lys Gln Arg Val Ile Met Gln Asp Glu Tyr Asp _ _ _89_ Ile Val His Phe Ala Asn Leu Ser Gln His Leu Gly Ile Lys Met Lys Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu Tyr Val Asp Met Ile Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Val Asn Cys Pro Glu Tyr Gln Tyr Ala Asn Thr Glu Gln Asn Lys Cys Ile Gln Lys Gly Val Thr Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Met Ala Phe Cys Phe Ser Ala Phe Thr Ala Val Val Leu Cys Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr Leu Leu Leu Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Leu Pro Asn Lys Val Ile Cys Val Leu Gln Gln Ile Thr Phe Gly Ile Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr Val Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Arg Leu Arg Tyr Phe Leu Val Ser Gly Thr Leu Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Val Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Gln His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Leu Ala Phe Leu Ala Lys Asn Leu Pro Asp Ala Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys His Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu Glu Cys Ile Phe Val Pro Lys Ile Tyr Ile Ile Leu Met Arg Pro Glu Arg Asn Ser Thr Gln Lys Ile Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1889 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:

GCCATTGTTT

GACCACAAAG

TGTACGGCTT

GAAAATATGG

CTAATGCGAA

CCCCATATTA

TTTAAGCACA

AAAAAATACC

TTTTCTGATA

TTACCTAGTC

GTGTACGCTG

TGTGAAAATG

ATTGAGGTGA

CTTAACCTCT

GCAAATGCTC

ATATTTTCAG

GTAACCCTGG

ATTTCTAATG

ACAGAGAAGA

GGGATGGCTC

ATATTTGTGA

ACTTTGCTCA

AACACAGTTG

GCCACTGTGT

AGAATGGTAA

CTGATCCAAC

GATGCTCATA

TTCCACTCTG

TTGTCAAGAA

GTATTCTTCT

ATGGTCGCCG

(2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 604 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Ser Leu Ser Leu Ala Ile Val Ser Leu Met Val His Phe Arg Trp Ser Trp Val Gly Leu Ile Leu Pro Asp Asp His Lys Gly Asn Lys Ile Leu Ser Asp Phe Arg Lys Glu Met GIu Arg Lys Arg Ile Cys Thr Ala Phe Val Lys Met Ile Pro Ala Thr Trp Thr Ser Ser Phe Val Lys Phe Trp Glu Asn Met Asp Asp Thr Asn Ile Ile Ile Ile Tyr Gly Asp Ile Asp Ser Leu Glu Gly Leu Met Arg Asn Ile Gly Gln Arg Leu Leu Thr Trp His Val Trp Val Met Asn Ile Glu Pro His Ile Ile Glu Tyr Asp Asn Tyr Phe Met Leu Asp Ser Phe His Gly Ser Leu Ile Phe Lys His Asn Tyr Arg Glu Asn Phe Glu Phe Thr Lys Phe Ile Arg Thr Val Asn Pro Lys Lys Tyr Pro Glu Asp Ile Tyr Leu Pro Lys Met Trp Tyr Leu Phe Phe Met Cys Ser Phe Ser Asp Ile Asn Cys Gln Val Leu Asp Ser Cys Gln Thr Asn Ala Ser Leu Asp Met Leu Pro Ser Gln Ile Phe Asp Val Val Met Ser Glu Glu Ser Thr Ser Ile Tyr Asn Ala Val Tyr Ala Val Ala His Ser Leu His Glu Met Arg Leu Gln Gln Leu Gln Thr Gln Pro Cys Glu Asn Glu Glu Gly Met Glu Phe Phe Pro Trp Gln Leu Asn Thr Phe Leu Lys Asp Ile Glu Val Arg Val Asn Ser Leu Asp Trp Arg Gln Arg Ile Asp Ala Glu Tyr Asp Ile Leu Asn Leu Trp Asn Leu Pro Lys Gly Leu Gly Leu Lys Val Lys Ile Gly Asn Phe Tyr Ala Asn Ala Pro Gln Gly Gln Gln Leu Ser Leu Ser Glu Gln Met Ile Gln Trp Pro Glu Ile Phe Ser Glu Ile Pro Gln Ser Val Cys Ser Glu Ser Cys Gly Pro Gly Phe Arg Lys Val Thr Leu Glu Asn Lys Ala Ile Cys Cys Tyr Asn Cys Thr Pro Cys Ala Asp Asn Glu Ile Ser Asn Glu Thr Asp Val Asp Gln Cys Val Lys Cys Pro Glu Ser His Tyr Ala Asn Thr Glu Lys Ser Asn Cys Tyr Gln Lys Ser Val Ser Phe Leu Gly Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Ser Ile Ala Leu Cys Leu Ser Ala Leu Thr Ala Phe Val Ile Gly Ile Phe Val Lys His Lys Asp Thr Pro Ile Val Lys Ala Asn Asn Gln Ala Leu Ser Tyr Thr Leu Leu Ile Thr Leu Lys Phe Cys Phe Leu Cys Ser Leu Asn Phe Ile Gly Gln Pro Asn Thr Val Ala Cys Ile Leu Gln Gln Thr Thr Phe Ala Val Ala Phe Thr Met Ala Leu Ala Thr Val Leu Ala Lys Ala Ile Thr Val Val Leu Ala Phe Lys Val Ser Phe Pro Gly Arg Met Val Arg Trp Leu Met Ile Ser Arg Gly Pro Asn Tyr Ile Ile Pro Ile Cys Thr Leu Ile Gln Leu Leu Leu Cys Gly Ile Trp Met Ala Ile Ser Pro Pro Tyr Ile Asp Gln Asp Ala His Ile Glu His Gly His Ile Ile Ile Leu Cys Asn Lys Gly Ser Ala Val Ala Phe His Ser Val Leu Gly Tyr Leu Cys Phe Leu Ala Leu Gly Ser Tyr Thr Met Ala Phe Leu Ser Arg Asn Leu Pro Asp Thr Phe Asn Glu Ser Lys Phe Ile Ser Leu Ser Met Leu Val Phe Phe Cys Val Trp Ile Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val (2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1889 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:

ATGGCGACGA
AGGACACATC

TCGAAGTCTTCTGCATCCAAGCCGAATTC lgg9 (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 604 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
Ser Leu Ser Leu Ala Ile Val Ser Leu Met Val His Phe Arg Trp Ser Trp Val Gly Leu Ile Leu Pro Asp Asp His Lys Gly Asn Lys Ile Leu Ser Asp Phe Arg Lys Glu Met Glu Arg Lys Arg Ile Cys Thr Ala Phe Val Lys Met Ile Pro Ala Thr Trp Thr Ser Ser Phe Val Lys Phe Trp Glu Asn Met Asp Asp Thr Asn Ile Ile Ile Ile Tyr Gly Asp Ile Asp Ser Leu Glu Gly Pro Met Arg Asn Ile Gly Gln Arg Leu Leu Thr Trp His Val Trp Val Met Asn Ile Glu Pro His Ile Ile Glu Tyr Asp Asn . 100 105 110 Tyr Phe Met Leu Asp Ser Phe His Gly Ser Leu Ile Phe Lys His Asn Tyr Arg Glu Asn Phe Glu Phe Thr Lys Phe Ile Arg Thr Val Asn Pro Lys Lys Tyr Pro Glu Asp Ile Tyr Leu Pro Lys Met Trp Tyr Leu Phe Phe Met Cys Ser Phe Ser Asp Ile Asn Cys Gln Val Leu Asp Ser Cys Gln Thr Asn Ala Ser Leu Asp Met Leu Pro Ser Gln Ile Phe Asp Val Val Met Ser Glu Glu Ser Thr Ser Ile Tyr Asn Ala Val Tyr Ala Val Ala His Ser Leu His Glu Met Arg Leu Gln Gln Leu Gln Thr Gln Pro Cys Glu Asn Glu Glu Gly Met Glu Phe Phe Pro Trp Gln Leu Asn Thr Phe Leu Lys Asp Ile Glu Val Arg Val Asn Ser Leu Asp Trp Arg Gln Arg Ile Asp Ala Glu Tyr Asp Ile Leu Asn Leu Trp Asn Leu Pro Lys Gly Leu Gly Leu Lys Val Lys Ile Gly Asn Phe Tyr Ala Asn Ala Pro Gln Gly Gln Gln Leu Ser Leu Ser Glu Gln Met Ile Gln Trp Pro Glu Ile Phe Ser Glu Val Pro Gln Ser Val Cys Ser Glu Ser Cys Arg Pro Gly Phe Arg Lys Val Ser Leu Asp Asp Lys Ala Ile Cys Cys Tyr Lys Cys Thr Pro Cys Ala Asp Asn Glu Ile Ser Asn Glu Thr Asp Val Asp Gln Cys Val Lys Cys Pro Glu Ser His Tyr Ala Asn Thr Glu Lys Ser Asn Cys Phe Pro Lys Ser Val Ser Phe Leu Ala Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Ser Ile Ala Leu Cys Leu Ser Ala Leu Thr Val Phe Val Ile Gly Ile Phe Val Lys Asn Arg Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Thr Leu Ser Tyr Ile Leu Leu Ile Thr Leu Thr Phe Cys Phe Leu Cys Ser Leu Asn Phe Ile Gly Gln Pro Asn Thr Ala Ala Cys Ile Leu Gln Gln Thr Thr Phe Ala Val Ala Phe Thr Met Ala Leu Ala Thr Val Leu Ala Lys Ala Ile Thr Val Val Leu Ala Phe Lys Ile Ser Phe Pro Gly Arg Met Leu Arg Trp Leu Met Ile Ser Arg Gly Pro Arg Tyr Ile Ile Pro Ile Cys Thr Leu Ile Gln Leu Leu Leu Cys Gly Ile Trp Met Ala Thr Ser Pro Pro Phe Ile Asp Gln Asp Val Asn Thr Glu Asp Gly Tyr Ile Ile Leu Leu Cys Asn Lys Gly Ser Ala Val Ala Phe His Ser Val Leu Gly Tyr Leu Cys Phe Leu Ala Leu Gly Ser Tyr Thr Met Ala Phe Leu Ser Arg Asn Leu Pro Asp Thr Phe Asn Glu Ser Lys Phe Leu Ser Phe Ser Met Leu Val Phe Phe Cys Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val (2) INFORMATION FOR SEQ ID N0:15:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2561 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 80...349 (D) OTHER INFORMATION: VR8 (xi) SEQUENCE DESCRIPTION: SEQ ID
N0:15:

TACATCAGAA

CTC TGT GCT TTC ACG ATT TCA TTG

Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Leu TGC TGT AGT

Leu Phe Leu Lys Phe Ser Leu Ile Leu Trp Ser Glu Pro Cys Cys Ser GAT AAT CAA

Cys Phe Trp Arg Ile Lys Asn Ser Asp Asp Gly Asp Leu Asp Asn Gln GCT GAT GAT

Arg Glu Cys His Phe Tyr Leu Gly Ala Thr Pro Val Glu Ala Asp Asp AGG TTT TTA

Asn Phe Tyr Ser Ser Leu Leu Lys Phe Ser Leu Rsp His Arg Phe Leu TGC CCC TAGCC

Ile Leu Thr Tyr Ala Thr Met Thr Gly Met Ser Ile Arg Cys Pro GTCTCCTTGA

CAGGGTATTC

GCTTTTGTTA

GATCAACAAA

ACTCTAGAAG

ACCTCACAAT

ACTATCACTT

ATGAACACTG

AATTGTTCAA

GAATGGACAT

AATGCTGTTT

CAGAAAAAGG

TCCGTGTGTA

GACTGCTGCT

GAACAGTGTG

TCAAGAGCTG

GCACTGTCCT

ACTCCCATTG

TTCTGCTTTC

CAGCAGACCA

ATAACTGTGG

ATGACAGGGG

GGAATCTGGT

AAGATTGTCA

(2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg Ile Lys Asn Ser Asp Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Gly Ala Ala Asp Thr Pro Val Glu Asp Asn Phe Tyr Ser Ser Leu Leu Lys Phe Arg Phe Ser Leu Asp Hia Leu Ile Leu Thr Tyr Ala Thr Met Thr Gly Cys Pro Met Ser Ile Arg (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2734 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence _ (B) LOCATION: 80...1387 (D) OTHER INFORMATION: VR9 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:

Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser.

AAG GAT GAT

CysPhe TrpArgIle AsnSer AspAsnAsp Gly Leu Gln Lys Asp Asp TAC GCA GTT

ArgGlu CysHisPhe LeuGly AlaAspThr Pro Glu Asp Tyr Ala Val CTT TTT AGT

AsnPhe TyrSerSer LeuLys ArgIleAla Ala Glu Tyr Leu Phe Ser ATG GCT AAC

GluPhe LeuLeuVal PhePhe IleAspGlu Ile Arg Asn Met Ala Asn AAC TTG ATT

ProTyr LeuLeuPro IleThr MetPheSer Phe Gly Gly Asn Leu Ile TTG ATG ACA

AsnCys GlnAspLeu ArgVal AspGlnAla Tyr Gln Ile Leu Met Thr TTT TAT GAT

AsnGly HisMetAsn ValAsn PheCysTyr Leu Asp Ser Phe Tyr Asp ACA TCA TTA

CysAla IleGlyLeu GlyPro TrpLysThr Ser Lys Leu Thr Ser Leu ATG GTT TTT

AlaMet HisSerSer ProLeu PhePheGly Pro Asn Pro Met Val Phe GAC CCC GTA

AsnLeu ArgAspHis ArgLeu HisValHis Gln Ala Pro Asp Pro Val TCC ATG TTT

LysAsp ThrHisLeu HisGly ValSerLeu Met His Phe Ser Met Phe GGA ATC CAG

ArgTrp ThrTrpIle MetVal SerAspAsp Asp Gly Ile Gly Ile Gln TTA GAA GGG

GlnPhe LeuSerAsp ArgGlu SerGlnArg His Ile Cys Leu Glu Gly ATG GAA TAC

LeuAla PheValAsn IlePro AsnMetGln Ile Met Thr Met Glu Tyr GAT ATT GCA

ArgAla ThrIleTyr GlnGln MetThrSer Ser Lys Val Asp Ile Ala GAA TCT AGC

ValIle IleTyrGly MetAsn ThrLeuGlu Val Phe Arg Glu Ser Ser TGG GCT ATC ACA ACC
GAA TCA CAA

ArgTrp Glu Leu Gly Arg Arg Trp Ile Thr Gln Glu Ala Ile Thr Ser GTC AAA TTC AAT TTC

TrpAsp Ile Thr Asn Lys Asp Thr Leu Leu His Val Lys Phe Asn Phe ATC CAC GTT CCT TTA

GlyThr Thr Phe Ala His Arg Glu Ile Lys Asn Ile His Val Pro Leu ATG AAC AAA GTA ATT

LysPhe Gln Thr Met Thr Ala Tyr Pro Asp Ser Met Asn Lys Val Ile ATA AAT AAT ATA AAG

HisThr Leu Glu Trp Tyr Phe Cys Ser Ser Asn Ile Asn Asn Ile Lys AGA ATT AAC TTG TGG

SerIle Met His His Thr Phe Asn Thr Glu Thr Arg Ile Asn Leu Trp CAC ATG AGT GGT AGT

SerLeu Asn Tyr Asp Ala Met Asp Glu Tyr Leu His Met Ser Gly Ser GCT GTG ACC GAA ATT

TyrAsn Val Tyr Ala Ala His Tyr His Tyr Phe Ala Val Thr Glu Ile GTA AAA AAA AGA TTC

GlnGln Glu Ser Gln Lys Ala Pro Lys Tyr Thr Val Lys Lys Arg Phe CAG AAC TGAGGTGTCC
AGATGATAAG
TATGCCA

AlaCys Gln Ile Trp Ser Val Gln Asn ', TCACATTTGT GAAACACAAC GATACTCCCA TTGTGAAGGC CAATAACCGCATTCTCAGCT1594 ACATCCTGCT CATCTCTCTC GTCTTCTGC~ TTCTCTGCTC CCTGCTCTTC ATTGGACCTC1654 ', GAGATATACA ATCTGAGCAT GGGAAGATTG TCATTCTTTG CAATAAAGGCTCAGTCATTG1954 _ TCATGGTGGT TGTGGAGGTT TTCTCCATCT TGGCTTCTAG TGCAGGGTTG CTAATGTGTA2194 (2) INFORMATION FOR SEQ ID N0:18:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 436 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg Ile Lys Asn Ser Asp Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Gly Ala Ala Asp Thr Pro Val Glu Asp Asn Phe Tyr Ser Ser Leu Leu Lys Phe Arg Ile Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Ile Asp Glu Ile Asn Arg Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Phe Ile Gly Gly Asn Cys Gln Asp Leu Leu Arg Val Met Asp Gln Ala Tyr Thr Gln Ile Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Ile Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Met Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Gln Gln Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Glu Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Val Ile Thr Asn Lys Lys Asp Phe Thr Leu Asn Leu Phe His Gly Thr Ile Thr Phe Ala His His Arg Val Glu Ile Pro Lys Leu Asn Lys Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ile Arg Met His His Ile Thr Phe Asn Asn Thr Leu Glu Trp Thr Ser Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Ser Leu Tyr Asn Ala Val Tyr Ala Val Ala His Thr Tyr His Glu Tyr Ile Phe Gln Gln Val Glu Ser Gln Lys Lys Ala Lys Pro Lys Arg Tyr Phe Thr Ala Cys Gln Gln Ile Trp Asn Ser Val i 435 (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2732 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/REY: Coding Sequence (B) LOCATION: 80...1375 II (D) OTHER INFORMATION: VR10 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:

Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Phe Leu Ser Leu Lys Phe Ser Leu Ile Leu Cys Cys Leu Thr Glu Ala Ser Cys Phe Trp Arg Ile Lys Asn Ser Glu Asp Ser Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Val Ile Asp Lys Pro Ile Glu Asp Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg Ile Ser Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn ', CCT TAT TTA AAC GTT GGT 400 CTT CCC ATA GGT
ACT
TTG
ATA
TTC
AGC
ATC

Pro TyrLeuLeu ProAsn ThrLeuIlePhe SerIleVal Gly Ile Gly j 95 100 105 ', CAC TGTCATGAT TTATTG GGTCTGGATCAA TCATATACA ATA 448 AGA CAA

I His CysHisAsp LeuLeu GlyLeuAspGln SerTyrThr Ile Arg Gln GTT GAT

Asn GlyArgVal AsnPhe AsnTyrPheCys TyrLeuAsp Ser Val Asp . 125 130 135 GGA AAA

Cys AsnIleGly LeuThr ProSerTrpLys LysSerLeu Leu Gly Lys ', 140 145 150 155 Ala Met Asp Ser Ser Ile Pro Met Val Phe Phe Gly Pro Phe Asn Pro CGC CAT CCC

Asn Leu AspHis Asp Arg Leu Pro Val His Gln Val Ala Arg His Pro ACA GTC TTT

Lys Asp HisLeu Ser His Gly Met Ser Leu Met Phe His Thr Val Phe ACT TCA ATT

Arg Trp TrpIle Gly Leu Val Ile Asp Asp Asp Gln Gly Thr Ser Ile CTC AGC TGT

Gln Phe SerAsp Leu Arg Glu Glu Gln Arg His Gly Ile Leu Ser Cys TTT AAC ACA

Leu Ala ValAsn Met Ile Pro Glu Met Gln Ile Tyr Met Phe Asn Thr ACA ATG GTT

Arg Ala IleTyr Asp Lys Gln Ile Thr Ser Ser AIa Lys Thr Met Val ATT ACT AGA

Val Ile TyrGly Glu Met Asn Ser Leu Glu Val Ser Phe Ile Thr Arg GAA ATC CAA

Arg Trp AspLeu Gly Ala Arg Arg Trp Ile Thr Thr Ser Glu Ile Gln ATC TTC CAT

Trp Asp IleLeu Asn Lys Lys Glu Thr Leu Asn Leu Phe Ile Phe His ATC GTT AGG

Gly Pro ThrPhe Ala His His Lys Glu Ile Pro Lys Leu Ile Val Arg ATG AAA TCT

Asn Phe GlnThr Met Asn Thr Ala Tyr Pro Val Asp Ile Met Lys Ser ATA AAT AAC

His Thr LeuGlu Trp Asn Tyr Phe Cys Ser Ile Ser Lys Ile Asn Asn AAA AAC ACA

Ser Ser MetAsp Leu Phe Thr Ser Asn Thr Leu Glu Trp Lys Asn Thr CAC AGT TTG

Ala Leu AsnTyr Asp Met Ala Met Asp Glu Gly Tyr Asn His Ser Leu GCT ACC CTT

Tyr Asn ValTyr Val Ala Ala His Tyr His Glu His Ile Ala Thr Leu GTA GAA ACT

Gln Gln GluSer Gln Lys Lys Val His Asn Arg Tyr Phe Val Glu Thr CAG CCAGATGATA AGTATGCCAA
.C

Val Cys Gln Gln Ile ,. GGGATGGCTC TAGGCTGCAT GGCACTATCC TTCTCGGCCATCACAATTCTAGTACTAGTC1536 ' TCTACAGTGT TGGCCARAAC AATAACTGTG GTCATGGCTTTCAAGTTCACTACTCCAGGA1776 TATGACAAAG GTACATAAAT AAATAAACAC TTTCCCCACC1?~WAAAAAAAAAA,AAA 2732 (2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 432 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Phe Leu Ser Leu Lys Phe Ser Leu Ile Leu Cys Cys Leu Thr Glu Ala Ser Cys Phe Trp Arg Ile Lys Asn Ser Glu Asp Ser Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Val Ile Asp Lys Pro Ile Glu Asp Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg Ile Ser Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Ile Phe Ser Ile Val Gly Gly His Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ser Tyr Thr Gln Ile Asn Gly Arg Val Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Asn Ile Gly Leu Thr Gly Pro Ser Trp Lys Lys Ser Leu Lys Leu Ala Met Asp Ser Ser Ile Pro Met Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys Gln Ile Met Thr Ser Ser Ala Lys Val val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Asp Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Ile Ile Leu Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Pro Ile Thr Phe Ala His His Lys Val Glu Ile Pro Lys Leu Arg Asn Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Asp Leu Phe Thr Ser Asn Asn Thr Leu Glu Trp Thr Ala Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Val Ala Ala His Thr Tyr His Glu His Ile Leu Gln Gln Val Glu Ser Gln Lys Lys Val Glu His Asn Arg Tyr Phe Thr Val Cys Gln Gln Ile (2) INFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2962 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 81...1601 (D) OTHER INFORMATION: VR11 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:

AATGTGTGTG
TGATGTTTTT
CTACATCAGA
AACGGATTTC
ACAACAACTC

ATG
AAG
AAG
CTC
TGT
GCT
TTC
ACT
ATT
TCA

Met Lys Lys Leu Cys Ala Phe Thr Ile Ser TTG AAG TGC GCA

Phe SerLeu Phe Ser Leu Ile Leu Cys Leu Thr Glu Leu Lys Cys Ala TGC AGG GAT TTG

Ser PheTrp Ile Lys Asn Ser Glu Ser Asp Gly Asp Cys Arg Asp Leu AGA CAT ATT GAA

Gln GluCys Phe Tyr Leu Trp Val Asp Lys Pro Ile Arg His Ile Glu AAT AAT AGA GAA .

W0.99/00422 PC'T/IJS98l13680 Asp Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg Ile Ser Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Ile Phe Ser Ile Val Gly Gly His Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ser Tyr Thr Gln Ile Asn Gly Arg Val Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Asn Ile Gly Leu Thr Gly Pro Ser Trp Lys Lys Ser Leu Lys Leu Ala Met Asp Ser Ser Ile Pro Met Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala ', CCC AAG GAC ACA CAT TTA TCC CAT GGC ATG GTC TCC TTG ATG TTT CAT 686 ', Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His ', 190 195 200 Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys Gln Ile Met Thr Ser Ser Ala Lys I _ Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Asp Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Ile Ile Leu Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Pro Ile Thr Phe Ala His His Lys Val Glu Ile Pro Lys Leu .

AAT CAA GCC TAC GAT

Arg Phe Met Thr MetAsnThr Lys Pro Val Ile Asn Gln Ala Tyr Asp CAT CTG TTT TGT TCT

Ser Thr Ile Glu TrpAsnTyr Asn Ser Ile Lys His Leu Phe Cys Ser AGC ATG TCC AAC GAA

Asn Ser Lys Asp LeuPheThr Asn Thr Leu Trp Ser Met Ser Asn Glu GCA AAC ATG GAT TAC

Thr Leu His Tyr AspMetAla Ser Glu Gly Asn Ala Asn Met Asp Tyr TAT GTT CAC TAC CAC

Leu Asn Ala Tyr ValAlaAla Thr His Glu Ile Tyr Val His Tyr His CAA GAG GTA CAC TAT

Leu Gln Val Ser GlnLysLys Glu Asn Arg Phe Gln Glu Val His Tyr GTT CAG ATG ACC TTT

Thr Cys Gln Val SerSerLeu Lys Arg Val Thr Val Gln Met Thr Phe CCG GAA AAG AGG CAG

Asn Val Gly Leu ValAsnMet His Glu Asn Cys Pro Glu Lys Arg Gln GAG ATT AAT CCA CTT

Thr Tyr Asp Phe IleIleTrp Phe Gln Gly Gly Glu Ile Asn Pro Leu AAA ATA CCT TTT AGT

Leu Leu Lys Gly SerTyrIle Cys Pro Lys Gln Lys Ile Pro Phe Ser CTT TCT TGG ATG ACA

Gln His Ile Asp AspLeuGlu Ala Gly Gly Ser Leu Ser Trp Met Thr TAGAACAGTG ACCTAC
TGTGAAATGT
CCAGATGATA
AGTATGCCAA

Ile - lOS -AATTAAGTAA TATACAGATT

TAAATAAATA AACACTTTCC CCACAAAAAAAAAAAAP.AAA AAAAA 2962 (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 507 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Lys Lys Leu Cys Ala Phe Thr Ile Ser Phe Leu Ser Leu Lys Phe Ser Leu Ile Leu Cys Cys Leu Thr Glu Ala Ser Cys Phe Trp Arg Ile Lys Asn Ser Glu Asp Ser Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Val Ile Asp Lys Pro Ile Glu Asp Asn Phe Tyr Asn Ser Val Leu Asn Phe Arg Ile Ser Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Ile Phe Ser Ile Val Gly Gly His Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ser Tyr Thr Gln Ile Asn Gly Arg Val Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Asn Ile Gly Leu Thr Gly Pro Ser Trp Lys Lys Ser Leu Lys Leu Ala Met Asp Ser Ser Ile Pro Met Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys Gln Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Asp Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Ile Ile Leu Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Pro Ile Thr Phe Ala His His Lys Val Glu Ile Pro Lys Leu Arg Asn Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Asp Leu Phe Thr Ser Asn Asn Thr Leu Glu Trp Thr Ala Leu His Asn Tyr Asp Met Ala Met Ser Asp Glu Gly Tyr Asn Leu Tyr Asn Ala Val Tyr Val Ala Ala His Thr Tyr His Glu His Ile Leu Gln Gln Val Glu Ser Gln Lys Lys Val Glu His Asn Arg Tyr Phe Thr Val Cys Gln Gln Val Ser Ser Leu Met Lys Thr Arg Val Phe Thr Asn Pro Val Gly Glu Leu Val Asn Met Lys His Arg Glu Asn Gln Cys Thr Glu Tyr Asp Ile Phe Ile Ile Trp Asn Phe Pro Gln Gly Leu Gly Leu Lys Leu Lys Ile Gly Ser Tyr Ile Pro Cys Phe Pro Lys Ser Gln Gln Leu His Ile Ser Asp Asp Leu Glu Trp Ala Met Gly Gly Thr Ser Ile (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2821 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 60...992 (D) OTHER INFORMATION: VR12 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:

Me t Lys Gln Leu Cys Thr Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Se r Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg Ile Ly s Lys Ser Glu Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Ty r Leu Trp Lys Thr Asp Glu Pro Ile Glu Asp Ser Phe Tyr Asn Tyr As p Leu Ser Phe Arg Ile Ala Gly Ser Glu Tyr Glu Leu Leu Leu Val Me t Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro As.

i~ ACA TGA GTT TGA TGT TCT CCA TCA TTG GTG GAA ACT GTC ATG ATT TAT 396 n Met Ser Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Le a Arg Ser Leu Asp Gln Glu Tyr Ala Gln Ile Asp Gly His Met Asn Ph a Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Thr Gly Leu Th r Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Me t Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His As ', p Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Se r His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gl y Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Le a Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Me ', 25 230 235 240 t Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr As p Thr Gln Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly As ',, p Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu ', 275 280 285 y Ala Arg Arg Ile Trp Ile Thr Thr Thr Gln Trp Asp Val Ile Thr As n Lys Lys Arg Leu His Pro TGAGGTTTCC
AATGAAACAG
ATATGGAACA

(2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 311 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Met Lys Gln Leu Cys Thr Phe Thr Ile Ser Leu Leu Phe Leu Lys Phe Ser Leu Ile Leu Cys Cys Trp Ser Glu Pro Ser Cys Phe Trp Arg Ile Lys Lys Ser Glu Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Trp Lys Thr Asp Glu Pro Ile Glu Asp Ser Phe Tyr Asn Tyr Asp Leu Ser Phe Arg Ile Ala Gly Ser Glu Tyr Glu Leu Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Met Ser Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Leu Arg Ser Leu Asp Gln Glu Tyr Ala Gln Ile Asp Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Thr Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Pro Phe Asn Pro Asn Leu Arg Asp His Asp Arg Leu Pro His Val His Gln Val Ala Pro Lys Asp Thr His Leu Ser His Gly Met Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Thr Gln Ile Met Thr Ser Ser Ala Lys Val Val Ile Ile Tyr Gly Asp Met Asn Ser Thr Leu Glu Ala Ser Phe Arg Arg Trp Glu Glu Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Thr Gln Trp Asp Val Ile Thr Asn Lys Lys Arg Leu His Pro {2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2773 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
{A) NAME/KEY: Coding Sequence {B) LOCATION: 3...1238 (D) OTHER INFORMATION: VR13 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:

CGG ATA AAG AAT AGT GAA
GAT AAT GAT GGA

Ala Ser Cys Phe Trp Arg Ile Lys Asn Ser Glu Asp Asn Asp Gly AAA CCA

Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Gly Ala Val Asp Lys Pro GCA GCA

Ile Glu Asp Asn Phe Tyr Asn Ser Leu Leu Lys Phe Arg Ile Ala Ala GAG ATC

Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile TCC ATC

Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Ile GCA TAT

Ile Gly Gly Asn Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ala Tyr TAT TTA

Thr Gln Ile Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Ile Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Ser .

CGG

PheAsn ProAsnLeu HisAspHis Asp Leu HisHisValHis Gln Arg CAT

ValAla ThrLysAsp ThrHisLeu Ser Gly IleValSerLeu Met His CTG

PheHis PheArgTrp ThrTrpIle Gly Val IleSerAspAsp Asp Leu AGA

LysGIy IleGlnPhe LeuSerAsp Leu Glu GluSerGlnArg His Arg ATC

GlyIle CysLeuAla PheValAsn Met Pro GluAsnMetGln Ile Ile AAA

TyrMet ThrArgAla ThrIleTyr Asp Gln IleMetThrSer Leu Lys ATG

AlaLys ValValIle IleTyrGly Glu Asn SerThrLeuGlu Val Met GCT

SerPhe ArgArgTrp GluAsnLeu Gly Arg ArgIleTrpIle Thr Ala AAA

ThrSer GlnTrpAsp ValIleThr Asn Lys GluPheThrLeu Asn Lys CAC

LeuPhe HisGlyThr IleThrPhe Ala Arg ArgPheGluIle Pro His AAC

LysPhe LysLysPhe MetGlnThr Met Thr AlaLysTyrPro Val Asn AAT

AspIle SerHisThr IleLeuGlu Trp Tyr PheAsnCysSer Ile Asn ATT

SerLys AsnSerSer LysMetAsp His Thr PheAsnAsnThr Leu Ile ATG

GluTrp ThrAlaLeu HisAsnTyr Asp Val MetSerAspGlu Gly Met GTG

TyrAsn LeuTyrAsn AlaValTyr Ala Ala HisThrTyrHis Glu Val AAA

HisIle PheGlnGln ValGluSer Gln Lys AlaLysProLys Arg Lys Phe Phe Thr Val Cys Gln Gln Gln Ile Trp Asn Ser Val TTTTCACTGTGTCTGTTTCTACAGTGTTGGCCAAAACAAT AACTGTGGTC ATGGCTTTCA1'610 (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 412 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Ala Ser Cys Phe Trp Arg Ile Lys Asn Ser Glu Asp Asn Asp Gly Asp Leu Gln Arg Glu Cys His Phe Tyr Leu Gly Ala Val Asp Lys Pro Ile Glu Asp Asn Phe Tyr Asn Ser Leu Leu Lys Phe Arg Ile Ala Ala Ser Glu Tyr Glu Phe Leu Leu Val Met Phe Phe Ala Thr Asp Glu Ile Asn Lys Asn Pro Tyr Leu Leu Pro Asn Ile Thr Leu Met Phe Ser Ile Ile Gly Gly Asn Cys His Asp Leu Leu Arg Gly Leu Asp Gln Ala Tyr Thr Gln Ile Asn Gly His Met Asn Phe Val Asn Tyr Phe Cys Tyr Leu Asp Asp Ser Cys Ala Ile Gly Leu Thr Gly Pro Ser Trp Lys Thr Ser Leu Lys Leu Ala Met His Ser Ser Met Pro Leu Val Phe Phe Gly Ser Phe Asn Pro Asn Leu His Asp His Asp Arg Leu His His Val His Gln Val Ala Thr Lys Asp Thr His Leu Ser His Gly Ile Val Ser Leu Met Phe His Phe Arg Trp Thr Trp Ile Gly Leu Val Ile Ser Asp Asp Asp Lys Gly Ile Gln Phe Leu Ser Asp Leu Arg Glu Glu Ser Gln Arg His Gly Ile Cys Leu Ala Phe Val Asn Met Ile Pro Glu Asn Met Gln Ile Tyr Met Thr Arg Ala Thr Ile Tyr Asp Lys Gln Ile Met Thr Ser Leu Ala Lys Val Val Ile Ile Tyr Gly Glu Met Asn Ser Thr Leu Glu Val Ser Phe Arg Arg Trp Glu Asn Leu Gly Ala Arg Arg Ile Trp Ile Thr Thr Ser Gln Trp Asp Val Ile Thr Asn Lys Lys Glu Phe Thr Leu Asn Leu Phe His Gly Thr Ile Thr Phe Ala His Arg Arg Phe Glu Ile Pro Lys Phe Lys Lys Phe Met Gln Thr Met Asn Thr Ala Lys Tyr Pro Val Asp Ile Ser His Thr Ile Leu Glu Trp Asn Tyr Phe Asn Cys Ser Ile Ser Lys Asn Ser Ser Lys Met Asp His Ile Thr Phe Asn Asn Thr Leu Glu Trp Thr Ala Leu His Asn Tyr Asp Met Val Met Ser Asp Glu Gly Tyr Asn Leu Tyr Aan Ala Val Tyr Ala Val Ala His Thr Tyr His Glu His Ile Phe Gln Gln Val Glu Ser Gln Lys Lys Ala Lys Pro Lys Arg Phe Phe Thr Val Cys Gln Gln Gln Ile Trp Asn Ser Val (2) INFORMATION FOR SEQ ID N0:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3108 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 116...2527 (D) OTHER INFORMATION: VR14 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:

TAAACATCTC
CTTTGCCTAA
AGAAATAAAA
GCTGGTAGAA
ATCTGATGTG

TGGCACTTCA AAAAG
CAATCCACAC ATG
TGCCCAGGTT

Met TTC GTC TTC AAT ACA

Phe Ile Met Glu Phe Leu Leu Ile LeuLeu Met Phe Val Phe Asn Thr TTC CCC TGC AGA AAT

Ala Asn Ile Asp Arg Phe Trp Ile LeuAsp Glu Phe Pro Cys Arg Asn GAT TTG TTA GCT ATC

Ile Met Glu Tyr Gly Ser Cys Phe LeuAla Ala Asp Leu Leu Ala Ile , !, GTT CAG ACA CCC GAA GATTAT TTCAACAAGACT CTTAAT GTT 310 ATT AAT

Val Gln Thr Pro Glu AspTyr PheAsnLysThr LeuAsn Val Ile Asn AAA CAC

Leu Lys Thr Thr Asn LysTyr AlaLeuAlaLeu ValPhe Ala Lys His AAC AAT

Met Asp Glu Ile Arg ProAsp LeuLeuProAsn MetSer Leu Asn Asn ACT GGC

Ile Ile Arg Tyr Leu ArgCys AspGlyLysThr ValIle Pro Thr Gly TTT AAA

', Thr Pro Tyr Leu Arg LysLys GluSerProIle ProAsn Tyr Phe Lys i ', TTC TGT AAT GAA ACT TGTTCC TATCTGCTTACA GGACCC CAT 550 GAG ATG

Phe Cys Asn Glu Thr CysSer TyrLeuLeuThr GlyPro His Glu Met TTA TTC

Trp Glu Val Ser Gly TrpLys HisMetAsnSer PheLeu Ser Leu Phe CAG ACC

Pro Arg Ile Leu Leu TyrGly ProPheHisSer IlePhe Ser Gln Thr TAT TAT

Asp Asp Glu Gln Pro LeuTyr GlnMetAlaPro LysAsp Thr Tyr Tyr Ser Leu Ala Leu Ala Met Val Ser Phe Ile Leu Tyr Phe Ser Trp Asn GTC GAT CAA TTT
ATT

Trp Ile GlyLeuVal Pro Asp Asp Gly Asn Gln Leu Ile Asp Gln Phe CAG AAC GAA GCC

Leu Glu LeuLysLys Ser Glu Lys Ile Cys Phe Phe Gln Asn Glu Ala GTT GTT TTT ACT

Val Lys MetIleSer Asp Asp Ser Pro Gln Asn Glu Val Val Phe Thr ATT TCA ACA ATC

Met Tyr TyrAsnGln Val Met Ser Asn Val Ile Ile Ile Ser Thr Ile AAT GAT ATC TGG

Tyr Gly GluThrTyr Phe Ile Leu Phe Arg Met Glu Asn Asp Ile Trp AGA ATC ACA AAT

', Pro Pro IleLeuGln Ile Trp Thr Lys Gln Leu Phe Arg Ile Thr Asn .

ACC ACA TTC TCA
AGG
AAA
AAA
GAC
ATA
AGT

ProThrArgLys Asp SerHis Gly PheTyr GlySerLeu Lys Ile Thr CAC GGT GGT

ThrPheLeuPro His ValIle Ser PheLys AsnPheVal His Gly Gly CAT AGA TTA

GlnThrTrpPhe Leu AsnThr Asp TyrLeu ValMetGln His Arg Leu TTT TAT GCA

GluTrpLysTyr Asn GluAsp Ser SerThr CysLysIle Phe Tyr Ala TCA AAT GAT

LeuLysAsnAsn Ser AlaSer Phe TrpLeu MetGluGln Ser Asn Asp ACC AGT CAT

LysPheAspMet Phe GluAsn Ser AsnIle TyrAsnAla Thr Ser His GCC GCC ATG

ValHisAlaIle His LeuHis Glu AsnLeu GlnGlnAla Ala Ala Met ATA AAT GAG

AspAsnGlnAla Asp GlyLys Lys ProSer SerSerHis Ile Asn Glu AAC TTT ATT

CysLeuLysVal Ser LeuArg Arg TyrPhe ThrAsnPro Asn Phe Ile GTG ATG GTA

ProGlyAspLys Phe LysGln Arg IleMet HisAspGlu Val Met Val CAC GTG CAA

TyrAspIleVal Phe AsnLeu Ser HisLeu GlyIleLys His Val Gln AAG AGC CCA

MetLysLeuGly Phe ProTyr Leu HisGly ArgHisSer Lys Ser Pro GAC ATT ACA

HisLeuTyrVal Arg GluLeu Ala GlyArg ArgLysMet Asp Ile Thr TGC GCT CCT

ProSerSerVal Ser AspCys Ser GlyPhe ArgArgLeu Cys Ala Pro ATG GCC GTT

TrpLysGluGly Ala CysCys Phe CysSer ProCysPro Met Ala Val TCT GAG GTA

GluAsnGluIle Asn ThrThr Val LeuCys ValPheVal Ser Glu Val ACT ATT AAT .

WO 99/~4Z2 PCT/US98/13680 Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr Leu Leu Ser Leu CysPheLeuCys SerPhe Phe Leu Met Met Ser CCA ATC

Phe IleGly Leu Asn ArgAla CysValLeuGln GlnIle Thr Pro Ile TTC GTT

Phe GlyIle Val Thr MetAla SerThrValLeu AlaLys Thr Phe Val CTG GTC

Val ThrVal Val Ala PheLys ThrAspProGly ArgArg Leu Leu Val GTA CCC

Arg AsnPhe Leu Ser GlyThr AsnTyrIleIle ProIle Cys Val Pro TGT GCA

Ser LeuLeu Gln Val LeuCys IleTrpLeuAla ValSer Pro Cys Ala ATT ACT

Pro PheVal Asp Asp GluHis LeuHisGlyHis IleIle Ile Ile Thr GGC GCA

Val CysAsn Lys Ser ValThr PheTyrCysIle LeuGly Tyr Gly Ala ', 690 695 700 705 GCA TTC

Leu AlaCys Leu Leu GlyAsn SerValAlaPhe LeuAla Lys Ala Phe ', AAT CTGCCT GAC TTC AATGAA AAGTTCTTGACC TTCAGC ATG 2326 ACA GCC

Asn LeuPro Asp Phe AsnGlu LysPheLeuThr PheSer Met Thr Ala AGT ACC

Leu ValPhe Cys Val TrpVal PheLeuProVal TyrHis Ser Ser Thr CAC GTG

', Thr LysGly Lys Met ValAla GluIlePheSer IleLeu Ala His Val ', TCC AGTGCT GGG CTT GGATGT TTTGTACCCAAG ATTTAT ATC 2470 ATC ATA

Ser SerAla Gly Leu GlyCys PheValProLys IleTyr Ile Ile Ile CCA TCG

Ile LeuMet Arg Glu ArgAsn ThrGlnLysIle ArgGlu Lys Pro Ser Ser Tyr Phe ACTAAACTCT
CTAATTATTA
CAATTTTATT

CTGTCAAATAAAAATATATTATATCCAAAAp,~~iAAAAAAAA p~AAAAAAAp,A 3108 AA

{2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(Ay LENGTH: 804 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:
Met Phe Ile Phe Met Glu Val Phe Phe Leu Leu Asn Ile Thr Leu Leu Met Ala Asn Phe Ile Asp Pro Arg Cys Phe Trp Arg Ile Asn Leu Asp Glu Ile Met Asp Glu Tyr Leu Gly Leu Ser Cys Ala Phe Ile Leu Ala Ala Val Gln Thr Pro Ile Glu Asn Asp Tyr Phe Asn Lys Thr Leu Asn Val Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala Met Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Arg Tyr Thr Leu Gly Arg Cys Asp Gly Lys Thr Val Ile Pro Thr Pro Tyr Leu Phe Arg Lys Lys Lys Glu Ser Pro Ile Pro Asn Tyr Phe Cys Asn Glu Glu Thr Met Cys Ser Tyr Leu Leu Thr Gly Pro His Trp Glu Val Ser Leu Gly Phe Trp Lys His Met Asn Ser Phe Leu Ser Pro Arg Ile Leu Gln Leu Thr Tyr Gly Pro Phe His Ser Ile Phe Ser Asp Asp Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Ala Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe Ile Leu Tyr Phe Ser Trp Asn Trp Ile Gly Leu Val Ile Pro Asp Asp Asp Gln Gly Asn Gln Phe Leu Leu Glu Leu Lys Lys Gln Ser Glu Asn Lys Glu Ile Cys Phe Ala Phe Val Lys Met Ile Ser Val Asp Asp Val Ser Phe Pro Gln Asn Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Ser Thr Asn Val Ile Ile Ile Tyr Gly Glu Thr Tyr Asn Phe Ile Asp Leu Ile Phe Arg Met Trp Glu Pro Pro Ile Leu Gln Arg Ile Trp Ile Thr Thr Lys Gln Leu Asn Phe Pro Thr Arg Lys Lys Asp Ile Ser His Gly Thr Phe Tyr Gly Ser Leu Thr Phe Leu Pro His His Gly Val Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Trp Phe His Leu Arg Asn Thr Asp Leu Tyr Leu Val Met Gln Glu Trp Lys Tyr Phe Asn Tyr Glu Asp Ser Ala Ser Thr Cys Lys Ile Leu Lys Asn Asn Ser Ser Asn Ala Ser Phe Asp Trp Leu Met Glu Gln Lys Phe Asp Met Thr Phe Ser Glu Asn Ser His Asn Ile Tyr Asn Ala Val His Ala Ile Ala His Ala Leu His Glu Met Asn Leu Gln Gln Ala Asp Asn Gln Ala Ile Asp Asn Gly Lys Lys Glu Pro Ser Ser Ser His Cys Leu Lys Val Asn Ser Phe Leu Arg Arg Ile Tyr Phe Thr Asn Pro Pro Gly Asp Lys Val Phe Met Lys Gln Arg Val Ile Met His Asp Glu Tyr Asp Ile Val His Phe Val Asn Leu Ser Gln His Leu Gly Ile Lys Met Lys Leu Gly Lys Phe Ser Pro Tyr Leu Pro His Gly Arg His Ser His Leu Tyr Val Asp Arg Ile Glu Leu Ala Thr Gly Arg Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Arg Leu Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Thr Val Val Leu Cys Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr Leu Leu Leu Met Ser Leu Met Ser Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Leu Pro Asn Arg Ala Ile Cys Val Leu Gln Gln Ile Thr Phe Gly Ile Val Phe Thr Met Ala Val Ser Thr Val Leu Ala Lys Thr Val Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Arg Leu Arg Asn Phe Leu Val Ser Gly Thr Pro Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Val Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Thr Leu His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Ile Leu Gly Tyr Leu Ala Cys Leu Ala Leu Gly Asn Phe Ser Val Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys His Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Ile Leu Gly Cys Ile Phe Val Pro Lys Ile Tyr Ile Ile Leu Met Arg Pro Glu Arg Asn Ser Thr Gln Lys Ile Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3689 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii} MOLECULE TYPE: cDNA
(ix} FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 39...419 (D) OTHER INFORMATION: VR15 (xi) SEQUENCE DESCRIPTION: SEQ ID
N0:29:

GGAAAAAT ATG TTC ATT TTC ATG GGA

Met Phe Ile Phe Met Gly CTC ATG AAT

Val Phe Phe Leu Leu Asn Ile Thr Leu Ala Asn Phe Ile Leu Met Asn GAT GAA TAT

Pro Arg Cys Phe Trp Arg Ile Asn Leu Ile Thr Asp Glu Asp Glu Tyr GCG GCA ACT

Leu Gly Leu Ser Cys Thr Phe Ile Leu Val Gln Thr Pro Ala Ala Thr AAT GTT AAA

Glu Lys Asp Tyr Phe Asn Lys Thr Leu Leu Lys Thr Thr Asn Val Lys TTT GCA AAC

Asn His Lys Tyr Ala Leu Ala Leu Val Met Asp Glu Ile Phe Ala Asn TCT TTG ACT

Arg Asn Pro Asp Leu Leu Pro Asn Met Ile Ile Arg Tyr Ser Leu Thr ACA CCT TTT

Leu Gly Leu Cys Asp Gly Lys Thr Val Thr Pro Tyr Leu Thr Pro Phe TAATTATTTC TGTAATGAAG AGACTAT

His Lys Lys Lys Thr Lys Pro Tyr Pro GGATGTATCT

GCTTACCTAT

TCAGATGGCC

GAAATGGAAC

AGAGTTGAAG

TGTTGATGAT

ATCCACAAAT

AATGTGGGAA

TACCAGGAAG

CCATGGTGAG

AGATTTATAT

TTGTAAAATA

GTTTGACATG

CCATGCCCTC

AGGAGCCAGT

TCCTCTTGGG

TATTCACTTT

GACACTCTCA

CCTCTGTGTG

CAGCCTGCTG

CCTCTCCATT

GGATGGGAAT

GTTAACACTA

GCATTTTGGT

GCTGGTCCTC

TAACATCATA

TCATTTAAAT

CATTACATAT

TACCTTGACA

TGGATCAATG

TTCAGAAAGG

TGGCCTTCTG

ACACTCCTAT

TGTTCTGTTT

TACAGCAAAT

CAGTCACTGT

TGGTATCAGG

GTGCAATCTG

GCCATATCAT

ATTTGGCCTG

ACACATTCAA

TCACCTTCCT

TCTCCATCTT

TCATTTTAAT

GAACAAATAT

TTATAGTGCA

AGTATCATAT

TTCATTTTCT

TCATGGAGAT

CTTTGTGTAG

TCAAATAATC

TATTTTCTGA

CTTCAATCTA

ATATATTATA

(2) INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 127 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Phe Ile Phe Met Gly Val Phe Phe Leu Leu Asn Ile Thr Leu Leu Met Ala Asn Phe Ile Asn Pro Arg Cys Phe Trp Arg Ile Asn Leu Asp Giu Ile Thr Asp Glu Tyr Leu Gly Leu Ser Cys Thr Phe Ile Leu Ala Ala Val Gln Thr Pro Thr Glu Lys Asp Tyr Phe Asn Lys Thr Leu Asn Val Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Val Phe Ala Met Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Arg Tyr Thr Leu Gly Leu Cys Asp Gly Lys Thr Val Thr Pro Thr Pro Tyr Leu Phe His Lys Lys Lys Thr Lys Pro Tyr Pro (2) INFORMATION FOR SEQ ID N0:31:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3896 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 36...263 (D) OTHER INFORMATION:

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:

TGT GTT

Met Lys Asn Leu Cys Val TTG TGC CAT

Phe Thr Leu Ser Phe Phe Leu Leu Glu Phe Ser Leu Ile Leu Cys His GAA GAT AAT

Leu Thr Glu Pro Ile Cys Phe Trp Arg Ile Asn Asn Asn Glu Asp Asn GCA GTT GAG

Asp Gly Asp Leu Arg Ser Asp Cys Gly Phe Phe Leu Ala Ala Val Glu TTT TCT TTG

Gly Pro Thr Asp Asp Ser Tyr Asn Ile Ser Asp Leu Arg Phe Ser Leu ATCAGGTA

Asp His Leu Ile Leu Ser TTTTACATGG

ATCAGACATG

CCCAGAAGAC

ATCAACAGCA

TAGAAGGTGG

TATCACAAAT

CCATGTAGGT

CACAGTAAAC

GAACAGCAAT

CAAATATGAC

GGCCCACACC

CAAAGGAACA

TAACCCTGTT

TATTTTCATC

TTTGCCTTGT

AGGAGGATCA

AATTCATCAG

AGTTTCCAAT

AAAGGCTCAG

(2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 76 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Lys Asn Leu Cys Val Phe Thr Leu Ser Phe Phe Leu Leu Glu Phe Ser Leu Ile Leu Cys His Leu Thr Glu Pro Ile Cys Phe Trp Arg Ile Asn Asn Asn Glu Asp Asn Asp Gly Asp Leu Arg Ser Asp Cys Gly Phe Phe Leu Ala Ala Val Glu Gly Pro Thr Asp Asp Ser Tyr Asn Ile Ser Asp Leu Arg Phe Ser Leu Asp His Leu Ile Leu Ser (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2811 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
( ix) FEATURE
(A) NAME/KEY: Coding Sequence (B) LOCATION: 962...2605 (D) OTHER INFORMATION: GoVNl (xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:

ACAGATGCCA

CCTCGTACAC

GCCTATAAGT

GATAAATGCA

TGTGAAGTCC

TTTTGAACGG

TCCTAATTAC

GAAAACATCT

TGGGCCGTGT

CCCCAAAGAC

CTGGGTGGGA

AAAGGAGCTG

GGAATCATTG

TGTGATTATA

AAAGTATGAA

AACAATATAC

CAT CAT GGG

Met Leu Glu Leu Ala His Gly Thr Leu Thr Phe Ser Pro His His Gly CCT ATC AAG

Glu Ile Ser Asp Phe Thr Asn Phe Met Gln Glu Val Thr Pro Ile Lys TAT TTC AAT

Tyr Pro Glu Asp Ile Phe Leu His Ile Leu Trp Asn Gln Tyr Phe Asn TGT ATA CCC

Cys Pro Leu Leu His Ser Glu Cys Lys Ile Phe Glu Asn Cys Ile Pro CTG GTC ATG

Asn Ala Ser Leu Glu Leu Leu Pro Gly Gly Val Phe Glu Leu Val Met GTG GCC CAC

Thr Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His CCA CAG GAT

Ser Leu His Glu Lys Ala Leu His Gln Val Glu Ile Gln Pro Gln Asp CCT TTT CTG

Asn Lys Asp Arg Thr Ile Leu Phe Pro Trp Gln Leu His Pro Phe Leu Lys AsnIle Gln Ile AsnSer Gly Arg Val Leu Leu Val Asp Ile Asp ACG ATT

Trp LysLys Lys Asp ThrGluTyrAsp IleSer Asn TrpAsn Thr Ile CTT TTT

Phe ProThr Gly Ser LeuLeuValLys ValGly Thr AlaPro Leu Phe GGG ACA

Ser AlaPro Lys Glu GlnLeuSerIle SerGlu His IleAsn Gly Thr TTT AGT

Trp ProIle Gly Thr GluIleProLys SerVal Cys GluSer Phe Ser CAC CCT

Cys SerPro Gly Arg LysValIleLeu GluSer Lys AlaCys His Pro ACT AAC

Cys PheAsp Cys Pro CysProAspLys GluIle Ser GluThr Thr Asn TGT GCA

Asp ValGly Gln Val LysCysProGlu SerHis Tyr AsnThr Cys Ala TGC GAT

Glu LysSer His Leu LysLysThrMet ThrPhe Leu TyrAsn Cys Asp ACG TTC

Asp SerLeu Gly Gly LeuThrLeuMet SerLeu Gly PheVal Thr Phe GTT AAC

Val ThrGly Leu Ile GlyValPheIle IleHis Arg ThrPro Val Asn AAT CTC

Ile ValLys Ala Asn ArgSerLeuSer TyrIle Leu IleThr Asn Leu TTC CTT

Leu ThrLeu Cys Leu CysProLeuLeu PheIle Gly ProAsn Phe Leu ATC CTC

Thr AlaThr Cys Leu GlnGlnAsnLeu PheGly Leu PheThr Ile Leu _ GTG GCTCTA TCC GTG TTGGCCAAAACT ATCACT GTA ATGGCA 2065 ACA GTT

Val AlaLeu Ser Val LeuAlaLysThr IleThr Val MetAla Thr Val GCT CTG

Phe LysIle Thr Pro GlyArgLysThr ArgTrp Leu IleLeu Ala Leu AGA GCC CCT CAG TTC ATC ATT CCA CTT TGT GCC CTG ATG CAA ATC CTT . 2161 ArgAla Gln Ile Ile Pro Leu AlaLeu Met IleLeu Pro Phe Cys Gln GGG TGG CCT GAC

PheSer Ile Leu Gly Thr Ser ProPhe Val MetAsp Gly Trp Pro Asp TCT CAT ATT AAG

AlaHis Glu Gly His Ile Ile LeuCys Asn GlySer Ser His Ile Lys GGC TAC TAC ATG

AlaIle Phe Cys Thr Leu Ala LeuGly Val AlaPhe Gly Tyr Tyr Met TAC TTG AGG GAC

GlySer Leu Ala Phe Met Ser AsnLeu Pro ThrPhe Tyr Leu Arg Asp TCC GCC ATG TGC

AsnGlu Lys Leu Ala Phe Ser LeuMet Phe SerVal Ser Ala Met Cys ACA CTC AGC AAG

TrpVal Phe Pro Val Tyr His ThrThr Gly ValArg Thr Leu Ser Lys ATG ATG GCT AGC

ValAla Glu Phe Ser Ile Leu SerSer Ala IleLeu Met Met Ala Ser ATC GTC ATT AGA

ThrLeu Phe Pro Lys Cys Tyr ValLeu Phe ProGlu Ile Val Ile Arg ATA CCT AAA AGG

ArgAsn Leu Leu Asn Arg Glu ArgGln His SerLys Ile Pro Lys Arg GAA TAGCAGTCAA
GACAAACATT
GGCCTAGCAC
AAAATGTCTG

AsnSer Thr Glu CCTGCTATAT GATCACATGA
AAACAATTAG
TCCTTTGACT

TGATATTGCT TATTGACCAA
TCAAATTATG
TAAAATATGT

GTTCTTGTAT
GAAAAAAAAA
AAAAAAA

(2) INFORMATION
FOR
SEQ
ID
N0:34:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH:548 amino acids (B) TYPE:
amino acid (C) STRANDEDNESS:
single (D) TOPOLOGY:
linear (ii) TYPE: protein MOLECULE

(v) FRAGMENT
TYPE:
internal (xi) DESCRIPTION: SEQ N0:34:
SEQUENCE ID

MetLeu Leu His Gly Thr Leu PheSer Pro HisGly Glu Ala Thr His GluIle Asp Thr Asn Phe Met GluVal Thr IleLys Ser Phe Gln Pro Tyr.Pro Asp Phe Leu His Ile TrpAsn Gln PheAsn Glu Ile Leu Tyr .

Cys Pro Leu Leu His Ser Glu Cys Lys Ile Phe Glu Asn Cys Ile Pro Asn Ala Ser Leu Glu Leu Leu Pro Gly Gly Val Phe Glu Leu Val Met Thr Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His Ser Leu His Glu Lys Ala Leu His Gln Val Glu Ile Gln Pro Gln Asp Asn Lys Asp Arg Thr Ile Leu Phe Pro Trp Gln Leu His Pro Phe Leu Lys Asn Ile Gln Leu Ile Asn Ser Val Gly Asp Arg Val Ile Leu Asp Trp Lys Lys Lys Thr Asp Thr Glu Tyr Asp Ile Ser Asn Ile Trp Asn Phe Pro Thr Gly Leu Ser Leu Leu Val Lys Val Gly Thr Phe Ala Pro Ser Ala Pro Lys Gly Glu Gln Leu Ser Ile Ser Glu His Thr Ile Asn Trp Pro Ile Gly Phe Thr Glu Ile Pro Lys Ser Val Cys Ser Glu Ser Cys Ser Pro Gly His Arg Lys Val Ile Leu Glu Ser Lys Pro Ala Cys Cys Phe Asp Cys Thr Pro Cys Pro Asp Lys Glu Ile Ser Asn Glu Thr Asp Val Gly Gln Cys Val Lys Cys Pro Glu Ser His Tyr Ala Asn Thr Glu Lys Ser His Cys Leu Lys Lys Thr Met Thr Phe Leu Asp Tyr Asn Asp Ser Leu Gly Thr Gly Leu Thr Leu Met Ser Leu Gly Phe Phe Val Val Thr Gly Leu Val Ile Gly Val Phe Ile Ile His Arg Asn Thr Pro Ile Val Lys Ala Asn Asn Arg Ser Leu Ser Tyr Ile Leu Leu Ile Thr Leu Thr Leu Cys Phe Leu Cys Pro Leu Leu Phe Ile Gly Leu Pro Asn Thr Ala Thr Cys Ile Leu Gln Gln Asn Leu Phe Gly Leu Leu Phe Thr Val Ala Leu Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Met Ala Phe Lys Ile Thr Ala Pro Gly Arg Lys Thr Arg Trp Leu Leu Ile Leu Arg Ala Pro Gln Phe Ile Ile Pro Leu Cys Ala Leu Met Gln Ile Leu Phe Ser Gly Ile Trp Leu Gly Thr Ser Pro Pro Phe Val Asp Met Asp Ala His Ser Glu His Gly His Ile Ile Ile Leu Cys Asn Lys Gly Ser Ala Ile Gly Phe Tyr Cys Thr Leu Ala Tyr Leu Gly Val Met Ala Phe Gly Ser Tyr Leu Leu Ala Phe Met Ser Arg Asn Leu Pro Asp Thr Phe Asn Glu Ser Lys Ala Leu Ala Phe Ser Met Leu Met Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Thr Gly Lys Val Arg Val Ala Met Glu Met Phe Ser Ile Leu Ala Ser Ser Ala Ser Ile Leu Thr Leu Ile Phe Val Pro Lys Cys Tyr Ile Val Leu Phe Arg Pro Glu Arg Asn Ile Leu Pro Leu Asn Arg Glu Lys Arg Gln His Arg Ser Lys Asn Ser Glu Thr (2) INFORMATION FOR SEQ ID N0:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3584 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 273...2576 (D) OTHER INFORMATION: GoVN2 (xi) SEQUENCE N0:35:
DESCRIPTION:
SEQ ID

CAGAAAGAAT ATTTTTCCTT
ATGTTCATTT

CTCCACCATC CACTTCTCAT GGTGCTTTTG GAGAACAAAT

GGCAAATTTC
ATCGATCCCT

TTGAATGAAG TCAAGGAAAA CCTTCATCCT TGGAGCAGTT

AAACTTGGAT
ATAAATTGTG

TATTTCAATG ACAACTAAAA
AGACTTTGAA

TTAGCCTTTT
CA ATG

Met Glu GluIleAsnArg Asn AAT TCT GTT

ProAsp LeuLeu Pro Met Leu Ile LysHisThrLeu Ser Asn Ser Val ACT GAC ATA

TyrCys AspGly Asn Ala His Phe LysGluLysPhe Tyr Thr Asp Ile TAT TGT GAA

LysPro LeuPro Asn Val Asn Glu ThrMetCysSer Phe Tyr Cys Glu AAT GTA TCT

MetLeu IleGly Leu Trp Leu Leu ThrLeuPheLys Asp Asn Val Ser 60 65 7p TTT CGT CTT

LeuAsp IlePhe Ser Pro Phe Gln IleSerTyrGly Pro Phe Arg Leu AGT AAT CAA

PheHis SerIle Phe Asp Glu Phe ProTyrLeuTyr Gln Ser Asn Gln ACA CTA TTG

MetThr ProLys Asp Ser Ala Ala IleValSerPhe Leu Thr Leu Leu AAC GTT CTT

LeuTyr PheAsn Trp Trp Gly Val IleSerAspAsn Asp Asn Val Leu CTC GAG AAA

GluGly AsnGln Phe Ser Leu Lys GluThrGlnAsn Lys Leu Glu Lys TTT AAC ATG

GluIle CysPhe Ala Val Met Ser IleHisGluHis Ser Phe Asn Met _ --Ser Tyr GlnLysThrGlu MetTyrTyr AsnGlnIle ValMetSer Ser Thr Asn IleIleIleIle TyrGlyLys ThrAsnSer IleIleGlu Leu Ser Phe ArgMetTrpVal SerProVal IleGlnArg IleTrpVal Thr Asn Ser GluLeuAspPhe ProThrSer MetArgAsp PheThrHis Gly Thr Phe TyrGlyThrLeu ThrPheLeu HisHisHis GlyGluIle Ser Gly Phe ThrAsnPhePhe GluThrTrp AspHisLeu ArgSerArg Asp Leu Asn LeuLeuIlePro GluTrpLys TyrPheSer TyrAspAla Ser Gly Ser AsnCysLysIle LeuArgAsn TyrSerSer AsnAlaSer Leu Glu Trp IleThrGluGln LysPheHis MetAlaPhe AsnAspTyr Ser His Ser IleTyrAsnAla ValTyrAla MetAlaHis AlaLeuHis Glu Thr Asn LeuGlnGluVal AspAsnLys GluIleArg AsnGlyLys Gly Ala Ser ThrHisCysLeu LysValAsn SerPheLeu ArgLysThr His Phe Thr AsnSerHisGly GluArgVal IleMetLys GlnArgVal Arg Val Gln GluAspTyrAsp IleValHis IleGlnAsn PheSerGln His Leu Arg IleLysMetLys IleGlyLys PheSerPro TyrPheThr His Gly Gly ProPheHisLeu TyrGluAsp MetIleGln LeuAlaThr Gly .

SerArgLys MetProSer SerValCys SerAlaAsp CysSerPro Gly PheArgLys SerTrpLys GluGlyMet AlaProCys CysPheIle Cys SerLeuCys ProGluAsn GluIleSer AsnGluThr AsnMetAsp Gln CysValAsn CysProGlu TyrGlnTyr AlaAsnThr GluLysAsn Lys CysIleGln LysAspVal IlePheLeu SerTyrGlu AspProLeu Gly MetAlaLeu AlaLeuIle AlaPheCys LeuSerAla PheThrAla Val ValLeuTrp ValPheVal LysHisHis AspThrPro IleValLys Ala AsnAsnArg IleLeuSer TyrIleLeu IleMetSer LeuMetPhe Cys PheLeuCys SerPhePhe PheIleGly HisProAsn ArgGlyThr Cys IleLeuGln GlnIleThr PheGlyIle ValPheThr ValAlaVal Ser ThrValLeu AlaLysThr IleThrVal IleLeuAla PheLysLeu Arg AspProGly ArgSerLeu ArgAsnPhe LeuValSer GlyAlaPro Asn TyrIleIle ProIleCys SerLeuLeu GlnCysIle LeuCysAla Ile TrpLeuAla ValSerPro ProPheVal AspIleAsp GluHisSer Glu HisGlyHis IleMetIle ValCysAsn LysGlySer IleMetAla Phe TyrCysVal LeuGlyTyr LeuAlaCys LeuAlaLeu GlySerPhe Thr ThrAlaPhe LeuAlaLys AsnLeuPro AspThrPhe AsnGluAla Lys ', TTC TTG ACC TTC AGC ATG CTA GTG TTC TGC AGT GTC TGG GTC ACC TTT 2405 Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Arg Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Phe Gly Cys Ile Phe Ala Pro Lys Ile Tyr Ile Ile Leu Met Lys Pro Glu Arg Asn Ser Ile Gln Lys Phe Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 768 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
Met Glu Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Val Ile Lys His Thr Leu Ser Tyr Cys Asp Gly Asn Thr Ala Asp His Ile Phe Lys Glu Lys Phe Tyr Lys Pro Leu Pro Asn Tyr Val Cys Asn Glu Glu Thr Met Cys Ser Phe Met Leu Ile Gly Leu Asn Trp Val Leu Ser Leu Thr Leu Phe Lys Asp Leu Asp Ile Phe Ser Phe Pro Arg Phe Leu Gln Ile Ser Tyr Gly Pro Phe His Ser Ile Phe Ser Asp Asn Glu Gln Phe Pro Tyr Leu Tyr Gln Met Thr Pro Lys Asp Thr Ser Leu Ala Leu Ala Ile Val Ser Phe Leu Leu Tyr Phe Asn Trp Asn Trp Val Gly Leu Val Ile Ser Asp Asn Asp Glu Gly Asn Gln Phe Leu Ser Glu Leu Lys Lys Glu Thr Gln Asn Lys Glu Ile Cys Phe Ala Phe Val Asn Met Met Ser Ile His Glu His Ser Ser Tyr Gln Lys Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Ser Thr Asn Ile Ile Ile Ile Tyr Gly Lys Thr Asn Ser Ile Ile Glu Leu Ser Phe Arg Met Trp Val Ser Pro Val Ile Gln Arg Ile Trp Val Thr Asn Ser Glu Leu Asp Phe Pro Thr Ser Met Arg Asp Phe Thr His Gly Thr Phe Tyr Gly Thr Leu Thr Phe Leu His His His Gly Glu Ile Ser Gly Phe Thr Asn Phe Phe Glu Thr Trp Asp His Leu Arg Ser Arg Asp Leu Asn Leu Leu Ile Pro Glu Trp Lys Tyr Phe Ser Tyr Asp Ala Ser Gly Ser Asn Cys Lys Ile Leu Arg Asn Tyr Ser Ser Asn Ala Ser Leu Glu Trp Ile Thr Glu Gln Lys Phe His Met Ala Phe Asn Asp Tyr Ser His Ser Ile Tyr Asn Ala Val Tyr Ala Met Ala His Ala Leu His Glu Thr Asn Leu Gln Glu Val Asp Asn Lys Glu Ile Arg Asn Gly Lys Gly Ala Ser Thr His Cys Leu Lys Val Asn Ser Phe Leu Arg Lys Thr His Phe Thr Asn Ser His Gly Glu Arg Val Ile Met Lys Gln Arg Val Arg Val Gln Glu Asp Tyr Asp Ile Val His Ile Gln Asn Phe Ser Gln His Leu Arg Ile Lys Met Lys Ile Gly Lys Phe Ser Pro Tyr Phe Thr His Gly Gly Pro Phe His Leu Tyr Glu Asp Met Ile Gln Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys Ser Pro Gly Phe Arg Lys Ser Trp Lys Glu Gly Met Ala Pro Cys Cys Phe Ile Cys Ser Leu Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Val Asn Cys Pro Glu Tyr Gln Tyr Ala Asn Thr Glu Lys Asn Lys Cys Ile Gln Lys Asp Val Ile Phe Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Ile Ala Phe Cys Leu Ser Ala Phe Thr Ala Val Val Leu Trp Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Ile Leu Ile Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly His Pro Asn Arg Gly Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Ile Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr Ile Thr Val Ile Leu Ala Phe Lys Leu Arg Asp Pro Gly Arg Ser Leu Arg Asn Phe Leu Val Ser Gly Ala Pro Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Ile Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Glu His Gly His Ile Met Ile Val Cys Asn Lys Gly Ser Ile Met Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Leu Gly Ser Phe Thr Thr Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Arg Gly Arg Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Phe Gly Cys Ile Phe Ala Pro Lys Ile Tyr Ile Ile Leu Met Lys Pro Glu Arg Asn Ser Ile Gln Lys Phe Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3578 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1181...3181 (D) OTHER INFORMATION: GoVN3 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:

AATTATTTTT

TGAAGCTTTC

CCAAGACATT

TTGGTCATGT

CTATCCACTT

TTCTACCTAA

TAAACAGTTT

GAAGATATTT

TCCTATACAT

CCCTAAAGTC

ACAATCACTG

CTACTGATGC

TCTTTGCACT

TTAGAACTCA

ATCACACCTG

AACTAAGTGA

GATATTTTTT

TAAAATTCCT

CATTTCACCC

CCT AAG GAC

Met Ala Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Leu Phe Val His Phe Ser Trp GCT GTT GAC GGT
TCA TAT

Asn TrpValGly ValVal AspAsp ProGly Glu Phe Ala Ser Asp Tyr AGA ATG AAC TGT

Ile LeuGluLeu ArgGlu GlnArg AsnPhe Leu Ala Arg Met Asn Cys ATT GAT TTA AAA

Phe ValSerIle ValSer AspAsn PheLeu Arg Tyr Ile Asp Leu Lys AAC AAG TCA GTT

Asn IleTyrTyr GlnIle MetSer AlaLys Val Ile Asn Lys Ser Val 70 75 8p g5 AAA CCT GTG AGA

Ile TyrGlyAsp AspSer LeuGln AsnPhe Leu Trp Lys Pro Val Arg ATC ATC ACT CAG

Asn LeuPheAsp GlnArg TrpVal ThrSer Trp Asp Ile Ile Thr Gln AAT TTC AAT TAT

Met IleIleAsn GlyLys LeuLeu SerPhe Gly Thr Asn Phe Asn Tyr CAT TCT TCT AAA

Leu SerPheSer HisTyr GluLeu GlyPhe Thr Phe His Ser Ser Lys TAC AAC GAT TCT

Ile GlnThrAla ProSer TyrSer AspPhe Leu Gly Tyr Asn Asp Ser GTG AAT TTG TCT

Ile LeuTrpTrp TyrPhe CysSer SerLeu Glu Cys Val Asn Leu Ser AAT AAG ATA TGG

Lys AsnLeuGln CysPro GluAsn PheArg Leu Tyr Asn Lys Ile Trp GAA TTG ACT GAC

Arg HisHisPhe MetSer SerAsp ThrTyr Leu Tyr Glu Leu Thr Asp GCT TAC CAA CTT

Asn SerMetTyr ValAla ThrLeu GlnMet Leu Lys Ala Tyr Gln Leu TGG GAT AAA GAA

Gln AlaAspThr GlnIle AspGly GluPro Phe Asp Trp Asp Lys Glu CTC CTG ATC ATA

Ser TrpGlnMet SerPhe ArgAsn GlnPhe Asn Pro Leu Leu Ile Ile GTG AAT GAA GAT

Val GlyAspLys AsnLeu HisGlu LysLeu Thr Lys Val Asn Glu Asp CAG ACT CCA GTA .

Tyr GluIle HisGln Thr ThrPhe Pro Asn ValPheLys Leu Leu Pro TCC TTA GGT

Leu LysIle GlyThr Phe GlnAsn Ser His ArgGlnLeu Ser Leu Gly ATA AAC CAC

Tyr MetLeu LysGlu Met GluTrp Thr Gly GlnGlnSer Ile Asn His ATT AGT TTC

Pro ThrSer ValCys Ser ProCys Pro Gly ArgLysSer Ile Ser Phe GTT TTT ACA

Pro GlnLeu GlyLys Pro CysCys Asp Cys ProCysPro Val Phe Thr ' 345 350 355 ATG ATG TGT

Glu AsnGlu IleSer Asn ThrAsn Asn Gln IleLysCys Met Met Cys Leu Asn Asp Gln Tyr Ala Asn Pro Gly Gly Thr Arg Cys Leu Lys Lys Val Ile Val Phe Leu Gly Tyr Glu Asp Pro Leu Gly Met Ser Leu Ala Ile Leu Ala Leu Cys Phe Ser Ala Leu Thr Ala Phe Val Leu Ser Ile Phe Leu Lys His Gln Glu Thr Pro Thr Val Lys Ala Asn Asn Arg Thr Leu Ser Tyr Val Leu Leu Ile Ser Leu Ile Ser Cys Phe Leu Cys Ser Leu Leu Phe Ile Gly His Pro Ser Phe Thr Thr Cys Ile Met Gln Gln ' 455 460 465 Thr Thr Phe Ala Val Val Phe Thr Val Ala Ala Ser Thr Val Leu Ala Lys Thr Ile Ile Val Ile Leu Ala Phe Lys Val Thr Asn Thr Ser Arg _ Lys Met Arg Trp Leu Leu Val Ser Gly Ala Pro Lys Phe Ile Ile Pro Ile Cys Thr Met Ile Gln Leu Ile Leu Cys Gly Ile Trp Leu Gly Thr Ser Pro Pro Phe Val Asp Ala Asp Gly His Val Glu Lys Gly His Ile.

CTT GCT TGT

Leu Ile Phe Cys Asn Lys Gly Ser Ile Phe Tyr ValLeu Leu Ala Cys AGT TTC GCA

Gly Tyr Leu Val Ser Ile Ala Ile Ala Thr Leu PhePhe Ser Phe Ala GAA GCC CTA

Ala Arg Asn Leu Pro Asp Thr Phe Asn Lys Phe ThrPhe Glu Ala Leu GTC ACC CCT

Ser Met Leu Val Phe Cys Ser Val Trp Phe Leu ValTyr Val Thr Pro GCT GTG TTC

His Ser Thr Lys Gly Lys Ser Met Val Glu Val CysIle Ala Val Phe TGC ATC CCA

Leu Ala Ser Ser Ala Gly Leu Leu Phe Phe Ala LysCys Cys Ile Pro AAA TCT AAG

Phe Ile Ile Leu Leu Arg Pro Glu Lys Phe Gln PheGln Lys Ser Lys ATTAAATTTT TCTGACACAC

Asn Ile His Ser Lys Ile TAGATCCAAA

AAAACACGTC

CTGTTTGCTG

GTTCTGAGTT

TGTTGTTGTG

CTCTATAATA AATAATTATG AGATAAATGC P~~i~AAAAAAAp~~e~AAAAAAAAAAAAAAA 3578 A

(2) INFORMATION FOR SEQ ID N0:38:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 667 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID 8:
N0:3 Met Ala Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Leu Phe Val His Phe Ser Trp Asn Trp Val Gly Ala Val Val Ser Asp Asp Asp Pro Gly Tyr Glu Phe Ile Leu Glu Leu Arg Arg Glu Met Gln Arg Asn Asn Phe Cys Leu Ala Phe Val Ser Ile Ile Val Ser Asp Asp Asn Leu Phe Leu Lys Arg Tyr Asn Ile Tyr Tyr Asn Gln Ile Lys Met Ser Ser Ala-Lys Val Val Ile Ile Tyr Gly Asp Lys Asp Ser Pro Leu Gln Va1 Asn Phe Arg Leu Trp Asn Leu Phe Asp Ile Gln Arg Ile Trp Val Thr Thr Ser Gln Trp Asp Met Ile Ile Asn Asn Gly Lys Phe Leu Leu Asn Ser Phe Tyr Gly Thr Leu Ser Phe Ser His His Tyr Ser Glu Leu Ser Gly Phe Lys Thr Phe Ile Gln Thr Ala Tyr Pro Ser Asn Tyr Ser Asp Asp Phe Ser Leu Gly Ile Leu Trp Trp Val Tyr Phe Asn Cys Ser Leu Ser Leu Ser Glu Cys Lys Asn Leu Gln Asn Cys Pro Lys Glu Asn Ile Phe Arg Trp Leu Tyr Arg His His Phe Glu Met Ser Leu Ser Asp Thr Thr Tyr Asp Leu Tyr Asn Ser Met Tyr Ala Val Ala Tyr Thr Leu Gln Gln Met Leu Leu Lys Gln Ala Asp Thr Trp Gln Ile Asp Asp Gly Lys Glu Pro Glu Phe Asp Ser Trp Gln Met Leu Ser Phe Leu Arg Asn Ile Gln Phe Ile Asn Pro Val Gly Asp Lys Val Asn Leu Asn His Glu Glu Lys Leu Asp Thr Lys Tyr Glu Ile His Gln Thr Leu Thr Phe Leu Pro Asn Pro Val Phe Lys Leu Lys Ile Gly Thr Phe Ser Gln Asn Leu Ser His Gly Arg Gln Leu Tyr Met Leu Lys Glu Met Ile Glu Trp Asn Thr Gly His Gln Gln Ser Pro Thr Ser Val Cys Ser Ile Pro Cys Ser Pro Gly Phe Arg Lys Ser Pro Gln Leu Gly Lys Pro Val Cys Cys Phe Asp Cys Thr Pro Cys Pro Glu Asn Glu Ile Ser Asn Met Thr Asn Met Asn Gln Cys Ile Lys Cys Leu Asn Asp Gln Tyr Ala Asn Pro Gly Gly Thr Arg Cys Leu Lys Lys Val Ile Val Phe Leu Gly Tyr Glu Asp Pro Leu Gly Met Ser Leu Ala Ile Leu Ala Leu Cys Phe Ser Ala Leu Thr Ala Phe Val Leu Ser Ile Phe Leu Lys His Gln Glu Thr Pro Thr Val Lys Ala Asn Asn Arg Thr Leu Ser Tyr Val Leu Leu Ile Ser Leu Ile Ser Cys Phe Leu Cys Ser Leu Leu Phe Ile Gly His Pro Ser Phe Thr Thr Cys Ile Met Gln Gln Thr Thr Phe Ala Val Val Phe Thr Val Ala Ala Ser Thr Val Leu Ala Lys Thr Ile Ile Val Ile Leu Ala Phe Lys Val Thr Asn Thr Ser Arg Lys Met Arg Trp Leu Leu Val Ser Gly Ala Pro Lys Phe Ile Ile Pro Ile Cys Thr Met Ile Gln Leu Ile Leu Cys Gly Ile Trp Leu Gly Thr Ser Pro Pro Phe Val Asp Ala Asp Gly His Val Glu Lys Gly His Ile Leu Ile Phe Cys Asn Lys Gly Ser Ile Leu Ala Phe Tyr Cys Val Leu Gly Tyr Leu Val Ser Ile Ala Ile Ala Ser Phe Thr Leu Ala Phe Phe Ala Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Val Tyr His Ser Thr Lys Gly Lys Ser Met Val Pro Ala Val Glu Val Cys Ile Leu Ala Ser Ser Ala Gly Leu Leu Phe Phe Cys Ile Phe Ala Lys Cys Phe Ile Ile Leu Leu Arg Pro Glu Lys Pro Lys Ser Phe Gln Phe Gln Asn Ile His Ser Lys Ile Lys (2) INFORMATION FOR SEQ ID N0:39:

(i) SEQUENCE
CHARACTERISTICS:

(A) LENGTH: 4467 base pairs (B) TYPE: nucleic acid {C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE
TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 126...2723 {D) OTHER INFORMATION: GoVN4 (xi) SEQUENCE
DESCRIPTION:
SEQ ID
N0:39:

GAAACACCTG
TAGAAAAGGA
AACCTGAATA
CAGGTATAGC
ATCTTCTTGG

AGATGGGGAT
AATTGCTACC
TGTTTGCTGA
TCTGTGCAGC
AATTAACTAC

TCC AGG
CTC AGA
GCA GGA
AAA AAT
ATG CTC
ACC TTC
ATT TTA

Met Ser Arg Leu Arg Ala Gly Lys Asn Met Leu Thr Phe IIe Leu TTT ATT TAT

Leu Phe Leu Leu Asn Ile Pro Leu Phe Val Pro Ser Phe Phe Ile Tyr TGC AGA AAC

Pro Arg Phe Trp Ser Met Lys Lys Asn Glu Tyr Gln Asp Cys Arg Asn ACA CCT ATG

Leu Gly Gly Cys Met Phe Phe Ile Leu Ala Val Gln Gln Thr Pro Met GAG ACT GAA

Glu Lys Tyr Phe Ser His Ile Ser Asn Ile Gln Thr Pro Glu Thr Glu AAG ATC AAC

Asn Gln Tyr Pro Leu Thr Leu Ala Phe Ser Met Asn Glu Lys Ile Asn CCT TTC TCA

Asn Asn Asp Leu Leu Pro Asn Met Ser Leu Ala Phe Thr Pro Phe Ser AGT AAT TTT

Glu Tyr Cys Tyr Leu Glu Ser His His Lys Arg Leu Phe Ser Asn Phe AAA AAA GAC

Ser Leu Asn His Glu Ile Leu Pro Asn Phe Ile Cys Thr Lys Lys Asp WO 99/00422 PCT/US98/13b80 GGA CTT AGT TTG ACT

Ile LysCys Gly Val Val Leu Thr SerLeuVal Thr Val Gly Leu Thr TTC ATA CGT

_ Thr LeuHis Ile Ile Leu Asn Asn PheGlnGln Phe Gln Phe Ile Arg GCT CTG AAT

Leu ThrTyr Gly His Phe His Pro CysAspHis Glu Phe Ala Leu Asn GAT GAT CTT

Pro HisLeu Tyr Gln Met Ala Ser ThrSerLeu Ala Ala Asp Asp Leu AGT TGG TTG

Leu ValSer Phe Ile Ile His Phe AsnTrpIle Gly Ala Ser Trp Leu CAT TTT AGA

Ile SerAsp Asn Asp Gln Gly Ile LeuSerTyr Leu Arg His Phe Arg TTT GCC ATT

Glu MetGlu Lys Asn Thr Val Cys PheValAsn Ile Pro Phe Ala Ile ', 240 245 250 255 AGA GCT AGC

Val AsnMet Asn Leu Tyr Met Ser GluValTyr Tyr Gln Arg Ala Ser GTT ATC ACA

Val MetThr Ser Ser Ala Asn Val IleTyrGly Asp Gly Val Ile Thr ATG TGG ATA

Asn ThrLeu Ala Val Ser Phe Arg AspSerLeu Gly Gln Met Trp Ile TGG GAT AAG

Arg LeuTrp Val Thr Thr Ser Gln ValThrPro Phe Lys Trp Asp Lys GGA ACT CAC

Asp PheThr Phe Asp Asn Gly Tyr PheGlyPhe Gly Arg Gly Thr His TAT TTT AAC

His SerGlu Ile Ser Gly Phe Lys ValGlnThr Leu Pro Tyr Phe Asn GTA AAG TAT

Phe LysTyr Ser Asp Glu Tyr Leu LeuGluTrp Met Val Val Lys Tyr TGT AAG TGC

Asn CysLys Ile Leu Glu Tyr Asn SerLeuLys Asn Ser Cys Lys Cys Phe Asn His Ser Leu Glu Trp Leu Met Thr His Thr Phe Asp Met Ala ATT ATT GAA GGG AGT TAT GAA ATA TAC AAT GCT GTG TAT GCT TTT GCC . 1370 Ile Ile GluGlySer TyrGluIle TyrAsnAla ValTyrAla PheAla His Ala LeuHisGlu MetThrLeu GlnAsnVal AspAsnVal LeuLeu Pro Asn TyrGluGlu GlnAsnTyr AsnCysLys MetValTyr SerPhe Leu Ser LysThrGln PheThrAsn ProValGly AspThrVal AsnMet Asn Gln ArgAsnLys LeuLysGlu GluTyrAsp IlePheTyr AsnTrp Asn Phe ProGlnGly LeuGlyPhe LysValLys IleGlyIle PheSer Pro Tyr PheProLys GlyGlnGln LeuHisLeu SerGluAsn LeuIle Glu Trp SerThrGly ArgIleGln MetProThr SerValCys SerAla Asp Cys GlyProGly PheArgLys ValTrpLys AsnGlyMet ProAla Cys Cys PheAspCys SerProCys ProGluAsn GluIleSer AsnGlu Thr Asn ValGluLeu CysValGln CysProGlu AspGlnTyr AlaAsn Gln Glu GlnAsnHis CysIleHis LysAlaArg IlePheLeu SerTyr Asp Glu ProLeuGly MetAlaLeu SerLeuMet AlaLeuCys LeuAla Ala Leu ThrValVal ValLeuGly ValPheVal LysHisHis ArgThr Pro Ile ValLysAla AsnAsnCys ThrLeuThr TyrIleLeu LeuIle Ala Leu IlePheCys PheLeuCys ProLeuPhe PheIleGly HisPro Asn Ser AlaThrCys IleLeuGln GlnIleThr PheGlyVal ValPhe.

Thr Val Ala Ile Ser Thr Val Leu Ala Lys Thr Thr Thr Val Ile Leu GTT

Ala Phe Arg Val Thr Ala Pro His Arg Met Met Lys Tyr Phe Leu Val w 690 695 700 ATT

Ser Arg Ala Ser Asn Tyr Ile Ile Pro Ile Cys Thr Leu Ile Gln Ile ATT

Ile Val Cys Ala Ile Trp Leu Gly Ala Ser Pro Pro Ser Val Asp Ile GGT

Asp Ala Gln Ser Glu His Gly His Ile Ile Ile Ala Cys Asn Lys Gly GCC

Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala ACC

Phe Val Ser Phe Thr Leu Ala Phe Leu Ser Arg Asn Leu Pro Val Thr AGT

Phe Asn Glu Ala Lys Ser Met Thr Phe Ser Met Leu Val Phe Cys Ser GTT

Val Trp Val Thr Phe Leu Pro Val Tyr His Gly Thr Lys Gly Lys Val ATG

Met Val Ala Val Glu Ile Phe Ser Thr Leu Ala Ser Ser Ala Gly Met CCA

Leu Gly Cys Ile Phe Ala Pro Lys Cys Tyr Thr Ile Leu Phe Arg Pro ACT

Asp Arg Asn Ser Leu Gln Met Ile Arg Glu Lys Ser Ser Ser His Thr TCATAATCAC CAAATATTC

His Ile Leu CCCTCAATTT TAAGTGTATC ATAAAAGACA CAGTTGTGAA ATTTTCAAGG ACAGCACTA,C3432 (2) INFORMATION FOR SEQ ID N0:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 866 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
Met Ser Arg Leu Arg Ala Gly Lys Asn Met Leu Thr Phe Ile Leu Leu Phe Phe Leu Leu Asn Ile Pro Leu Phe Val Pro Ser Phe Ile Tyr Pro Arg Cys Phe Trp Ser Met Lys Lys Asn Glu Tyr Gln Asp Arg Asn Leu Gly Thr Gly Cys Met Phe Phe Ile Leu Ala Val Gln Gln Pro Met Glu Lys Glu Tyr Phe Ser His Ile Ser Asn Ile Gln Thr Pro Thr Glu Asn Gln Lys Tyr Pro Leu Thr Leu Ala Phe Ser Met Asn Glu Ile Asn Asn Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ala Phe Thr Phe Ser Glu Tyr Ser Cys Tyr Leu Glu Ser His His Lys Arg Leu Phe Asn Phe Ser Leu Lys Asn His Glu Ile Leu Pro Asn Phe Ile Cys Thr Lys Asp Ile Lys Cys Gly Val Val Leu Thr Gly Leu Ser Leu Val Thr Thr Val Thr Leu His Ile Ile Leu Asn Asn Phe Ile Phe Gln Gln Phe Arg Gln Leu Thr Tyr Gly His Phe His Pro Ala Leu Cys Asp His Glu Asn Phe Pro His Leu Tyr Gln Met Ala Ser Asp Asp Thr Ser Leu Ala Leu Ala Leu Val Ser Phe Ile Ile His Phe Ser Trp Asn Trp Ile Gly Leu Ala Ile Ser Asp Asn Asp Gln Gly Ile His Phe Leu Ser Tyr Leu Arg Arg Glu Met Glu Lys Asn Thr Val Cys Phe Ala Phe Val Asn Ile Ile Pro Val Asn Met Asn Leu Tyr Met Ser Arg Ala Glu Val Tyr Tyr Ser Gln Val Met Thr Ser Ser Ala Asn Val Val Ile Ile Tyr Gly Asp Thr Gly Asn Thr Leu Ala Val Ser Phe Arg Met Trp Asp Ser Leu Gly Ile Gln Arg Leu Trp Val Thr Thr Ser Gln Trp Asp Val Thr Pro Phe Lys Lys Asp Phe Thr Phe Asp Asn Gly Tyr Gly Thr Phe Gly Phe Gly His Arg His Ser Glu Ile Ser Gly Phe Lys Tyr Phe Val Gln Thr Leu Asn Pro Phe Lys Tyr Ser Asp Glu Tyr Leu Val Lys Leu Glu Trp Met Tyr Val Asn Cys Lys Ile Leu Glu Tyr Asn Cys Lys Ser Leu Lys Asn Cys Ser Phe Asn His Ser Leu Glu Trp Leu Met Thr His Thr Phe Asp Met Ala Ile Ile Glu Gly Ser Tyr Glu Ile Tyr Asn Ala Val Tyr Ala Phe Ala His Ala Leu His Glu Met Thr Leu Gln Asn Val Asp Asn Val Leu Leu Pro Asn Tyr Glu Glu Gln Asn Tyr Asn Cys Lys Met Val Tyr Ser Phe Leu Ser Lys Thr Gln Phe Thr Asn Pro Val Gly Asp Thr Val Asn Met Asn Gln Arg Asn Lys Leu Lys Glu Glu Tyr Asp Ile Phe Tyr Asn Trp Asn Phe Pro Gln Gly Leu Gly Phe Lys Val Lys Ile Gly Ile Phe Ser Pro Tyr Phe Pro Lys Gly Gln Gln Leu His Leu Ser Glu Asn Leu Ile Glu Trp Ser Thr Gly Arg Ile Gln Met Pro Thr Ser Val Cys Ser Ala Asp Cys Gly Pro Gly Phe Arg Lys Val Trp Lys Asn Gly Met Pro Ala Cys Cys Phe Asp Cys Ser Pro Cys Pro Glu Asn Glu Ile Ser Asn G1u Thr Asn Val Glu Leu Cys Val Gln Cys Pro Glu Asp Gin Tyr Ala Asn Gln Glu Gln Asn His Cys Ile His Lys Ala Arg Ile Phe Leu Ser Tyr Asp Glu Pro Leu Gly Met Ala Leu Ser Leu Met Ala Leu Cys Leu Ala Ala Leu Thr Val val Val Leu Gly Val Phe Val Lys His His Arg Thr Pro Ile Val Lys Ala Asn Asn Cys Thr Leu Thr Tyr Ile Leu Leu Ile Ala Leu Ile Phe Cys Phe Leu Cys Pro Leu Phe Phe Ile Gly His Pro Asn Ser Ala Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Val Val Phe Thr Val Ala Ile Ser Thr Val Leu Ala Lys Thr Thr Thr Val Ile Leu Ala Phe Arg Val Thr Ala Pro His Arg Met Met Lys Tyr Phe Leu Val Ser Arg Ala Ser Asn Tyr Ile Ile Pro Ile Cys Thr Leu Ile Gln Ile Ile Val Cys Ala Ile Trp Leu Gly Ala Ser Pro Pro Ser Val Asp Ile Asp Ala Gln Ser Glu His Gly His Ile Ile Ile Ala Cys Asn Lys Gly Ser Val Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe Val Ser Phe Thr Leu Ala Phe Leu Ser Arg Asn Leu Pro Val Thr Phe Asn Glu Ala Lys Ser Met Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Gly Thr Lys Gly Lys Val Met Val Ala Val Glu Ile Phe Ser Thr Leu Ala Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Ala Pro Lys Cys Tyr Thr Ile Leu Phe Arg Pro Asp Arg Asn Ser Leu Gln Met Ile Arg Glu Lys Ser Ser Ser His Thr His Ile Leu (2) INFORMATION FOR SEQ ID N0:41:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2916 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 299...2635 (D) OTHER INFORMATION: GoVN5 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:

TATGCTCTTC

CTACAGAAGC

GGTGATTGTT

GTGGATGCTT

ATTTGGCC AT

Met ACC AAA

Arg Phe Ala Ile Glu Glu Ile Asn Ser Asn Pro His Leu Leu Pro Asn GAG GTA

Thr Ser Leu Gly Phe Glu Ile Asn Asn Val Pro His Gly Gln Arg Tyr TGA CAT

Thr Leu Val Lys Leu Phe Ser Ser Leu Ser Gly Ser Asn Tyr Asp Ile ACT TAC

Pro Asn Tyr Ile Ser Ala Ser Glu Ser Asn Ser Ala Ala Val Leu Thr GGA TCT

Gly Pro Ser Trp Thr Ile Ser Glu Cys Val Gly Thr Leu Leu Asp Leu CCT GAG

Tyr Lys Phe Pro Gln Leu Thr Phe Gly Pro Phe Asp Ser Leu Leu Ser AGA TAC

Glu Gln Arg Arg Phe Ser Ser Leu Tyr Gln Val Ala Pro Lys Asp Thr.

loo los llo Phe Leu Thr Pro Gly Ile Val Ser Leu Met Leu His Phe His Trp Asn Trp Val Gly Leu Phe Ile -Ile Asp Asp Asp Lys Gly Ala Gln Thr Leu Ser Asp Leu Arg Asn Glu Met Asp Lys Asn Gly Val Cys Thr Ala Phe Val Glu Met Ile Pro Val Ile Lys Gly Ser Phe Phe Thr Lys Ser Trp Lys Asn His Val Gln Ile Leu Glu Ser Ser Ser Asn Val Ile Ile Ile Tyr Gly Asp Ser Asp Ser Leu Leu Ser Leu Ile Val Asn Ile Lys Gln Lys Leu Leu Thr Trp Lys Val Trp Val Leu Ile Ser Gln Trp Asp Val Ser Lys Phe Asp Asp Tyr Phe Met Val Asp Ser Leu His Gly Ala Leu Ile Phe Ser His His Arg Giu Glu Ile Pro Asn Phe Thr Asp Phe Met Gln Lys Tyr Asn Pro Ser Lys Tyr Pro Glu Asp Thr Tyr Leu His Val Leu Trp His Met Tyr Phe Asn Cys Ser Phe VaI Lys Lys Asp Cys Lys Ile Val His Asn Cys Leu Pro Asn Ala Ser Leu Gly Phe Leu Pro Gly Asn Ile Phe Asp Met Ala Met Ser Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His Ser Leu His Glu Met Ile Leu Asn Gln Val Gln Phe Gln Thr His Glu Lys Gly Lys Lys Met Val Phe Phe Pro Trp Gln Leu His Pro Phe Leu Arg Glu Arg Gln Leu Ile Asn Gln Asn TTG GTC GTA

Gly Ala Asn Glu Asp Leu Asp Thr Arg Lys His Val Glu Cys Ser Tyr TTT TCT TGT

Asp Ile Leu Asn Phe Trp Asn Pro Lys Gly Gly Leu Asn Phe Leu Val AAG GGA GTC

Lys Val Gly Thr Phe Ser Pro Ala Pro Lys Gln Lys Leu Ser Glu Ser GTG GTC TCC

Ile Ser Ser Asn Met Ile Gln Ala Thr Gly Thr Glu Ile Trp Ser Pro CTG ATT CCA

Gln Ser Val Cys Ser Glu Ser His Pro Gly Arg Lys Thr Cys Phe His TTG CAT AGA

Gln Glu Gly Arg Val Ala Cys Phe Asp Cys Pro Cys Pro Cys Ile Glu AGA GTG TCC

Asn Glu Ile Ser Asn Glu Thr Val Asp Gln Val Lys Cys Asp Cys Pro AGA CTG AAC

Glu Thr His Tyr Ala Asn Ile Lys Ile His Leu Gln Lys Glu Cys Thr TGA GAA GAC ACT
TTG CTT

Val Thr Phe Leu Tyr Tyr Asp Pro Leu Gly Thr Leu Cys Asp Lys Phe ACT TGT GTT

Met Ser Leu Gly Phe Ser Ser Thr Ala Ala Leu Val Val Leu Val Phe CAT CAA TAA CCT
GGC TCT

Leu Lys Asn Arg Asp Thr Pro Val Lys Ala Asn Leu Ala Ile Asn Leu TTT TTT CTT

Ser Tyr Thr Leu Leu Ile Thr Met Leu Cys Leu Cys Pro Leu Phe Leu CAC TAT AAA

Leu Phe Ile Gly Arg Pro Ser Ala Ser Cys Leu Gln Gln Thr Ile Asn TGT CAC CAA

Ile Phe Gly Leu Leu Phe Thr Ala Leu Ser Val Leu Ala Val Thr Lys CTT TTC AAT

Thr Ile Thr Val Val Ile Ala Lys Ile Thr Pro Gly Arg Phe Ser Ile AAG TTT CTT

Arg Arg Trp Leu Leu Ile Ser Ala Pro Asn Ile Ile Pro Arg Phe Leu TCT TTG CTC

Cys Thr Leu Leu Gln Val Phe Leu Ser Gly Ile Trp Leu Thr Thr Ser Pro Pro Phe Ile Asp Lys Asp Ala His Ser Glu His Gly His Ile Ile Ile Ile Cys Asn Lys Gly Ser Ala Val Ala Phe His Cys Asn Leu Gly Tyr Leu Gly Ala Leu Ala Leu Val Ser Tyr Phe Met Ala Phe Leu Ser ', Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Ala Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His ' Ser Thr Lys Gly Lys Asn Met Val Ala Met Glu Val Phe Ser Ile Leu Ala Ser Ser Thr Ser Leu Leu Gly Ile Ile Phe Ala Pro Lys Cys Tyr 740 745 75p Leu Ile Leu Leu Arg Pro Glu Arg Asn Ser Leu Ser Tyr Ile Arg Asp T

Lys Thr Tyr Ala Lys Ser Ile Lys Pro Ser (2) INFORMATION FOR SEQ ID N0:42:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 779 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single ' (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:42:
Met Arg Phe Ala Ile Glu Glu Ile Asn Ser Asn Pro His Leu Leu Pro Asn Thr Ser Leu Gly Phe Glu Ile Asn Asn Val Pro His Gly Gln Arg Tyr Thr Leu Val Lys Leu Phe Ser Ser Leu Ser Gly Ser Asn Tyr Asp Ile Pro Asn Tyr Ile Ser Ala Ser Glu Ser Asn Ser Ala Ala Val Leu .

Thr Gly Pro Ser Trp Thr Ile Ser Glu Cys Val Gly Thr Leu Leu Asp Leu Tyr Lys Phe Pro Gln Leu Thr Phe Gly Pro Phe Asp Ser Leu Leu Ser Glu Gln Arg Arg Phe Ser Ser Leu Tyr Gln Val Ala Pro Lys Asp Thr Phe Leu Thr Pro Gly Ile Val Ser Leu Met Leu His Phe His Trp Asn Trp Val Gly Leu Phe Ile Ile Asp Asp Asp Lys Gly Ala Gln Thr Leu Ser Asp Leu Arg Asn Glu Met Asp Lys Asn Gly Val Cys Thr Ala Phe Val Glu Met Ile Pro Val Ile Lys Gly Ser Phe Phe Thr Lys Ser Trp Lys Asn His Val Gln Ile Leu Glu Ser Ser Ser Asn Val Ile Ile Ile Tyr Gly Asp Ser Asp Ser Leu Leu Ser Leu Ile Val Asn Ile Lys Gln Lys Leu Leu Thr Trp Lys Val Trp Val Leu Ile Ser Gln Trp Asp Val Ser Lys Phe Asp Asp Tyr Phe Met Val Asp Ser Leu His Gly Ala Leu Ile Phe Ser His His Arg Glu Glu Ile Pro Asn Phe Thr Asp Phe Met Gln Lys Tyr Asn Pro Ser Lys Tyr Pro Glu Asp Thr Tyr Leu His Val Leu Trp His Met Tyr Phe Asn Cys Ser Phe Val Lys Lys Asp Cys Lys Ile Val His Asn Cys Leu Pro Asn Ala Ser Leu Gly Phe Leu Pro Gly Asn Ile Phe Asp Met Ala Met Ser Glu Glu Ser Tyr Asn Val Tyr Asn Ala Val Tyr Ala Val Ala His Ser Leu His Glu Met Ile Leu Asn Gln Val Gln Phe Gln Thr His Glu Lys Gly Lys Lys Met Val Phe Phe Pro Trp Gln Leu His Pro Phe Leu Arg Glu Arg Gln Leu Ile Asn Gln Asn Gly Ala Asn Glu Asp Leu Asp Cys Thr Arg Lys Ser His Val Glu Tyr Asp Ile Leu Asn Phe Trp Asn Phe Pro Lys Gly Leu Gly Leu Asn Val Lys Val Gly Thr Phe Ser Pro Ser Ala Pro Lys Glu Gln Lys Leu Ser Ile Ser Ser Asn Met Ile Gln Trp Ala Thr Gly Ser Thr Glu Ile Pro Gln Ser Val Cys Ser Glu Ser Cys His Pro Gly Phe Arg Lys Thr His Gln Glu Gly Arg Val Ala Cys Cys Phe Asp Cys Ile Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asp Val Asp Gln Cys Val Lys Cys Pro Glu Thr His Tyr Ala Asn Ile Glu Lys Ile His Cys Leu Gln Lys Thr Val Thr Phe Leu Tyr Tyr Asp Asp Pro Leu Gly Lys Thr Leu Cys Phe Met Ser Leu Gly Phe Ser Ser Leu Thr Ala Ala Val Leu Val Val Phe Leu Lys Asn Arg Asp Thr Pro Ile Val Lys Ala Asn Asn Leu Ala Leu Ser Tyr Thr Leu Leu Ile Thr Leu Met Leu Cys Phe Leu Cys Pro Leu Leu Phe Ile Gly Arg Pro Ser Thr Ala Ser Cys Ile Leu Gln Gln Asn Ile Phe Gly Leu Leu Phe Thr Val Ala Leu Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Ile Ala Phe Lys Ile Thr Ser Pro Giy Arg Ile Arg Arg Trp Leu Leu Ile Ser Arg Ala Pro Asn Phe Ile Ile Pro Leu Cys Thr Leu Leu Gln Val Phe Leu Ser Gly Ile Trp Leu Thr Thr Ser Pro Pro Phe Ile Asp Lys Asp Ala His Ser Glu His Gly His Ile Ile Ile Ile Cys Asn Lys Gly Ser Ala Val Ala Phe His Cys Asn Leu Gly Tyr Leu Gly Ala Leu Ala Leu Val Ser Tyr Phe Met Ala Phe Leu Ser Arg Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Ala Phe Ser Met Leu Val Phe Cys Ser Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Asn Met Val Ala Met Glu Val Phe Ser Ile Leu Ala Ser Ser Thr Ser Leu Leu Gly Ile Ile Phe Ala Pro Lys Cys Tyr Leu Ile Leu Leu Arg Pro Glu Arg Asn Ser Leu Ser Tyr Ile Arg Asp Lys Thr Tyr Ala Lys Ser Ile Lys Pro Ser (2) INFORMATION FOR SEQ ID N0:43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3307 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 112...1761 (D) OTHER INFORMATION: GoVN6 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
TAAGGCAGGA AAAAATGTTC ATTTTGATGG AAGTCT"TCTT CTTCTTCCTT AACATTCCAC 60 Met Lys Leu Arg Asp Lys Asp Leu Ser Ile Thr Cys Ser Phe Ile Leu Glu Ala Val Gln Met Pro Thr Glu Asn Asp Tyr Phe Asn Gln Thr Leu Asn Ile . 20 25 30 Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Ser Ile Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu AAA ACT
TAC

IleIleLys TyrProLeu GlyLeuCys AspGlyGln ThrThrLeuPro ThrProTyr LeuPheAsn GluIleTyr PheArgPro IleProAsnTyr PheCysAsn GluGluThr MetCysThr PheLeuLeu ThrGlyProHis TrpIleThr SerTyrSer PheTrpIle HisLeuAsn IlePheLeuSer ProSerMet AsnProLys AspThrSer LeuAlaLeu AlaMetValSer PheLeuLeu TyrPheLys TrpAsnTrp ValGlyLeu ValIleSerAsp AspAspGln GlyAsnGln PheLeuSer GluLeuLys LysGluSerLys IleLysGlu IleCysPhe AlaPheVal SerMetLeu AlaIleAspGlu IleSerPhe TyrHisLys ThrGluMet TyrTyrAsn GlnIleValMet SerSerThr AsnValIle IleIleTyr GlyLysThr GluSerIleIle GluLeuSer PheArgMet TrpGluSer ProValIle GlnArgIleTrp ValThrThr LysGluMet AsnPhePro ThrSerLys ArgAspLeuThr HisAspThr PheTyrGly ThrLeuThr PheLeuHis SerHisGlyGlu IleSerGly PheLysAsn PheValGln ThrTrpTyr HisLeuArgIle ThrAspLeu HisLeuVal MetProGlu TrpLysTyr PheAsnTyrGlu AlaSerAla SerAsnCys LysIleLeu LysAsnTyr SerSerSerAla Ser Leu Glu Leu Glu Gln Thr Phe Asp Val Phe Ser Trp Met Met Asp GAT TAT ATG CTC

Gly Ser Arg Ile Asn Ala Val Asn Ala Ala His Ala Asp Tyr Met Leu AAT CAC GCA GGG

His Glu Met Leu Leu Val Asp Asn Gln Ile Asp Asn Asn His Ala Gly AGT CAC TCC AAG

Lys Gly Ala Ser Cys Phe Lys Ile Asn Phe Leu Arg Ser His Ser Lys ACT CCT ATT AGA

Thr His Phe Asn Leu Gly Asp Arg Val Met Lys Glu Thr Pro Ile Arg CAA GAC ACT TCT

Glu Ile Leu Glu Tyr Asn Ile Phe His Trp Asn Phe Gln Asp Thr Ser GGT AAG TTC TTT

', Gln His Ile Phe Val Lys Ile Gly Lys Ser Pro Tyr Gly Lys Phe Phe AGG TTT ATG GCT

Pro His Gly His His Leu Tyr Val Asp Ile Glu Leu Arg Phe Met Ala AGA ATG ACT AGT

Thr Gly Ser Lys Pro Ser Ser Val Cys Glu Asp Cys Arg Met Thr Ser AGA TTC GCA TTT

Pro Gly Tyr Arg Trp Lys Glu Gly Met Ala Cys Cys Arg Phe Ala Phe CCC CCT AAT ATG

Val Cys Ser Cys Glu Asn Ala Ile Ser Glu Thr Asn Pro Pro Asn Met GTG TGT GCC CGG

Asp Gln Cys Asn Pro Glu Tyr Gln Tyr Asn Thr Lys Val Cys Ala Arg ', GAC AAA TGC CAG AAT GTG ATG TTT CTA TAC AAA GAC 1701 ATT AAA AGC CCC

Asp Lys Cys Gln Asn Val Met Phe Leu Tyr Lys Asp Ile Lys Ser Pro GAC TGC TTT AAC

Leu Gly Asp Ser Leu His Ser Leu Leu Leu Cys Ile Asp Cys Phe Asn ACT GTGAAGCACC
ATGACACTCC
TATTGTGAAG
GCCAA

. Ser Cys Cys Thr TATTAATCAC GTCTCTCTTG
TTCTGTTTTC TCTGCTCATT

ACAGAGCAAC CTGCATCTTA
CAGCAAATCA CATTTGGAAT

CTACAATTTT GGCAAAAACA
ATCACTGTGG TTCTGGCTTT

GAAGGTTGAG AAACTTCCTA
GTATTGGGTA CACTCAACTA

TGTTTCAATG TATTCTGTGT
GCAATCTGGC TAGCAGTTTC

ATGAACACAC TGAGTATGGC
CACATCATCA TTGTGTGCAA

(2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 550 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
Met Lys Leu Arg Asp Lys Asp Leu Ser Ile Thr Cys Ser Phe Ile Leu Glu Ala Val Gln Met Pro Thr Glu Asn Asp Tyr Phe Asn Gln Thr Leu Asn Ile Leu Lys Thr Thr Lys Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Ser Ile Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Lys Tyr Pro Leu Gly Leu Cys Asp Gly Gln Thr Thr Leu Pro Thr Pro Tyr Leu Phe Asn Glu Ile Tyr Phe Arg Pro Ile Pro Asn Tyr Phe Cys Asn Glu Glu Thr Met Cys Thr Phe Leu Leu Thr Gly Pro His Trp Ile Thr Ser Tyr Ser Phe Trp Ile His Leu Asn Ile Phe Leu Ser Pro Ser Met Asn Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe Leu Leu Tyr Phe Lys Trp Asn Trp Val Gly Leu Val Ile Ser Asp Asp Asp Gln Gly Asn Gln Phe Leu Ser Glu Leu Lys Lys Glu Ser Lys Ile Lys Glu Ile Cys Phe Ala Phe Val Ser Met Leu Ala Ile Asp Glu Ile Ser Phe Tyr His Lys Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Ser Thr Asn Val Ile Ile Ile Tyr Gly Lys Thr Glu Ser Ile Ile Glu Leu Ser Phe Arg Met Trp Glu Ser Pro Val Ile Gln Arg Ile Trp Val Thr Thr Lys Glu Met Asn Phe Pro Thr Ser Lys Arg Asp Leu Thr His Asp Thr Phe Tyr Gly Thr Leu Thr Phe Leu His Ser His Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Trp Tyr His Leu Arg Ile Thr Asp Leu His Leu Val Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu Ala Ser Ala Ser Asn Cys Lys Ile Leu Lys Asn Tyr Ser Ser ' 305 310 315 320 Ser Ala Ser Leu Glu Trp Leu Met Glu Gln Thr Phe Asp Met VaI Phe Ser Asp Gly Ser Arg Asp IIe Tyr Asn Ala Val Asn Ala Met Ala His Ala Leu His Glu Met Asn Leu His Leu Val Asp Asn Gln Ala Ile Asp Asn Gly Lys Gly Ala Ser Ser His Cys Phe Lys Ile Asn Ser Phe Leu Arg Lys Thr His Phe Thr Asn Pro Leu Gly Asp Arg Val Ile Met Lys Glu Arg Glu Ile Leu Gln Glu Asp Tyr Asn Ile Phe His Thr Trp Asn Phe Ser Gln His Ile Gly Phe Lys Val Lys Ile Gly Lys Phe Ser Pro Tyr Phe Pro His Gly Arg His Phe His Leu Tyr Val Asp Met Ile Glu Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Thr Glu Asp Cys Ser Pro Gly Tyr Arg Arg Phe Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Glu Asn Ala Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Val Asn Cys Pro Glu Tyr Gln Tyr Ala Asn Thr Lys Arg Asp Lys Cys Ile Gln Lys Asn Val Met Phe Leu Ser Tyr Lys Asp Pro Leu Gly Asp Asp Ser Cys Leu His Ser Leu Leu Phe Leu Cys Ile Asn Ser Cys Cys Thr (2) INFORMATION FOR S8Q ID N0:45:
{i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3938 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 46...2424 (D) OTHER INFORMATION: GoVN7 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:

Met Ile Val Phe Phe Leu Leu Asn Ile Pro Leu Leu Met Ala Asn Ser Val Asp Pro Arg AAA GTC ATA GAT
ATA GAT

CysPhe TrpLysIle AsnLeuAsn Glu LysAsp AspLeu Asp Val Ile GTT CCT

ThrSer CysTyrPhe IleLeuGlu Ala GlnLeu MetGlu Lys Val Pro CTA ACC

AspTyr PheAsnGln ThrLeuAsn Val LysThr LysTyr Asn Leu Thr ATG ATA

ArgTyr AlaLeuAla LeuAlaPhe Thr AspGlu AsnArg Asn Met Ile ATT CAT

ProHis IleLeuPro AsnMetSer Leu IleLys ThrLeu Gly Ile His TTA CAA

HisCys AspGlyAsn IleProLeu Arg LeuAsn IlePhe Tyr Leu Gln GAA ATG

MetPro PheProAsn TyrGlyCys Asn GluThr CysSer Phe Glu Met TCT TTT

MetLeu MetGlyPro AsnLeuTrp Pro ValAsp PheIle His Ser Phe CAG TTC

LeuAsn IleLeuPhe ProHisPhe Leu IleSer GlyPro Phe Gln Phe TTT ATC

HisSer IlePheSer AspAsnGlu Gln ProTyr TyrGln Met Phe Ile GCA TCT

ThrPro LysAspThr SerLeuAla Leu MetVal PheIle Leu Ala Ser GTC GAT

TyrPhe AsnTrpAsn TrpValGly Leu LeuSer AsnAsp Glu Val Asp AAA CAC

GlyAsn GlnPheLeu ThrGluLeu Lys GluThr AsnThr Glu Lys His GCA GAG

IleCys PheAlaPhe ValAsnMet Met IleAsn AsnSer Ser Ala Glu CAA ATG

MetLys LysThrAsp MetTyrTyr Asn IleVal SerThr Ala Gln Met CCC ATT

AsnVal IleIleIle TyrGlyGlu Arg SerIle GluLeu Cys Pro Ile TTC AGA ACA TGG ACA TCT CCA GTC ATA CAG AGG ATA TGG GTT ACC AAA . 921 Phe Arg Thr Trp Thr Ser Pro Val Ile Gln Arg Ile Trp Val Thr Lys Ser Glu Leu Tyr Phe Pro Thr Ser Lys Arg Asp Leu Ser His Gly Thr Phe Tyr Gly Thr Leu Ala Phe Gln Gln His His Asp Val Ile Ser Gly Phe Lys Asn PheVal Gln Thr Tyr His Leu Lys Ser Asp Leu Trp Met ', TAT TTA TTA AAGCCA GAG TGG TTC TTT GAA TAT GAA TCA GCA 1113 GGT ACC

Tyr Leu Leu LysPro Glu Trp Phe Phe Glu Tyr Glu Ser Ala Gly Thr AGT TCA

Ser Tyr Cys LysIle Leu Met Asn Ser Ser Asn Val Leu Glu Ser Ser ' 360 365 370 GAC AAT

Trp Leu Met GluGln Lys Phe Ile Ala Phe Asn Asp Ser His Asp Asn GCC CAT

Ser Ile Tyr AsnAla Val Tyr Met Ala His Ala Leu Glu Lys Ala His CAG AAA

Asn Leu Lys GlnIle Asp Asn Glu Ile Ser Tyr Gly Gly Ala Gln Lys CAC ATC

Ser Thr His CysLeu Lys Leu Ser Phe Leu Arg Thr His Phe His Ile GTG GTA

Thr Asn Pro PheGly Glu Arg Ile Met Lys Glu Arg Arg Val Val Val CAC CAA

Gln Glu Asp TyrAsp Ile Val Leu Gln Asn Cys Ser His Leu His Gln Arg Ile Lys Val Lys Ile Gly Gln Phe Ser Pro Tyr Phe Pro His Gly Gly Gln Phe His Leu Tyr Glu Asp Met Ile Asp Leu AIa Thr Gly Ser _ Arg Lys Met Pro Leu Ser Met Cys Ser Ala Asp Cys Arg Pro Gly Tyr Arg Lys Phe Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Asp Asn Glu Ile Ser Asn Glu Thr Thr Val Val Leu Trp ACT ATT AAT

ValPheVal LysHisHis Asp Pro Val LysAla Asn Arg Thr Ile Asn ATG CTC TTT

IleLeuSer TyrIleLeu Ile Ser Met PheCys Leu Cys Met Leu Phe CCT AGA ATC

SerPhePhe PheIleGly His Asn Gly ThrCys Leu Gln Pro Arg Ile TTC GTG ACA

GlnIleThr PheGlyIle Val Thr Ala ValSer Val Leu Phe Val Thr CTG TTT GAC

AlaLysThr IleThrVal Leu Ala Gln ValThr Thr Gly Leu Phe Asp GTA GGG TAC

ArgLysLeu ArgAsnPhe Leu Ser Thr ProAsn Ile Ile Val Gly Tyr TGC CTG TGG

ProIleCys SerLeuLeu Gln Thr Cys AlaIle Leu Ala Cys Leu Trp ATC GAA CAT

ValSerPro ProPheVal Asp Asp His SerGlu Gly His Ile Glu His GGA GTT TAC

IleIleIle ValCysAsn Lys Ser Met AlaPhe Cys Val Gly Val Tyr GCC GGA ATG

LeuGlyTyr LeuAlaPhe Leu Leu Ser PheThr Ala Phe Ala Gly Met ACA AAT TTC

LeuAlaLys AsnLeuPro Asp Phe Glu AlaLys Leu Thr Thr Asn Phe AGT TGG CTT

PheSerMet LeuValPhe Cys Val Ile ThrPhe Pro Val Ser Trp Leu GTC GTT ATT

TyrHisSer ThrLysGly Arg Met Ala ValGlu Phe Ser Val Val Ile ATG GGA GCA

IleLeuThr SerSerAla Gly Leu Cys ValPhe Pro Lys Met Gly Ala CCA AGA AAA

IleTyrIle IleLeuMet Lys Glu Ile LeuSer Arg Gln Pro Arg Lys TTTTAGAAAT
TCTGTCAAAT
GTACAGTTGT
T

GluLysSer ArgPhe CTCACTAGTT

CCATAAAATC

TGTAGTATTA

CAAGTACATT

ACAGGATTAC

GAATCAACAA

CAGAATACTG

GTAGAAGTTT

GAGCACCCTG

GAATACCAGC

ATACATAAGC

TCAGTGGAAG

GTGATGGTTT

GAGGAATTTG

TCAGTCTGTT

CCTGCCTGGA

GAACATGTAA

GTCTGTACAT

GATTTCCTCT

CACCGTAAAA

TATTAACATG

TGTACCTAAT

CACAAAATTC

ATAAATTTTC

(2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 793 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
Met Ile Val Phe Phe Leu Leu Asn Ile Pro Leu Leu Met Ala Asn Ser Val Asp Pro Arg Cys Phe Trp Lys Ile Asn Leu Asn Glu Val Lys Asp Ile Asp Leu Asp Thr Ser Cys Tyr Phe Ile Leu Glu Ala Val Gln Leu Pro Met Glu Lys Asp Tyr Phe Asn Gln Thr Leu Asn Val Leu Lys Thr Thr Lys Tyr Asn Arg Tyr Ala Leu Ala Leu Ala Phe Thr Met Asp Glu Ile Asn Arg Asn Pro His Ile Leu Pro Asn Met Ser Leu Ile Ile Lys His Thr Leu Gly His Cys Asp Gly Asn Ile Pro Leu Arg Leu Leu Asn ' Gln Ile Phe Tyr Met Pro Phe Pro Asn Tyr Gly Cys Asn Glu Glu Thr Met Cys Ser Phe Met Leu Met Gly Pro Asn Leu Trp Pro Ser Val Asp Phe Phe Ile His Leu Asn Ile Leu Phe Pro His Phe Leu Gln Ile Ser Phe Gly Pro Phe His Ser Ile Phe Ser Asp Asn Glu Gln Phe Pro Tyr Ile Tyr Gln Met Thr Pro Lys Asp Thr Ser Leu Ala Leu Ala Met Val Ser Phe Ile Leu Tyr Phe Asn Trp Asn Trp Val Gly Leu Val Leu Ser Asp Asn Asp Glu Gly Asn Gln Phe Leu Thr Glu Leu Lys Lys Glu Thr His Asn Thr Glu Ile Cys Phe Ala Phe Val Asn Met Met Ala Ile Asn Glu Asn Ser Ser Met Lys Lys Thr Asp Met Tyr Tyr Asn Gln Ile Val Met Ser Thr Ala Asn Val Ile Ile Ile Tyr Gly Glu Arg Pro Ser Ile Ile Glu Leu Cys Phe Arg Thr Trp Thr Ser Pro Val Ile Gln Arg Ile Trp Val Thr Lys Ser Glu Leu Tyr Phe Pro Thr Ser Lys Arg Asp Leu Ser His Gly Thr Phe Tyr Gly Thr Leu Ala Phe Gln Gln His His Asp Val Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Trp Tyr His Leu Lys Ser Met Asp Leu Tyr Leu Leu Lys Pro Glu Trp Gly Phe Phe Glu Tyr Glu Thr Ser Ala Ser Tyr Cys Lys Ile Leu Met Ser Asn Ser Ser Asn Val Ser Leu Glu Trp Leu Met Glu Gln Lys Phe Asp Ile Ala Phe Asn Asp Asn Ser His Ser Ile Tyr Asn Ala Val Tyr Ala Met Ala His Ala Leu His Glu Lys Asn Leu Lys Gln Ile Asp Asn Gln Glu Ile Ser Tyr Gly Lys Gly Ala Ser Thr His Cys Leu Lys Leu His Ser Phe Leu Arg Thr Ile His Phe Thr Asn Pro Phe Gly Glu Arg Val Ile Met Lys Glu Arg Val Arg Val Gln Glu Asp Tyr Asp Ile Val His Leu Gln Asn Cys Ser Gln His Leu Arg Ile Lys Val Lys Ile Gly Gln Phe Ser Pro Tyr Phe Pro His Gly Gly Gln Phe His Leu Tyr Glu Asp Met Ile Asp Leu Ala Thr Gly Ser Arg Lys Met Pro Leu Ser Met Cys Ser Ala Asp Cys Arg Pro Gly Tyr Arg Lys Phe Trp Lys Glu Gly Met Ala Ala Cys Cys Phe Val Cys Ser Pro Cys Pro Asp Asn Glu Ile Ser Asn Glu Thr Thr Val Val Leu Trp Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Ile Leu Ile Met Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly His Pro Asn Arg Gly Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Ile Val Phe Thr Val Ala Val Ser Thr Val Leu Ala Lys Thr Ile Thr Val Leu Leu Ala Phe Gln Val Thr Asp Thr Gly Arg Lys Leu Arg Asn Phe Leu Val Ser Gly Thr Pro Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Thr Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Glu His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Val Met Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Phe Leu Ala Leu Gly Ser Phe Thr Met Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ser Val Trp Ile Thr.

Phe Leu Pro Val Tyr His Ser Thr Lys Gly Arg Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Thr Ser Ser Ala Gly Met Leu Gly Cys Val Phe Ala Pro Lys Ile Tyr Ile Ile Leu Met Lys Pro Glu Arg Ile Leu Ser Lys Arg Gln Glu Lys Ser Arg Phe (2) INFORMATION FOR SEQ ID N0:47:
(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 3359 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA

(ix) FEATURE:

(A) NAME/KEY: Coding Sequence (B) LOCATION: 59...2452 (D) OTHER INFORMATION: GoVNI3C

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:47:

Met Val Ile Phe Phe Leu Leu Asn Ile Pro Phe Leu Leu Ala Asn Phe Met Asp Pro Arg Cys Phe Trp Lys Ile Asn Leu Asn Glu Ile Lys Asp Glu Val Leu Gly Met Thr Cys Ser Phe Ile Leu Glu Thr Val Gln Lys Thr Met Asp Lys Asp Tyr Phe Asn Gln Thr Leu Asn Val Leu Asn Thr Thr Thr Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Thr Val Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Lys Tyr Asn Leu Gly His Cys Asp Gly Lys Thr Val Thr Thr Leu Ser Asp Leu Phe Asn Pro Asn Asn His Leu His Phe Pro Asn Tyr Leu Cys Asn Glu AGG GAT TAT GTG TTT GGT TCT GCT TAC AGG ACC ACA TTG GAG AGC ATC . 492 Gly Ile Met Cys Leu Val Leu Leu Thr Gly Pro His Trp Arg Ala Ser TTT CCT

Leu Tyr Leu Trp Ile Ser Val Tyr Val Tyr Leu Ser Pro His Phe Leu TGA ACA

Gln Leu Ser Tyr Gly Pro Phe Tyr Ser Ile Phe Ser Asp Asn Glu Gln AGC ATT

Tyr Pro Tyr Leu Tyr Gln Met Gly Pro Lys Asp Ser Ser Leu Ala Leu TGG GCT

Ala Met Val Ser Phe Ile Ile Tyr Phe Lys Trp Asn Trp Val Gly Leu GTT GAA

Phe Ile Ser Asp Asp Asp Gln Gly Asn Gln Phe Leu Ser Glu Leu Lys CAT GAT

Lys Glu Ser Gln Thr Lys Asp Ile Cys Phe Ala Phe Val Asn Met Ile CTA CAA

Ser Val Ser Asp Val Ser Tyr Tyr His Lys Thr Glu Met Tyr Tyr Asn GGA AAC

Gln Ile Val Met Ser Ser Thr Lys Val Ile Ile Ile Tyr Gly Glu Thr AGT TAA

Asn Ser Ile Ile Glu Leu Ser Phe Arg Met Trp Ser Ser Pro Val Lys CAG TAA

Gln Arg Ile Trp Val Thr Thr Lys Gln Phe Asp Cys Pro Thr Ser Lys TCT ACA

Arg Asp Leu Thr His Gly Thr Phe Tyr Gly Thr Leu Thr Phe Leu His ACG GTA

His Tyr Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Arg Tyr GAA ATA

Asn Leu Arg Ser Thr Asp Leu Tyr Leu Val Met Pro Glu Trp Lys Tyr AAA CTA

Phe Asn Tyr Glu Ala Ser Ala Ser Asn Cys Lys Ile Leu Arg Asn Tyr TGA CAT

Leu Ser Asn Ile Ser Leu Glu Trp Leu Met Glu Gln Lys Phe Asp Met TGC CAT

Ser Phe Ser Asp Tyr Ser His Asn Ile Tyr Asn Ala Val Tyr Ala Ile.

Ala His Ala Leu His Glu Lys Asn Leu Gln Glu Val Glu Asn Gln Ala Ile Asn Asn Ala Lys Gly Glu Asn Thr His Cys Leu Lys Leu Asn Ser Phe Leu Arg Lys Thr His Phe Thr Asn Ser Leu Gly Asn Arg Val Ile Met Lys Gln Arg Glu Val Val His Gly Asp Tyr Asn Ile Val His Met Trp Asn Phe Ser Gln Arg Leu Gly Ile Lys Val Lys Ile Gly Gln Phe ', 5 470 475 480 Ser Pro His Phe Pro Gln Gly Gln Gln Leu His Leu Tyr Val Asp Met ' 485 490 495 Thr Glu Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys His Pro Gly Phe Arg Arg Ile Trp Lys Glu Glu Met Ala Ala Cys Cys Phe Val Cys Asn Pro Cys Pro Glu Asn Glu Ile Ser Asn ', Glu Thr Met Val Val Phe Trp Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Leu Leu Ile Val Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Tyr Pro Asn Arg Ala Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Ile Phe Phe Thr Val Ala Ile Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Leu Ala _ 610 615 620 62 Phe Lys Val Thr Asp Pro Gly Arg Gln Leu Arg Ile Phe Leu Val Ser Gly Thr Pro Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Ile TAT TGA

Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp GGG CTC

Glu His Ser Glu His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser GGC CTT

Ile Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe CAC ATT

Gly Ser Phe Thr Ile Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe CGC TGT

Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val GGT CAT

Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val Met GAT GCT

Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu ACC AGA

Gly Cys Ile Phe Ala Pro Lys Val Tyr Ile Ile Leu Met Arg Pro Asp TGAAAAGGTA

Arg Asn Ser Ile His Lys Ile Arg Glu Lys Ser Tyr Phe AGTTACAGAG

GAAGTACCAT

GCTTAGTATC

TTTGCTTTCA

ATATAACCTT

TAACTAAAAA

TACTTGACAG

GCTGAAAATG

ATCTGAGAAC

CCATAGGAAT

CCACTAACAA

ATTGCCTGGT

GGATTGGGGA

TATTGGATGA

AAAAAAA

(2) INFORMATION FOR SEQ ID N0:48:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 79B amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:

Met Val Ile Phe Phe Leu Leu Asn Ile Pro Phe Leu Leu Ala Asn Phe Met Asp Pro Arg Cys Phe Trp Lys Ile Asn Leu Asn Glu Ile Lys Asp Glu Val Leu Gly Met Thr Cys Ser Phe Ile Leu Glu Thr Val Gln Lys Thr Met Asp Lys Asp Tyr Phe Asn Gln Thr Leu Asn Val Leu Asn Thr Thr Thr Asn His Lys Tyr Ala Leu Ala Leu Ala Phe Thr Val Asp Glu Ile Asn Arg Asn Pro Asp Leu Leu Pro Asn Met Ser Leu Ile Ile Lys Tyr Asn Leu Gly His Cys Asp Gly Lys Thr Val Thr Thr Leu Ser Asp Leu Phe Asn Pro Asn Asn His Leu His Phe Pro Asn Tyr Leu Cys Asn Glu Gly Ile Met Cys Leu Val Leu Leu Thr Gly Pro His Trp Arg Ala Ser Leu Tyr Leu Trp Ile Ser Val Tyr Val Tyr Leu Ser Pro His Phe Leu Gln Leu Ser Tyr Gly Pro Phe Tyr Ser Ile Phe Ser Asp Asn Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Gly Pro Lys Asp Ser Ser Leu Ala Leu Ala Met Val Ser Phe Ile Ile Tyr Phe Lys Trp Asn Trp Val Gly Leu Phe Ile Ser Asp Asp Asp Gln Gly Asn Gln Phe Leu Ser Glu Leu Lys Lys Glu Ser Gln Thr Lys Asp Ile Cys Phe Ala Phe Val Asn Met Ile Ser Val Ser Asp Val Ser Tyr Tyr His Lys Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Ser Thr Lys Val Ile Ile Ile Tyr Gly Glu Thr Asn Ser Ile Ile Glu Leu Ser Phe Arg Met Trp Ser Ser Pro Val Lys Gln Arg Ile Trp Val Thr Thr Lys Gln Phe Asp Cys Pro Thr Ser Lys Arg Asp Leu Thr His Gly Thr Phe Tyr Gly Thr Leu Thr Phe Leu His His Tyr Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Arg Tyr Asn Leu Arg Ser Thr Asp Leu Tyr Leu Val Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu Ala Ser Ala Ser Asn Cys Lys Ile Leu Arg Asn Tyr Leu Ser Asn Ile Ser Leu Glu Trp Leu Met Glu Gln Lys Phe Asp Met Ser Phe Ser Asp Tyr Ser His Asn Ile Tyr Asn Ala Val Tyr Ala Ile Ala His Ala Leu His Glu Lys Asn Leu Gln Glu Val Glu Asn Gln Ala Ile Asn Asn Ala Lys Gly Glu Asn Thr His Cys Leu Lys Leu Asn Ser Phe Leu Arg Lys Thr His Phe Thr Asn Ser Leu Gly Asn Arg Val Ile Met Lys Gln Arg Glu Val Val His Gly Asp Tyr Asn Ile Val His Met Trp Asn Phe Ser Gln Arg Leu Gly Ile Lys Val Lys Ile Gly Gln Phe Ser Pro His Phe Pro Gln Gly Gln Gln Leu His Leu Tyr Val Asp Met Thr Glu Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys His Pro Gly Phe Arg Arg Ile Trp Lys Glu Glu Met.

Ala Ala Cys Cys Phe Val Cys Asn Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Met Val Val Phe Trp Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Leu Leu Ile Val Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Tyr Pro Asn Arg Ala Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Ile Phe Phe Thr Val Ala Ile Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Gln Leu Arg Ile Phe Leu Val Ser Gly Thr Pro Asn Tyr Ile Ile Pro Ile Cys Ser Leu Leu Gln Cys Ile Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Glu His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Ile Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe Gly Ser Phe Thr Ile Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Ala Pro Lys Val Tyr Ile Ile Leu Met Arg Pro Asp Arg Asn Ser Ile His Lys Ile Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3012 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 3...2087 (D) OTHER INFORMATION: GoVNI3B
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49:

Val Tyr Leu Ser Pro His Phe Leu Gln Leu Ser Tyr Gly Pro Phe Tyr Ser Ile Phe Ser Asp Asn Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Gly Pro Lys Asp Ser Ser Leu Ala Leu Ala Met Val Ser Phe Ile Ile WO 5!9/00422 PCT/US98/13680 AAG GAT CAA

Tyr Phe Trp Asn Trp Val Gly Leu Phe Ile Ser Asp Asp Lys Asp Gln CAA AAG GAT

Gly Asn Phe Leu Ser Glu Leu Lys Lys Glu Ser Gln Thr Gln Lys Asp w ATT TGC GCC TTT GTG AAC ATG ATA TCA GTC AGT GAT GTT 287 TTT TCA TAC

Ile Cys Ala Phe Val Asn Met Ile Ser Val Ser Asp Val Phe Ser Tyr AAA TCC ACA

Tyr His Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Lys Ser Thr ATT TTG AGC

Lys Val Ile Ile Tyr Gly Glu Thr Asn Ser Ile Ile Glu Ile Leu Ser ATG ACC ACA

Phe Arg Trp Ser Ser Pro Val Lys Gln Arg Ile Trp Val Met Thr Thr TTT GGC ACA

Lys Gln Asp Cys Pro Thr Ser Lys Arg Asp Leu Thr His Phe Gly Thr GGG TCT GGC

Phe Tyr Thr Leu Thr Phe Leu His His Tyr Gly Glu Ile Gly Ser Gly AAT GAT TTA

Phe Lys Phe Val Gln Thr Arg Tyr Asn Leu Arg Ser Thr Asn Asp Leu ',, 180 185 190 GTA TCA GCA

Tyr Leu Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu Ala Val Ser Ala TGT CTG GAA

Ser Asn Lys Ile Leu Arg Asn Tyr Leu Ser Asn Ile Ser Cys Leu Glu ATG AGT CAC

', Trp Leu Glu Gln Lys Phe Asp Met Ser Phe Ser Asp Tyr Met Ser His TAC GAG AAA

Asn Ile Asn Ala Val Tyr Ala Ile Ala His Ala Leu His Tyr Glu Lys CAA GGA GAA

Asp Leu Glu Phe Glu Asn Gln Ala Ile Asn Asn Ala Lys Gln Gly Glu CAC CAC TTC

Asn Thr Cys Leu Lys Leu Asn Ser Phe Leu Arg Lys Thr His His Phe TCT GTA GTG

Thr Asn Leu Gly Asn Arg Val Ile Met Lys Gln Arg Glu Ser Val Val GAC CGC CTT .

HisGly Tyr IlevalHis MetTrpAsn Phe Gln Leu Asp Asn Ser Arg TTT

GlyIleLysVal LysIleGlyGln PheSerPro His ProGlnGly Phe GCT

GlnGlnLeuHis LeuTyrValAsp MetThrGlu Leu ThrGlySer Ala CAT

ArgLysMetPro SerSerValCys SerAlaAsp Cys ProGlyPhe His TTT

ArgArgIleTrp LysGluGluMet AlaAlaCys Cys ValCysAsn Phe ATG

ProCysProGlu AsnGluIleSer AsnGluThr Asn AspGlnCys Met AAG

AlaAsnCysPro GluTyrGlnTyr AlaAsnThr Glu AsnLysCys Lys CCC

IleGlnLysGly ValIleValLeu SerTyrGlu Asp LeuGlyMet Pro ACA

AlaLeuAlaLeu IleAlaPheCys PheSerAla Phe ValValVal Thr GTG

PheTrpValPhe ValLysHisHis AspThrPro Ile LysAlaAsn Val ATG

AsnArgIleLeu SerTyrLeuLeu IleValSer Leu PheCysPhe Met GCA

LeuCysSerPhe PhePheIleGly TyrProAsn Arg ThrCysIle Ala GCT

LeuGlnGlnIle ThrPheGlyIle PhePheThr Val IleSerThr Aia AAA

ValLeuAlaLys ThrIleThrVal ValLeuAla Phe ValThrAsp Lys ACA

ProGlyArgGln LeuArgIlePhe LeuValSer Gly ProAsnTyr Thr TGT

IleIleProIle CysSerLeuLeu GlnCysIle Leu AlaIleTrp Cys CAC

LeuAlaValSer ProProPheVal AspIleAsp Glu SerGluHis.
His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Ile Thr Ala Phe Tyr Cys Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe Gly Ser Phe Thr Ile Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Ala Pro Lys Val Tyr Ile Ile Leu Met Arg Pro Asp Arg Asn Ser Ile His Lys Ile Arg Glu Lys Ser Tyr Phe CATATATCTA

ATGTACTGAT

TTCTGGAAGA

TGAGGATTTC

ACATAGAAAG

TCAACAAAGA

AATACTGTCT

TTCTTTATCT

TCCTGTGGTT

TACAAAGCAG

AGTCAGCCTA

TCCAGCCTCA

AGGTCTGGGG

AGAATGAATC

19~AAAAAAAAA

(2) INFORMATION FOR SEQ ID N0:50:
(i) SEQUENCE CHARACTERISTICS:
' (A) LENGTH: 695 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single ' (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:50:
Val Tyr Leu Ser Pro His Phe Leu Gln Leu Ser Tyr Gly Pro Phe Tyr Ser Ile Phe Ser Asp Asn Glu Gln Tyr Pro Tyr Leu Tyr Gln Met Gly Pro Lys Asp Ser Ser Leu Ala Leu Ala Met Val Ser Phe Ile Ile Tyr Phe Lys Trp Asn Trp Val Gly Leu Phe Ile Ser Asp Asp Asp Gln Gly Asn Gln Phe Leu Ser Glu Leu Lys Lys Glu Ser Gln Thr Lys Asp Ile Cys Phe Ala Phe Val Asn Met Ile Ser Val Ser Asp Val Ser Tyr Tyr His Lys Thr Glu Met Tyr Tyr Asn Gln Ile Val Met Ser Ser Thr Lys Val Ile Ile Ile Tyr Gly Glu Thr Asn Ser Ile Ile Glu Leu Ser Phe Arg Met Trp Ser Ser Pro Val Lys Gln Arg Ile Trp Val Thr Thr Lys Gln Phe Asp Cys Pro Thr Ser Lys Arg Asp Leu Thr His Gly Thr Phe Tyr Gly Thr Leu Thr Phe Leu His His Tyr Gly Glu Ile Ser Gly Phe Lys Asn Phe Val Gln Thr Arg Tyr Asn Leu Arg Ser Thr Asp Leu Tyr Leu Val Met Pro Glu Trp Lys Tyr Phe Asn Tyr Glu Ala Ser Ala Ser Asn Cys Lys Ile Leu Arg Asn Tyr Leu Ser Asn Ile Ser Leu Glu Trp Leu Met Glu Gln Lys Phe Asp Met Ser Phe Ser Asp Tyr Ser His Asn Ile Tyr Asn Ala Val Tyr Ala Ile Ala His Ala Leu His Glu Lys Asp Leu Gln Glu Phe Glu Asn Gln Ala Ile Asn Asn Ala Lys Gly Glu Asn Thr His Cys Leu Lys Leu Asn Ser Phe Leu Arg Lys Thr His Phe Thr Asn Ser Leu Gly Asn Arg Val Ile Met Lys Gln Arg Glu Val Val His Gly Asp Tyr Asn Ile Val His Met Trp Asn Phe Ser Gln Arg Leu Gly Ile Lys Val Lys Ile Gly Gln Phe Ser Pro His Phe Pro Gln Gly Gln Gln Leu His Leu Tyr Val Asp Met Thr Glu Leu Ala Thr Gly Ser Arg Lys Met Pro Ser Ser Val Cys Ser Ala Asp Cys His Pro Gly Phe Arg Arg Ile Trp Lys Glu Glu Met Ala Ala Cys Cys Phe Val Cys Asn Pro Cys Pro Glu Asn Glu Ile Ser Asn Glu Thr Asn Met Asp Gln Cys Ala Asn Cys Pro Glu Tyr Gln Tyr Ala Asn Thr Glu Lys Asn Lys Cys Ile Gln Lys Gly Val Ile Val Leu Ser Tyr Glu Asp Pro Leu Gly Met Ala Leu Ala Leu Ile Ala Phe Cys Phe Ser Ala Phe Thr Val Val Val Phe Trp Val Phe Val Lys His His Asp Thr Pro Ile Val Lys Ala Asn Asn Arg Ile Leu Ser Tyr Leu Leu Ile Val Ser Leu Met Phe Cys Phe Leu Cys Ser Phe Phe Phe Ile Gly Tyr Pro Asn Arg Ala Thr Cys Ile Leu Gln Gln Ile Thr Phe Gly Ile Phe Phe Thr Val Ala Ile Ser Thr Val Leu Ala Lys Thr Ile Thr Val Val Leu Ala Phe Lys Val Thr Asp Pro Gly Arg Gln Leu Arg Ile Phe Leu Val Ser Gly Thr Pro Asn Tyr Ile .

Ile Pro Ile Cys Ser Leu Leu Gln Cys Ile Leu Cys Ala Ile Trp Leu Ala Val Ser Pro Pro Phe Val Asp Ile Asp Glu His Ser Glu His Gly His Ile Ile Ile Val Cys Asn Lys Gly Ser Ile Thr Ala Phe Tyr Cys 580 585 . 590 Val Leu Gly Tyr Leu Ala Cys Leu Ala Phe Gly Ser Phe Thr Ile Ala Phe Leu Ala Lys Asn Leu Pro Asp Thr Phe Asn Glu Ala Lys Phe Leu Thr Phe Ser Met Leu Val Phe Cys Ala Val Trp Val Thr Phe Leu Pro Val Tyr His Ser Thr Lys Gly Lys Val Met Val Ala Val Glu Ile Phe Ser Ile Leu Ala Ser Ser Ala Gly Met Leu Gly Cys Ile Phe Ala Pro Lys Val Tyr Ile Ile Leu Met Arg Pro Asp Arg Asn Ser Ile His Lys Ile Arg Glu Lys Ser Tyr Phe (2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 435 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:

(2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 145 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide - (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
Gln Thr Leu Ser Tyr Thr Leu Leu Val Ser Leu Thr Leu Cys Phe Leu Ser Ser Ser Leu Phe Ile Gly Arg Pro Ser Pro Ala Thr Cys Leu Leu Ser Gln Thr Thr Phe Ala Ala Val Phe Thr Val Ala Val Phe Phe Cys Arg Ala Phe Gln Ala Ile Arg Pro Glu Ser Arg Ile Arg Lys Trp Met Gly Pro Gln Lys Thr Asn Ser Val Val Phe Leu Cys Ser Phe Thr Gln 65 70 75 g0 Val Thr Leu Cys Gly Ile Trp Leu Gly Thr Glu Pro Pro Phe Val Asn Lys Asp Pro Gln Phe Met Pro Gly Tyr Ile Ile Ile Gln Cys Asn Glu Gly Ser Val Thr Ala Phe Tyr Ser Val Leu Gly Tyr Leu Gly Phe Leu Val Leu Gly Ser Leu Ala Val Ala Phe Leu Ala Arg Asn Leu Pro Asp Ala (2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 474 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:

(2) INFORMATION FOR SEQ ID N0:54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 338 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:

(2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 182 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:

~ 182 (2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:

. (A) LENGTH: 37 base pairs (B) TYPE: nucleic acid (C) STR.ANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID
N0:56:

GACAAAATAT GAATTCT

(2) INFORMATION FOR SEQ ID N0:57:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 37 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID
N0:57:

GTACTCTTCA GAATTCT

(2) INFORMATION FOR SEQ ID N0:58:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 51 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID
N0:58:

Asn Met Asp Gln Cys Ala Asn Cys Pro Glu Tyr Ala Asn Thr Tyr Gln Glu Lys Asn Lys Cys Ile Gln Lys Gly Val Leu Ser Tyr Glu Ile Val Asp Pro Leu Gly Met Ala Leu Ala Leu Ile Cys Phe Ser Ala Ala Phe ' 35 40 45 Phe Thr Val !, 50 (2) INFORMATION FOR SEQ ID N0:59:
_ (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1079 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:
Met Ala Ser Tyr Ser Cys Cys Leu Ala Leu Leu Ala Leu Ala Trp His ' Ser Ser Ala Tyr Gly Pro Asp Gln Arg Ala Gln Lys Lys Gly Asp Ile .

Ile Leu Gly Gly Leu Phe Pro Ile His Phe Gly Val Ala Ala Lys Asp Gln Asp Leu Lys Ser Arg Pro Glu Ser Val Glu Cys Ile Arg Tyr Asn Phe Arg Gly Phe Arg Trp Leu Gln Ala Met Ile Phe Ala Ile Glu Glu Ile Asn Ser Ser Pro Ser Leu Leu Pro Asn Met Thr Leu Gly Tyr Arg Ile Phe Asp Thr Cys Asn Thr Val Ser Lys Ala Leu Glu Ala Thr Leu Ser Phe Val Ala Gln Asn Lys Ile Asp Ser Leu Asn Leu Asp Glu Phe Cys Asn Cys Ser Glu His Ile Pro Ser Thr Ile Ala Val Val Gly Ala Thr Gly Ser Gly Val Ser Thr Ala Val Ala Asn Leu Leu Gly Leu Phe Tyr Ile Pro Gln Val Ser Tyr Ala Ser Ser Ser Arg Leu Leu Ser Asn Lys Asn Gln Tyr Lys Ser Phe Leu Arg Thr Ile Pro Asn Asp Glu His Gln Ala Thr Ala Met Ala Asp Ile Ile Glu Tyr Phe Arg Trp Asn Trp Val Gly Thr Ile Ala Ala Asp Asp Asp Tyr Gly Arg Pro Gly Ile Glu Lys Phe Arg Glu Glu Ala Glu Glu Arg Asp Ile Cys Ile Asp Phe Ser Glu Leu Ile Ser Gln Tyr Ser Asp Glu Glu Glu Ile Gln Gln Val Val Glu Val Ile Gln Asn Ser Thr Ala Lys Val Ile Val Val Phe Ser Ser Gly Pro Asp Leu Glu Pro Leu Ile Lys Glu Ile Val Arg Arg Asn Ile Thr Gly Arg Ile Trp Leu Ala Ser Glu Ala Trp Ala Ser Ser Ser Leu Ile Ala Met Pro Glu Tyr Phe His Val Val Gly Gly Thr Ile Gly Phe Gly Leu Lys Ala Gly Gln Ile Pro Gly Phe Arg Glu Phe Leu Gln Lys Val His Pro Arg Lys Ser Val His Asn Gly Phe Ala Lys Glu Phe Trp Glu Glu Thr Phe Asn Cys His Leu Gln Glu Gly Ala Lys Gly Pro Leu Pro Val Asp Thr Phe Val Arg Ser His Glu Glu Gly Gly Asn Arg Leu Leu Asn Ser Ser Thr Ala Phe Arg Pro Leu Cys Thr Gly Asp Glu Asn Ile Asn Ser Val Glu Thr Pro Tyr Met Asp Tyr Glu His Leu Arg Ile Ser Tyr Asn Val Tyr Leu Ala Val Tyr Ser Ile Ala His Ala Leu Gln Asp Ile Tyr Thr Cys Leu Pro Gly Arg Gly Leu Phe Thr Asn Gly Ser Cys Ala Asp Ile Lys Lys Val Glu Ala Trp Gln Val Leu Lys His Leu Arg His Leu Asn Phe Thr Asn Asn Met Gly Glu Gln Val Thr Phe Asp Glu Cys Gly Asp Leu Val Gly Asn Tyr Ser Ile Ile Asn Trp His Leu Ser Pro Glu Asp Giy Ser Ile Val Phe Lys Glu Val Gly Tyr Tyr Asn Val Tyr Ala Lys Lys Gly Glu Arg Leu Phe Ile Asn Glu Glu Lys Ile Leu Trp Ser Gly Phe Ser Arg Glu Val Pro Phe Ser Asn Cys Ser Arg WO 99/t10422 PCT/US98/13680 Asp Cys Gln Ala Gly Thr Arg Lys Gly Ile Ile Glu Gly Glu Pro Thr Cys Cys Phe Glu Cys Val Glu Cys Pro Asp Gly Glu Tyr Ser Gly Glu Thr Asp Ala Ser Ala Cys Asp Lys Cys Pro Asp Asp Phe Trp Ser Asn Glu Asn His Thr Ser Cys Ile Ala Lys Glu Ile Glu Phe Leu Ala Trp Thr Glu Pro Phe Gly Ile Ala Leu Thr Leu Phe Ala Val Leu Gly Ile Phe Leu Thr Ala Phe Val Leu Gly Val Phe Ile Lys Phe Arg Asn Thr Pro Ile Val Lys Ala Thr Asn Arg Glu Leu Ser Tyr Leu Leu Leu Phe Ser Leu Leu Cys Cys Phe Ser Ser Ser Leu Phe Phe Ile Gly Glu Pro Gln Asp Trp Thr Cys Arg Leu Arg Gln Pro Ala Phe Gly Ile Ser Phe Val Leu Cys Ile Ser Cys Ile Leu Val Lys Thr Asn Arg Val Leu Leu Val Phe Glu Ala Lys Ile Pro Thr Ser Phe His Arg Lys Trp Trp Gly Leu Asn Leu Gln Phe Leu Leu Val Phe Leu Cys Thr Phe Met Gln Ile Leu Ile Cys Ile Ile Trp Leu Tyr Thr Ala Pro Pro Ser Ser Tyr Arg Asn His Glu Leu Glu Asp Glu Ile Ile Phe Ile Thr Cys His Glu Gly Ser Leu Met Ala Leu Gly Ser Leu Ile Gly Tyr Thr Cys Leu Leu Ala Ala Ile Cys Phe Phe Phe Ala Phe Lys Ser Arg Lys Leu Pro Glu Asn Phe Asn Glu Ala Lys Phe Ile Thr Phe Ser Met Leu Ile Phe Phe Ile Val Trp Ile Ser Phe Ile Pro Ala Tyr Ala Ser Thr Tyr Gly Lys Phe Val Ser Ala Val Glu Val Ile Ala Ile Leu Ala Ala Ser Phe Gly Leu Leu Ala Cys Ile Phe Phe Asn Lys Val Tyr Ile Ile Leu Phe Lys Pro Ser Arg Asn Thr Ile Glu Glu Val Arg Ser Ser Thr Ala Ala His Ala Phe Lys Val Ala Ala Arg Ala Thr Leu Arg Arg Pro Asn Ile Ser Arg Lys Arg Ser Ser Ser Leu Gly Gly Ser Thr Gly Ser Ile Pro Ser Ser Ser Ile Ser Ser Lys Ser Asn Ser Glu Asp Arg Phe Pro Gln Pro Glu Arg Gln Lys Gln Gln Gln Pro Leu Ser Leu Thr Gln Gln Glu Gln Gln Gln Gln Pro Leu Thr Leu His Pro Gln Gln Gln Gln Gln Pro Gln Gln Pro Arg Cys Lys Gln Lys Val Ile Phe Gly Ser Gly Thr Val Thr Phe Ser Leu Ser Phe Asp Glu Pro Gln Lys Asn Ala Met Ala His Arg Asn Ser Met Arg Gln Asn Ser Leu Glu Ala Gln Arg Ser Asn Asp Thr Leu Gly Arg His Gln Ala Leu Leu Pro Leu Gln Cys Ala Asp Ala Asp Ser Glu Met Thr Ile Gln Glu Thr Gly Leu Gln Gly Pro Met Val Gly Asp His Gln Pro Glu Met Glu Ser Ser Asp Glu Met Ser Pro Ala Leu Val Met Ser Thr Ser Arg Ser Phe Val Ile Ser Gly Gly Gly Ser Ser Val .

Thr Glu Asn Val Leu His Ser (2) INFORMATION FOR SEQ ID N0:60:
(iI SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 3...3 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 15...15 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 18...18 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:

(2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 6...6 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 9...9 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 18...18 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base _, (B) LOCATION: 21...21 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:

(2) INFORMATION FOR SEQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 3...3 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 9...9 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 13...13 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base ', (B) LOCATION: 24...24 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:

(2) INFORMATION FOR SEQ ID N0:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 2...2 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 5...5 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 8...8 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 11...11 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 14...14 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 20...20 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 26...26 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 29...29 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:

(2) INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 3...3 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 6...6 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 9...9 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 16...16 (D) OTHER INFORMATION: Inosine (A) NAME/ICEY: Modified Base (B) LOCATION: 24...24 (D) OTHER INFORMATION: Inosine ', (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
A~SNYTNR TNTTYNGYTT YYTNTG 26 (2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs (8) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 2...2 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base ' (B) LOCATION: 5...5 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 11...11 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 17...17 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 20...20 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 23...23 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:65:
RNATNSWRAA NAYYTCNACN RCNACCAT 2g (2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 6...6 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 9...9 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 15...15 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 21...21 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:

(2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ix) FEATURE:
(A) NAME/KEY: Modified Base (B) LOCATION: 3...3 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 6...6 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 12...12 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 15...15 (D) OTHER INFORMATION: Inosine (A) NAME/KEY: Modified Base (B) LOCATION: 24...24 (D) OTHER INFORMATION: Inosine (xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:

(2) INFORMATION FOR SEQ ID N0:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2550 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:68:

TGAAGTTTTC

ATAGTGAAGA

ATGAACCTAT

AATATGAGTT

TTTTACCCAA

GAGTTATGGA

GTTATTTAGA

AACTGGCAAT

GCGACCATGA

ATGGCATGGT

ATGATGACCA

TCTGTTTAGC

CAATATATGA

TGAACTCTAC

GGATCACAAC

TCCATGGGAT

TGCAAACAAT

ATTATTTTAA

ACACCTTGGA

ATTTGTACAA

TAGAGTCTCA

CCTTGATGAA

GGGAAAATCA

GATTAAAAGT

TATCTGATGA

GTAGTGTGGC

GCTTTGATTG

GTGTGAGGTG

CTGTATCATT

CCTTCTCAGC

CTGTGAAGGC

TTCTCTGCTC

CCACATTTGG

TGGTCATGGC

GGGCACCTAA

GGTTGGTCAC

TCATTCTTTG

CCTTGGCTCT

ATGAAGCCAA

TCCCTGTCTA

TGGCTTCTAG

TTAGACCAGA

(2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2424 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:

(2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2409 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:

TTCACTTTTA

TGCTACTGAT

CATCATTGGT

AAATGGACAT

TCTTACAGGA

GGTTTTCTTT

TCAAGTAGCC

TAGATGGACT

AGATTTAAGA

AGAAAACATG

TTTAGCAAAA

AAGATGGGAA

CACAAATAAA

CAGATTTGAG

AGTAGATATT

CAGCAGTAAA

CTATGATATG

CCACACCTAC

AAGATTTTTC

CCCTGTTGGA

TTTCCTCATT

ACCTTGTTTT

AGGAACATCA

AATTCATCAG

GGTTTCCAAT

CATAGAGAAA

GGGGATAGCT

CACATTTTTG

CATCCTGCTC

AAACCAGGTC

TTCTACAGTG

AAGAAGAATG

CCTAATCCAA

AGATATACAA

CTTCCATGTT

CTTGGCTAGG

GGTGTTCTGC

CATGGTGGTT

CTTTGTCCCA

CAAAGATAAA

(2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2556 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71:

CACTTCTCAT

TAACGGATGA

TTGAAAAAGA

ATGCTTTGGC

ATATGTCTTT

CACCATATTT

AGAGTATGTG

AGTACCTGGA

CCATCTTCAG

CTCTAGCATT

TCATCCCAGA

ACAAAGAAAT

AAAAAACTGA

ATGGAGAAAC

AGAGAATATG

ATGACACATT

AAAATTTTGT

AGTGGAAATA

CATCTGATGC

ATAGTCATAA

TGCAACAGGC

AGGTAAACTC

TGAAGCAAAG

AACACCTTGG

ACTCTCACTT

CTGTGTGCAG

CCTGCTGTTT

ATCAATGCGT

AGAAAGGTGT

CCTTCTGCTT

CTCCTATTGT

TCTGTTTTCT

AGCAAATCAC

TCACTGTGGT

TATCAGGGAC

CAATCTGGCT

ACATCATCAT

TGGCCTGCCT

CATTCAATGA

CCTTCCTCCC

CTATCTTGGC

TTTTAATGAG

(2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2169 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:

GGATGAATCT

GCTTTCCTAT

TCAGATGGCC

GAAATGGAAT

AGAGTTGAAG

TGTTGATGAA

ATTAACAAAT

AATGTGGGAA

TACCAGTAAG

CCATGGTGAG

AGATTTATAT

TTGTAAAATA

GCTTGACATG

CCATGCCCTC

AGGAGCCAGT

TCCTCTTGGG

TGTTCACTTT

CCCATATTTA

AGGAAGAAGA

ATTATGGAAG

CA 02294473~1999-12-21 AATTTCTAAT

CACAGAACAG

GGGGATGGCA

TGTCTTTGTG

TCTATTACTC

AAACAAAGTC

TTCCACAGTT

AAGAAGATTG

CCTACTCCAA

TGATGAACAC

ATTCTACTGT

CTTGGCCAAG

AGTGTTCTGC

CATGGTTGCT

TTTTGTACCC

CAGGGAAAAA

(2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1889 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESC&IPTION: SEQ ID N0:73:

GCCATTGTTT

GACCACAAAG

TGTACGGCTT

GAAAATATGG

CTAATGCGAA

CCCCATATTA

TTTAAGCACA

AAAAAATACC

TTTTCTGATA

TTACCTAGTC

GTGTACGCTG

TGTGAAAATG

ATTGAGGTGA

CTTAACCTCT

GCAAATGCTC

ATATTTTCAG

GTAACCCTGG

ATTTCTAATG

ACAGAGAAGA

GGGATGGCTC

ATATTTGTGA

ACTTTGCTCA

AACACAGTTG

GCCACTGTGT

AGAATGGTAA

CTGATCCAAC

GATGCTCATA

TTCCACTCTG

TTGTCAAGAA

GTATTCTTCT

ATGGTCGCCG

(2) INFORMATION FOR SEQ ID N0:74:

- 182 - w (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1889 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:

ATGGCGACGA AGGACACATC
TCTTTCACTT GCCATTGTTT

(2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 270 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:

(2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1308 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:

TCTCATCTTG

TAATGACGGA

TGAAGATAAT

TCTTCTCGTA

CATAACTTTG

CCAAGCATAT

TGATTCATGT

GCACTCTTCG

CCGGCTGCCC

CTCCTTGATG

TTTCACTTTA GATGGACTTG.GATAGGAATG GTCATCTCAG ATGATGACCA 660 GGi3TATTCAG

TTTTGTTAAT

TCAACAAATT

TCTAGAAGTA

CTCACAATGG

TATCACTTTT

GAACACTGCC

TTGTTCAATA

ATGGACATCA

TGCTGTTTAT

GAAAAAGGCA

(2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUBNCE CHARACTERISTICS:
(A) LENGTH: 1296 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:

TCTCATCTTG

TAGTGATGGA

TGAAGATAAT

TCTTCTGGTA

CATAACTTTG

TCAATCATAT

TGATTCATGT

AACATAGGCC TTACAGGACC ATCATGGAAA AAATCCiTAA AACTGGCAAT 480 GGATTCTTCA

CCGGCTGCCC

CTCCTTGATG

GGGTATTCAG

TTTTGTTAAT

TAAACAAATT

TCTAGAAGTA

CTCACAATGG

TATCACTTTT

GAACACTGCC

TTGTTCAATC

TCTAAGAACA GCAGTAAAAT GGATCTT'!'TT ACATCCAACA ACACATTGGA1140 ATGGACAGCA

CTGCACAACT ATGATATGGC CATGAGTGAT GAAiGGTTACA ATTTGTATAA1200 TGCTGTTTAT

GAAAAAGGTA

GAACACAACA GATATTTCAC TGTTTGTCAG CAGATA ' 1296 (2) INFORMATION FOR SEQ ID N0:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1521 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:

TCTCATCTTG

TAGTGATGGA

TGAAGATAAT

TCTTCTGGTA

CATAACTTTG

TCAATCATAT

TGATTCATGT

GGATTCTTCA

CCGGCTGCCC

CTCCTTGATG

GGGTATTCAG

TTTTGTTAAT

TAAACAAATT

TCTAGAAGTA

CTCACAATGG

TATCACTTTT

GAACACTGCC

TTGTTCAATC

ATGGACAGCA

TGCTGTTTAT

GAAAAAGGTA

AACCAGGGTA

GTGTACAGAG

GAAAATAGGA

TTTGGAATGG

(2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 933 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:

TCTCATCTTG

TAATGATGGA

TGAAGATAGT

TCTTCTGGTA

CATGAGTTTG

TCAAGAATAT

TGATTCATGT

GCATTCTTCA

CCGGCTGCCC

CTCCTTGATG

GGGTATTCAG

TTTTGTTAAT

TACACAAATT

TCTAGAAGCA

(2) INFORMATION FOR SEQ ID N0:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1236 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80:

GCAAAGGGAA

TAATTCACTT

TTTTGCTACT

CTCCATCATT

AATAAATGGA

AGGTCTTACA

ACTGGTTTTC

CCATCAAGTA

TTTTAGATGG

CTCAGATTTA

CCCAGAAAAC

GTCTTTAGCA

TAGAAGATGG

CATCACAAAT

CCGCAGATTT

CCCAGTAGAT

GAACAGCAGT

CAACTATGAT

GGCCCACACC

CAAAAGATTT

(2) INFORMATION FOR SEQ ID N0:81:
(i) SEQU8NC8 CHARACTERISTICS:
(A) LENGTH: 2412 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:

GGCCAATTTC

ATATTTGGGA

TTATTTCAAC

ATTGGTGTTT

GATTATAAGA

TACACTTTGG GCCGTTGTGA Z'GGAAAAACT GTAATACCTA CACCATATTT 360 ATTTCGTAAA

TTCCTATCTG

CAGCTTCTTA

TGATGATGAA

GGCAATGGTC

TGATGACCAA

GGAAACCAAT TTCTTTTAGA GTT'GAAGAAA CAGAGTGAAA ACAAGGAAAT 720 TTGCTTTGCC

AATGTACTAC

ATACAATTTC

GATCACCACA

CTATGGATCA

(2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 381 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:

(2) INFORMATION FOR SEQ ID N0:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 228 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:83:

(2) INFORMATION FOR SEQ ID N0:84:

- 187 - _ (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1644 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SBQ ID N0:84:
ATGTTAGAAT TGGCCCATGG CACTCTaACT TTCTCACCCC ATCATGGGGA 60 GATTTCTGAT

TTTTCTTCAC

AATCTTTGAA

GCTGGTCATG

TCTCCATGAG

TATATTATTT

TGGTGATCGT

TATTTGGAAT

TGCTCCCAAG

TACAGAGATT

CCTGGAGAGC

AAGCCTGCCT GTTaCTTTGA CTGCACTCCT TGCCCAGATA AAGAGATTTC 720 CAACGAGACA

GAAGAGTCAC

GGGACTCACA

TATAATCCAC

GCTCATCACT

AGCCACATGT

AGTGTTGGCC

GACAAGATGG

GCAAATCCTT

TCACTCTGAA

CTGTACTCTG

CAGGAATCTT

CTGCAGTGTC

GGCTATGGAA

CCCTAAGTGC

AAAAAGACAG

(2) INFORMATION FOR SEQ ID N0:85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2304 base pairs (B) TYPE: nucleic acid (C) STRANDEDNBSS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:85:

TATAAAACAT

ATTTTATAAG

TATAGGGCTG

TCCACGTTTC

ATTTCCATAT

CTTCTTACTT

CAATCAATTT

CTCTCAGAGT TG~?~AAAAAGA GACCCAAAAC AAGGAAATTT GCTTTGCCTT480 TGTTAACATG

TCAAATAGTG

ATGTCATCAA CAAATATTAT TATCATTTAT GGGAX~AACAA ACAGTATCAT600 TGAATTGAGC

AGAGTTGGAT

GACATTTCTA

CCATCTCAGA

TGCCTCAGGA

-188- _ TCCAATGCCT

AGTCATAGTA

CAAGAGGTTG

GTAAACTCAT

AAACAGAGAG

CACCTTCGGA

TTTCACTTAT

GTGTGCAGTG

TGCTGTTTTA

CAATGTGTGA

AAAGACGTGA

TTCTGTTTGT

CCTATTGTGA

TGTTTTCTCT

CAAATCACAT

ACTGTCATTC

TCTGGTGCAC

ATTTGGCTAG

ATCATGATTG

GCCTGCCTGG

TTCAACGAAG

TTTCTCCCTG

ATCTTGGCAT

TTAATGAAAC

(2) INFORMATION FOR SEQ ID N0:86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2001 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:86:

CCATTTCAGC

TATCTTGGAA

CATTGTTAGT

GATGTCATCA

CTTTAGACTA

TATGATCATA

ACATCACTAT

CTACAGTGAT

ATTATCTGAA

CAGGCACCAT

TGCTGTGGCT

TGATGGAAAA

ATTTATAAAC

GTATGAGATT

AACATTTTCC

GTGGAACACA

ATTCAGAAAA

AGAAAATGAA

GTATGCCAAT

AGATCCATTG

TGTACTTAGT

TCTCAGCTAT

TGGTCATCCC

TGTAGCTGCA

TAATACAAGT

AATTTGCACA

TGTTGATGCT

- 189 - _ ATTTTCTGTA

(2) INFORMATION FOR SEQ ID N0:87:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2598 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:87:

CTTTCTCCTG

TATGAAGAAG

AGCAGTGCAA

CAAAAGTATC CTCTCACCTT GGCTTTI"TCC ATGAATGAAA TCAACAACAA 300 CCCTGATCTT

TTGCCAAATA TGTCTTTAGC AT't'TACATTC TCAGAATATA GTTaTTATTT360 GGAATCCCAC

TTTTATCTGT

AACTGTGACA

TTATGGACAC

GGCCTCTGAT

GATACATCTC TAGCCG'TTGC TCTCGTCTCC TTCATAATTC ATTTCAGTTG 660 GAGAAGAGAG

TATGAATTTA

AAATGTTGTT

GGACTCTCTA

TAAGAAAGAC

TTCACATTTG ATAATGGATA TGGAACTTTT GGTTTTC',GAC ACCGCCACAG1020 TGAGATTTCT

ATATTTGGTA

GTCACTGAAG

CATGGCCATT

ACTCCATGAG

ATGACTC1'TC AAAATGTTGA TAATGTTCTC CTTCCCAATT ATGAAGAACA 1320 AAATTATAAT

TGCAAGATGG TTTATTCCTT Tt'TGAGCAAG ACTCAATTCA CAAFrTCCTGT1380 TGGAGACACT

GTGAATATGA ATCAAAGAAA CAAACTGAAG GAAGAGTACG ACATTrTCTA 1440 CAATTGGAAT

TTTTCCAAAA

TATACAGATG

GAAC'aAATQGA

TAATGAGACA

GCAGAATCAC

GGCTCTTTCC

TGTGAAACAT

GCTCATCGCA

AGCTACCTGC

TGTGTTGGCC

GATGAAGTAC

TCAAATTATT

ACAGTCTGAG

CTGTGTCCTG

CAGAAACCTG

CTGCAGTGTC

GGCTGTTGAG

TCCAAAATGC

GAAGTCATCT

(2) INFORMATION FOR SEQ ID N0:8B:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2337 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:88:

CACATCCCTG

ACTTTTTAGC

GAGCAATTCT

ACTCCTGGAT

TGAACAAAGA

TGGCATTGTA

TGATGACAAA

CTGCACAGCA

GAAAAATCAT

TGATTCTCTA

GGTACTGATC

GCATGGAGCT

GCAGAAGTAC

GTACTTCAAT

TGCCTCCCTG

CAATGTATAC

AGTACAATTT

CCCCTTTCTA

AGGGAAAGAC AACTCATCAA TCAGAATGGA GCGAATGA,AG ATCTGGATTG1140 TACCAGGAAG

TGGGCTAAAT

CATATCTTCT

CAGTGAGAGC

CTTTGACTGC

TGTGAAGTGT

TGTGACATTT

TTTCTCCTCA

TGTCAAGGCC

TCTCTGTCCC

CATTTTTGGG

GGTTATAGCC

GGCCCCTAAT

GCTGACAACC

CATCATTTGC

ACTAGCCCTA

TGAAGCCAAG

CCCTGTCTAC

GGCTTCCAGT

AAGACCAGAA

ACCTTCT

(2) INFORMATION FOR SEQ ID N0:89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1650 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89:

AACAAAAAAC

TCCTGATCTT

ACAAACTACA

TTATTTCTGT

AAT'Cv4AGAGA CTATGTGTAC ATTTCTACTT ACAGGACCGC ATTGGATAAC360 ATCTTATAGT

CACATCCCTA

CCTTGTCATC

CAAAATCAAG

TTATCATAAA

CATTTATGGG

AAAAC14,GAGA GTATTATTGA GTTGAGCTTC AGAATGTGGG AATCTCCAGT720 TATCCAGAGA

AACTCATGAC

CTTTAAAAAT

GCCAGAGTGG

CTATTCATCC

TGATGGAAGT

GAATCTGCAC

CTTTAAGATA

GATTATGAAA

TTCTCAGCAC

CAGGCACTTT

ATCCTCTGTG

GGCAGCCTGC

TATGGATCAG

CATTCAGAAA

TCATAGCCTT

(2) INFORMATION FOR SEQ ID N0:90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2379 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:90:

TGATCCCAGG

AAGTTGTTAC

GACTCTGAAT

AATGGATGAA

TACATTGGGC

CACTGTGATG GAAATATCCC ACTCL'G3CTTA CTTAATCAAA TATTTTATAT360 GCCTTTTCCT

GAATTTGTGG

TCAGATTTCC

CTATCAGATG

CTTCAACTGG

CACAGAGTTG

GGCAATCAAT

GTCAACCGCA

CAGAACATGG

CCCAACAAGT

ACACCATGAT

CATGGATTTA

TATTTATTAA AGCCAGAGT'G GGGTTTCTTT GAATATGAAA CCTCAGCATC1080 TTACTGTAAA

GAAGTTTGAC

GGCCCATGCT

CAAAGGAGCA

CAATCCTTTT

CATTGTTCAC

CAGCCCATAT

CACAGGAAGT

AAAATTCTGG

TGAAATTTCT

TATTGTGAAG

CTTTCTGTGC

AATCACATTT

TGTGCTTCTG

GGGGACACCC

TTGGCTAGCA

CATAATTGTG

CTTCCTGGCC

CAATGAAGCC

CCTTCCTGTC

TTTGACATCC

AATGAAACCA

GAGAGAATTC TATCCAAP~AG ACAGGAGAAA TCACGTTTC 2379 (2) INFORMATION FOR SEQ ID N0:91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2394 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91:

GGATCCCAGA

GACTTGTTCC

GACTCTGAAT

AGTGGATGAA

CAATTTGGGT

TAATCATCTC

TACAGGACCA

TCCACATTTC

ATATCCTTAT

CTTCATAATT

CAATCAATTT

TGTGAACATG

CCAAATTGTG

TGAATTGAGC

ACAATTTGAT

TACATTTCTA

CAATCTCAGA

AGCCTCAGCA

GCTAATGGAA

TGTATATGCC

AATAAACAAT

GACCCACTTC

TGGAGACTAT

GATAGGACAA

GACTGAGTTG

TCCTGGATTC

CTGCCCTGAA

CCATGACACT

ACTCATGTTC

TATCTTACAG

CAAAACAATC

CTTTTTGGTA

TCTGTGTGCA

GCATGGCCAC

GGGATACTTG

GCCTGACACA

CTGGGTCACC

(2) INFORMATION FOR SEQ ID N0:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2085 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:92:

CATCTTCAGT

ACTAGCATTG

TATCTCAGAT

CAAGGATATT

TAAAACTGAA

TGGGGAAACA

GAGAATATGG

TGGCACATTC

AAATTTTGTA

GTGGAAATAT

ATCCAATATC

TAGTCACAAC

GCAAGAATTT

GCTAAACTCA

GAAACAGAGA

ACGCCTTGGG

GTTACACTTA

AGTGTGCAGT

GCAGATTGCC ATCCTGGATT CAGAAC,AATC TGGAA('~GAGG AAATGGCAGC1140 CTGCTGTTTT

TCAGTGTGCG

GAAAGGTGTG

ATTCTGTTTC

TCCTATTGTG

CTGTTTTCTG

GCAAATCACA

CACTGTGGTT

ATCGGGGACA

AATCTGGCTA

CATCATCATT

GGCCTGCCTG

GCCTTTGGAA GCTTCACTAT AGCTT1'CTTG GCAAAGAACC TGCCTGACAC1860 ATTCAACGAA

CTTCCTCCCT

CATCTTGGCA

TTTAATGAGA

WC C181I11:
'r

Claims (61)

Claims
1. A family of isolated pheromone receptor polypeptides, each of said isolated polypeptides comprising from amino terminus to carboxyl terminus:
(a) an amino-terminal extracellular domain containing from 30 to 600 amino acids;
(b) a transmembrane region comprising:
(i) seven non-contiguous transmembrane domains designated TM1, TM2, TM3, TM4, TM5, TM6 and TM7 (ii) three non-contiguous extracellular domains designated EC2, EC3 and EC4, and (iii) three non-contiguous intracellular domains designated IC1, IC2, and IC3, wherein the transmembrane domains, the extracellular domains and the intracellular domains are attached to one another from amino terminus to carboxyl terminus in the order TM1-IC1-TM2-EC2-TM3-IC2-TM4-EC3-TM5-IC3-TM6-EC4-TM7, and wherein the transmembrane region has at least about 35% homology and a length approximately equal to a transmembrane region of a polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50; and (c) a carboxyl-terminal intracellular domain containing from 5 to 200 amino acids;
wherein the pheromone receptor polypeptides are expressed in a G.alpha.0 protein-expressing vomeronasal organ neuron or are expressed in another olfactory organ neuron in an animal which does not possess a vomeronasal organ.
2. The polypeptides of claim 1, wherein the transmembrane region of each of said polypeptides has at least between about 60% and about 90% homology to the transdomain region of a pheromone receptor polypeptide selected from the group consisting of SEQ ID
NO. 2, 4, 6, 8, 10, 12, 14, 34, 36, 38, 40, 42, 44, 46, 48, and 50.
3. The polypeptides of claims 1 or 2, wherein the non-contiguous intracellular domains of each of said polypeptides has at least between about 60% and about 90%
homology to the non-contiguous intracellular domains of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 34, 36, 38, 40, 42, 44, 46, 48, and 50.
4. The polypeptides of claim 1, wherein the extracellular domain of each of said polypeptides has at least between about 50% and about 90% homology to the extracellular domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO.
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50.
5. The polypeptides of claim 2, wherein the extracellular domain of each of said polypeptides has at least between about 50% and about 90% homology to the extracellular domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO.
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50.
6. The polypeptides of claim 3, wherein the extracellular domain of each of said polypeptides has at least between about 50% and about 90% homology to the extracellular domain of a pheromone receptor polypeptide selected from the group consisting of SEQ ID NO.
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, and 50.
7. The polypeptides of claims 1 or 2, wherein the extracellular domain contains at least between about 50 and about 500 amino acids.
8. The polypeptides of claim 3, wherein the extracellular domain contains at least between about 50 and about 500 amino acids.
9. The polypeptides of claims 4, 5 or 6, further comprising a signal sequence attached to the amino terminus of the extracellular domain.
10. The polypeptides of claim 9, wherein the signal sequence is selected from the group of signal sequences of a pheromone receptor polypeptide of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
11. A method for identifying a nucleic acid encoding a pheromone receptor polypeptide, comprising:
(1) contacting a mixture of nucleic acid molecules with at least one nucleic acid probe of a nucleic acid selected from the group consisting of (a) a nucleic acid molecule selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55 that encodes a pheromone receptor polypeptide;
(b) a unique fragment of (a); (c) a human homolog of (a) or (b); and (d) a set of degenerate primers of any of (a), (b) or (c); and (2) identifying the sequences within the mixture that hybridize to the probe.
12. The method of claim 11, wherein the mixture is a genomic library.
13. The method of claim 11, wherein the mixture is a cDNA library.
14. The method of claim 11, wherein the nucleic acid probe contains a detectable label.
15. The method of claim 11, wherein the at least one nucleic acid probe is a pair of degenerate polymerise chain reaction primers that amplify a unique fragment of a nucleic acid molecule selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55, the method further comprising the step of subjecting the mixture to a polymerise chain reaction amplification reaction prior to selecting a member of the mixture which hybridizes to the nucleic acid probe.
16. The method of claim 15, wherein the pair of degenerate polymerise chain reaction primers is selected from the group consisting of SEQ ID NOs. 60 and 61, SEQ ID
NOs. 62 and 63, SEQ ID NOs. 64 and 63, SEQ ID NOs. 64 and 65, and SEQ ID NOs. 66 and 67.
17. The method of claim 16, wherein the pair of polymerise chain reaction primers is selected from the group consisting of SEQ ID NOs. 60 and 61, SEQ ID NOs. 62 and 63, SEQ
ID and NOs. 64 and 63.
18. An isolated nucleic acid molecule (a) which hybridizes under high or low stringency conditions to a molecule consisting of a nucleic acid sequence selected from the group consisting of SEQ ID NO. 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 54, and 55, and which codes for a pheromone receptor, (b) nucleic acid molecules that differ from the nucleic acid molecules of (a) in codon sequence due to the degeneracy of the genetic code, and (c) complements of (a) and (b).
19. The nucleic acid molecule of claim 18, wherein the pheromone receptor is expressed in the vomeronasal organ or is expressed in another olfactory organ in an animal which does not possess a vomeronasal organ.
20. The nucleic acid molecule of claim 18, wherein the pheromone receptor is expressed in a G.alpha.o protein-expressing vomeronasal organ neuron.
21. The nucleic acid molecule of claim 18, wherein the pheromone receptor is a G-protein coupled receptor.
22. The isolated nucleic acid molecule of claim 18, wherein the pheromone receptor has an amino acid sequence selected from the group consisting of SEQ ID NO. 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
23. The isolated nucleic acid molecule of claim 18, wherein the isolated nucleic acid molecule is selected from the group consisting of SEQ ID NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, that encodes a pheromone receptor polypeptide.
24. The isolated nucleic acid molecule of claim 18, wherein the isolated molecule comprises a molecule having a sequence which encodes a pheromone receptor unique fragment, wherein said unique fragment is selected from the group consisting of a pheromone receptor extracellular domain, a pheromone receptor transmembrane domain, a pheromone receptor intracellular domain, a pheromone receptor extracellular domain coupled to at least one transmembrane domain, and at least one pheromone receptor transmembrane domain coupled to a pheromone receptor intracellular domain.
25. The isolated nucleic acid molecule of claim 18, wherein the pheromone receptor extracellular domain, the pheromone receptor transmembrane domain and the pheromone receptor intracellular domain have amino acid sequences selected from the group of sequences identified as these domains in SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
26. The isolated nucleic acid molecule of claim 18, wherein the unique fragment is selected from the group consisting of between 12 and 4000, between 12 and 2000, between 12 and 1000, between 12 and 500, between 12 and 250, between 12 and 100, between 12 and 50, and between 12 and 25, nucleotides in length.
27. An isolated nucleic acid molecule, comprising (a) a molecule having a sequence selected from the group consisting of SEQ ID
NO. 51, 53, 54, 55, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, and 92, and which codes for a pheromone receptor;
(b) nucleic acid molecules that differ from the nucleic acid molecules of (a) in codon sequence due to the degeneracy of the genetic code, and (c) complements of (a) and (b).
28. An expression vector comprising the isolated nucleic acid molecule of claims 18-27 operably linked to a promoter.
29. A host cell transformed or transfected with the isolated nucleic acid molecule of claims 18-27.
30. A host cell transformed or transfected with the isolated nucleic acid molecule of the expression vector of claim 28.
31. An isolated polypeptide encoded by the isolated nucleic acid molecule of claims 18-27.
32. The isolated polypeptide of claim 31, wherein the isolated polypeptide has a pheromone receptor activity.
33. The isolated polypeptide of claim 31, wherein the isolated polypeptide comprises a polypeptide selected from group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
34. The isolated polypeptide of claim 33, wherein the isolated polypeptide is a fragment of a peptide selected from the group consisting of an extracellular domain, a transmembrane domain and an intracellular domain, wherein the foregoing domains have amino acid sequences selected from the group of sequences identified as these domains of a pheromone receptor polypeptide selected from group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
35. A vaccine containing an isolated polypeptide selected from the group consisting of the isolated polypeptides of claim 31, 32, 33, and 34.
36. A method for controlling fertility in an animal, comprising:
administering to an animal in need of such treatment, an effective amount of the vaccine of claim 35 to elicit an immune response to the isolated polypeptide.
37. An isolated binding polypeptide which binds selectively to a polypeptide of claim 1, 2, 4, 5, 6, 8, 10, 31, 32, 33, and 34, provided that the isolated binding polypeptide does not bind to a G-protein coupled receptor other than a G.alpha.~+-coupled pheromone receptor.
38. The isolated binding polypeptide of claim 37, wherein the binding polypeptide binds to a polypeptide selected from the group consisting of SEQ ID NO. 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50 and 52.
39. The isolated binding polypeptide of claim 37, wherein the binding polypeptide is an antibody fragment selected from the group consisting of a Fab fragment, a F(ab)2 fragment or a fragment including a CDR3 region selective for a pheromone receptor polypeptide.
40. The isolated binding polypeptide of claim 38, wherein the binding polypeptide is an antibody fragment selected from the group consisting of a Fab fragment, a F(ab)2 fragment or a fragment including a CDR3 region selective for a pheromone receptor polypeptide.
41. An affinity matrix comprising:
a solid support to which is coupled an isolated binding polypeptide selected from the group consisting of the binding polypeptides of any of claims 37-40.
42. A method for isolating a pheromone receptor, comprising:
contacting a composition containing a putative pheromone receptor with the affinity matrix of claim 41 under conditions to permit the pheromone receptor to selectively bind to the binding polypeptides coupled to the solid support; and isolating the polypeptides that bind to the affinity matrix.
43. A composition comprising:
the polypeptide of claim 1, 2, 4, 5, 6, 8, 10, 31, 32, 33, or 34; and a pharmaceutically acceptable carrier.
44. A composition comprising:
the nucleic acid molecule of any of claims 18-28; and a pharmaceutically acceptable carrier.
45. A composition comprising:
the binding polypeptide of claim 37; and a pharmaceutically acceptable carrier.
46. A composition comprising:
the binding polypeptide of claims 38, 39 or 40; and a pharmaceutically acceptable carrier.
47. A method for modulating a pheromone receptor activity in a cell, comprising:

administering to the cell an amount of the isolated binding polypeptide of claim 37 effective to modulate pheromone receptor activity in the cell.
48. A method for modulating a pheromone receptor activity in a cell, comprising:
administering to the cell an amount of the isolated binding polypeptide of claim 38, 39, or 40 effective to modulate pheromone receptor activity in the cell.
49. The method of claim 47, wherein modulating a pheromone receptor activity comprises reducing the pheromone receptor activity.
50. The method of claim 48, wherein modulating a pheromone receptor activity comprises reducing the pheromone receptor activity.
51. The method of claim 47, wherein the pheromone receptor activity is selected from the group consisting of a signal transduction activity and a ligand binding activity.
52. The method of claim 48, wherein the pheromone receptor activity is selected from the group consisting of a signal transduction activity and a ligand binding activity.
53. The method of claim 47, wherein the cell is a vertebrate cell, preferably a mammalian cell.
54. The method of claim 48, wherein the cell is a vertebrate cell, preferably a mammalian cell.
55. The method of claim 47, wherein the cell is an invertebrate cell, preferably an insect cell.
56. The method of claim 48, wherein the cell is an invertebrate cell, preferably an insect cell.
57. A method for reducing the binding of a pheromone having a binding domain to a pheromone receptor having a ligand binding site that selectively binds to the binding domain of the pheromone, comprising:

contacting the pheromone receptor with an agent which binds to the binding domain for a time effective to reduce binding of the pheromone to the ligand binding site of the pheromone receptor.
58. The method of claim 57, wherein the agent is an antibody which binds to the binding domain.
59. A method for decreasing pheromone receptor mediated signal transduction activity in a subject comprising:
administering to a subject in need of such treatment an agent that selectively binds to an isolated nucleic acid molecule of claim 1 or an expression product thereof, in an amount effective to decrease pheromone receptor mediated signal transduction activity in the subject.
60. The method of claim 59, wherein the agent is selected from the group consisting of an antisense nucleic acid and a binding polypeptide.
61. A method for identifying lead compounds for a pharmacological agent useful in the diagnosis or treatment of disease associated with pheromone binding to a pheromone receptor polypeptide containing a ligand binding site that selectively binds to a binding domain of the pheromone, comprising forming a mixture comprising a pheromone receptor polypeptide or unique fragment thereof containing a ligand binding site, a molecule protein containing a binding domain which selectively binds the pheromone receptor ligand binding site, and a candidate pharmacological agent, incubating the mixture under conditions which, in the absence of the candidate pharmacological agent, permit a first amount of selective binding of the molecule containing a ligand binding domain by the pheromone receptor ligand binding site, and detecting a test amount of selective binding of the molecule containing the binding domain by the pheromone receptor ligand binding site, wherein reduction of the test amount of selective binding relative to the first amount of selective binding indicates that the candidate pharmacological agent is a lead compound for a pharmacological agent which disrupts selective binding of a molecule containing a binding domain by a pheromone receptor containing a ligand binding site and wherein increase of the test amount of selective binding relative to the first amount of selective binding indicates that the candidate pharmacological agent is a lead compound for a pharmacological agent which enhances selective binding of a molecule containing a binding domain by a pheromone receptor polypeptide containing a ligand binding site.
CA002294473A 1997-06-30 1998-06-30 Novel family of pheromone receptors Abandoned CA2294473A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US5128497P 1997-06-30 1997-06-30
US60/051,284 1997-06-30
PCT/US1998/013680 WO1999000422A1 (en) 1997-06-30 1998-06-30 Novel family of pheromone receptors

Publications (1)

Publication Number Publication Date
CA2294473A1 true CA2294473A1 (en) 1999-01-07

Family

ID=21970359

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002294473A Abandoned CA2294473A1 (en) 1997-06-30 1998-06-30 Novel family of pheromone receptors

Country Status (4)

Country Link
EP (1) EP0996635A4 (en)
JP (1) JP2002511871A (en)
CA (1) CA2294473A1 (en)
WO (1) WO1999000422A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001025431A1 (en) * 1999-10-01 2001-04-12 The Rockefeller University Primate, particularly human, vomeronasal-like receptor
US20020155444A1 (en) * 2000-02-17 2002-10-24 Herman Ronald C. Human VNO cDNA libraries
ATE432350T1 (en) * 2000-06-16 2009-06-15 Incyte Corp G-PROTEIN COUPLED RECEPTORS
AU2001287111A1 (en) * 2000-09-07 2002-03-22 Zymogenetics Inc. Human vomeronasal receptor-3
JP2004508843A (en) * 2000-09-22 2004-03-25 ケムコム エス.エー. Olfactory and pheromone G protein-coupled receptors
EP2008691A1 (en) * 2007-06-29 2008-12-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Vaginal odorants
JP2023184465A (en) * 2022-06-16 2023-12-28 花王株式会社 Analysis method for G protein-coupled receptors

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5030722A (en) * 1988-03-30 1991-07-09 The Johns Hopkins University Odorant-binding protein from rat
DE69634134D1 (en) * 1995-10-19 2005-02-03 Univ Columbia CLONING OF PHEROMONE RECEPTORS FROM VERTEBRATES AND THEIR USES

Also Published As

Publication number Publication date
WO1999000422A9 (en) 1999-04-15
EP0996635A1 (en) 2000-05-03
WO1999000422A1 (en) 1999-01-07
EP0996635A4 (en) 2003-08-27
JP2002511871A (en) 2002-04-16

Similar Documents

Publication Publication Date Title
US5494806A (en) DNA and vectors encoding the parathyroid hormone receptor, transformed cells, and recombinant production of PTHR proteins and peptides
CA2217668C (en) Genetic markers for breast and ovarian cancer
US6262333B1 (en) Human genes and gene expression products
CA2128208C (en) Novel seven transmembrane receptors
CA2348479A1 (en) Novel members of the capsaicin/vanilloid receptor family of proteins and uses thereof
US20020106655A1 (en) Human GPCR proteins
CA2311572A1 (en) Extended cdnas for secreted proteins
CA2292339A1 (en) Smad6 and uses thereof
JP2002531091A5 (en)
CA2281895C (en) Ikb kinases
CA2386509A1 (en) G protein-coupled receptors expressed in human brain
CA2290783A1 (en) Modulators of tissue regeneration
JPH11503012A (en) Human G protein-coupled receptor
CA2288430A1 (en) Crsp protein (cysteine-rich secreted proteins), nucleic acid molecules encoding them and uses therefor
CA2366062A1 (en) Human dickkopf-related protein and nucleic acid molecules and uses therefor
KR20070099564A (en) Methods for assessing patients with acute myeloid leukemia
CA2380000A1 (en) Odorant receptors
CA2145866C (en) Human calcitonin receptor
CA2321194A1 (en) Human potassium channel genes
CA2407219A1 (en) Pain signaling molecules
US20030180739A1 (en) Reagents and methods for identifying gene targets for treating cancer
CA2294473A1 (en) Novel family of pheromone receptors
CA2386029A1 (en) P-glycoproteins from macaca fascicularis and uses thereof
CA2421865A1 (en) Olfactory and pheromones g-protein coupled receptors
US20040014169A1 (en) Novel G protein-coupled receptors

Legal Events

Date Code Title Description
FZDE Discontinued