AU698800B2 - Genetic markers for breast and ovarian cancer - Google Patents

Genetic markers for breast and ovarian cancer Download PDF

Info

Publication number
AU698800B2
AU698800B2 AU55668/96A AU5566896A AU698800B2 AU 698800 B2 AU698800 B2 AU 698800B2 AU 55668/96 A AU55668/96 A AU 55668/96A AU 5566896 A AU5566896 A AU 5566896A AU 698800 B2 AU698800 B2 AU 698800B2
Authority
AU
Australia
Prior art keywords
ser
leu
lys
asn
glu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
AU55668/96A
Other versions
AU5566896A (en
Inventor
Lori Friedman
Mary-Claire King
Ming Lee
Eric Lynch
Beth Ostermeyer
Sarah Rowel
Csilla Szabo
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of California
Original Assignee
University of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of California filed Critical University of California
Publication of AU5566896A publication Critical patent/AU5566896A/en
Application granted granted Critical
Publication of AU698800B2 publication Critical patent/AU698800B2/en
Anticipated expiration legal-status Critical
Expired legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/82Translation products from oncogenes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Oncology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines Containing Material From Animals Or Micro-Organisms (AREA)

Abstract

Specific BRCA1 mutations, PCR primers and hybridization probes are used in nucleic acid-based methods for diagnostic of inheritable breast cancer susceptibility. Additionally, binding agents, such as antibodies, specific for peptides encoded by the subject BRCA1 mutants are used to identify expression products of diagnostic mutations/rare alleles in patient derived fluid or tissue samples. Compositions with high binding affinity for transcription or translation products of the disclosed BRCA1 mutations and alleles are used in therapeutic intervention. Such products include anti-sense nucleic acids, peptides encoded by the subject nucleic acids, and binding agents such as antibodies, specific for such peptides.

Description

O 96/33271 PCTiUS96/05621 Genetic Markersfor Breast and Ovarian Cancer
INTRODUCTION
Field of the Invention The field of the invention is genetic markers for inheritable breast cancer susceptibility.
Background The largest proportion of inherited breast cancer described so far has been attributed to a genetic locus, the BRCA1 locus, on chromosome 17q21 (Hall et al. 1990 Science 250:1684- 1689; Narod et al. 1991 Lancet 338:82-83; Easton et al. 1993 Am J Hum Genet 52:678-701).
Background material on the genetic markers for breast cancer screening is found in the Jan 29, 1993 issue of Science, vol 259, especially pages 622-625; see also King et al., 1993 J Amer Med Assoc 269:1975-198. Other relevant research papers include King (1992) Nature Genet 2:125- :126; Merette et aL (1992) Amer J Human Genet 50:515-519; NIH/CEPH Collaborative Mapoing Group (1992) Science 258:67-86.
Risks of breast cancer to women inheriting the locus are extremely high, exceeding before age 50 and reaching 80% by age 65 (Newman et al. 1988 Proc Natl Acad Sci USA 85:3044-3048; Hall et al. 1992 Amer J Human Genet 50:1235-1242; Easton et al. 1993).
Epidemiological evidence for inherited susceptibility to ovarian cancer is even stronger (Cramer et al. 1983 J Nati Cancer In.t 71:711-716; Schildkraut Thompson 1988 Amer J Epidemiol 128:456-466; Schildkraut et aL 1989 Amer J Hum Genet 45:521-529). According to one study, more than 90% of families with multiple relatives with breast and ovarian cancer trace disease susceptibility to chromosome 17q21 (Easton et al. 1993).
The link between increasing risk of breast and ovarian cancer and inherited susceptibility Sto these diseases lies in the application of genetics to diagnosis and prevention. Creating molecular tools for earlier diagnosis and developing ways to reverse the first steps of tumorigenesis may be the most effective means of breast and ovarian cancer control.
Our laboratory previously mapped the. heritable breast cancer susceptibility gene locus (BRCA1 locus) to a 50 cM region of chromosome 17q (Hall et al. 1990). More recently, we I. 0 iWH 1 2 1 developed new polymorphisms at ERBB2 (Hall and King 1991 Nucl Acids Res 19:2515), THRA (Bowcock et aL 1993 Amer J Human Genet 52:718-722), EDH17B (Friedman et al. 1993 Hum Molec Genet 2:821), and multiple anonymous loci (Anderson et al. 1993 Genomics 17:6-16-623), ultimately developing a high density map of 17q12-q 2 1 (Anderson et al. 1993; see also, Simard et aL 1993 Human Molec Genet 2:1193-1199). We also added families to the genetic study; there are now 100 families for whom transformed lymphocyte lines have been established and all informative relatives genotyped. We used our new markers and the many chromosome 17q polymorphisms developed in the past three years to test linkage in our families, refining the region first to 8 cM (Hall et al. 1992), then to 4 cM (Bowcock et al. 1993), then to 1 Mb based on polymorphisms from our high density map (Anderson et al. 1993; see also Flejter et al., 1993 Genomics 17:624-631). We disclose here a number of mutations in BRCA1 which correlate with disease.
Relevant Literature The predicted amino acid sequence for a BRCA1 cDNA and familial studies of this gene were described by Miki et al. (1994) Science 266, 66-71 and Futeal et al. (1994) Science 266, .o0 120-122. A study of Canadian cancer families is described in Simard et al. (1994) Nature S Genetics 8, 392-398. A collaborative survey of BRCA1 mutations is described in Shattuch- Eidens et al. (1995) JAMA 273, 535-541.
SUMMARY OF THE INVENTION The invention discloses methods and compositions useful in the diagnosis and treatment of breast and ovarian cancer associated with mutations and/or rare alleles of BRCA1, a breast cancer susceptibility gene. Specific genetic probes diagnostic of inheritable breast cancer susceptibility and methods of use are provided. Labelled nucleic acid probes comprising sequences complementary to specified BRCA1 alleles are hybridized to clinical nucleic acid samples. Linkage analysis and inheritance patterns of the disclosed markers are used to diagnose genetic susceptibility. In addition. BRCA1 mutations and/or rare alleles are directly identified by hybridization, polymorphism and or sequence analysis. I. another embodiment, labeled binding agents, such as antibodies, specific for peptides encoded by the subject nucleic acids are used to identify expression products of diagnostic mutations or alleles in patient derived fluid or tissue samples. For therapeutic intervention, the invention provides comp a:ions which can functionally interfere with the transcription or translation products of the breast and ovarian cancer 1040 *,R4 1, T 0~
;_I
susceptibility associated mutations and/or rare alleles within BRCA1. Such products include antisense nucleic acids, competitive peptides encoded by the subject nucleic acids, and high affinity binding agents such as antibodies, specific for e.g. translation products of the disclosed BRCAI mutations and alleles.
For the purposes of this specification it will be clearly understood that the word "comprising" means "including but not limited to", and that the word "comprises" has a corresponding meaning.
DESCRIPTION OF SPECIFIC EMBODIMENTS We disclose here methods and compositions for determining the presence or absence of BRCA1 mutations and rare alleles or translation products thereof which are useful in the diagnosis of breast and ovarian cancer susceptibility. Tumorigenic BRCA1 alleles include BRCA1 allele #5803 (SEQ ID NO:1), 9601 (SEQ ID NO:2), 9815 (SEQ ID NO:3), 8403 (SEQ ID NO:4), 8203 (SEQ ID NO:5), 388 (SEQ ID NO:6), 6401 (SEQ ID NO:7), 4406 (SEQ ID NO:8), 10201 (SEQ ID NO:9), 7408 (SEQ ID NO:10), 582 (SEQ ID NO:11) or 77 (SEQ ID NO:12). These S nucleic acids or fragments capable of specifically hybridizing with the corresponding allele in the presence of other BRCA1 alleles under stringent conditions find broad diagnostic and therapeutic application. Gene products of the disclosed mutant and/or rare BRCA1 alleles also find a broad :.:range of therapeutic and diagnostic applications. For example, mutant and/or rare allelic BRCA1 peptides are used to generate specific binding compounds. Binding reagents are used diagnostically. to distinguish non-tumorigenic wild-type and tumorigenic BRCA1 translation products.
The subject nucleic acids (including fragments thereof) may be single or double stranded and are isolated, partially purified, and/or recombinant. An "isolated" nucleic acid is present as other than a naturally occurring chromosome or transcript in its natural state and isolated from (not joined in sequence to) at least one nucleotide with which it is normally associated on a natural chromosome; a partially pure nucleic acid constitutes at least about 10%, preferably at least about and more preferably at least about 90% by weight of total nucleic acid present in a given fraction; and a recombinant nucleic acid is joined in sequence to at least one nucleotide with which it is not normally associated on a natural chromosome.
Fragments of the disclosed alleles are sufficiently long for use as specific hybridization 3i *r r WO 96/33271 PCTUS96/0 5 6 2 1 probes for detecting endogenous alleles, and particularly to distinguish the disclosed critical rare Sibl c oe canc from other BRCA1 alleles, including or mutant alleles which correlate with cancer susceptibility from other BRCA1 alleles, including alleles encoding the BRCA1 translation product displayed in Miki et al (1994) supra, under stringent conditions. Preferred fragments are capable of hybridizing to the corresponding mutant allele under stringency conditions characterized by a hybridization buffer comprising 0% formamide in 0.9 M saline/0.
0 9 M sodium citrate (SSC) buffer at a temperature of 37°C and remaining bound when subject to washing at 420C with the SSC buffer at 370C. More preferred fragments will hybridize in a hybridization buffer comprising 20% formamide in 0.9 M saline/0.09 M sodium citrate (SSC) buffer at a temperature of 42°C and remaining bound when subject to washing at 42°C with 2 X SSC buffer at 420C. In any event, the fragments are necessarily of length sufficient to be unique to the corresponding allele; i.e. has a nucleotide sequence at least long enough to define a novel oligonucleotide, usually at least about 14, 16, 18, 20, 22, or 24 bp in length, though such fragment may be joined in sequence to other nucleotides which may be nucleotides which naturally flank the fragment.
In many applications, the nucleic acids are labelled with directly or indirectly detectable signals or means for amplifying a detectable signal. Examples include radiolabels, luminescent fluorescent) tags, components of amplified tags such antigen-labelled antibody, biotin-avidin combinations etc. The nucleic acids can be subject to purification, synthesis, modification, sequencing, recombination, incorporation into a variety of vectors, expression, transfection, administration or methods of use disclosed in standard manuals such as Molecular Cloning,
A
Laboratory Manual (2nd Ed., Sambrook, Fritsch and Maniatis, Cold Spring Harbor), Current Protocols in Molecular Biology (Eds. Aufubel, Brent, Kingston, More, Feidman, Smith and Stuhl, Greene Publ. Assoc., Wiley-Interscience, NY, NY, 1992) or that are otherwise known in the art.
The subject nucleic acids are used in a wide variety of nucleic acid-based diagnostic method that are known to those in the art. Exemplary methods include their use as allele-specific oligonucleotide probes (ASOs), in ligase mediated methods for detecting mutations, as primers in PCR-based methods, direct sequencing methods wherein the clinical BRCA1 nucleic acid sequence is compared with the disclosed mutations and rare alleles, etc. The subject nucleic acids are capable of detecting the presence of a critical mutant or rare BRCA1 allele in a sample and distinguishing the mutant or rare allele from other BRCA1 alleles. For example, where the subject nucleic acids are used as PCR primers or hybridization probes the subject primer or probe 4 re I ip i I ILlqC--~q~~-L -i~Cd-P- Y~-LIL-~I WO 96/33271 PCT/US96/05621 comprises an oligonucleotide complementary to a strand of the mutant or rare allele of length sufficient to selectively hybridize with the mutant or rare allele. Generally, these primers and probes comprise at least 16 bp to 24 bp complementary to the mutant or rare allele and may be as large as is convenient for the hybridizations conditions.
Where the critical mutation is a deletion of wild-type sequence, useful primers/probes require wild-type sequences flanking (both sides) the deletion with at least 2, usually at least 3, more usually at least 4, most usually at least 5 bases. Where the mutation is an insertion or substitution which exceeds about 20 bp, it is generally not necessary to include wild-type sequence in the probes/primers. For insertions or substitutions of fewer than 5 bp, preferred nucleic acid portions comprise and flank the substitution/insertion with at least 2, preferably at least 3, more preferably at least 4, most preferably at least 5 bases. For substitutions or insertions from about 5 to about 20 bp, it is usually necessary to include both the entire insertion/substitution and at least 2, usually at least 3, more usually at least 4, most usually at least 5 basis of wild-type sequence of at least one flank of the substitution/insertion.
In addition to their use as diagnostic genetic probes and primers, BRCA1 nucleic acids are used to effect a variety of gene-based therapies. See, e.g. Zhu et al. (1993) Science 261, 209-211; Gutierrez et al. (1992) Lancet 339, 715-721; Gary Nabel lab (Dec 1993), Proc. Nat'l. Acad Sci USA. For example, therapeutic nucleic acids are used to modulate cellular expression or intracellular concentration or availability of a tumorigenic BRCA1 translation product by introducing into cells complements of the disclosed nucleic acids. These nucleic acids are typically antisense: single-stranded sequences comprising complements of the disclosed relevant BRCA1 mutant. Antisense modulation of the expression of a given mutant may employ antisense nucleic acids operably linked to gene regulatory sequences. Cell are transfected with a vector comprising such a sequence with a promoter sequence oriented such that transcription of the gene yields an antisense transcript capable of binding to the endogenous tumorigenic BRCA 1 allele or transcript. Transcription of the antisense nucleic acid may be constitutive or indu .ble and the vector may provide for stable extrachromosomal maintenance or integration. Alternatively, single-stranded antisense nucleic acids that bind to BRCA1 genomic DNA or mRNA may be administered to the target cell, in or temporarily isolated from a host, at a concentration that results in a substantial reduction in expression of the targeted translation product.
Various techniques may be employed for introducing of the nucleic acids into viable cells.
WO 96/33271 PCT/US96/0562 1 The techniques vary depending upon whether one is using the subject compositions in culture or in vivo in a host. Various techniques which have been found efficient include transfection with a retrovirus, viral coat protein-liposome mediated transfection, see Dzau et al., Trends in Biotech 11, 205-210 (1993). In some situations it is desirable to provide the nucleic acid source with an agent which targets the target cells, such as an antibody specific for a surface membrane protein on the target cell a ligand for a receptor on the target cell, etc. Where liposomes are employed, proteins which bind to a surface membrane protein associated with endocytosis may be used for targeting and/or to facilitate uptake, e.g. capsid proteins or fragments thereof tropic for a particular cell type, antibodies for proteins which undergo internalization in cycling, proteins that target intracellular localization and enhance intracellular half-life. In liposomes, the decoy concentration in the lumen will generally be in the range of about 0.1 [M to 20 pM. For other techniques, the application rate is determined empirically, using conventional techniques to determine desired ranges. Usually, application of the subject therapeutics will be local, so as to be administered at the site of interest. Various techniques can be used for providing the subject compositions at the site of interest, such as injection, use of catheters, trocars, projectiles, pluronic gel, stents, sustained drug release polymers or other device which provides for internal access.
Systemic administration of the nucleic acid using lipofection, liposomes with tissue targeting (e.g.
antibody) may also be employed.
The invention also provides isolated translation products of the disclosed BRCA1 allele which distinguish the wild type BRCA1 gene product. For example, for alleles which encode truncated tumorigenic translation product, the C-terminus is used to differentiate wild-type BRCA1. Accordingly, the invention provides tle translation product of BRCA1 allele #5803 (SEQ ID NO:13), 9601 (SEQ ID NO:14), 9815 (SEQ ID NO:15), 8203 (SEQ ID NO:17), 388 (SEQ ID NO:18), 6401 (SEQ ID NO:19), 4406 (SEQ ID NO:20), 10201 (SEQ ID NO:21), 7408 (SEQ ID NO:22), 582 (SEQ ID NO:23) or 77 (SEQ ID NO:24), or a C-terminus fragment thereof; and that of #8403 (SEQ ID NO:16), or a fragment thereof cor ising Gly at position 61.
The subject mutant and/or rare allelic BRCA1 translation products comprise an amino acid sequence which provides a target for distinguishing the product from that of other BRCA1 alleles.
Preferred fragments are capable of eliciting the production of a peptide-specific antibody, in vivo or in vitro, capable of distinguishing a protein comprising the immunogenic peptide from a wildtype BRCA1 translation product. The fragments are necessarily unique to the disclosed allele 6 i WO 96/33271 PCTIUS96/05621 translation product in that it is not found in any previously known protein and has a length at least long enough to define a novel peptide, from about 5 to about 25 residues, preferably from 6 to residues in length, depending on the particular amino acid sequence.
The subject translation products (including fragments) are either isolated, i.e.
unaccompanied by at least some of the material with which they are associated in their natural state); partially purified, i.e. constituting at least about preferably at least about 10%, and more preferably at least about 50% by weight of the total translation product in a given sample; or pue, i.e. at least about 60%, preferably at least 80%, and more preferably at least about by weight of total translation product. Included in the subject translation product weight are any atoms, molecules, groups, etc. covalently coupled to the subject translation products, such as detectable labels, glycosylations, phosphorylations, etc. The subject translation products may be isolated, purified, modified or joined to other compounds in a variety of ways known to those skilled in the art depending on what other components are present in the sample and to what, if anything, the translation product is covalently linked.
Binding agents specific for the disclosed tumorigenic BRCA1 genes and gene products find particular use in cancer diagnosis. The selected method of diagnosis will depend on the nature of the tumorigenic BRCA1 mutants/rare allele and its transcription or translation product(s). For example, soluble secreted translation products of the disclosed alleles may be detected in a variety of physiologic fluids using a binding agent with a detectable label such as a radiolabel, fluorescer etc. Detection of membrane bound or intracellular products generally requires preliminary isolation of cells blood cells) or tissue breast biopsy tissue). A wide variety of specific binding assays, e.g. ELISA, may be used BRCA1 gene product-specific binding agents are produced in a variety of ways using the compositions disclosed herein. For example, structural x-ray crystallographic and/or NMR data of the mutant and/or rare allelic BRCA1 translation products are used to rationally d ign binding molecules of determined structure or complementarity. Also, the disclosed mutant and/or rare allelic BRCA1 translation products are used as immunogens to generate specific polyclonal or monoclonal antibodies. See, Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, for general methods. Specific antibodies are readily modified to a monovalent form, such as Fab, Fab', or Fv.
Other mutant and/or rare allelic BRCA1 gene-product specific agents are screened from 7 I L t I WO 96/33271 PCT/US96/05621 large libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of saccharide, peptide, and nucleic acid based compounds.
Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily producible. Additionally, natural and synthetically produced libraries and compounds are readily modified through conventional chemical, physical, and biochemical means. See, e.g. Houghten et al. and Lam et al (1991) Nature 354, 84 and 81, respectively and Blake and Litzi-Davis (1992), Bioconugate Chem 3, 510.
Useful binding agents are identified with assays employing a compound comprising mutant and/or rare allelic BRCA1 peptides or encoding nucleic acids. A wide variety of in vitro, cell-free binding assays, especially assays for specific binding to immobilized compounds comprising the subject nucleic acid or translation product find convenient use. See, e.g. Fodor et al (1991) Science 251, 767 for the light directed parallel synthesis method. Such assays are amenable to scale-up, high throughput usage suitable for volume drug screening.
Useful agents are typically those that bind the targeted mutant and/or rare allelic BRCA1 gene product with high affinity and specificity and distinguish the tumorigenic BRCA1 mutants/rare alleles from the wild-type BRCA1 gene product. Candidate agents comprise functional chemical groups necessary for structural interactions with proteins and/or DNA, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups, more preferably at least three. The candidate agents often comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the forementioned functional groups. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, sterols, isoprenoids, purines, pyrimidines, derivatives, structural analogs or combinations thereof, and the like. Where the agent is or is encoded by a transfected nucleic acid, said nucleic acid is typically DNA or
RNA.
Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural and synthetically produced "'raries and compounds are readily modified through conventional rce i- i_ :4 1IL-~ pc-il.- y-l -Y ~1II WO 9633271 CTUS96/05621 WO 96/33271 chemical, physical, and biochemical means to enhance efficacy, stability, pharmaceutical compatibility, and the like. In addition, known pharmacological agents may be subject to directed or random chemical modifications, such as acylation, aE, i tion, esterification, amidification, etc., to produce structural analogs.
Therapeutic applications typically involve binding to and functional disruption of a tumorigenic BRCA1 gene product by an administered high affinity binding agent. For therapeutic uses, the compositions jnd agents disclosed herein may be administered by any convenient way.
Small organics are preferably administered orally; other compositions and agents are preferably administered parenterally, conveniently in a pharmaceutically or physiologically acceptable carrier, phosphate buffered saline, or the like. Typically, the compositions are added to a retained physiological fluid such as blood or synovial fluid. Generally, the amount administered will be empirically determined, typically in the range of about 10 to 1000 pg/kg of the recipient. For peptide agents, the concentration will generally be in the range of about 50 to 500 pg/ml in the dose administered. Other additives may be included, such as stabilizers, bactericides, etc. These additives will be present in conventional amounts.
The following examples are offered by way of illustration and not by way of limitation.
EXAMPLES
Example 1. Positional cloning Contig construction.
YACs. Primers flanking polymorphic repeats in the 4 Mb region of linkage were used to amplify pools from the CEPH, Washington University, and CEPH megaYAC libraries available. 39 YACs were selected. Of these, 23 were tested for chimerism by FISH and 12 found to be chimeric.
YACs were aligned to each other by attempting to amplify each YAC with primer pairs from known sequence tagged sites (STSes). More STSes were defined by sequencing the ends of YACs, and these new STSes used ir further alignment and YAC identification.
Cosmids. A gridded cosmid library of chromosome 17 was prepared. Alu-Alu PCR products of YACs were hybridized to the cosmid grids and positively hybridizing cosmids used for subsequent studies. Contigs were constructed in two ways. Cosmids with the same restriction patterns were aligned; and, the unique sequences flanking polymorphic markers and our sequenced cDNAs were used as STSes.
9 L' WO 96/33271 PCT/US96/05621 Physical mapping by pulsed fiela gel electrophoresis. Physical distances were estimated by pulsed field gel electrophoresis, using DNA from lymphocyte cell lines of BRCAl-linked patients and of controls. DNA samples were digested with NotI, Mlul, RsrII, NruI, SacII, and EclXI. Filters were probed with single-copy sequences isolated from cosmids and later with cDNA clones. Multiple unrelated linked patients and controls were screened to detect large insertions or deletions associated with BRCA1. Results of PFGE were used to define the region first used to screen cDNA libraries as 1 Mb and the current linked region as 500 kb.
Screening cDNA libraries. We began library screening when the linked region defined by meiotic recombination was -1 Mb. The first question was what library would optimize the length of cDNA clones, representation of both 5' and 3' ends of genes, and the chances that BRCA1 would be expressed. We chose to use a random primed cDNA library cloned into IgtlO from cultured (not transformed) fibroblasts from a human female. This library was selected because it had inserts averaging 1.8 kb, with 80% of inserts between 1 and 4 kb, was contructed from cultured fibroblasts known to be "leaky" in gene expression, and was known to include 5' ends of genes. We simultaneously screened three other libraries (from ovary, fetal brain, and mouse mammary epithelium). With one exception (described below), all tr-- ripts from these libraries cross-hybridized to transcripts from the fibroblast library.
The fibroblast library was screened with YAC DNA isolated by PFGE. Pure YAC DNA (100 nanograms) was random primed with both aP32-dATP (6000mCi/mmole) and 32 P-dCTP (3000 mCi/mmole), and used immediately after labelling. Filters from the library were prehybridized with human placental DNA for 24-48 hours. Labelled YAC DNA was hybridized to the filters for 48 hours at 65C. Approximately 250 transcripts were selected by screening with 7 YACs and then ross-hybridized. We also used pools of cosmids from the linked region to screen the fibroblast library. We selected 122 transcripts and cross-hybridized them to clones previously detected by the YACs.
Example 2. Cloning BRCA1 and its characterization.
A. Screening for mutations in candidate genes. We initially identified 24 genes in the 1Mb BRCA1 region defined by meiotic recombinatio,, respective locations on the YAC coltig, sizes of representative cDNA clones, numbers of rn cates in the library, sizes of transcripts, homologies to known genes, and variants detected. Candidate gene were characterized in the following ways: WO 96/33271 PCTIUS96/05621 Cross-hybridizing clones. cDNA clones isolated from the library are hybridized against each other. Cross-hybridizing clones are considered "siblings" of the clone used as a probe and represent the same gene.
Mapping back. At least one clone from each sibship is mapped back to total human genomic DNA, to cosmids, to YACs, and to somatic cell hybrid lines, some of which contain deletions of 17q and one of which has chrorrosome 17 as its only human chromosome.
Subcloning andseuencing. One of the longest clones from each sibship is subcloned into M13 and sequenced manually by standard methods, constructing new primers at the end of each fragment to continue sequencing until the end of the clone is reached.
Extnding s nces with sibs. In order to find clones that contain more of the gene, the last sequencing primer for the clone and primers made from Xgtl0 are used to amplify sibs of the first clone. Sibs that amplify the longest fragments are selected, subclored, and sequenced. This process is continued until we rea-h the size of the transcript defined by Northern blot and/or until the 3' sequence is a polyA tail and the 5' sequence has features of the beginning of the coding region.
Southerns. To identify insertion or deletion mutations, ge. DNA from 20 unrelated patienb from families with breast cancer linked to 17q "linked patients") and controls are digested with BamI/TaqI and independently with HindII/HinfI. Each cDNA clone is ued to screen Southern blots. Variants have been detected in two genes. Both of these variants are RFLPs, occuring in equal frequency in linked patients and in controls.
NoQthrns. To identify splice mutations and/or length mutations, we prepared total RNA and polyA+ RNA from germline DNA (from lymphoblast lines) of 20 unrelated linked patients, from ovarian and breast tissues, from fibroblasts, from a HeLa cell line, and from breast cancer cell lines. Northern blots are screened with each gene.
Detection of small mutations. To screen for g.rmline point mutations in patients without encountering introns, prepared cDNA from poly-A+ mRNA from lymphoblast cell lines of unrelated linked patients and from controls. cDNA has also been made from 65 malignant ovarian cancers from patients not selected for family history. Primers are constructed every -200 basepairs along the sequence and used to amplify these cDNAs. Genomic DNA has also been prepared from cell lines from all family members (linked and unlinked), from malignant and normal cells from paraffin blocks from their bieast and ovarian surgeries, and from malignant and normal 11 i
I
WO 96/33271 PCT/US96/05621 cells from 29 breast tumors not selected for family history. For sequences without introns, cDNA and gDNA lengths are equal, and the gDNA samples are amplified as well.
Two mutation detection methods are used to screen each sequence. Amplified products are screened for SSCPs using modifications that enable electrophoresis to be done with only one set of running conditions (Keen et al. 1991 Trends Genet 7:5; Soto and Sukumar 1992 PCR Meth Appl 2:96-98). In order to screen longer segments of DNA (100-1500 bp) and to detect variants missed by SSCP, sequences are also screened for point mutations by CCM (Cotton 1993 Mutation Res 285:125-144) using essentially the protocol of Grompe et al. 1989 Proc Natl Acad Sci USA 86:5888-5892. An endonuclease developed for mismatch detection reduces the toxicity of the method (Youil et al. 1993 Amer J Hum Genet 53 (supplement): abstract 1257).
Polymorphism or mutation. Variants are screened in cases and controls to distinguish polymorphisms from a critical mutation. Linkage of breast cancer to each variant is tested in all informative families.
Example 3. Characterize BRCA1 mutations in germline DNA and breast cancer patients tumors.
A. BRCA1 mutations inchromosome 17q-linked families. Our seiies of families includes large extended kindreds in which breast and ovarian cancer (and in one family prostatic cancer) are linked to 17q21, with individual lod scores 1.5. Since linked patients in these i milies carry mutations in BRCA1, we have identified their mutations first.
Table 1 summarizes critical BRCA1 mutations and rare alleles: Family Exon U14680 nt Mutation Amino Acid Predicted __change effect 5803 3 200-253 exon 3 deleted (54 bp) 27 Stop protein truncation 9601 3 230 deletion AA 39 Stop protein truncation 9815 Intron 5 splice donor, substitution G to A 64 Stop protein bp +1 ->22 bp deletion in RNA truncation 8403 5 300 substitution T to G Cys 61 Gly lose zincbinding motif 8203 Intron 5 splice substitution T to G 81 Stop protein acceptor, bp ->59 bp insertion of truncation -11 intron into RNA i 1 WO 9 WgP 96/33271 PCT/S96/0562 388 11 1048 deletion A 313 Stop protein truncation 6401 11 2415 deletion AG Ser 766 Stop protein truncation 4406 11 2800 deleiton AA 901 Stop protein truncation 10201 11 2863 deletion TC Ser 915 Stop protein truncation 7408 11 3726 substitution C to T Arg 1203 protein Stop truncation 582 11 418, deletion TCAA 1364 Stop protein truncation 77 24 5677 Insertion A Tyr 1853 protein Stop truncation
U
B. Germline BRCA I mutations among breast cancer patients in the general population.
From each breast cancer patient, not selected for family history, a 30 ml sample of whole blood is drawn into acid citrate dextrose. DNA from the blood is extracted and stored at in 3 aliquots. Germline mutations in BRCA1 are identified using the approaches described above and by directly sequencing new mutations. Paraffin-embedded tumor specimens from the same patients are screened for alterations of p53, HER2, PRAD1, and ER. Germline BRCA1 mutations are tested in the tumor blocks.
A preliminary estimate of risk associated with different BRCA1 mutations is obtained from relatives of patients with germline alterations. For each patient with a germline BRCA1 mutation, each surviving sister and mother (and for older patients, brothers as well), DNA is extracted from a blood sample and tested for the presence of the proband's BRCA1 mutation. To ascertain men at risk of prostatic cancer, brothers of breast cancer patients diagnosed after age 55 are also interviewed and sampled. Paraffin blocks from deceased relatives who had cancer are also screened. The frequency of breast, ovarian, or prostatic cancer among relatives carrying BRCA 1 mutations is a first estimate of risk of these cancers associated with different mutations. Somatic alterations of BRCA1 in breast tumors.
Malignant cells are dissected from normal cells from paraffin blocks. By identifying BRCA1 mutations in these series, we estimate the frequency of somatic BRCA1 alterations, determine BRCA1 mutations characteristic of any particular stage of tumor development, and 13 (i*y WO 96/33271 PCT/US96/05621 evaluate their association with prognosis.
D. Characterizing mutant and rare alleles of BRCA1. Mutant or rare BRCA1 allele function and pattern of expression during development are characterized using transformed cells expressing the allele and knockout or transgenic mice. For example, phenotypic changes in the animal or cell line, such as growth rate and anchorage independence are determined. In addition, several methods are used to study loss-of-function mutations, including replacing normal genes with their mutant alleles (BRCA1-/BRCA1-) by homologous recombination in embryonic stem (ES) cells and replacing mutant alleles with their normal counterparts in differentiated cultured cells (Capecchi 1989 Science 244:1288-1292; Weissman et al. 1987 Science 236:175-180; Wang et al. 1993 Oncogene 8:279-288). Breast carcinoma cell lines are screened for mutation at the BRCA1 locus and a mutant BRCA1 line is selected. Normal and mutant cDNAs of BRCA1 are subcloned into an expression vector carrying genes which confer resistance to ampicillin and geneticin (Baker et al. 1990 Nature 249:912-915). Subclones are transfected into mutant BRCA1 breast cancer cells Geneticin-resistant colonies are isolated and examined for any change in tumorigenic phenotype, such as colony formation in soft agar, increased growth rate, and/or tumor formation in athymic nude mice. In vivo functional demonstrations involve introducing the normal BCRA1 gene into a breast carcinoma cell line mutant at BRCA1 and injecting these BRCAl+ cells into nude mice. Changes observed in tumorigenic growth compared to nude mice injected with BRCA1 mutant breast carcinoma cells are readily observed. For example, correcting the mutant gene decreases the ability of the breast carcinoma cells to form tumors in nude mice (Weissman et al. 1987; Wang et al. 1993).
All publications and patent applications cited in this specification are herein incorporated by reference as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Although the foregoing invention has been described in some detail by way of illustration and cxam-.e for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
14 S14 1 H ryl -3 I T 0 WO 96/33271 PCTIUS96/05621 SEQUENCE LISTING GENERAL INFORMATION: APPLICANT: KING, Mary-Claire FRIEDMAN, Lori OSTERMEYER, Beth ROWELL, Sarah LYNCH, Eric SZABO, Csilla LEE, Ming (ii) TITLE OF INV7NTION: GENETIC MARKERS FOR BREAST AND OVARIAN
CANCER
(iii) NUMBER OF SEQUENCES: 24 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Science Technology Law Group STREET: 268 Bush Street, Suite 3200 CITY: San Francisco STATE: California COUNTRY: USA ZIP: 94104 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: US FILING DATE:
CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION: NAME: OSMAN, Richard A REGISTRATION NUMBER: 36,627 REFERENCE/DOCKET NUMBER: A-59563-3/RAO (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: (415) 343-4341 TELEFAX: (415) 343-4342
TELEX:
INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 5656 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA 120 TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAAATGT CATTAATGCT ATGCAGAAAA 180 TCTTAGAGTG TCCCATCTGA TTTTGCATGC TGAAACTTCT CAACCAGAAG AAAGGGCCTT 240 CACAGTGTCC TTTATGTAAG AATGATATAA CCAAAAGGAG CCTACAAGAA AGTACGAGAT 300 TTAGTCAACT TGTTGAAGAG CTATTGAAAA TCATTTGTGC TTTTCAGCTT GACACAGGTT 36C TGGAGTATGC AAACAGCTAT AATTTTGCAA AAAAGGAAAA TAACTCTCCT GAACATCTAA 420 AAGATGAAGT TTCTATCATC CAAAGTATGG GCTACAGAAA CCGTGCCAAA AGACTTCTAC 480 I ;Mmm WO 96/33271 PCTJUS96OS621
AGAGTGAACC
TTGGAACTGT
ACATTGAATT
TGGGAGATCA
ATTCTGCAAA
ATCAACCCAG
AAAAGTATCA
ATGCCAGCTC
TAGAAAAGGC
ACAGATGGGC
AGGTAGATCT
CATGCTCAGA
TTCAGAAAGT
ATGATGGGGA
TAGATGAATA
CTTTAATATG
AAATATTTGG
AAAATCTAAT
CAAATAAATT
AGAAAGCAGA
CGGAGCAGI~k
GTGATTCTAT
CTTTCAAAAC
ATATCCACAA
ATATTCATGC
TGCAAATTGA
CAGTCAGGCA
AGAAGAGTAA
AGCTGAAGrT
AAGAATTTGT
AAGTGTCTAA
AAACTGAAAG
CGAAAATCCT
GAGAACTCTG
GGGATCTGAT
AGAATTGTTA
AAAGGCTGCT
TAATAATGAT
GGGTAGTTCT
ATTACAGCAT
TGAAT'rCTGT
TGGAAGTAAG
GAATGCTGAT
GAATCCTAGA
TAATG. jTGG
GTCTGAATCA
TTCTGGTTCT
TAAAAGTGAA
GAAAACCTAT
TATAGGAGCA
AA.AGCGTAAA
TTTGGCAGTT
TGGTCAAGTG
TCAGAATGAG
GAAAGCTGAA
TTCAAAAGCA
GCTTGAACTA
TAGTTGTTCT
CAGCAGAA
C;UXGCCAAAT
AACAAATGCA
CAATCCTAGC
TAATGCTGAA
ATCTGTAGAG
TCCTTGCAGG
AGGACAAAGC
TCTTCTGAAG
CAAATCACCC
TGTGAATTTT
TTGAACACCA
GTTTCAAACT
GAGAACAGCA
AATAAAGCA
GAAACATGTA
CCCCTGTGTG
GATACTGAAG
TTTTCCAGAA
AATGCCAAAG
TCAGAGAAAA
AGAGTTCACT
CGGAAGAAGG
TTTGTTACTG
.AGGAGACCTA
CAAAAGACTC
ATGAATATTA
AAAAATCCTA
CCTATAAGCA
CCTAAAAAGA
GTAGTCAGTA
AGCAGTGAAG
CTACAACTCA
GAACAGACAA
CCTGGTTCTT
CTTCCAAGAG
GACCCCAAAG
AGTAGCAGTA
AAACCAGTCT CAGTGTCCAA AGCGGATACA ACCTCAAAAG ATACCGTTAA TAAGGCAACT i
CTCAAGGAAC
CTGAGACGGA
CTGAGAAGCG
TGCATGTGGA
GTTTATTACT
AACAGCCTGG
ATGATAGGCG
AGAGAAAAGA
ATGTTCCTTG
GTGATGAACT
TAGCTGATGT
TAGACTTACT
CCAAATCAGT
CAAGCCTCCC
AGCCACAGAT
CATCAGGCCT
CTGAAATGAT
CTAATAGTGG
ACCCAATAGA
GCAGTAkTAAG
ATAGGCTGAG
GAAATCTAAG
AGATAAAGAA
TGGAAGGTAA
GTAAAAGACA
TTACTAAGTG
AAGAAAAAGA
ATCTCATGTT
TTTCATTGGT
CAGGGATGAA
TGTAACAAAT
TGCAGCTGAG
GCCATGTGGC
CACTAAAGAC
CTTAGCAAGG
GACTCCCAGC
ATGGAATAAG
GATAACACTA
GTTAGGTTCT
ATTGGACGTT
GGCCAGTGAT
AGAGAGTAAT
CAACTTAAGC
AATACAAGAG
TCATCCTGAG
AAATCAGGGA
TCATGAGAAT
ATCACTCGAA
CAATATGGAA
GAGGAAGTCT
CCCACCTAAT
AAAAAAGTAC
AGAACCTGCA
TGACAGCGAT
TTCAAATACC
AGAGAAACTA
AAGTGGAGAA
ACCTGGTACT
CTCTCTAACC
ACGTCTGTCT
TATTGCAGTG
ATCAGTTTGG
ACTGAACATC
AGGCATCCAG
ACAAATACTC
AGAATGAATG
AGCCAACATA
ACAGAAAAAA
CAGAAACTGC
AATAGCAGCA
GATGACTCAC
CTAAATGAGG
CCTCATGAGG
ATTGAAGACA
CATGTAACTG
CGTCCCCTCA
GATTTTATCA
ACTAACCAAA
AAAACAAAAG
AAAGAATCTG
CTCGAATTAA
TCTACCAGGC
TGTACTGAAT
AACCAAATGC
ACTGGAGCCA
ACTTTCCCAG
AGTGAACTTA
GAAACAGTTA
AGGGTTTTGC
GATTATGGCA
540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 WO 96133271 WO 9633271PCTIUS96/05621 CTCAGGAAA:3 TATCTCGTrTA ATAAATGTGT GAGTCAGTGT CCAAAGATAA TAGAAATGAC ACAGTCGGGA AACAAGCATA ATACATTCAA GGTTTCAAAG AAGAGGAATG TGCAACATTC TCACTTTTGA ATGTGAACAA CTGTACAGAC AGTTAATATC TTGATAATGC CAAATGTAGT GAGGCAACGA AACTGGACTC GTATACCACC ACTTTTTCCC TAGAGGAAAA CTTTGAGGAA TTCCAAGTAC AGTGAGCACA
CCAGCTCAAG
ATGAAATAGG
AATTGAATGC
CTGGAAGTA
CTGTTAATAC
GTAGTCATGC
TAAAGGAAGA
AAAGCGTCCA
CTCAGGGTTA
AGGATGAAGA
CTCAGTCTAC
ATTTATTATC
CATCTCAGGA
AGTGCAGTGA
GTTCTTCCAA
AATTGGTTTC
AAAGCATGGA
CAATATTAAT
TTCCAGTGAT
TATGCTTAGA
TTGTA.XGCAT
AGATTTCTCT
ATCTCAGGTT
TACTAGTTTT
GAAAGGAGAG
CCGAAGAGGG
GCTTCCCTGC
TAGGCATAGC
ATTGAAGAAT
ACATCACCTT
ATTGGAAGAC
ACAAATGAGG
AGATGATGAA
TTCAAACTTA
CTGGAAGTTA
GCAGCATTTG
ACAGAAGGCT
GAAATGGAAG
CGCCAGTCAT
TCTGCCCACT
AAGGAAGAAA
ACTGCAGGCT
ATCAAAGGAG
ATTACTCCAA
ATCAAGTCAT
CATTCAATGT
ATTAGCCGTA
GAAGTAGGTT
GAAAACATTC
TTAGGGGTTT
CCTGAAATAA
CCATATCTGA
TGTTCTGAGA
GCTGAAAATG
CTTAGCAGGA
GCCAAGAAAT
TTCCAACACT
ACCGTTGCTA
AGCTTAAATG
AGTGAGGAAA
TTGACTGCAA
CATCAGTCTG
GAAAGAGGAA
GGTGAAGCAG
GCACTCTAGG
AAAkACCCCAA
TTAAGTATCC
AAAGTGAACT
TTGCTCCGTT
CTGGGTCCTT
ATCAAGGAAA
TTCCTGTGGT
GCTCTAGGTT
ATAAACATGG
TTGTTAAA.AC
CACCTGAAAG
ATAACATTAG
CCAGTACTAA
AAGCAGAACT
TGCAACCTGA
AAAAGCAAGA
TTTCAGATAA
CACCTGATGA
ACATTAAGGA
GTCCTAGCCC
TAGAGTCCTC
TGTTATTTGG
CCGAGTGTCT
ACTGCAGTAA
CAAAATGTTC
ATACAAACAC
AA.AGCCAGGG
CGGGCTTGGA
CATCTGGGTG
GAAGGCAAAA
GGGACTAATT
ATTGGGACAT
TGATGCTCAG
TTCAAATCCA
AAAGAAACAA
GAATGAGTCT
TGGTCAGAA.A
TTGTCTATCA
ACTTTTACAA
TAAAkTGTAAG
AGAAATGGGA
AGAA.AATGTT
TGAAGTGGGC
AGGTAGAAAC
GGTCTATAAA
ATATGAAGAA
CTTAGAACAG
CCTGTTAGAT
AAGTTCTGCT
TTTCACCCAT
AGAAGAGAAC
TAAAGTAAAC
GTCTAAGAAC
CCAGGTAATA
TGCTAGCTTG
CCAGGATCCT
AGTTGGTCTG
AGAZAAAAT
TGAGAGTGAA
AACCACTCAG
TGAACTAGAA
ACAGAACCAA
CATGGTTGTT
GAAGTTAACC
TATTTGCAGA
GGAAATGCAG
AGTCCAAAAG
AATATCA.AGC
GATAAGCCAG
TCTCAGTTCA
AACCCATATC
AAAAATCTGC
AATGAGAACA
TTTAAAGAAG
TCCAGTATTA
AGAGGGCCAA
C-AAGTCTTC
GTAG-TTCAGA
CCTATGGGAA
GATGGTGAAA
GTTTTTAGCA
ACACATTTGG
TTATCTAGTG
AATATACCTT
AC-IGAGGAGA
TTGGCAAAGG
TTTTCTTCAC
TTCTTGATTG
AGTGACAAGG
CAAGAAGAGC
ACAAGCGTCT
CAGAGGGATA
GCTGTGTTAG
2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 CCATGCAACA TAACCTGATA AAGCTCCAGC AGGA-AATGG 17
T
c WO 96/33271
AACAGCATGG
TTGAGGACCT
AAAGTAGTGA
TGTCTGCAGA
CTAAATGCCC
ATAGAAACTA
TGGAAGAGTC
AGGGAACCCC
CTTCTGAAGA
CATTGAAAGT
CTACTGATAC
TGACAGCTTC
CAGAAGAATT
TAATTACTGA
GGACACTGAA
TGACCCAGTC
ATGTGGTCAA
AGATCTTCAG
AACTGGAATG
CCCTTGGCAC
ATGGCTTCCA
TGGACAGTGT
GAGCCAGCCT
GCGAAATCCA
ATACCCTATA
TAGTTCTACC
ATCATMAGAT
CCCATCTCAA
TGGGCCACAC
TTACCTGGAA
CAGAGCCCCA
TCCCCAATTG
TGCTGGGTAT
AACAGAAAGG
TATGCTCGTG
AGAGACTACT
ATATTTTCTA
TATTAAAGAA
TGGAAGAAAC
GGGGCTAGAA
GATGGTACAG
AGGTGTCCAC
TGCAATTGGG
AGCACTCTAC
TCTAACAGCT
GAACAAAGCA
AGCCAGAATC
AGTAAAAATA
GATAGGTGGT
GAGGAGCTCA
GATTTGACGG
TCTGGAATCA
GAGTCAGCTC
AAAGTTGCAG
AATGCAATGG
GTCAACAAAA
TACAAGTTTG
CATGTTGTTA
GGAATTGCGG
AGAAAAATGC
CACCAAGGTC
ATCTGTTIGCT
CTGTGTGGTG
CCAATTGTGG
CAGATGTGTG
CAGTGCCAGG
ACCCTTCCAT
CATCAGAAAA
CAGAAGGCCT
AAGAACCAGG
ACATGCACAG
TTAAGGTTGT
AAACATCTTA
GCCTCTTCTC
GTGTTGGCAA
AATCTGCCCA
AAGAAAGTGT
GAATGTCCAT
CCAGAAAACA
TGAAAACAGA
GAGGAAAATr TGAATG.Z1
CAAAGCGAL-.:
ATGGGCCCTT
CTTCTGTGGT
TTGTGCAGCC
AGGCACCTGT
AGCTGGACAC
CATAAGTGAC TCTTCTGCCC AGCAGTATTA ACTTCACAGA TTCTGCTGAC AAGTTTGAGG
AGTGGAAAGG
TTGCTCTGGG
TGATGTGGAG
CTTGCCAAGG
TGATGACCCT
CATACCATCT
GAGTCCAGCT
GAGCAGGGAG
GGTGGTGTCT
CCACATCACT
TGCTGAGTTT
qGTAGTTAGC 3iC TTTGAA
GAATCC
CACCAACATG
GAAGGAGCTT
AGATGCCTGG
GGTGACCCGA
CTACCTGATA
TCATCCC CTT
AGTCTTC-AGA
GAGCAACAGC
CAAGATCTAG
GAATCTGATC
TCAACCTCTG
GCTGCTCATA
AAGCCAGAAT
GGCCTGACCC
TTAACTAATC
GTGTGTGAAC
TATTTCTGGG
GTCAGAGGAG
CAGGACAGAA
CCCACAGATC
TCATCATTCA
ACAGAGGACA
GAGTGGGTGT
CCCCAGATCC
4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5656 ado CCCACAGCCA CTACTG INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: 50 LENGTH: 5709 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA TGGATTTATC TGCTCTTC' C GTTGAAGAAG TACAAAATGT CATTAATGCT ATGCAGAAAA WO 96/33271 P~U9/52 PCT/US96/05621
TCTTAGAGTG
ATATTTTGCA
CCTTTATGTA
CTTGTTGAA.G
GCAAACAGCT
GTTTCTATCA
CCCGAAAATC
GTGAGAACTC
TTGGGATCTG
CAMAGAATTGT
AAAAAGGCTG
AGTAATAATG
CAGGGTAGTT
TCATTACAGC
GCTGAATTCT
GCTGGAAGTA
CTGAATGCTG
GAGAATCCTA
GTTAATGAGT
GAGTCTGAAT
TATTCTGGTT
TGTAAAAGTG
GGGAAAACCT
ATTATAGGAG
TTAAAGCGTA
GATTTGGCAG
AATGGTCAAG
ATTCAGAATG
ACGAAAGCTG
AATTCAAAAG
GCGCTTGAAC
GATAGTTGTT
TCCCATCTGT
AATTTTGCAT
AGAATGATAT
AGCTATTGAA
ATAATTTTGC
TCCAAAGTAT
CTTCCTTGCA
TGAGGACAAA
ATTCTTCTGA
T'iCAAATCAC
CTTGTGAATT
ATTTGAACAC
CTGTTTCAAA.
ATGAGAACAG
GTAATAAAAG
AGGAAACATG
ATCCCCTGTG
GAGATACTGA
GGTTTTCCAG
CAAATGCCAA
CTTCAGAGAA
AAAGAGTTCA
ATCGGAAGkA
CATTTGTTAC
AXAGGAGACC
TTCAAAAGhC
TGATGAATAT
AGAAAAATCC
AACCTATAAG
CACCTAAAAA
TAGTAGTCAG
CTAGCAGTGA
CTGGAGTTGA
GCTGAAACTT
AACCAAAAGG
AATCATTTGT
AAAAAAGGAA
GGGCTACAGA
GGAAACCAGT
GCAGCGGATA
AGATACCGTT
CCCTCAAGGA
TTCTGAGACG
CACTGAGAAG
CTTGCATGTG
CAGTTTATTA
CAAACAGCCT
TAATGATAGG
TGAGAGAAA
AGATGTTCCT
AAGTGATr(AA AGTAGCTG k'
AATAGACTTA
CTCCAAATCA
GGCAAGCCTC
TGAGCCACAG
TACATCAGGC
TCCTGAAATG
TACTAATAGT
TAACCCAATA
CAGCAGTATA
GAATAGGCTG
TAGAAATCTA
AGAGATAAAG
TCAAGGAACC
CTCAACCAGA
AGCCTACAAG
GCTTTTCAGC
A.ATAACTCTC
AACCGTGCCA
CTCAGTGTCC
CAACCTCAAA
AATAAGGCAA
ACCAGGGATG
GATGTAACAA
CGTGCAGCTG
GAGCCATGTG
CTCACTAAAG
GGCTTAGCAA
CGGACTCCCA
GAATGGAATA
TGGATAACAC,
CT1GrYOPG-.T ,,3A-VTGGACG
CTGGCCAGTG
GTAGAGAGTA
CCCAACTTAA
ATAATACAAG
CTTCATCCTG
ATAAATCAGG
GGTCATGAGA
GAATCACTCG
AGCAATATGG
AGGAGGAAGT
AGCCCACCTA
AAAAAAAAGT
TGTCTCCACA
AGAAAGGGCC
AAAGTACGAG
TTGACACAGG
CTGAACATCT
AAAGACTTCT
AACTCTCTAA
AGACGTCTGT
CTTATTGCAG
AAATCAGTTT
ATACTGAACA
AGAGGCATCC
GCACAAATAC
ACAGAATGAA
GGAGCCAACA
GC. 4 :AGAA.AA
AGCAGAAACT
TAAATAGCAG
i'TGATGACTC
TTCTAAATGA
ATCCTCATGA
ATATU -,',AGA
GCCATGTAAC
AGCGTCCCCT
AGGATTTT.AT
GAACTAACCA
ATAAAACAAA
AAAAAGAATC
AACTCGAATT
CTTCTACCAG
ATTGTACTGA
ACAACCAAAT
GTGTGACCAC
TTCACAGTGT
ATTTAGTCAA
TTTGGAGTAT
AAAAGATGAA
ACAGAGTGAA
CCTTGGAACT
CTACATTGAA
TGTGGGAGAT
GGATTCTGCA
TCATCAACCC
AGAAAAGTAT
TCATGCCAGC
TGTAGAAAAG
TAACAGATGG
AAAGGTAGAT
GCCATGCTCA
CATTCAGAAA
ACATGATGGG
GGTAGATGAA
GGCTTTAATA
CAAAATATTT
TGAAAATCTA
CACAAATAAA
CAAGAAAGCA
AACGGAGCAG
AGGTGATTCT
TGCTTTCAAA
AAATATCCAC
GCATATTCAT
ATTGCAAATT
GCCAGTCAGG
240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 WO 96/33271 WO 9633271PCTIUS96/05621
CACAGCAGAA
AACAAGCCAA
TTAACAAATG
GTCAATCCTA
AATAATGCTG
AGATCTGTAG
AGTATCTCGT
GTGAGTCAGT
AATAGAAATG
GAAACAAGCA
AAGGTTTuU-
TGTGCAACAT
(UkATGTGAAC
ACAGTTAATA
GCCAAATGTA
GAAACTGGAC
CCACTTTTTC
AACTTTGAGG
ACAGTGAGCA
AGCAATATTA
cGrTCCAGTG
GCTATGCTTA
AATTGTAAGC
ACAGATTTCT
GCATCTCAGG
GATACTAGTT
CAGAAAGGAG
TACCGAAGAG
GAGCTTCCCT
AC"'AGGCATA
TCATTGAAGA
GAACATCACC
ACCTACAACT CATGGAAGGT AAAGAT.CCTG CAACTGGAGC CAAGAAGAGT
ATGAACAGAC
CACCTGGTTC
GCCTTCCAAG
AAGACCCCAA
AGAGTAGCAG
TACTGGAAGT
GTGCAGCATT
ACACAGAAGG
TAGAAATr(;A AGCGCCA-7
TCTCTGCCCA
AAAAGGAAGA
TCACTGCAGG
GTATCAAAGG
TCATTACTCC
CCATCAAGTC
AAXCATTCAAT
CAATTAGCCG
ATGAAGTAGG
ATGAAAACAT
GATTAGGGGT
ATCCTGAAAT
CTCCATATCT
TTTGTTCTGA
TTGCTGAAA
AGCTTAGCAG
GGGCCAAGAA
GCTTCCAACA
GCACCGTTGC
ATAGCTTAAA
TTAGTGAGGA
AAGTAAAAGA CATGACAGCG
TTTTACTAAG
AGAAGAAAAA
AGATCTCATG
TATTTCATTG
TAGCACTCTA
TGAMAACCCC
CTTTAAGTAT
AGAAAGTGAA
ATTTGCTCCG
CTCTGGGTCC
AAATCAAGGA
CTTTCCTGTrG
AGGCTCTAGG
AAATAAACAT
ATTTGTTAAA
GTCACCTGAA
TAATAACATT
TTCCAGTACT
TCAAGCAGAA
TTTGCAACCT
AAAAAAGCAA
GATTTCAGAT
GACACCTGAT
TGACATTAAG
GAGTCCTAGC
ATTAGAGTCC
CTTGTTATTC
TACCGAGTGT
TGACTGCAGT
AACAAAATGT
TGT C AAATA
GAAGAGAAAC
TTAAGTGGAG
GTACCTGGTA
GGGAAGGCAA
AAGGGACTAA
CCATTGGGAC
CTTGATGCTC
TTTTCAA.ATC
TTAAAGAAAC
AAGAATGAGT
GTTGGTCAGA
TTTTGTCTAT
GGACTTTTAC
ACTAA.ATGTA
AGAGAAATGG
AGAGAAkAATG
AATGAAGTGG
CTAGGTAGAA
GAGGTCTATA
GAATATGAAG
AACTTAGAAC
GACCTGTTAG
GAAAGTTCTG
CCTTTCACCC
TCAGAAGAGA
GGTAAAGTAA
CTGTCTAAGA
AACCAGGTAA
TCTGCTAGCT
ATACTTTCCC
CCAGTGAACT
TAGAAACAGT
AAAGGGTTTT
CTGATTATGG
AAACAGAACC
TTCATGGTTG
ATGAAGTTAA
AGTATTTGCA
CAGGAAATGC
AAAGTCCAA
CTAATATCAA
&PAGATAAGCC
CATCTCAGTT
AAAACCCATA
AGAAAAATCT
GAAATGAGAA
TTTTTAAAGA
GCTCCAGTAT
ACAGAGGGCC
AACAAAGTCT
AAGTAGTTCA
AGCCTATGGG
ATGATGGTGA
CTGTTTTTAG
ATACACATTT
ACTTATCTAG
ACAATATACC
ACACAGAGGA
TATTGGCAAA
TGTTTTCTTC
AGAGCTGAAG
TAAAGAATTT
TAAAGTGTCT
GCAAACTGAA
CACTCAGGAA
AAATAAATGT
TTCCAA.AGAT
CCACAGTCGG
GAATACATTC
AGAAGAGGAA
AGTCACTTTT
GCCTGTACAG
AGTTGATAAT
CAGAGGCAAC
TCGTATACCA
GCTAGAGGAA
CATTCCAAGT
AGCCAGCTCA
TAATGAAATA
AAAATTGAAT
TCCTGGAAGT
GACTGTTAAT
AAGTAGTCAT
AATAAAGGA.A
CAAA.AGCGTC
GGCTCAGGGT
TGAGGATGAA
TTCTCAGTCT
GAATTTATTA
GGCATCTCAG
ACAGTGCAGT
2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020
A"
WO 96133271 PCTIUS96/05621
GAATTGGAAG
AAACAAATGA
TCAGATGATG
GATTCAAACT
TGCTCAGGGC
CATAACCTGA
GGGAGCCAGC
CTGCGAAATC
GAATACCCTA
GATAGTTCTA
CCATCATTAG
TACCCATCTC
TCTGGGCCAC
CCTTACCTGG
GACAGAGCCC
GTTCCCCAAT
ACTGCTGGGT
TCAACAGAAA
TTTATGCTCG
GAAGAGACTA
AAATATTTTC
TCTATTAAAC
AATGGAAGAA
AGGGGGCTAG
TGGATGGTAC
ACAGGTGTCC
CATGCAATTG
GTAGCACTCT
CACTACTGA
ACTTGACTGC
GGCATCAGTC
AAGAAAGAGG
TAGGTGAAGC
TATCCTCTCA
TAAAGCTCCA
CTTCTAACAG
CAGAACAAAG
TAAGCCAGAA
CCAGTAAAAA
ATGATAGGTG
AAC3AGGAGCT
ACGATTTGAC
AATCTGGAAT
CAGAGTCAGC
TGAAAGTTGC
ATAATGCAAT
GGGTCAACAA
TGTACAAGTT
CTCATGTTGT
TAGGAATTGC
AAAGAAAAAT
ACCACCAAGG
AAATCTOTTG
AGCTGTGTGG
ACCCAATTGT
GGCAGATGTG
AAATACAAAC
TGAAAGCCAG
AACGGGCTTG
AGCATCTGGG
GAGTGACATT
GCAGGAAATG
CTACCCTTCC
CACATCAGAA
TCCAGAAGC
TAA.AGAACCA
GTACATGCAC
CATTAAGGTT
GGAA?.CATCT
CAGCCTCTTC
TCGTGTTGGC
AGAATCTGCC
GGAAGAAAGT
AAGAATGTCC
TGCCAGAAAA
TATGAAAACA
GGGAGGAAAA
GCTGAATGAG
TCCAkAGCGA
CTATGGGCCC
TGCTTCTGTG
GGTTGTGCAG
TGAGGCACCT
Ak-WAGGATC
GGAGTTGGTC
GAAGAAAATA
TGTGAGAGTG
TTAACCACTC
GCTGAACTAG
ATCATAAGTG
AAAGCAGTAT
CTTTCTGCTG
GGAGTGGAAA
AGTTGCTCTG
GTTGATGTGG
TACTTGCCAA
TCTGATGACC
AACATACCAT
CAGAGTCCAG
GTGAGCAGGG
ATGGTGGTGT
CACCACATCA
GATCZTGAGT
TGGGTAGTTA
CATGATTTTG
GCAAGAGAAT
TTCACCAACA
GTGAAGGAGC
CCAGATGCCT
GTGGTGACCC
ACCTACCTGA
CTTTCTTGAT
TGAGTGACAA
ATCAAGAAGA
AAACAAGCGT
AGCAGAGGGA
AAGCTGTGTT
ACTCTTCTGC
TAAC..LTCACA
ACAAGTTTGA
GGTCATCCCC
GGAGTCTTCA
AGGAGCAACA
GGCAAGATCT
CTGAATCTGA
CTTCAACCTC
CTGCTGCTCA
AGAAGCCAGA
CTGGCCTGAC
CTTTAACTAA
TTGTGTGTGA
GCTATTTCTG
AAGTCAGAGG
CCCAGGACAG
TGCCCACAGA
TTTCATCATT
GGACAGAGGA
GAGAGTGGGT
TACCCCAGAT
TGGTTCTTCC
GGAATTGGTT
GCAAAGCATG
CTCTGAAGAC
TACCATGCAA
AGAACAGCAT
CCTTGAGGAC
GAAAAGTAGT
GGTGTCTGCA
TTCTAAATGC
GAATAGAAAC
GCTGGAAGAG
AGAGGGAACC
TCCTTCTGAA
TGCATTGAAA
TACTACTGAT
ATTGACAGCT
CCCAGAAGAA
TCTAATTACT
ACGGACACTG
GGTGACCCAG
AGATGTGGTC
AAGATCTTC
TCAACTGGAA
CACCCTTGGC
CAATGGCTTC
GTTGGACAGT
CCCCCACAGC
4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 510.J 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5709 toLl ACCAGTGCCA GGAGCTGGAC INFORMATION FOR SEQ ID NO:3: Wi SEQUENCE CHARACTERISTICS: LENGTH: 5689 base pairs TYPE: nucleic acid p 21 WO 96/33271 WO 9633271PCTIUS96/05621 STRAflDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG CCTGCGCTCA GGAGGCCTTC TGGATTTATC TGCTCTTCGC TCTTAGAGTG TCCCATCTGT ACATATTTTG CAAATTTTGC GTCCTTTATG AGCCTACAAG AATCATTTGT GCTTTTCAGC AAAAAAGGAA AATAACTCTC GGGCTACAGA AACCGTGCCA GGAA.ACCAGT CTCAGTGTCC GCAGCGGATA CAACCTCAAA AGATACCGTT AATAAGGCAA CCCTCAAGGA ACCAGGGATG TTCTGAGACG GATGTAACAA CACTGAGAAG CGTGCAGCTG CTTGCA.TGTG GAGCCATGTG CAGTTTATTA CTCACTAAAG CAAACAGCCT GGCTTAGCAA TCAATGATAGG CGGACTCCCA TGAGAGAAAA GAATGGAATA AGATGTTCCT TGGATAACAC AAGTGATGAA CTGTTAGGTT AGTAGCTGAT GTATTGGACG AATAGACTTA CTGGCCAGTG CTCCAAATCA GTAGAGAGTA GGCAAGCCTC CCCAACTTAA TGAGCCACAG ATAATACAAG TACATCAGGC CTTCATCCTG
ACCCTCTGCT
GTTGA.AGAAG
CTGGAGTTGA
ATGCTGAAAC
AAAGTACGAG
TTGACACAGG
CTGAACATCT
AAAGACTTCT
AACTCTC1TAA AGACGTC'2GT
CTTATTGCAG
AAATCAGTTT
ATACTGAACA
AGAGGCATCC
GCACAAATAC
ACAGAATGAA
GGAGCCAACA
GCACAGAAAA
AGCAGAAACT
TAAATAGCAG
CTGATGACTC
TTCTAAATGA
ATCCTCATGA
ATATTGAAGA
GCCATGTAAC
AGCGTCCCCT
AGGATTTTAT
CTGGGT AAAG
TACAAA-ATGT
TCAAGGAACC
TTCTCAACCA
ATTTAGTCAA
TTTGGAGTAT
AAAAGATGAA
ACAGAGTGAA
CCTTGGAACT
CTACATTGAA
TGTGGGAGAT
GGATTCTGCA
TCATCAACCC
AGAAAAGTAT
TCATGCCAGC
TGTAGAAAAG
TAACAGATGG
AAAGGTAGAT
GCCATGCTCA
CATTCAGAAA
ACATGATGGG
GGTAGATGAA
GGCTTTAATA
CAAAATATTT
TGAAAATCTA
CACAAATAAA
CAAGAAAGCA
GTTTCTCAGA
TTCATTGGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
CTTGTTGAAG
GCAAACAGCT
GTTTCTATCA
CCC-.AAAATC
GTGAGAACTC
TTGGGATCTG
CAAGAATTGT
AAAAAGGCTG
AGTAATAATG
CAGGGTAGTT
TCATTACAGC
GCTGAATTCT
GCTGGAAGTA
CTGAATGCTG
GAGAATCCTA
GTTAATGAGT
GAGTCTGAAT
TATTCTGGTT
TGTAAAAGTG
GGGAAAACCT
ATTATAGGAG
TTAAAGCGTA
GATTTGGCAG
TAACTGGGCC
CAGAAAGAAA
ATGCAGAAAA
A.AGTGTGACC
CCTTCACAGT
AGCTATTGAA
ATAATTTTGC
TCCAAAGTAT
CTTCCTTGCA
TGAGGACAAA
ATTCTTCTGA
TACAAATCAC
CTTGTGAATT
ATTTGAACAC
CTGTTTCAAA
ATGAGAACAG
GTAATAAAAG
AGGAAACATG
ATCCCCTGTG
GAC-ATACTGA
GGTTTTCCAG
CAAATGCCAA
CTTCAGAGAA
AAAGAGTTCA
ATCGGAAGAA
CATTTGTTAC
AAAGGAGACC
TTCAAAAGAC
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 WO 96/3327 1 PCTIUS96/05621
TCCTGAAATG
TACTAATAGT
TAACCCAATA
CAGCAGTATA
GAATAGGCTG
TAGAAATCTA
AGAGATAAAG
CATGGAAGGT
AAGTAAAAGA
TTTTACTAAG
AGAAGAAAAA
AGATCTCATG
TATTTCATTG
TAGCACTCTA
TGAAAACCCC
CTTTAAGTAT
AGAAAGTGAA
ATTTGCTCCG
CTCTGGGTCC
AAATCAAGGA
CTTTCCTGTG
AGGCTCTAGG
AAATAAACAT
ATTTGTTAAA
GTCACCTGAA
TAATAACATT
TTCCAGTACT
TCAAGCAGAA
TTTGCAACCT
AAAAAAGCAA
GATTTCAGAT
GACACCTGAT
ATAAATCAGG
GGTCATGAGA
GAATCACTCG
AGCAATATGG
AGGAGGAAGT
AGCCCACCTA
AAAAAAAGT
AAAGAACCTG
CATGACAGCG
TGTTCAAATA
GAAGAGAAAC
TTAAGTGGAG
GTACCTGGTA
GGGAAGGCA
AAGGGACTAA
CCATTGGGAC
CTTGATGCTC
TTTTCAAATC
TTAAAGAAAC
AAGAATGAGT
GTTGGTCAGA
TTTTGTCTAT
GGACTTTTAC
ACTAAATGTA
AGAGAAATGG
AGAGAAAATG
AATGAAGTGG
CTAGGTAGAA
GAGGTCTATA
GAATATGAAG
AACTTAGAAC
GACCTGTTAG
GAACTAACCA
ATAAA.AC:AA
AAAAAGAATC
AACTCGAATT
CTTCTACCAG
ATTGTACTGA
ACAACCAAAT
CAACTGGAGC
ATACTTTCCC
CCAGTGAACT
TAGAAACAGT
AAAGGGTTTT
CTGATTATGG
AAACAGAACC
TTCATGGTTG
ATGAAGTTAA
AGTATTTGCA
CAGGAAATGC
AAAGTCCAAA
CTAATATCAA
AAGATAAGCC
CATCTCAGTT
AAAACCCATA
AGAAAAATCT
GAAATGAGA-k
TTTTTAAAGA
GCTCCAGTAT
ACAGAGGGCC
AACAAAGTCT
AAGTAGTTCA
AGCCTATGGG
ATGATGGTGA
AACGGAGCAG AATGGTCAAG AGGTGATTCT ATTCAGAATG TGCTTTCAAA ACGAAAGCTG AAATATCCAC PATTCAAAAG
GCATATTCAT
ATTGCAJAATT
GCCAGTCAGG
CAAGAAGAGT
AGAGCTGAAG
TAAAGAATTT
TAAAGTGTCT
GCAAACTGAA
CACTCAGGAA
AAATAAATGT
TTCCAAAGAT
CCACAGTCGG
GAATACATTC
AGAAGAGGAA
AGTCACTTTT
GCCTGTACAG
AGTTGATAAT
CAGAGGCAAC
TCGTATACCA
GCTAGAGGAA
CATTCCAAGT
AGCCAGCTCA
TAATGAAATA
AAAATTGAAT
TCCTGGAAGT
GACTGTTAAT
AAGTAGTCAT
AATAAAGGAA
GCGCTTGAAC
GATAGTTGTT
CACAGCAGAA
AACAAGCCAA
TTAACAAATG
GTCAATCCTA
AATAATGCTG
AGATCTGTAG
AGTATCTCGT
GTGAGTCAGT
AATAGAAATG
GAAACAAGCA
AAGGTTTCAA
TGTGCAACAT
GAATGTGAAC
ACAGTTAATA
GCCAAATGTA
GAAACTGGAC
CCACTT'TTC
AACTTTGAGG
ACAGTGAGCA
AGCAATATTA
GGTTCCAGTG
GCTATGCTTA
AATTGTAAGC
ACAGATTTCT
GCATCTCAGG
GATACTAGTT
TGATGAATAT
AGAAAAATCC
AACCTATAAG
CACCTAAAAA
TAGTAGTCAG
CTAGCAGTGA
ACCTACAACT
ATGAACAGAC
CACCTGGTTC
GCCTTCCAAG
A.AGACCCCAA
AGAGTAGCAG
TACTGGAAGT
GTGCAGCATT
ACACAGAAGG
TAGAAATGGA
AGCGCCAGTC
TCTCTGCCCA
AAAAGGAAGA
TCACTGCAGG
GTATCAAAGG
TCATTACTCC
CCATCAAGTC
AACATTCAAT
CAATTAGCCG
ATGAAGTAGG
ATGAAAkACAT
GATTAGGGGT
ATCCTGAAAT
CTCCATATCT
TTTGTTCTGA
TTGCTGAAAA
1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 WO 96/33271 PTU9/52 PCTIUS96105621
TGACATTAAG
GAGTCCTAGC
ATTAGAGTCC
CTTGTTATTT
TACCGAGTGT
TGACTGCAGT
AACAAAATGT
AAATACAAAC
TGAAAGCCAG
AACGGGCTTG
AGCATCTGGG
GAGTGACATT
GCAGGAAATG
CTACCCTTCC
CACATCZLGAA
TCCAGAAGGC
TAAAACCA
GTACATGCAC
CATTAAGGTT
GGAAACATCT
CAGCCTCTTC
TCGTGTTGGC
AGAATfCTGCC
GGAAGAAAGT
A.AGAATGTCC
TGCCAGAAAA
TATGAAAACA
GGGAGGAAAA
GCTGAATGAG
TCCAAAGCGA
CTATGGGCCC
TGCTTCTGTG
GAAAGTTCTG
CCTTTCACCC
TCAGAAGAGA
GGTA.AAGTAA
C .'GTCTAAGA
AACCAGGTAA
TCTGCTAGCT
ACCCAGGATC
GGAGTTGGTC
GAAGAAAATA
TGTGAGAGTG
TTAACCACTC
GCTGAACTAG
ATCATAAGTG
AAAGCAGTAT
CTTTCTGCTG
GGAGTCGAA
AGTTGCTCTG
GTTGATGTGG
TACTTGCCAA
TCTGATGACC
AACATACCAT
CAGAGTCCAG
GTGAGCAGGG
ATGGTGGTGT
CACCACATCA
GATGCTGAGT
TGGGTAGTTA
CATGATTTTG
GCAAGAGAAT
TTCACCAACA
GT1GAAGGAGC
CTGTTTTTAG
ATACACATTT
ACTTATCTAG
ACAATATACC
ACACAGAGGA
TATTGGCAAA
TGTTTTCTTC
CTTTCTTGAT
TGAGTGACAA
ATCAAGAAGA
AAACAAGCGT
AGCAGAGGGA
A.AGCTGTGTT
ACTCTTCTGC
TAACTTCACA
ACAAGTTTGA
GGTCATCCCC
GGAGTCTTCA
AGGAGCAACA
GGCAAGATCT
CTGAATCTGA
CTTCAACCTC
CTGCTGCTCA
AGAAGCCAGA
CTGGCCTGAC
CTTTAACTAA
TTGTGTGTGA
GCTATTTCTG
AAGTCAGAGG
CCCAGGACAG
TGCCCACAGA
TTTCATCATT
CAAAAGCGTC
GGCTCAGGGT
TGAGGATGAA
TTCTCAGTCT
GAATTTATTA
GGCATCTCAG
ACAGTGCAGT
TGGTTCTTCC
GGAATTGGTT
GCAAAGCATG
CTCTGAAGAC
TACCATGCAA
AGAACAGCAT
CCTTGAGGAC
GAAAAGTAGT
GGTGTCTGCA
TTCTAAATGC
GAATAGAAAC
GCTGGAAGAG
AGAGGGAACC
TCCTTCTGAA
TGCATTG.AA
TACTArTGAT
ATTGACAGCT
CCCAGAAGAA
TCTAATTACT
ACGGACACTG
GGTGACCCAG
AGATGTGGTC
AAAGATCTTC
TCAACTGGAA
CACCCTTGGC
CAGAA.AGGAG
TACCGAAGAG
GAGCTTCCCT
ACTAGGCATA
TCATTGAAGA
GAACATCACC
Gi-, ,TGGAAG
AACAAATGA
TCAGATGATG
GATTCAAACT
TGCTCAGGGC
CATAACCTGA
GGGAGCCAGC
CTGCGAAATC
GAAIACCCTA
GATAGTTCTA
CCATCATTAG
TACCCATCTC
TCTGGGCCAC
CCTTACCTGG
GACAGAGCCC
GTTCCCCAAT
ACTGCTGGGT
TCAACAGAAA
TTTATGCTCG
GAAGAGACTA
AAATATTTTC
TCTATTAAAG
AATGGAAGAA
AGGGGGCTAG
TGGATGGTAC
ACAGGTGTCC
AGCTTAGCAG
GGGCCAAGAA
GCTTCCAACA
GCACCGTTGC
ATAGCTTAAA
TTAGTGAGGA
ACTTGACTGC
GGCATCAGTC
AAGAAAGAGG
TAGGTGAAGC
TATCCTCTCA
TAAAGCTCCA
CTTCTAACAG
CAGAACAAAG
TAAGCCAGAA
CCAGTAAAAA
ATGATAGGTG
AAGAGGAGCT
ACGATTTGAC
AATCTGGAAT
CAGAGTCAGC
TGAAAGTTGC
ATAATGCAAT
GGGTCAACAA
TGTACAAGTT
CTCATGTTGT
TAGGAATTGC
AAAGAAAAAT
AC CACCAAGG
AAATCTGTTG
.AGCTGIGTGG
ACCCAATTGT
3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 Ii' p WO 96/33271
PCTJUS
GGTTGTGCAG CCAGATGCCT GGACAGAGGA CA.ATGGCTTC CATGCAATTG GGCAGATGTG TGAGGCACCT GTGGTGACCC GAGAGTGGGT GTTGGACAGT GTAGCACTCT ACCAGTGCCA GGAGCTGGAC ACCTP.CCTGA TACCCCAGAT CCCCCACAGC CACTACTGA INFORMATION F I SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 5711 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 96/05621 AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG
CCTGCGCTCA
TGGATTTATC
TCTTAGAGTG
ACATATTTTG
GTCCTC2TATG
AACTTGTTGA.
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
?-CAAGAATT
CAAAAAAGGC
CCAGTAATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGGCTGGAAG
ATCTGAIXTGC
CAGAGAATCC
AAGTTAATGA
GGGAGTCTGA
GGAGGCCTTC
TGCTCTTCGC
TCCCATCTGT
CAAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGA'
TGATTTGAAC
TTCTGTTTCA
GCATGAGAAC
CTGTAATAAA
TAAGGAAACA
TGATCCCCTG
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGAT.A
TGTGAGAGAA
GAAGATGTTC
Af"7AAGTGATG
AAAGTAGCTG
CTGGGTAAAG
T.ACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAI)ACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
GGCGGACTCC
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
GTTTCTCAGA
TTCATC-rAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAA.AT
AGACAGAATG
AAGGAGCCAA
CAGCACAGAA
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAA.AT
TAAC'rGGGCC
CAGAAAGAAA
ATGCAGAAAA
AAGTGTGACC
ccir~CACAGG
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
A.NCCTTGGAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAAAGGTAG
CTGCCATGCT
AGCATTCAGA
TCACATGATG
GAGGTAGATG
5580 5640 5689 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1330
I
Al- 4) WO 96/33271 PCTIUS96/05621
AATATTCTGG
TATGTAAAAG
TTGGGAAAAC
TAATTATAGG
AATTAAAGCG
CAGATTTGGC
AGAATGGTCA
CTATTCAGAA
AAACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
GGCACAGCAG
GTAACAAGCC
AGTTAACAAA
TTGTCAATCC
CTAATAATGr AAAGATCTGr
AAAGTATCTC
GTGTGAGTCA
ATAATAGAAA
GGGAAACAAG
TCAAGGTTTC
AATGTGCAAC
TTGAATGTGA
AGACAGTTAA
ATGCCAAATG
ACGAAACTGG
CACCACTTTT
AAAACTTTGA
GTACAGTGAG
CAAGCAATAT
TTCTTCAGAG
TGA.AAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGAGA
AGTTCAAAAG
AGTGATG.AAT
TGAGAAAAAT
TGAACCTATA
AGCACCTAAA
ACTAGTAGTC
TTCTAGCAGT
AAACCTACAA
AAATGJAACAG
TGCACCTGGT
TAGCCTTCCA
TGAAGACCCC
AGAGAGTAGC
GTTACTGGAA
GTGTGCAGCA
TGACACAGAA
CATAGAAATG
AAAGCGCCAG
ATTCTCTGCC
ACAAAAGGAA
TATCACTGCA
TAGTATCAAA
ACTCATTACT
TCCCATCAAG
GGAACATTCA
CACAATTAGC
TAATGAAGTA
AAAATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAA.
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAAATC
GAAGAGATAA
CTCATGGAAG
ACAAGTAAAA
TCTTTTACTA
AGAGAAGAAA
AAAGATCTCA
AGTATTTCAT
GTTAGCACTC
TTTGAAAACC
GGCTTTAAGT
GAAGAAAGTG
TCATTTGCTC
CACTCTGGGT
GAAAATCA.AG
GGCTTTCCTG
GGAGGCTCTA
CCAAATAAAC
TCATTTGTTA
ATGTCACCTG
CGTAATAACA
GGTTCCAGTA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
AAGAAGAGA
TGTTAAGTGG
TGGTACCTC'
TAGGGAAGGC
CCAAGGGACT
ATCCATTGGG
AACTTGATGC
CGTTTTCAAA
CCTTAAAGAA
GAAAGAATGA
TGGTTGGTCA
GGTPTTTGTCT
ATGGACTTTT
AAACTAAATG
APIAGAGAAP.T
TTAGAGkAAA
CTAMOOAIT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTAAC
GAATAAAACA
CGAAAA.AGAA
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
TGCAACTGGA
CGATACTYTC
TACCAGTGAA
ACTAGAAACA
AGAAAGGGTT
TACTGATTAT
AAAAACAGAA
AATTCATGGT
ACATGAAGTT
TCAGTATTTG
TCCAGGAALAT
ACAAAGTCCA
GTCTAATA'!L
GAAAGATAAG
ATCATCTCAG
ACAAAACCCA
TAAGAAAAAT
GGGAAATGAG
TGTTTTTAAA
GGGCTCCAGT
GAGGCTTTAA
GACAAAATAT
ACTGAAAATC
CTCACAAATA
ATCAAGAAAG
CAAACGGAGC
AAAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAAA
ATGCCAGTCA
GCCAAGAAGA
CCAGAGCTuA
CTTAAAGAAT
GTTAAAGTGT
TTGCAAACTG
GGCACTCAGG
CCAAATAAAT
TGTTCCAAAG
AACCACAGTC
CAGAATACAT
GCAGAAGAGG
AAAGTCACTT
AAGCCTGTAC
CCAGTTGATA
TTCAGAGGCA
TATCGTATAC
CTGCTAGAGG
AACATTCCAA
GAAGCCAGCT
ATTAATGAAA
1440 1500 1560 1620 :680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300
U
13 WO 96/33271 PCTIUS96/05621
TAGGTTCCAG
ATGCTATGCT
GTAATTGTAA
ATACAGATTT
ATGCATCTCA
AAGATACTAG
TCCAGAAAGG
GTTACCGAAG
AAGAGCTTCC
CTACTAGGCA
TATCATTGAA
AGGAACATCA
GTGAATTGGA
CCAAACAAAT
TTTCAGATGA
TGGATTCAAA
ACTGCTCAGG
AACATAACCT
ATGGGAGCCA
ACCTGCGAAA
GTGAATACCC
CAGATAGTTC
GCCCATCATT
ACTACCCATC
AGTCTGGGCC
CCCCTTACCT
AAGACAGAGC
AAGTTCCCCA
ATACTGCTGG
CTTCAACAGA
AATTTATGCT
CTGAAGAGAC
TGATGAAAAC
TAGATTAGGG
GCA','CCTGAA
CTCTCCATAT
GGTTTGTTCT
TTTTGCTGAA
AGAGCTTAGC
AGGGGCCAAG
CTGCTTCCAA
TAG-CACCGTT
GAATAGCTTA
CCTTAGTGAG
AGACTTGACT
GAGGCATCAG
TGAAGAAAGA
CTTAGGTGAA
GCTATCCTCT
GATAAAGCTC
GCCTTCTAAC
TCCAGAACAA
TATAAGCCAG
TACCAGTAAA
AGATGATAGG
TCAAGAGGAG
A::ACGATTTG
GGAATCTGGA
CCCAGAGTCA
ATTGAAAGTT
GTATAATGCA
AAGGGTCAAC
CGTGTACAAG
TACTCATGTT
ATTCAAGCAG
GTTTTGCAAC
ATAAAAAGC
CTGATTTCAG
GAGACACCTG
A-ATGACATTA
AGGAGTCCTA
AAAJTAGAGT
CACTTGTTAT
GCTACCGAGT
AATGACTGCA
GAAACAAAiAT
GCAAATA'CAA
TCTGAMVAGCC
GGAACGGGCT
GCAGCATCTG
CAGAGTGACA
CAGCAGGAAA
AGCTACCCTT
AGU CATCAG
AATCCAGAAG
AATAAAGAAC
TGGTACATGC
CTCATTAAGG
ACGGAAACAT
ATCAGCCTCT
GCTCGTGTTG
GCAGAATCTG
ATGGAA jAAA
AAAAGP.ATGT
AACTAGGTAG
CTGAGGTCTA
AAGAATATGA
ATAACTTAGA
ATGACCTGTT
AGGAAAGTTC
GCCCTTTCAC
CCTCAGA.AGA
TTGGTAAAGT
GTCTGTCTAA
GTAACCAGGT
GTTCTGCTAG
ACACCCAGGA
AGGGAGTTGG
TGGAAGAAA.A
GGTGTGAGAG
TTTTAACCAC
TGGCTGAACT
CCATCATAAG
AAAAAGCAGT
GCC'rTTCTGC
CAGGAGTGGA
ACAGTTGCTC
TTGTTGATGT
CTTACTTGCC
TCTCTGATGA
GCAACATACC
CCCAGAGTCC
GTGTGAGCAG
CCATGGTGGT
AAACAGAGGG
TAAACAAAGT
AGAAGTAGTT
ACAGCCTATG
AGATGATGGT
TCCOTGTTTTT
CCATACACAT
GAACTTATCT
AAACAATATA
GAACACAGAG
AATATTGGCA
CTTGTTTTCT
TCC TTTCTTG
TCTGAGTGAC
TAATCAAGAA
TGAAACAAGC
TCAGCAGAGG
AGAAGCTGTG
TGACTCTTCT
ATTAACTTCA
TGACAAGTTT
AAGGTCATCC
TGGGAGTCTT
GGAGGAGCAA
AAGGCAAGAT
CCCTGAATCT
ATCTTCAACC
AGCTGCTGCT
GGAGAAGCCA
GTCTGGCCTG
CACTTTAACT
GTTTGTGTGT
CCAAAATTGA
CTTCCTGGAA
CAGACTGTTA
GGAAGTAGTC
GAAATAAAGG
AGCAAAAGCG
TTGGCTCAGG
AGTGAGGATG
CCTTC'TCAGT
GAGAATTTAT
AAGGCATCTC
TCACAGTGCA
ATTGGTTCTT
AAGGAATTGG
GAGCAAAGCA
GTCTCTGAAG
GATACCATGC
TTAGAACAGC
GCCCTTGAGG
CAGAAAAGTA
GAGGTGTCTG
CCTrTCTAAAT
CAGAATAGAA
CAGCTGGAAG
CTAGAGGGAA
GATCCTTCTG
TCTGCATTGA
CATACTACTG
GAATTGACAG
ACCCCAGAAG
AATCTAATTA
GAACGGACAC
3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4 2 0 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 TTTGCCAGAA AACACCACAT GTTATG-AAAA CAGATGCTGA 27 WO 9633271PCT[US96/05621 WO 96/33271 TGAAATATTT TCTAGGAATT GCGGGAGGAA 'LiU(G±AGT± TTAX ±G ICAC AGTCTATTAA AGAA.AGAA.AA ATGCTGAATG AGCATGATTT TGAAGTCAGA GGAGATGTGC TCAATGGAAG AAACCACCAA GGTCCAAAGC GAGCAAGAGA ATCCCAGGAC AGAAAGATC TCAGGGGGCT AGAAATCTGT TGCTATGGGC CCTTCACCAA CATGCCCACA GATCAACTGC AATGGATGGT ACAGCTGTGT GGTGCTTCTG TGGTGAAGGA GCTTTCATCA TTCACCCTT( GCACAGGTGT CCACCCAATT GTGGTTGTGC AGCCAGATGC CTGGACAGAG GACAATGGC: TCCATGCAAT TGGGCAGATC TGTGAGGCAC CTGTGGTGAC CCGAGAGTGG GTGTTGGAC~ GTGTAGCACT CTACCAGTGC CAGGAGCTGG ACACCTACCT GATACCCCAG ATCCCCCAC2 GCCACTACTG A INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 59 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID TGTCCTTAAA AGGTTGATAA TCACTTGCTG AGTGTGTTTC TCAAACAAGT TAATTTCAG INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 5710 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ 11) NO:6:
I,
5280 5340 5400 5460 5520 5580 5640 5700 5711 59 AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG
CCTGCGCTCA
TGGATTTATC
TCTTAGAGTG
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
GGAGGCCTTC
TGCTCTTCGC
TCCCATCTGT
CAAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
CTGGGTAAAG
TACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
GTTTCTCAGA
TTCAT~nGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
TAACTGGGCC
CAGAAAGAAA
ATGCAGAAAA
A.AGTGTGACC
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
AACCTTGGAA
I alp, 4011 zt7t WO 96/3327 1 PCTJUS96/05621
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTAATAA
A'CAGGGTAG
GCTCATTACA
AGGCTGAATT
GGCTGGAAGT
TCTGAATGCT
kGAGAATCCT rGTTAATGAG
GGAGTCTGAA
ATATTCTGGT
ATGTAAA.AGT
TGGGAAAACC
AATTATAGGA
ATTAAAGCGT
AGATTTGGCA
GAATGGTCAA
TATTCAGAAT
AACGAAAGCT
CAATTCAAAA
TGCGCTTGAA
TGATAGTTGT
GCACAGCAGA
TAACAAGCCA
GTTAACAAAT
TGTCAATCCT
TAATAATGCT
AAGATCTGTA
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA
TGATTTGAAC
TTCTGTTTCA
GCATGAGAAC
CTGTAATAA.A
AAGGAAACAT
GATCCCCTGT
AGAGATACTG
TGGTTTTCCA
TCAAATGCCA
TCTTCAGAGA
GAAAGAGTTC
TATCGGAAGA
GCATTTGTTA
AAAAGGAGAC
GTTCAAAAGA
GTGATGAATA
GAGAAAAATC
GAACCTATAA
GCACCTAAAA
CTAGTAGTCA
TCTAGCAGTG
AACCTACAAC
AATGAACAGA
GCACCTGGTT
AGCCTTCCAA
GAAGACCCCA
GAGAGTAGCA
AAGCfLGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACGCC
GTAATGATAG
GTGAGAGAAA
AAGATGTTCC
GAAGTGATGA
AAGTAGCTGA
AAATAGACTT
ACTCCAAATC
AGGCAAGCCT
CTGAGCCACA
CTACATCAGG
CTCCTGAAAT
TTACTAATAG
CTAACCCAAT
GCAGCAGTAT
AGAATAGGCT
GTAGAAATCT
AAGAGATAAA
TCATGGAAGG
CAAGTAAAAG
CTTTTACTAA
GAGA.AGAAAA
AAGATCTCAT
GTATTTCATT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
TGGCTTAGCA
GCGGACTCCC
AGAATGGAAT
TTGGATAACA
ACTGTTAGGT
TGTATTGGAC
ACTGGCCAGT
AGTAGAGAGT
CCCCAACTTA
GATAATACA6A
CCTTCATCCT
GATAAATCAG
TGGTCATGAG
AGAATCACTC
AAGCALATATG
GAGGAGGAAG
AAGCCCACCT
GAAAAAAAAG
TAAAGAACCT
ACATGACAGC
GTGTTCAAAT
AGAAGAGAAA
GTTAAGTGGA
GGTACCTGGT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAALAT
AGACAGAATG
AGGAGCCAAC
AGCACAGAAA
AAGCAGAAAC
CTAAATAGCA
TCTGATGACT
GTTCTAAATG
GATCCTCATG
A6ATATTGAAG
AGCCATGTAA
GAGCGTCCCC
GAGGATTTTA
GGAACTAAC'7
AATAAAACAA
GAAAAAGAAT
GAAC-TCGAAT
TCTTCTACCA
AATTGTACTG
TACAACCAAA
GCAACTGGAG
GATACTTTCC
ACCAGTGAAC
CTAGAAACAG
GAAAGGGTTT
ACTGATTATG
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
ATAACAGATG
AAAAGGTAGA
TGCCATGCTC
GCATTCAGAA
CACATGATGG
AGGTAGATGA
AGGCTTTAAT
ACAAAATATT
CTGAAAATCT
TCACAA6ATAA
TCAAGAA.AGC
AA6ACGGAGCA zAGGTGATTC
CTGCTTTCAA
TAALATATCCA
GGCATATTCA
AATTGCAA.AT
TGCCAGTCAG
CCAAGAAGAG
CAGAGCTGA.A
TTAAAGAATT
TTAAAGTGTC
TGCAAACTGA
GCACTCAGGA
660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 AAGTATCTCG TTACTGGAAG TTAGCACTCT AGGGAAGGCA AAAACAGAAC CAAATAAATG WO 96/3327 1 PCTIUS96/05621
TGTGAGTCAG
TAATAGAAAT
GGAAACAAGC
CAAGGTTTCA
ATGTGCAACA
TGAATGTGAA
GACAGTTAAT
TGCCAAATGT
CGAAACTGGA
ACCACTTTTT
AAACTTTGAG
TACAGTGAGC
AAGCAATATT
AGGTTCCAGT
TGCTATGCTT
TAATTGTAAG
TACAGATTTC
TGCATCTCAG
AGATACTAGT
CCAGAAAGGA
TTACCGAAGA
AGAGCTTCCC
TACTAGGCAT
ATCATTGAAG
GGAACATCAC
TGAATTGGAA
CAAACAAATG
TTCAGATGAT
GGATTCAAAC
CTGCTCAGGG
ACATAACCTG
TGTGCAGCAT
GACACAGAAG
ATAGAAATGG
A.AGCGCCAGT
TTCTCTGCCC
CAAAAGGAAG
ATCACTGCAG
AGTATCAAAG
CTCATTACTC
CCCATCAAGT
GAACATTCAA
ACAATTAGCC
AATGAAGTAG
GATGAAAACA
AGATTAGGGG
CATCCTGAAA
TCTCCATATC
GTTTGTTCTG
TTTGCTGAAA
GAGCTTkGCA
GGGGCCAAGA
TGCTTCCAAC
AGCACCGTTG
AATAGCTTAA
CTTAGTGAGG
GACT'TGACTG
AGGCATCAGT
GAAGAAAGAG
TTAGGTGAAG
CTATCCTCTC
ATAAAGCTCC
TTGAAAACCC
GCTTTAAGTA
AAGA7AAGTGA
CATTTGCTCC
ACTCTGGGTC
AAAATCAAGG
GCTTTCCTGT
GAGGCTCTAG
CAAATAAACA
CATTTGTTAA
TGTCACCTGA
GTAATAACAT
GTTCCAGTAC
TTCAAGCAGA
TTTTGCAACC
TAAAAAAGCA
TGATTTCAGA
AGACACCTGA
ATGACATTAA
GGAGTCCTAG
AATTAGAGTC
ACTTGTTATT
CTACCGAGTG
ATGACTGCAG
AAACAAAATG
CAAATACAAA
CTGAAAGCCA
GAACGG'-'TT
CAGCATCTGG
AGAGTGACAT
AGCAGGAAAT
CAAGGGACTA
TCCATTGGGA
ACTTGATGCT
GTTTTCAAAT
CTTAAAGAAA
AAAGAATGAG
GGTTGGTCAG
GTTTTGTCTA
TGGACTTTTA
AACTAAATGT
AAGAGAAATG
TAGAGAAAAT
TA.ATGAAGTG
ATTCATGGTT
CATGAAGTTA
CAGTATTTGC
CCAGGAAATG
CAAAGTCCAA
TCTAATATCA
AAAGATAAGC
TCATCTCAGT
CAAAACCCAT
AAGAAAAATC
GGAAATGAGA
GTTTTTAAAG
GGCTCCAGTA
ACTAGGTAGA AACAGAGGGC TGAGGTCTAT AAACAAAGTC AGAATATGAA GAAGTAGTTC
GTTCCAAAGA
ACCACAGTCG
AGAATACATT
CAGAAGAGGA
AAGTCACTTT
AGCCTGTACA
CAGTTGATAA
TCAGAGGCAA
ATCGTATACC
TGCTAGAGGA
ACATTCCAAG
AAGCCAGCTC
TTAATGAAAT
CAAAATTGA.A
TTCCTGGAAG
AGACTGTTAA
GAAGTAGTCA
AAATAAAGGA
GCAAAAGCGT
TGGCTCAGGG
GTGAGGATGA
CTTCTCAGTC
AGAATTTATT
AGGCATCTCA
CACAGTGCAG
TTGGTTCTTC
AGGAATTGGT
AGCAA.AGCAT
TCTCTGAAGA
ATACCATGCA
TAGAACAGCA
TAACTTAGAA
TGACCTGTTA
GGAAAGTTCT
CCCTTTCACC
CTCAGAAGAG
TGGTAAAGTA
TCTGTCTAAG
TAACCAGGTA
TTCTGCTAGC
CACCCAGGAT
GGGAGTTGGT
GGAAGAAAAT
GTGTGAGAGT
TTTAACCACT
GGCTGAACTA
CAGCCTATGG
GATGATGGTG
GCTGTTTTTA
CATACACATT
AACTTATCTA
AACAATATAC
AACACAGAGG
ATATTGGCAA
TTGTTTTCTT
CCTTTCTTGA
CTGAGTGACA
AATCAAGAAG
GAAACAAGCG
CAGCAGAGGG
GAAGCTGTGT
2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 TGGGAGCCAG CCTTCTAACA GCTACCCTTC CATCATAAGT GACTCTTCTG CCCTTGAGGA
'I
WO 96/33271 PCT/US96/05621
CCTGCGAAAT
TGAATACCCT
AGATAGTTCT
CCCATCATTA
CTACCCATCT
GTCTGGGCCA
CCCTTACCTG
AGACAGAGCC
AGTTCCCCAA
TACTGCTGGG
TTCAACAGAA
ATTTATGCTC
TGAAGAGACT
GAAATATTTT
GTCTATTAAA
CAATGGAAGA
CAGGGGGCTA
ATGGATGGTA
CACAGGTGTC
CCATGCAATT
TGTAGCACTC
CCAGAACAAA
ATAAGCCAGA
ACCAGTAAAA
GATGATAGGT
CAAGAGGAGC
CACGATTTGA
GAATCTGGAA
CCAGAGTCAG
TTGAAAGTTG
TATAATGCAA
AGGGTCAACA
GTGTACAAGT
ACTCATGTTG
CTAGGAATTG
GAAAGAAAAA
AACCACCAAG
GAAATCTGTT
CAGCTGTGTG
CACCCAATTG
GGGCAGATGT
TACCAGTGCC
GCACATCAGA
ATCCAGAAGG
ATAAAGAACC
GGTACATGCA
TCATTAAGGT
CGGAAACATC
TCAGCCTCTT
CTCGTGTTGG
CAGAATCTGC
TGGAAGAAAG
AAAGAATGTC
TTGCCAGAAA
TTATGAAAAC
CGGGAGGAAA
TGCTGAATGA
GTCCAAAGCG
GCTATGGGCC
GTGCTTCTGT
TGGTTGTGCA
GTGAGGCACC
AGGAGCTGGA
AAAAGCAGTA
CCTTTCTGCT
AGGAGTGGAA
CAGTTGCTCT
TGTTGATGTG
TTACTTGCCA
CTCTGATGAC
CAACATACCA
CCAGAGTCCA
TGTGAGCAGG
CATGGTGGTG
ACACCACATC
AGATGCTGAG
ATGGGTAGTT
GCATGATTTT
AGCAAGAGAA
CTTCACCAAC
GGTGAAGGAG
GCCAGATGCC
TGTGGTGACC
CACCTACCTG
TTAACTTCAC
GACAAGTTTG
AGGTCATCCC
GGGAGTCTTC
GAGGAGCAAC
AGGCAAGATC
CCTGAATCTG
TCTTCAACCT
GCTGCTGCTC
GAGAAGCCAG
TCTGGCCTGA
ACTTTAACTA
TTTGTGTGTG
AGCTATTTCT
GAAGTCAGAG
TCCCAGGACA
ATGCCCACAG
CTTTCATCAT
TGGACAGAGG
CGAGAGTGGG
ATACCCCAGA
AGAAAAGTAG
AGGTGTCTGC
CTTCTAAATG
AGAATAGAAA
AGCTGGAAGA
TAGAGGGAAC
ATCCTTCTGA
CTGCATTGAA
ATACTACTGA
AATTGACAGC
CCCCAGAAGA
ATCTAATTAC
AACGGACACT
GGGTGACCCA
GAGATGTGGT
GAAAGATCTT
ATCAACTGGA
TCACCCTTGG
ACAATGGCTT
TGTTGGACAG
TCCCCCACAG
4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5710
U
J7
CCACTACTGA
INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 5709 base pairs TYPE: nucleic acid 50 STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA 55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA. TAACTGGGCC CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TTCATTGGAA CAGAAAGAAA TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAA.ATGT CATTAATGCT ATGCAGAAAA TCTTAGAGTG TCCCATCTGT CTGGAGTTGA TCMAGGAACC TGTCTCCACA AAGTGTGACC 31 j:j WO 96/3327 1 PCTIUS,96OS621
LI
II
/1 /1
I;
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTAATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGGCTGGAAG
ATCTGAATGC
CAGAGAATCC
35 AAGTTAATGA
GGGAGTCTGA
AATATTCTGG
TATGTAAAAG
TTGGGAAAAC
TAATTATAGG
AATTAAAGCG
CAGATTTGGC
AGAATGGTCA
CTATTCAGAA
AAACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
CAAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA
TGATTTGAAC
TTCTGTTTCA
GCATGAGAAC
CTGTAATAAA
TAAGGAAACA
TGATCCCCTG
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
TTCTTCAGAG
TGAAAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGAGA
AGTTCAAAAG
AGTGATGAAT
TGAGAAAAAT
TGAACCTATA
AG-CACCTAAA
ACTAGTAGTC
TTCTAGCAGT
ATGCTGAAAC
ATAACCAAAA
AA7LATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGATA
TGTGAGAGAA
GAAGATGTTC
AGAAGTGATG
AAAGTAGCTG
AAAATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAAA
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAAATC
GAAGAGATAA
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
GGCGGACTCC
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGA-2CAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAAT
AGACAGAATG
AAGGAGCCAA
CAGCACAGAA
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTAAC
GAATAAAACA
CGAAAAAGAA
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
AACCTTGGAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAAAGGTAG
CTGCCATGCT
AGCATTCAGA
TCACATGATG
GAGGTAGATG
GAGGCTTTAA
GACAAAATAT
ACTGAAAATC
CTCACA1&ATA
ATCAAGAAAG
CAAACGGAGC
AAAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAAA
ATGCCAGTCA
300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 GGCACAGCAG AALACCTACAA CTCATGGAAG GTAAAGAACC TGCAACTGGA GCCAAGAAGA 32 WO 96133271 WO 9/337 1PCT/US96/05(t21
GTAACAAGCC
AGTTAACAAA
TTGTCAATCC
CTAATAATGC
AAAGATCTGT
AGTATCTCGT
GTGAGTCAGT
AATAGAAATG
GA;LACAAGCA
AAGGTTTCAA
TGTGCAACAT
GAATGTGAAC
ACAGTTAATA
GCCAAATGTA
GAAACTGGAC
CCACTTTTTC
AACTTTGAGG
ACAGTGAGCA
AGCAATATTA
GGTTCCAGTG
GCTATGCTTA
AATTGTAAGC
ACAGATTTCT
GCATCTCAGG
GATACTAGTT
CAGAAAGGAG
TACCGAAGAG
GAGCTTCCCT
ACTAGGCATA
TCATTGAAGA
GAACATCACC
GAATTGGAAG
AAATGAACAG
TGCACCTGGT
TAGCCTTCCA
TGAAGACCCC
AGAGTAGCAG
TACTGGAAGT
GTGCAGCATT
ACACAGAAGG
TAGAAATGGA
AGCGCCAGTC
TCTCTGCCCA
AAAAGGAAGA
TCACTGCAGG
GTATCAAAGG
TCATTACTCC
CCATCAAGTC
AACATTCAAT
CAATTAGCCG
ATGAAGTAGG
ATGAAAACAT
GATTAGGGGT
ATCCTGAAAT
CTCCATATCT
TTTGTTCTGA
TTGCTGAAAA
AGCTTAGCAG
GGGCCAAGAA
GCTTCCAACA
GCACCGTTGC
ATAGCTTAAA
TTAGTGAGGA
ACTTGACTGC
ACAAGTAAAA
TCTTTTACTA
AGAGAAGAAA
AAAGATCTCA
TATTTCATTG
TAGCACTCTA
TGAAAACCCC
CTTTAAGTAT
AGAAAGTGAA
ATTTGCTCCG
CTCTGGGTCC
AAATCAAGGA
CTTTCCTGTG
AGGCTCTAGG
AAATAAACAT
ATTTGTTAAA
GTCACCTGAA
TAATAACATT
TTCCAGTACT
TCAAGCAGAA
TTTGCAACCT
AAAAAAGCAA
GATTTCAGAT
GACACCTGAT
TGACATTAAG
GAGTCCTAGC
ATTAGAGTCC
CTTGTTATTT
TACCGAGTGT
TGACTGCAGT
AACAAAATGT
AAATACAAAC
GACATGACAG
AGTGTTCAAA
AAGAAGAGAA
TGTTAAGTGG
GTACCTGGTA
GGGAAGGCAA
AAGGGACTAA
CCATTGGGAC
CTTGATGCTC
TTTTCAAATC
TTAAAGAAAC
AACk~ATGAGT
GTTGGTCAGA
TTTTGTCTAT
GGACTTTTAC
ACTAAATGTA
AGAGAAATGG
AGAGAAAATG
AATGAAGTGG
CTAGGTAGAA
GAGGTCTATA
GAATATGAAG
AACTTAGAAC
GACCTGTTAG
GAAAGTTCTG
CCTTTCACCC
TCAGAAGAGA
GGTAAAGTAA
CTGTCTAAGA
AACCAGGTAA
TCTGCTAGCT
ACCCAGGATC
CGATACTTTC
TACCAGTGAA
ACTAGAAAiCA
AGAAAGGGTT
CTGATTATGG
AAACAGAACC
TTCATGGTTG
ATGAAGTTAA
AGTATTTGCA
CAGGAAATGC
AAAGTCCAAA
CTAATATCAA
AAGATAAGCC
CATCTCAGTT
AAA.ACCCATA
AGAAAAATCT
GAAATGAGAA
TTTTTAAAGA
GCTCCAGTAT
ACAGAGGGCC
AACAA.AGTCT
AAGTAGTTCA
AGCCTATGGG
ATGATGGTGA
CTGTTTTTAG
ATACACATTT
ACTTATCTAG
ACAATATACC
ACACAGAGGA
TATTGGCAAA
TGTTTTCTTC
CTTTCTTGAT
CCAGAGCTGA
-fTAAAGAAT
GTTAAAGTGT
TTGCAAACTG
CACTCAGGAA
AAATAAATGT
TTCCAAAGAT
CCACAGTCGG
GAATACATTC
AGAAGAGGAA
AGTCACTTTT
GCCTC-TACAG
AGTTGATAAT
CAGAGGCAAC
TCGTATACCA
GCTAGAGGAA
CATTCCAAGT
AGCCAGCTCA
TAATGAAATA
AAAATTGAAT
TCCTGGAAGT
GACTGTTAAT
AAGTAGTCAT
AATAAAGGAA
CAAAAGCGTC
GGCTCAGGGT
TGAGGATGAA
TTCTCAGTCT
GAATTTATTA
GGCATCTCAG
ACAGTGCAGT
TGGTTCTTCC
2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 U IN U I WO 96/33271 PC'rUS96/05621
AAACAAATGA
TCAGATGATG
GATTCAAACT
TGCTCAGGGC
CATAACCTGA
GGGAGCCAGC
CTGCGAAATC
GAATACCCTA
GATAGTTCTA
CCATCATTAG
TACCCATCTC
TCTGGGCCAC
CCTTACCTGG
GACAGAGCCC
GTTCCCCAAT
ACTGCTGGGT
TCAACAGAAA
TTTATGCTCG
U GAAGAGACTA
AAATATTTTC
TCTATTAAAG
AATGGAAGAA
AGGGGGCTAG
U TGGATGGTAC
ACAGGTGTCC
CATGCAATTG
GTAGCACTCT
CACTACTGA
GGCATCAGTC
AAGAAAGAGG
TAGGTGAAGC
TATCCTCTCA
TAAAGCTCC1,
CTTCTAACAG
CAGAACAA.AG
TAAGCCAGAA
CCAGTAAAAA
ATGATAGGTG
AAGAGGAGCT
ACGATTTGAC
AATCTGGAAT
CAGAGTCAGC
TGAAAGTTGC
ATAATGCAAT
GGGTCAACAA
TGTACAAGTT
CTCATGTTGT
TAGGAATTGC
AAAGAAAAAT
ACCACCAAGG
AAATCTGTTG
AGCTGTGTGG
ACCCAATTGT
GGCAGATGTG
ACCAGTGCCA
TGAAAGCCAG
AACGGGCTTG
AGCATCTGGG
GAGTGACATT
GCAGGAAATG
CTACCCTTCC
CACATCAGAA
TCCAGAAGGC
TAAAGAACCA
GTACATGCAC
CATTAAGGTT
GGAAACATCT
CAGCCTCTTC
TCGTGTTGGC
AGAATCTGCC
GGAAGAAAGT
AAGAATGTCC
TGCCAGAAAA
TATGAAAACA
GGGAGGAAAA
GCTGAATGAG
TCCAAAGCGA
CTATGGGCCC
TGCTTCTGTG
GGTTGTGCAG
TGAGGCACCT
GGAGCTGGAC
GGAGTTGGTC
GAAGAAAATA
TGTGAGAGTG
TTAACCACTC
GCTGAACTAG
ATCATAAGTG
AAAGCAGTAT
CTTTCTGCTG
GGAGTGGAAA
AGTTGCTCTG
GTTGATGTGG
TACTTGCCAA
TCTGATGACC
AACATACCAT
CAGAGTCCAG
GTGAGCAGGG
ATGGTGGTGT
CACCACATCA
GATGCTGAGT
TGGGTAGTTA
CATGATTTTG
GCAAGAGAAT
TTCACCAACA
GTGAAGGAGC
CCAGATGCCT
GTGGTGACCC
ACCTACCTGA
TGAGTGACAA
ATCAAGAAGA
AAACAAGCGT
AGCAGAGGGA
AAGCTGTGTT
ACTCTTCTGC
TAACTTCACA
ACAAGTTTGA
GGTCATCCCC
GGAGTCTTCA
AGGAGCAACA
GGCAAGATCT
CTGAATCTGA
CTTCAACCTC
CTGCTGCTCA
AGAAGCCAGA
CTGGCCTGAC
CTTTAACTAA
TTGTGTGTGA
GCTATTTCTG
AAGTCAGAGG
CCCAGGACAG
TGCCCACAGA
TTTCATCATT
GGACAGAGGA
GAGAGTGGGT
TACCCCAGAT
GGAATTGGTT
GCAZ.AGCATG
CTCTGAAGAC
TACCATGCAA
AGAACAGCAT
CCTTGAGGAC
GAAAAGTAGT
GGTGTCTGCA.
TTCTAAATGC
GAATAGAAAC
GCTGGAAGAG
AGAGGGAACC
TCCTTCTGAA
TGCATTGAAA
TACTACTGAT
ATTGACAGCT
CCCAGAAGAA
TCTAATTACT
ACGGACACTG
GGTGACCCI.G
AGATGTGGTC
AAAGATCTTC
TCAACTGGAA
CACCCTTGGC
CAATGGCTTC
GTTGGACAGT
CCCCCACAGC
4140 4200 4260 4320 4380 4440 -4500 4560 4620 A.68 0 4740 4800 4860 4920 4900 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5709 INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 5709 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear
IUI~~
WO 96/33271 PCT[US96/05621 (ii) MOLECULE TYPE: CDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTIGTGGG
CCGGCC
TGGATTTATC
TCTTAGAGTG
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTA.ATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGGCTGGAAG
ATCTGAATGC
CAGAGAATCC
AAGTTAATGA
so GGGAGTCTGA
AATATTCTGG
TATGTAAAAG
TTGGGAAAAC
TAATTATAGG
AATTAAAGCG
CAGATTTGGC
AGAATGGTCA
GGAGGCCTTC
TGCTCTTCGC
TCCCATCTGT
CAAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA
TGATTTGAAC
TTCTGTTTCA
GCATGAGALAC
CTGTAATAAA
TAAGGAAACA
TGATCCCCTG
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
TTCTTCAGAG
TGAAAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGAGA
AGTTCAkP.AG
AGTGATGAAT
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
A.AGCAGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGATA
TGTGAGAGAA
GAAGATGTTC
AGAAGTGATG
AAAGTAGCTG
AAAATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAA
ATTACTAATA
CTGGGTAAAG
TACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
GGCGGACTCC
AAGAATGGA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
GTTTCTCAGA
TTCATTGGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAAT
AGACAGAATG
AAGGAGCCAA
CAGCACAGAA
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTXAC
GAATAAAACA
TAACTGGGCC
CAGAAAGAAA
ATGCAGAAAA
AAGTGTGACC
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
AACCTTGGAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAAAGGTAG
CTGCCATGCT
AGCATTCAGA
TCACATGATG
GAGGTAGATG
GAGGCTTTAA
GACAAAATAT
ACTGAAAA'rC
CTCACAAATA
ATCAAGAAAG
CAAACGGAGC
AAAGGTGATT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 *1 22 WO 96/33271 PCTIUS96/05621
CTATTCAGAA
AAACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
GGCACAGCAG
GTAACAAGCC
AGTTAACAAA
TTGTCAATCC
CTAATAATGC
AAAGATCTGT
AA.AGTATCTC
GTGTGAGTC]
ATAATAGAAA
GGGAAACAAG
TCAAGGTTTC
AATGTGCAAC
GAATGTGAAC
ACAGTTAATA
GCCAAATGTA
GAAACTGGAC
CCACTTTTTC
AACTTTGAGG
ACAGTGAGCA
AGCAATATTA
GGTTCCAGTG
GCTATGCTTA
AATTGTAAGC
ACAGATTTCT
GCATCTCAGG
GATACTAGTT
CAGAAAGGAG
TGAGAAAAAT
TGAACCTATA
AGCACCTAAA
ACTAGTAGTC
TTCTAGCAGT
AAACCTACAA
AAATGAACAG
TGCACCTGGT
TAGCCTTCCA
TGAAGACCCC
AGAGAGTAGC
GTTACT(.. "AA
GTGTGCAGCA
TGACACAGAA
CATAGAAATG
AAAGCGCCAG
ATTCTCTGCC
AAAAGGAAGA
TCACTGCAGG
GTATCAAAGG
TCATTACTCC
CCATCAAGTC
AACATTCAAT
CAATTAGCCG
ATGAAGTAGG
ATGAAAACAT
GATTAGGGGT
ATCCTGAAAT
CTCCATATCT
TTTGTTCTGA
TTGCTGAAAA
AGCTTAGCAG
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAkATC
GAAGAGATAA
CTCATGGAAG
ACAAGTAAAA
TCTTTTACTA
AGAGAAGAAA
AAAGATCTCA
AGTATTTCAT
GTTAr-_ -C
TTTGAAAACC
GGCTTTAAGT
GAAGAAAGTG
TCATTTGCTC
CACTCTGGGT
AAATCAAGGA
CTTTCCTGTG
AGGCTCTAGG
AA.ATAAACAT
ATTTGTTAAA
GTCACCTGAA
TAATAACATT
TTCCAGTACT
TCAAGCAGAA
TTTGCAACCT
AAAAAAGCAA
GATTTCAGAT
GACACCTGAT
TGACATTAAG
GAGTCCTAGC
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
AAGAAGAGAA
TGTTAAGTGG
TGGTACCTGG
TAGGGAAGGC
CCAAGGGACT
ATCCATTGGG
AACTTGATGC
CGTTTTCAAA
CCTTAAAGAC
AAGAATGAGT
GTTGGTCAGA
TTTTGTCTAT
GGACTTTTAC
ACTAAATGTA
AGAGAAATGG
AGAGAAAATG
AATGAAGTGG
CTAGGTAGAA
GAGGTCTATA
GAATATGAAG
AACTTAGAAC
GACCTGTTAG
GAAAGTTCTG
CCTTTCACCC
CGAAAAAGAA
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
TGCAACTGGA
CGATACTTTC
TACCAG"'GAA
ACTAGAAACA
AGAAAGGGTT
TACTGATTAT
AAAAACAGAA
AATTCATGGT
ACATGAAGTT
TCAGTATTTG
TCCAGGAAAT
AAAGTCCAAA
CTAATATCAA
AAGATAAGCC
CATCTCAGTT
AAAACCCATA
AGAAAAATCT
GAAATGAGAA
TTTTTAAAGA
GCTCCAGTAT
ACAGAGGGCC
AACAAAGTCT
AAGTAGTTCA
AGCCTATGGG
ATGATGGTGA
CTGTTTTTAG
ATACACATTT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAA
ATGCCAGTCA
GCCAAGAAGA
CCAGAGCTGA
CTTAAAGAAT
GTTAAAGTGT
TTGCAAACTG
GGCACTCAGG
CCAAATAAAT
TGTTCCAAAG
AACCACAGTC
CAGAATACAT
GCAGAAGAGG
AGTCACTTTT
GCCTGTACAG
AGTTGATAAT
CAGAGGCAAC
TCGTATACCA
GCTAGAGGAA
CATTCCAAGT
AGCCAGCTCA
TAATGAAATA
AAAATTGAAT
TCCTGGAAGT
GACTGTTAP.,T
AAGTAGTCAT
AATAAAGGAA
CAAAAGCGTC
GGCTCAGGGT
1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3400 3540 3600 3660 3720 tvl-w UUUAAATAGGAA GATACTAGTT TTGCTGAAA 3600 I I I
I,
WO 96/33271 PCTTJS96/05621
TACCGAAGAG
GAGCTTCCCT
ACTAGGCA'
TCATTGAAGA
GAACATCACC
GAATTGGAAG
AAACAAATGA
TCAGATGATG
GATTCAAACT
TGCTCAGGGC
CATAACCTGA
GGGAGCCAGC
CTGCGAAATC
GAATACCCTA
GATAGTTCTA
CCATCATTAG
TACCCATCTC
TCTGGGCCAC
CCTTACCTGG
GACAGAGCCC
GTTCCCCAAT
ACTGCTGGGT
TCAACAGAAA
TTTATGCTCG
GAAGAGACTA
AAATATTTTC
TCTATTAAAG
AATGGAAGAA
AGGGGGCTAG
TGGATGGTAC
ACAGGTGTCC
CATGCAATTG
GGGCCAAGAA
GCTTCCAACA
CGTTGC
ATAGCTTAAA
TTAGTGAGGA
ACTTGACTGC
GGCATCAGTC
AAGAAAGAGG
TAGGTGAAGC
TATCCTCTCA
TAAAGCTCCA
CTTCTAACAG
CAGAACAAAG
TAAGCCAGAA
CCAGTAAAAA
ATGATAGGTG
AAGAGGAGCT
ACGATTTGAC
AATCTGGAAT
CAGAGTCAGC
TGAAAGTTGC
ATAATGCAAT
GGGTCAACAA
TGTACAAGTT
CTCATGTTGT
TAGGAATTGC
AALAGAAAAAT
ACCACCAAGG
AAATCTGTTG
AGCTGTGTGG
ACCCAATTGT
GGCAGATGTG
ATTAGAGT.CC
CTTGTTATTT
TACCGAGTGT
TGACTGCAGT
AACAAAATGT
AAATACAAAC
TGAAAGCCAG
AACGGGCTTG
AGCATCTGGG
GAGTGACATT
GCAGGA.AATG
CTACCCTTCC
CACATCAGAA
TCCAGAAGGC
TAAAGAACCA
GTACATGCAC
CATTAAGGTT
GGAAACATCT
CAGCCTCTTC
TCGTGTTGGC
AGAATCTGCC
GGAAGAAAGT
AAGAATGTCC
TGCCAGAAAA
TATGAAAACA
GGGAGGAAAA
GCTGAATGAG
TCCAAAGCGA
CTATGGGCCC
TGCTTCTGTG
GGTTGTGCAG
TGAGGCACCT
TCAGAAGAGA
GGTAAAGTAA
CTGTCTAAGA
AACCAGGTAA
TCTGCTAGCT
ACCCAGGATC
GGAGTTGGTC
GAAGAAAATA
TGTGAGAGTG
TTAACCACTC
GCTGAACTAG
ATCATAAGTG
AAAGCAGTAT
CTTTCTGCTG
GGAGTGGAAA
AGTTGCTCTG
GTTGATGTGG
TACTTGCCAA
TCTGATGACC
AACATACCAT
CAGAGTCCAG
GTGAGCAGGG
ATGGTGGTGT
CACCACATCA
GATGCTGAGT
TGGGT.AGTTA
CATGATTTTG
GCAAGAGAAT
TTCACCAACA
GTGAAGGAGC
CCAGATGCCT
GTGGTGACCC
ACTTATCTAG
ACAATATACC
ACACAGAGGA
TATTGGCAAA
TGTTTTCTTC
CTTTCTTGAT
TGAGTGACAA
ATCAAGAAGA
AAACAAGCGT
AGCAGAGGGA
AAGCTGTGTT
ACTCTTCTGC
TAACTTCACA
ACAAGTTTGA
GGTCATCCCC
GGAGTCTTCA
AGGAGCAACA
GGCAAGATCT
CTGAATCTGA
CTTCAACCTC
CTGCTGCTCA
AGAAGCCAGA
CTGGCCTGAC
CTTTAACTAA
TTGTGTGTGA
GCTATTTCTG
AAGTCAGAGG
CCCAGGACAG
TGCCCACAGA
TTTCATCATT
GGACAGAGGA
GAGAGTGGGT
TGAGGATGAA
TTCTCAGTCT
GAATTTATTA
GGCA'2'CTCAG
ACAGTGCAGT
TGGTTCTTCC
GGAATTGGTT
GCAAAGCATG
CTCTGAAGAC
TACCATGCX-A
AGAACAGCAT
CCTTGAGGAC
GAAAAGTAGT
GGTGTCTGCA
TTCTAAATGC
GAATAGAAAC
GCTGGAAGAG
AGAGGGAACC
TCCTTCTGAA
TGCATTGAAA
TACTACTGAT
ATTGACAGCT
CCCAGAAGAA
TCTAATTACT
ACGGACACTG
GGTGACCCAG
AGATGTGGTC
AAAGATCTTC
TCA .CTGGAA
CACCCTTGGC
CAATGGCTTC
GTTGGACAGT
3780 3840 3900 3960 4020 408) 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 WO 96/33271 WO 9633271PCT[US96/05621 GTAGCACTCT ACCAGTGCCA GGAGCTGGAC P ;I.,ACCTGA TACCCCAGAT CCCCCACAGC
CACTACTGA
INFORM.ATION FOR SEQ ID NO:9: SEQUENCE CHARACTZRISTICS: LENGTH: 5709 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 5700 5709 AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG
CCTGCGCTCA
TGGA'1-I ATC
TCTTAGAGTG
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTAATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGGCTGGAAG
ATCTGAATGC
CAGAGAATCC
AAGTTAATGA
GGGAGTCTGA
AATATTCTGG
GGAGGCCTTC
TGCTCTTCGC
TCCCATCTGT
CAAATTTTGC
T %GAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAC-GACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA
TGATTTGAAC
TTCTGTTTCA
GCATGAGAAC
CTGTAATAAA
TAAGGAAACA
TGATCCCCTG
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
TTCTTCAGAG
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGATA
TGTGAGAGAA
GAAGATGTTC
AGAAGTGATG
AAAGTAGCTG
AAAATAGACT
CTGGGTAAAG
TACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAUC
GGCGGACTCC
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
GTTTCTCAGA
TTCATTGGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAAT
AGACAGAATG
AAGGAGCCAA
CAGCACAGAA
TAA( i J.-AAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAACTGGGCC
CAGAAAGAAA
ATGCAGAAAA
AAGTGTGACC
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
AACCTTGGAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAAAGGTAG
CTGCCATGCT
AG3CATTCAGA
TCACATGATG
GAGGTAGATG
GAGGCTTTAA
6u 120 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 PCT[US96/05621 'WO 96/3327 1
TATGTAAAAG
TTGGGAAAAC
T-iATTATAGG
AATTAAAGCG
CAGATTTGGC
AGAATGGTCA
CTATTCAGAA
AAACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
GGCZACAGCAG
GTAACAAGCC
AGTTAAC UAA
TTGTCAAICC
CTAATAATCJ'C
AAAGATCTGT
AAAGTATCTC
GTdTGAGTCA
ATAATAGAAA
GGGAAAWAAG
TCAAGGTTTC
AATCTGCAAC
TTGAATGTGA
ACAGTTAATA
GCCAAATGTA
GAAACTGGAC-
CCACTTTTTC
AACTTTGAGG
ACAGTGAGCA
AGCAATATTA
GGTTCCAGTG
TGAAAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGiGA
AGTTCAAAAG
AGTGA'rGAAT
TGAGAAAAAT
'rGAACCTATA
AGCACCTAAA
ACTAGTAGTC
TTCTAGCAGT
AAACCTACAA
A,;ATGAACAG
TGCACCTGGT
TAGCCTTCCA
TGAAGACCCC
P -AG~ C
GTTACTGGAA
GTGTGCAGCA
TGACACAGAA
CATAGAAATG
AAAGCGCCAG
ATTCTCTGCC
ACAAAAG6,AA
TCACTGCAGG
GTATCAAAGG
TCATTACTCC
CCATCAAGTC
AACATTCAAT
Ck..TTAGCCG
ATGAAGTAGG
ATGAAAACAT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAAA
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAAATC
GAAGAGATAA
CTCATGGAAG
ACAAGT .'Ak TCTTTTACTh
AGAGAAGAA.A
AA.AGATCTCA
AGTATTTCAT
GTTAGCACTC
TTTGAAAACC
GGCTTTAAGT
GAAGAPAGTG
TCATTTGCTC
CACTCTGGGT
GAAAATCAAG
CTTTCCTGTG
AG(",CTCTAGG
AAATAAACAT
ATTTGTTAAA
GTCACCTGAA
TAAP.ATACATT
TTCCAGTACT
TCAAGCAGAA
CAGTAGAGAG
TCCCC AACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TG AGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
AAGAAGAGAA
TGTTPAAGTGG
TGGTACCTGG
TAGGGAAGGC
CCAAGGGACT
ATCCATTGGG
AACTTGATGC
CGTTTTCAAA
CCTTAUA.GAA
GAAAGAATGA
GTTGGTCAGA
TTTTGTCTAT
GGACTTTTAC
ACTAAATGTA
AGAGAAATG
AGAGAAAATG
AATGAAGTGG
CTAGGTAGAA
TAATATTGAA
AAGCCATGTA
AGAGCOTCCC
TGAGGATTIT
GGGAACTAAC
GAATAAAACA
CGAAAAAGAA
GGAACTCGAA
GTCTTCTACC
TA) .TGTACT
GTACAACCAA
TGCAACTGGA
CGATACTTTC
TACCAGTGAA
ACTAGAAACA
AGAAAGGGTT
TACTGNTTAT
AAAAACAGAA
\ATTCATGGT
AC ATCAAGTT TCA('. TATTTG TCCAGGAAATr
ACAAAGTCCA
GTAATATCAA
AAGATAAGCC
CATCTC'hGTT
AAAACCCATA
AGPAkAAATCT
GAAATGAGAA
TTTTTAAAGA
GCTCCAGTAT
ACAGAGGGCC
GACAAAATAT
ACTGAAAATC
CTCACAAATA
ATCAAGAAAG
CAAACGGAGC
A21LAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GA'TGCAAA
A±i ,AGTCA
GCCAAGAAGA
CCAGAGCTGA
CTTAAAGAAT
GTTAAAGTGT
TTGCAAACTG
GGCACTCAGG
CC-AAATAAAT
TGTTCCAAAG
AACCACAGTC
CAGAATACAT
GCAGAAGAGG
AAAGTCACTT
GC~mGTACAG
AGTTGATAAT
CAGAGGCAAC
TCGTATACCA
GCTAGAGGAA
CATTCCAAGT
AGCCAGCTCA
TAATGAAATA
AAAATTGAAT
1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2 700 2760 2820 2R80 2940 3000 3060 3120 318u 3240 3300 3360 11
L
WO 96/33271 FTU9152 FCT[US96/05621
GCTATGCTTA
AATTGTAAGC
ACAGATTTCT
GCATCTCAGG
GATACTAGTT
CAGAAAGGAG
TACCGAAGAG
GAGCTTCCCT
ACTAGGCATA
TCATTGAAGA
GAACATCACC
GAATTGGAAG
AAACAAATGA
TCAGATGATG
GATTCAAACT
TGCTCAGGGC
CATAACCTGA
GGGAGCCAGC
CTGCGAAATC
GAATACCCTA
GATAGTTCTA
CCATCATTAG
TACCCATCTC
TCTGGGCCAC
CCTTACCTGG
GACAGAGCCC
GTTCCCCAAT
ACTGCTGGGT
TCAACAGAAA
TTTATGCTCG
GAAGAGACTA
AAATATTTTC
GATTAGGGGT
ATCCTGAAAT
CTCCATATCT
TT TTCTGA
TT'GCTGAAAA
A:7CTTAGCAG
GGGCCAAGAA
GCTTCCAACA
GCACCGTTGC
ATAGCTTAAA
TTAGTGAGGA
ACTTGACTGC
GGCATCAGTC
AAGAAAGAGG
TAGGTGAAGC
TATCCTCTCA
TAAAGCTCCA
CTTCTAACAG
CAGAACAAAG
TAAGCCAGAA
CCAGTAAAAA
ATGATAGGTG
AAGAGGAGCT
ACGATTTGAC
AATCTGGAAT
CAGAGTCAGC
TGAAAGTTGC
ATAATGCAAT
GGGTCAACAA
TGTACAAGTT
CTCATGTTGT
TAGGAATTGC
TTTGCAACCT
AAAAAAGCAA
GATTTCAGAT
GACACCTGAT
TGACATTAAG
GAGTCCTAGC
ATTAGAGTCC
CTTGTTATTT
TACCGAGTGT
TGACTGCAGT
AACAAAATGT
AAATACAAAC
TGAAAGCCAG
AACGGGCTTG
AGCATCTGGG
GAGTGACATT
GCAGGAAATG
CTACCCTTCC
CACATCAGAA
TCCAGAAGGC
TAAAGAACCA
GTACATGCAC
CATTYAAGGTT
GGAAACATCT
CAGCCTCTTC
TCGTGTTGGC
AGAATCTGCC
GGAAGAAAGT
AAGAATGTCC
TGCCAGAAAA
TATGAAAACA
GGGAGGAAAA
GAGGTCTATA
GAATATGAAG
AACTTAGAAC
GACCTGTTAG
GAAAGTTCTG
CCTTTCACCC
TCAGAAGAGA
GGTAAAGTAA
CTGTCTAAGA
AACCAGGTAA
TCTGCTAGCT
ACCCAGGATC
GGAGTTGGTC
GAAGAAAATA
TGTGAGAGTG
TTAACCACTC
GCTGAACTAG
ATCATAAGTG
AAAGCAGTAT
CTTTCTGCTG
GGAGTGGAAA
AGTTGCTCTG
GTTGATGTGG
TACTTGCCAA
TCTGATGACC
AACATACCAT
CACAGTCCAG
GTGAGCAGGG
ATGGTGGTGT
CACCACATCA
GATGCTGAGT
TGGGTAGTTA
AACAAAGTCT
AA.GTAGTTCA
AGCCTATGGG
ATGATGGTGA
CTGTTTTXTAG
ATACACATTT
ACTTATC'TAG
ACAATATACC
ACACAGAGGA
TATTGGCAAA
TGTTTTCTTC
CTTTCTTGAT
TGAGTGACAA
ATCAAGAAGA
AAACAAGCGT
AGCAGAGGGA
AAGCTGTGTT
ACTCTTCTGC
TAiACTTCACA
ACAAGTTTGA
GGTCATCCCC
GGAGTCTTCA
AGGAGCAACA
GGCAAGATCT
CTGAATCTGA
C!TTCAACCTC
CTGCTGCTCA
AGAAGCCAGA
CTGGCCTGAC
CTTTAACTA.
TTGTGTGTGA
GCTATTTCTG
TCCTGGAAGT
GACTGTTAAT
AAGTAGTCAT
AATAAAGGAA
CAAAAGCGTC
GGCTCAGGGT
TGAGGATGAA
TTCTCAGTCT
GAATTTATTA
GGCATCTCAG
ACAGTGCAGT
TGGTTCTTCC
GGAATTGGTT
GCAAAGCATG
CTCTGAAGAC
TACCATGCAA
AGAACAGCAT
CCTTGAGGAC
GAAAAGTAGT
GGTGTCTGCA
TTCTAAATGC
GAATAGAAAC
GCTGGAAGAG
AGAGGGAACC
TCCTTCTGAA
TGCATTGAAA
TACTACTGAT
ATTGACAGCT
CCCAGAAGAA
TCTAATTACT
ACGGACACTG
GGTGACCCAG
3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 27 Ii WO 96133271 TCTATTAAAG AAAGAA.A LI AATGGAAGAA ACCACCAA AGGGGGCTAG AAATCTGI TGGATGGTAC AGCTGTGT CATGCAATTG GGCAGATG GACATTC ACCATI
EG
TYPE:
STPAW
TOPOL
(ii) MOLECULE TI (xi) SEQUENCE D
I
PCTUS96105621 AT GCTGAATGAG CATGATYT'VU GG TCCAAAGCGA GCAAGAGAAT 'TG CTATGGGCCC TTCACCAACA 'GG TGCTTCTGTG GTGAAGGAGC 'GT GGTTGTGCAG CCAGATGCCT TG TGAGGCACCT GTGGTGACCC CA GGAGCTGGAC ACCTACCTGA SEQ ID NO:lO:
HARACTERISTICS:
5711 base pairs nucleic acid (DEDNESS: double ,OGY: linear 'YPE: cDNA )ESCRIPTION: SEQ ID NO:1O:
AAGTCAGAGG
CCCAGGACAG
I'GCCCACAGA
TTTCATCATT
GGACAGAGGA
GAGAGTGGGT
TACCCCAGAT
AGATGTGGTC
AAAGATCTTC
TCAACTGGAA
CACCCTTGGC
CAATGGCTTC
GTTGGACAGT
CCCCCACAGC
5340 5400 5460 5520 5580 5640 5700 5709 AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC Af
CCTGCGCTCA
TGGATTTATC
TCTTAGAGTG
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTAATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGAGGCCTTC
TGCTCTTCGC
TCCCATCTGT
CAAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA.
TGATTTGAAC
TTCTGTTTCA
GCATGAGAAC
CTGTAATAAA
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAkAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATAC.
ACCCCTCAAG
TTTTCTGAG.A
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
CTGGGTAAAG
TACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTA.ATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
TTCATTGGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAANAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAAT
AGACAGAATG
AAGGAGCCAA
CAGAAAGAAA
ATGCAGAAAA
AAGTGTGACC
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAA.AAGATG
CTACAGAGTG
AACCTTr-CAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 WO 96/33271 WO 9633271PCT[US96/05621
GGGCTGGAAG
ATCTGAATGC
CAGAGAATCC
AAGTTAATGA
GGGAG-TCTGA
AATATTCTGG
TATGTAAA.AG
TTGGGAAAAC
TAATTATAGG
AATTAAAGCG
CAGATTTGGC
AGAATGGTCA
CTATTCAGAA
A.ACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
GGCACAGCAG
GTAACAAGCC
AGTTAACAAA
TTGTCAATCC
CTAATAATGC
AAAGATCTGT
AAAGTATCTC
GTGTGAGTCA
ATAATAGAAA
GGGAAACAAG
TCAAGGTTTC
AATGTGCAAC
TTGAATGTGA
AGACAGTTAA
ATGCCAAATG
TAAGGAAACA TGTAATGATA GGCGGACTCC CAGCACAGAA AAAAAGGTAG TGATCCCCTG TGTGAGAGAA
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
TTCTTCAGAG
TGAAAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGAGA
AGTTCAAAAG
AGTGATGAAT
TGAGAAAAAT
TGAACCTATA
AGCACCTAAA
ACTAGTAGTC
TTCTAGCAG-T
AAACCTACAA
AAATGAACAG
TGCACCTGGT
TAGCCTTCCA
TGAAGACCCC
AGAGAGTAGC
GTTACTGGAA
GTGTGCAGCA
TGACACAGAA
CATAGAAATG
AAAGCGCCAG
ATTCTCTGCC
ACAAAAGGAA
TATCACTGCA
ThGTATCAAPA
GAAGATGTTC
AGAAGTGATG
AAAGTAGCTG
AAAATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCIG
ACTCCTGAAA
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGG,
AGTAGAAATC
GAAGAGATAA
CTCATGGAAG
ACAAGTAAAA
TCTTTTACTA
AGAGAAGAAA
AAAGATCTCA
AGTATTTCAT
GTTAGCACTC
TTTGAAAACC
GGCTTTAAGT
GAAGAAAGTG
TCATTTGCTC
CACTCTGGGT
GAAAATCAAG
GGCTTTCCTG
GGAGGCTCTA
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAA-ACA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
AAGAAGAGAA
TGTTAAGTGG
TGGTACCTGG
TAGGGAAGGC
CCAAGGGACT
ATCCATTGGG
AACTTGATGC
CGTTTTCAAA
CCTTAAAGAA
GAAAGAATGA
TGGTTGGTCA
GGTTTTGTCT
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTAAC
GAATAAAACA
CGAAAAAGAA
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
TGCAACTGGA
CGATACTTTC
TACCAGTGAA
ACTAGAAACA
AGAAAGGGTT
TACTGATTAT
AAAAACAGAA
AATTCATGGT
ACATGAAGTT
TCAGTATTTG
TCCAGGAAAT
ACAAAGTCCA
GTCTAATATC
GAAAGAT
7
AAG
ATCATCTCAG
CTGCCAkTGCT
AGCATTCAGA
TCACATGATG
GAGGTAGATG
GAGGCTTTAA
GACAAAATAT
A.TGAAAATC
CTCACAAATA
ATCAAPGAAAG
CAAACGGAGC
AAAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAAA
ATGCCAGTCA
GCCAAGAAGA
CCAGAGCTGA
CTTA.AAGAAT
GTTAAAGTGT
TTGCAAACTG
GGCACTCAGG
CCAAATAAAT
TGTTCCAAAG
AACCACAGTC
CAGAATACAT
GCAGAAGAGG
AAAGTCACTT
AAGCCTGTAC
CCAGTTGATA
TTCAGAGGCA
1140 1200 1260 1320 -380 1440 -1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 WO 96/33271 WO 9633211PCTIIJS96/0S621
ACGAAACTGG
CACCACTTTT
AAAACTTTGA
GTACAGTGAG
CAAGCAATAT
TAGGTTCCAG
ATGCTATGCT
GTAATTGTAA
ATACAGATTT
ATGCATCTCA
AAGATACTAG
TCCAGAAAGG
GTTACTGAAG
AAGAGCTTCC
CTACTAGGCA
TATCATTGAA
AGGAACATCA
GTGAATTGGA
CCAAACAAAT
TTTCAGATGA
TGGATTCAAA
ACTGCTCAGG
AACA'?AACCT
ATGGGAGCCA
ACCTGCGAAA
GTGAATACCC
CAGATAGTTC
GCCCATCATT
ACTACCCATC
AGTCTGGGCC
CCCCTTACCT
ACTCATTACT
TCCCATCAAG
GGAACATTCA
CACAATTAGC
TAATGAAGTA
TGATGAAAAC
TAGATTAGGG
GCATCCTGAA
CTCTCCATAT
GGTTTGTTCT
TTTTGCTGAA
AGAGCTTPLGC
AGGGGCCAAG
CTGCTTCCAA
TAGCACCGTT
GAATAGCTTA
CCTTAGTGAG
AGACTTGACT
GAGGCATCAG
TGAAGAAAGA
CTTAGGTGAA
GCTATCCTCT
GATAAAGCTC
GCCTTCTAAC
TCCAGAACAA
TATAAGCCAG
TACCAGTAAA
AGATGATAGG
TCAAGAGGAG
ACACGATTTG
GGAATCTGGA
CCAAATAAAC
TCATTTGTTA
ATGTCACCTG
CGTAATAACA
GGTTCCAGTA
ATTCAAGCAG
GTTTTGCAAC
ATAAAAAAGC
CTGATTTCAG
GAGACACCTG
AATGACATTA
AGGAGTCCTA
AAATTAGAGT
CACTTGTTAT
GCTACCGAGT
AATGACTGCA
GAAACAAAAT
GCAAATACAA
TCTGAAAGCC
GGAACGGGCT
GCAGCATCTG
CAGAGTGACA
CAGCAGGAAA
AGCTACCCTT
AGCACATCAG
AATCCAGAAG
AATAAAGAAC
TGGTAkCATGC CTCAkTTAAGG
ACGGAAACAT
ATCAGCCTCT
ATGGACTTTT
AAACTAAATG
AAAGAGAAAT
TTAGAGAAAA
CTAATGAAGT
AACTAGGTAG
CTGAGGTCTA
AAGAATATGA
ATAACTTAGA
ATGACCTGTT
AGGAAAGTTC
GCCCTTTCAC
CCTCAGAAGA
TTGGTAAAGT
GTCTGTCTAA
GTAACCAGGT
GTTCTGCTAG
ACACCCAGGA
AGGGAGTTGG
TGGAAGAAA1A
GGTGTGAGAG
TTTTAACCAC
TGGCTGAACT
CCATCATAAG
AAAAAGCAGT
GCCTTTCTGC
CAGGAGTGGA
ACAGTTGCTC
TTG'TTGATGT
CTTACTTGCC
TCTCTGATGA
ACAAA.ACCCA
TAAGAAAAAT
GGGAAATGAG
TGTTTTTAAA
GGGCTCCAGT
AAACAGAGGG
TAAACAAAGT
AGAAGTAGTT
ACAGCCTATG
AGATGATGGT
TGCTGTTTTT
CCATACACAT
GAACTTATCT
AAACAATATA
GAACACAGAG
AATATTGGCA
CTTGTTTTCT
TCCTTTCTTG
TCTGAGTGAC
TAATCAAGAA
TGAAACAAGC
TCAGCAGAGG
AGAAGCTGTG
TGACTCTTCT
ATTAACTTCA
TGACAkAGTTT
AAGGTCATCC
TGGGAGTCTT
GGAGGAGCAA
AAGGCAAGAT
CCCTGAATCT
TATCGTATAC
CTGCTAGAGG
AACATTCCAA
GAAGCCAGCT
ATTAATGAAA
CCAAAATTGA
CTTCCTGGAA
CAGACTGTTA
GGAAGTAGTC
GAAATA.AAGG
AGCAAAAGCG
TTGGCTCAGG
AGTGAGGATG
CCTTCTCAGT
GAGAATTTAT
AAGGCATCTC
TCACAGTGCA
ATTGGTTCTT
AAGGAATTGG
GAGCAAAGCA
GTCTCTGAAG
GATACCATGC
TTAGAACAGC
GCCCTTGAGG
CAGAAAAGTA
GAGGTGTCTG
CTTCTAAAT
CAGAATAGAA
CAGCTGGAAG
CTAGAGGGAA
GATCCTTCTG
3060 3120 3180 3240 3300 3360 34.20 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 AAGACAGAGC CCCAGAGTCA GCTCGTGTTG GCAACATACC ATCTTCAACC TCTGCATTGA WO 96/33271 WO 9633271PCTIUS96/05621 AAGTTCCCCA A9 ATACTGCTGG C-'j CTTCAACAGA Al AATTTATGCT CC CTGAAGAGAC T2 TGAAATATTT T( AGTCTATTAA AC TCAATGGAAG A TCAGGGGGCT AC AATGGATGGT AC GCACAGGTGT CC TCCATGCAAT TC G'TGTAGCACT C~ GCCACTICTG A 'TGAAAGTT GCAGAATCTG CCCAGAGTCC AGCTGCTGCT CATACTACTG
~ATAATGCA
~GGTCAAC
TGTACAAG
.CTCATGTT
TAGGAATT
3 AAAGPAAA k.ACCACCAA 3AAATCTGT
:AGCTGTGT
:ACCCAATT
3GGCAGATG
TACCAGTGC
ATGGAAGAAA
AAAAGAATGT
TTTGCCAGAA
GTTATGAAAA
GCGGGAGGAA
ATGCTGAATG
GGTCCAAAGC
TGCTATGGGC
GGTGCTTCTG
GTGGTTGTGC
TGTGAGGCAC
CAGGAGCTGG
GTGTGAGCAG
CCATGGTGGT
AACACCACAT
CAGATGCTGA
AATGGGTAGT
AGCATGATTT
GAGCAAGAGA
CCTTCACCAA
TGGTGAAGGA
AGCCAGATGC
CTGTGGTGAC
ACACCTACCT
GGAGAAGCCA
GTCTGGCCTG
CACTTTAACT
GTTTGTGTGT
TAGCTATTTC
TGAAGTCAGA
ATCCCAGGAC
CATGCCCACA
GCTTTCATCA
CTGGACAGAG
CCGAGAGTGG
GATACCCCAG
GAATTGACAG
ACCCCAGAAG
AATCTAATTA
GAACGGACAC
TGGGTGACCC
GGAGATGTGG
AGAAAGATCT
GATCAACTGG
TTCACCCTTG
GACAATGGCT
GTGTTGGACA
ATCCCCCACA
4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5711 INFOPCMATION FOR SEQ ID NO:ll: SEQUENCE CHARACTERISTICS: LENGTH: 5707 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG GTTTCTCAGA TAACTGGGCC
CCTGCGCTCA
TGGATTTATC
TCTTAGAGTG
ACATATTTTG
GTCCTTTATG
AACTTGTTGA
ATGCAiACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AArTGGGATC
GGAGGCCTTC
7GCTCTTCGC
TCCCATCTGT
C:AAATTTTGC
TAAGAATGAT
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
ACCCTCTGCT
GTTGAAGAAG
CTGGAGTTGA
ATGCTGAAAC
ATAACCAAAA
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATACCG
'TGGGTAAAG
TACAAAATGT
TCAAGGAACC
TTCTCAACCA
GGAGCCTACA
GTGCTTTTCA
AAAATAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
TTCATTGGAA
CATTAATGCT
TGTCTCCACA
GAAGAAAGGG
AGAAAGTACG
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
CAGAAAGAAA
ATGCAGAAAA
ALAGTGTGACC
CCTTCACAGT
AGATTTAGTC
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
.AACCTTGGAA
GTCTACATTG
AGTGTGGGAG
WO 96/33271 PCTIUS96105621 ATCAAGAATT GTTACAAATC CAAAAAAGGC TGCTTGTGAA CCAGTAATAA TGATTTGAAC ATCAGGGTAG TTCTGTTTCA GCTCATTACA GCATGAGAAC AGGCTGAATT CTGTAATAAA GGGCTGGAAG TAAGGAAACA ATCTGAATGC TGATCCCCTG CAGAGAATCC TAGAGATACT AAGTTAATGA GTGGTTTTCC GGGAGTCTGA ATCAAATGCC AATATTCTGG TTCTTCAGAG TATGTAAAAG TGAAAGAGTT TTGGGAAAAC CTATCGGAAG TAATTATAGG AGCATTTGTT AATTAAAGCG TAAAAGGAGA CAGATTTGGC AGTTCAAAAG AGAATGGTCA AGTGATGAAT CTATTCAGAA TGAGAAAAAT AAACGAAAGC TGAACCTATA ACAATTCAAA AGCACCTAAA ATGCGCTTGA ACTAGTAGTC TTGATAGTTG TTCTAGCAGT GGCACAGCAG AA.ACCTACAA GTAACAAGCC AAATGAACAG AGTTAACAAA TGCACCTGGT TTGTCAATCC TAGCCTTCCA CTAATAATGC TGAAGACCCC AAAGATCTGT AGAGAGTAGC AAAGTATCTC GTTACTGGAA GTGTGAGTCA GTGTGCAGCA ATAATAGAAA TGACACAGAA
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGATA
TGTGAGAGAA
GAAGATGTTC
AGAAGTGATG
AA.AGTAGCTG
AAAATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAAA
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAAATC
GAAGAGATAA
CTCATGGAAG
ACAAGTAAAA
TCTTTTACTA
AGAGAAGAAA
AAAGATCTCA
AGTATTTCAT
GTTAGCACTC
TTTGARAACC
GGCTTTAAGT
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
GGCGGACTCC
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
AAGAAGAGAA
TGTTAAGTGG
TGGTACCTGG
TGAAATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAXAT
AGACAGAATG
AAGGAGCCAA
CAGCACAGAA
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTAAC
GAATAAAACA
CGAAAAAGAP.
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
TGCAACTGGA
CGATACTTTC
TACCAGTGAA
ACTAGAAACA
AGAAAGGGTT
TACTGATTAT
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAAAGGTAG
CTGCCATGCT
AGCATTCAGA
'CACATGATG
GAGGTAGATG
GAGGCTTTAA
GACAAAATAT
ACTGAAAATC
CTCACAAATA
ATCAAGAAAG
CAAACGGAGC
AAAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAAA
ATGCCAGTCA
GCCAAGAAGA
CCAGAGCTGA
CTTAAAGAAT
GTTAAAGTGT
TTGCAAACTG
GGCACTCAGG
CCAAATAAAT
TGTTCCAAAG
A.ACCACAGTC
780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640
I
TAGGGAAGGC AAAACAGAA CCAAGGGACT AATTCATGGT ATCCATTGGG ACATGAAGTT GGCACAGCAG AAACCTACAA CTA'1'.'U WO 96/33271 PCT/US96/05621
GGGAAACAAG
TCAAGGTTTC
AATGTGCAAC
TTGAATGTGA
AGACAGTTAA
ATGCCAAATG
ACGAAACTGG
CACCACTTTT
AAAACTTTGA
GTACAGTGAG
CAAGCAATAT
TLGGTTCCAG
ATGCTATGCT
GTAATTGTAA
ATACAGATTT
ATGCATCiCA
AAGATACTAG
TCCAGAAAGG
GTTACCGAAG
AAGAGCTTCC
CTACTAGGCA
TATCATTGAA
AGGAACATCA
GTGAATTGGA
CCAAACAAAT
TTTCAGATGA
TTCAAACTTA
CTCAGGGCTA
TAACCTGATA
GAGCCAGCCT
GCGAAATCCA
ATACCCTATA
CATAGAAATG
AAAGCGCCAG
ATTCTCTGCC
ACAAAAGGAA
TATCACTGCA
TAGTATCAAA
ACTCATTACT
TCCCATCAAG
GGAACATTCA
CACAATTAGC
TAATGAAGTA
TGATGAAAAC
TAGATTAGGG
GCATCCTGAA
CTCTCCATAT
GGTTTGTTCT
TTTTGCTGAA
AGAGCTTAGC
AGGGGCCAAG
CTGCTTCCAA
TAGCACCGTT
GAATAGCTTA
CCTTAGTGAG
AGACTTGACT
GAGGCATCAG
TGAAGAAAGA
GGTGAAGCAG
TCCTCTCAGA
1.AGCTCCAGC TCTAkCAGCT
GAACAAAGCA
AGCCAGAATC
GAAGAA.AGTG
TCATTTGCTC
CACTCTGGGT
GAAAATCAAG
GGCTTTCCTG
GGAGGCTCTA
CCAAATAAAC
TCATTTGTTA
ATGTCACCTG
CGTAATAACA
GGTTCCAGTA
ATTCAAGCAG
GTTTTGCA.AC
ATAAAAAAGC
CTGATTTCAG
GAGACACCTG
AATGACATTA
AGGAGTCCTA
AAATTAGAGT
CACTTGTTAT
GCTACCGAGT
AATGACTGCA
GAAACAAAAT
GCAAATACA.A
TCTGAAAGCC
GGAACGGGCT
CA'rCTGGGTG
GTGACATTTT
AGGAA.ATGGC
ACCCTTCCAT
CATCAGAAAA
CAGAAGGCCT
AACTTGATGC
CGTTTTCAAA
CCTTAAAGAA
GAAAGAATGA
TGGTTGGTCA
GGTTTTGTCT
ATGGACTTTT
PAACTAAATG
AAAGAGAAAT
TTAGAGAAA.A
CTAATGAAGT
AACTAGGTAG
CTGAGGTCTA
AAGAATATGA
ATAACTTAGA
ATGACCTGTT
AGGAAAGTTC
GCCCTTTCAC
CCTCAGAAGA
TTGGTAAAGT
GTCTGTCTAA
GTAACCAGGT
GTTCTGCTAG
ACACCCAGGA
AGGGAGTTGG
TGGAAGAAAA
TGAGAGTGAA
AACCACTCAG
TGAACTAGAA
CATAAGTGAC
AGCAGTATTA
TTCTGCTGAC
TCAGTATTTG
TCCAGGAAAT
ACAAAGTCCA
GTCTAATATC
GAAAGATAAG
ATCATCTCAG
ACAAAACCCA
TAAGAAAAAT
GGGAAATGAG
TGTTTTTAAA
GGGCTCCAGT
AAACAGAGGG
TAAACAAAGT
AGAAGTAGTT
ACAGCCTATG
AGATGATGGT
TGCTGTTTTT
CCATACACAT
GAACTTATCT
AAACAATATA
GAACACAGAG
AATATTGGCA
CTTGTTTTCT
TCCTTTCTTG
TCTGAGTGAC
TAAGAAGAGC
ACAAGCGTCT
CAGAGGGATA
GCTGTGTTAG
TCTTCTGCCC
ACTTCACAGA
AAGTTTGAGG
CAGAA~TACAT
GCAGAAGAGG
AAAGTCACTT
AAGCCTGTAC
CCAGTTGATA
TTCAGAGGCA
TATCGTATAC
CTGCTAGAGG
AACATTCCAA
GAAGCCAGCT
ATTAATGAAA
CCAAAATTGA
CTTCCTGGAA
CAGACTGTTA
GGAAGTAGTC
GAAATAAAGG
AGCAAAAGCG
TTGGCTCAGG
AGTGAGGATG
CCTTCTCAGT
GAGAATTTAT
AAGGCATCTC
TCACAGTGCA
ATTGGTTCTT
AAGGAATTGG
AAAGCATGGA
CTGAAGACTG
CCATGCAACA
AACAGCATGG
TTGAGGACCT
AAAGTAGTGA
TGTCTGCAGA
2700 2760 2820 2880 2940 3000 -3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 46 WO 96/33271 PTU9/52 PCT[US96/05621
TAGTTCTACC
ATCATTAGAT
CCCATCTCAA
TGGGCCACAC
TTACCTGGAA
CAGAGCCCCA
TCCCCAATTG
TGCTGGGTAT
AACAGAA.AGG
TATGCTCGTG
AGAGACTACT
ATATTTTCTA
TATTAAAGAA
TGGAAGAAAC
GGGGCTAGAA
GATGGTACAG
AGGTGTCCAC
TGCAATTGGG
AGCACTCTAC
CTACTGA
AGTAAAAATA AAGAACCAGG AGTGGAAAGG TCATCCCCTT CTAAATGCCC
GATAGGTGGT
GAGGAGCTCA
GATTTGACGG
TCTGGAATCA
GAGTCAGCTC
AA.AGTTGCAG
AATGCAATGG
GTCAACAAAA
TACAAGTTTG
CATGTTGTTA
GGAATTGCGG
AGAAAAATGC
CACCAAGGTC
ATCTGTTGCT
CTGTGTGGTG
CCAATTGTGG
CAGATGTGTG
CAGTGCCAGG
ACATGCACAG
TTAAGGTTGT
AAACATCTTA
GCCTCTTC'rC
GTGTTGGCAA
AATCTGCCCA
AAGAAAGTGT
GAATGTCCAT
CCAGAAAACA
TGAAAACAGA
GAGGAAAATG
TGAATIGAGCA
CAAAGCGAGC
ATGGGCCCTT
CTTCTGTGGT
rTGTGCAGCC
AGGCACCTGT
AGCTGGACAC
TTGCTCTGGG
TGATGTGGAG
CTTGCCAAGG
TGATGACCCT
CATACCATCT
GAGTCCAGCT
GAGCAGGGAG
GGTGGTGTCT
CCACATCACT
TGCTGAGTTT
GGTAGTTAGC
TGATTTTGAA
AAGAGAATCC
CACCAACATG
GAAGGAGCTT
AGATGCCTGG
GGTGACCCGA
CTACCTGATA
AGTCTTCAGA
GAGCAACAGC
CAAGATCTAG
GAATCTGATC
TCAACCTCTG
GCTGCTCATA
AAGCCAGAAT
GGCCTGACCC
TTAACTAATC
GTGTGTGAAC
I'ATTTCTGGG
GTCAGAGGAG
CAGGACAGAA
CCCACAGATC
TCATCATTCA
ACAGAGGACA
GAGTGGGTGT
CCCCAGATCC
GTTTCTCAGA
TTCATTGGAA
CATTAATGCT
1GTCTCCACA
GAAGAAAGGG
AGAAAGTACG
ATAGAAACTA
TGGAAGAGTC
AGGGAACCCC
CTTCTGAAGA
CATTGAAAGT
CTACTGATAC
TGACAGCTTC
CAGAAGAATT
TAATTACTGA
GGACACTGAA
TGACCCAGTC
ATGTGGTCAA
AGATCTTCAG
AACTGGAATG
CCCTTGGCAC
ATGGCTTCCA
TC-GACAGTGT
CCCACAGCCA
TAACTGGGCC
CAGAAAGAAA
ATGCAGAAAA
AAGTGTGACC
CCTTCACAGT
AGATTTAGTC
4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 9640 5700 5707 INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 5712 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: AGCTCGCTGA GACTTCCTGG ACCCCGCACC AGGCTGTGGG CCTGCGCTCA GGAGGCCTTC ACCCTCTGCT CTGGGTAAAG TGGATTTATC TGCTCTTCGC GTTGAAGAAG TACAAAATGT TCTTAGAGTG TCCCATCTGT CTGGAGTTGA TCAAGGAACC ACATATTTTG CAAATTTTGC ATGCTGAAAC TTCTCAACCA GTCCTTTATG TAAGAATGAT ATAACCAAAA GGAGCCTACA I U WO 96/3327 1 PCT[US96/05621
AACTTGTTGA
ATGCAAACAG
AAGTTTCTAT
AACCCGAAAA
CTGTGAGAAC
AATTGGGATC
ATCAAGAATT
CAAAAAAGGC
CCAGTAATAA
ATCAGGGTAG
GCTCATTACA
AGGCTGAATT
GGGCTGGAAG
ATCTGAATGC
CAGAGAATCC
AAGTTAATGA
GGGAGTCTGA
AATATTCTGG
TATGTAAAAG
TTGGGAAAAC
TAATTATAGG
AATTA.AAGCG
CAGATTTGGC
AGAATGGTCA
CTATTCAGAA
APLACGAAAGC
ACAATTCAAA
ATGCGCTTGA
TTGATAGTTG
GGCACAGCAG
GTAACAAGCC
AGTTAACAAA
AGAGCTATTG
CTATAATTTT
CATCCAAAGT
TCCTTCCTTG
TCTGAGGACA
TGATTCTTCT
GTTACAAATC
TGCTTGTGAA
TGATTTGAAC
TTCTGTTTCA
GCATGAGA.AC
CTGTAATAAA
TAAGGAAACA
TGATCCCCTG
TAGAGATACT
GTGGTTTTCC
ATCAAATGCC
TTCTTCAGAG
TGAAAGAGTT
CTATCGGAAG
AGCATTTGTT
TAAAAGGAGA
AGTTCAAAAG
AGTGATGAAT
TGAGAAAAAT
TGAACCTATA
AGCACCTAAA
ACTAGTAGTC
TTCTAGCAGT
AAACCTACAA
AAATGAACAG
TGCACCTGGT
AAAATCATTT
GCAAAAAAGG
ATGGGCTACA
CAGGAAACCA
AAGCAGCGGA
GAAGATACCG
ACCCCTCAAG
TTTTCTGAGA
ACCACTGAGA
AACTTGCATG
AGCAGTTTAT
AGCAAACAGC
TGTAATGATA
TGTGAGAGAA
GAAGATGTTC
AGAAGTGATG
AAAGTAGCTG
AA-AATAGACT
CACTCCAAAT
AAGGCAAGCC
ACTGAGCCAC
CCTACATCAG
ACTCCTGAAA
ATTACTAATA
CCTAACCCAA
AGCAGCAGTA
AAGAATAGGC
AGTAGAAATC
GAAGAGATAA
CTCATGGAAG
ACAAGTAAAA
TCTTTTACTA
GTGCTTTTCA
AAAANTAACTC
GAAACCGTGC
GTCTCAGTGT
TACAACCTCA
TTAATAAGGC
GAACCAGGGA
CGGATGTAAC
AGCGTGCAGC
TGGAGCCATG
TACTCACTAA
CTGGCTTAGC
GGCGGACTCC
AAGAATGGAA
CTTGGATAAC
AACTGTTAGG
ATGTATTGGA
TACTGGCCAG
CAGTAGAGAG
TCCCCAACTT
AGATAATACA
GCCTTCATCC
TGATAAATCA
GTGGTCATGA
TAGAATCACT
TAAGCAATAT
TGAGGAGGAA
TAAGCCCACC
AGAAAAAAAA
GTAAAGAACC
GACATGACAG
AGTGTTCAAA
GCTTGACACA
TCCTGAACAT
CAAAAGACTT
CCAACTCTCT
AAAGACGTCT
AACTTATTGC
TGA.AATCAGT
AAATACTGAA
TGAGAGGCAT
TGGCACAAAT
AGACAGA.ATG
ALAGGAGCCAA
CAGCACAGAA
TAAGCAGAAA
ACTAAATAGC
TTCTGATGAC
CGTTCTAAAT
TGATCCTCAT
TAATATTGAA
AAGCCATGTA
AGAGCGTCCC
TGAGGATTTT
GGGAACTAAC
GAATAAAACA
CGAAAAAGAA
GGAACTCGAA
GTCTTCTACC
TAATTGTACT
GTACAACCAA
TGCAACTGGA
CGATACTTTC
TACCAGTGAA
GGTTTGGAGT
CTAAAAGATG
CTACAGAGTG
AACCTTGGAA
GTCTACATTG
AGTGTGGGAG
TTGGATTCTG
CATCATCAAC
CCAGAAAAGT
ACTCATGCCA
AATGTAGAAA
CATAACAGAT
AAAA.AGGTAG
CTGCCATGCC
AGCATTC:AGA
TCACATGATG
GAGGTAGATG
GAGGCTTTAA
GACAAAATAT
ACTGAAA.ATC
CTCACAAATA
ATCAAGAAAG
CAAACGGAGC
AAAGGTGATT
TCTGCTTTCA
TTAAATATCC
AGGCATATTC
GAATTGCAAA
ATGCCAGTCA
GCCAAGAAGA
CCAGAGCTGA
CTTAAAGAAT
420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 WO 96/33271 PTU9/52 PCTfUS96/05621 TTGTCAATCC TAGCCTTCCA AGAGAAGAAA AAGAAGAGAA ACTAGAAACA GTTAAAGTGT
CTAATAATGC
AAAGATCTGT
AAAGTATCTC
GTGTGAGTCA
ATAATAGAAA
GGGAAACAAG
TCAAGGTTTC
AATGTGCAAC
TTGAATGTGA
AGACAGTTAA
ATGCCAAAT(;
ACGAAACT(aG
CACCACTTTT
AAAACTTTGA
GTACAGTGAG
CAAGCAATAT
TAGGTTCCAG
ATGCTATGCT
GTAATTGTAA
ATACAGATTT
ATGCATCTCA
AAGATACTAG
TCCAGAAAGG
GTTACCGAAG
AAGAGCTTCC
CTACTAGGCA
TATCATTGAA
AGGAACATCA
TGAAGACCCC
AGAGAGTAGC
GTTACTGGAA
GTGTGCAGCA
TGACACAGAA
CATAGAAATG
AAAGCGCCAG
ATTC'rCTGCC
ACAAAAGGAA
TATCACTGCA
TAGTATCAAA
ACTCATTACT
TCCCATCAAG
GGAACATTCA
CACAATTAGC
TA.ATGAAGTA
TGATGAAAAC
TAGATTAGGG
GCATCCTGAA
CTCTCCATAT
GGTTTGTTCT
TTTTGCTGAA
AGAGCTTAGC
AGGGGCCAAG
CTGCTTCCAA
TAGCACCGTT
GAATAGCTTA
CCTTAGTGAG
AAAGATCTCA
AGTATTTCAT
GTTAGCACTC
TTTGAAAACC
GGCTTTAAGT
GAAGAAAGTG
TCATTTGCTC
CACTCTGGGT
GAAAATCAAG
GGCTTTCCTG
GGAGGCTCTA
CCAAATAAAC
TCATTTGTTA
ATGTCACCTG
CGTAATAACA
GGTTCCAGTA
ATTCAAGCAG
GTTTTGCAAC
ATAAAAAAGC
CTGATTTCAG
GAGACACCTG
AATGACATTA
AGGAGTCCTA
AAATTAGAGT
CACTTGTTAT
GCTACCGAGT
AATGACTGCA
GAAACAAAAT
GCAAATACAA
TCTGAAAGCC
GGAACGGGCT
TGTTAAGTGG
TGGTACCTGG
TAGGGAAGGC
CCAAGGGACT.
ATCCATTGGG
AACTTGATGC
CGTTTTCAAA
CCTTAAAGAA
GAAAGAATGA
TGGTTGGTCA
GGTTTTGTCT
ATGGACTTTT
AAACTAAATG
AAAGAGAAAT
TTAGAGAAAA
CTAATGAAGT
AACTAGGTAG
CTGAGGTCTA
AAGAATATGA
ATAACTTAGA
ATGACCTGTT
AGGAAAGTTC
GCCCTTTCAC
CCTCAGAAGA
TTGGTAAAGT
GTCTGTCTAA
GTAACCAGGT
GTTCTGCTAG
ACACCCAGGA
AGGGAGTTGG
TGGAAGAAAA
AGAAAGGGTT
TACTGATTAT
AAAAACAGAA
AATTCATGGT
ACATGAAGTT
TCAGTATTTG
TCCAGGAAAT
ACAAAGTCCA
GTCTAATATC
GAAAGATAAG
ATCATCTCAG
ACAAAACCCA
TAAGAAAAAT
GGGAAATGAG
TGTTTTTA-L-k
GGGCTCCAGT
AAACAGAGGG
TAAACAAAGT
AGAAGTAGTT
ACAGCCTATG
AGATGATGGT
TGCTGTTTTT
CCATACACAi*
GAACTTATCT
AAACAATATA
GAACACAGAG
AATATTGGCA
CTTGTTTTCT
TCCTTTCTTG
TCTGAGTGAC
TAATCAAGAA
TTGCAA.ACTG
GGCACTCAGG
CCAAATAAAT
TGTTCCAAAG
AACCACAGTC
CAGAATACAT
GCAGAAGAGG
AAAGTCACTT
AAGCCTGTAC
CCAGTTGATA
TTCAGAGGCA
TATCGTATAC
CTGCTAGAGG
AACATTCCAA
GAAGCCAGCT
ATTAATGAAA
CCAAAATTGA
CTTCCTGGAA
CAGACTGTTA
GGAAGTAGTC
GAAATAAAGG
AGCAAAAGCG
TTGGCTCAGG
AGTGAGGATG
CCTTCTCAGT
GAGAATTTAT
AAGGCATCTC
TCACAGTGCA
ATTGGTTCTT
AAGGAATTGG
GAGCAAAGCA
2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 GTGAATTGGA AGACTTGACT CCAAACAAAT GAGGCATCAG TTTCAGATGA TGAAGAAAGA
L
WO 96/33271 PCTIUS96/05621
TGGATTCAAA
ACTGCTCAGG
AACATAACCT
ATGGGAGCCA
ACCTGCGAAA
GTGAATACCC
CAGATAGTTC
GCCCATCATT
ACTACCCATC
AGTCTGGGCC
CCCCTTACCT
AAGACAGAGC
AAGTTCCCCA
ATACTGCTGG
CTTCAACAGA
AATTTATGCT
CTGAAGAGAC
TGAAATATTT
AGTCTATTAA
TCAATGGAAG
TCAGGGGGCT
A.ATGGATGGT
GCACAGGTGT
TCCATGCAAT
GTGTAGCACT
CTTAGGTGAA CCAGCATCTG GGTGTGAGAG TGAAACAAGC GTCTCTGAAG
GCTATCCTCT
GATAAAGCTC
GCCTTCTAAC
TCCArjAZACAA
TATAAGCCAG
TACCAGTAAA
AGATGATAGG
TCAAGAGGAG
ACACGATTTG
GGAATCTGGA
CCCAGAGTCA
ATTGAAAGTT
GTATAATGCA
AAGGGTCAAC
CGTGTACAAG
TACTCATGTT
TCTAGGAATT
AGAAAGAAAA
AAACCACCAA
AGAAATCTGT
ACAGCTGTGT
CCACCCAATT
TGGGCAGATG
CTACCAGTGC
CAGAGTGACA
CAGCAGGAAA
AGCTACCCTT
AGCACATCAG
AATCCAGAAG
AATAAAGAAC
TGGTACATGC
CTCATTAAGG
ACGGAA.ACAT
ATCAGCCTCT
GCTCGTGTTG
GCAGAATCTG
ATGGAAGAAA
AAAAGAATGT
TTTGCCAGAA
GTTATGAAAA
GCGGGAGGAA
ATGCTGALATG
GGTCCAAAGC
TGCTATGGGC
GGTGCTTCTG
GTGGTTGTGC
TGTGAGGCAC
CAGGAGCTGG
TTTTAACCAC
TGGCTGAACT
CCATCATAAG
AAAAAGCAGT
GCCTTTCTGC
CAGGAGTGGA
ACAGTTGCTC
TTGTTGATGT
CTTACTTGCC
TCTCTGATGR
GCAACATACC
CCCAGAGTCC
GTGTGAGCAG
CCATGGTGGT
AACACCACAT
CAGATGCTGA
AATGGGTAGT
AGCATGATTT
GAGCAAGAGA
CCTTCACCAA
TGGTGAAGGA
AGCCAGATGC
CTGTGGTGAC
ACACCTAACC
TCAGCAGAGG
AGAAGCTGTG
TGACTCTTCT
ATTAACTTCA
TGACAAGTTT
AAGGTCATCC
TGGGAGTCTT
GGAGGAGCAA
AAGGCAAGAT
CCCTGAATCT
ATCTTCAACC
AGCTGCTGCT
GGAGA6AGCCA
GTCTGGCCTG
CACTTTAACT
GTTTGTGTGT
TAGCTATTTC
TGAAGTCAGA
ATCCCAGGAC
CATGCCCACA
GCTTTCATCA
CTGGACAGAG
CCGAGAGTGG
TGATACCCCA
',ATACCATGC
TTAGAACAGC
GCCCTTGAGG
CAGAAA.AGTA
GAGGTGTCTG
CCTTCTAAAT
CAGAATAGAA
CAGCTGGAAG
CTAGAGGGAA
GATCCTTCTG
TCTGCATTGA
CATACTACTG
GAATTGACAG
ACCCCAGA6AG
AATCTAATTA
GAACGGACA C
TGGGTGACCC
GGAGATGTGG
AGAAAGATCT
GATCAACTGG
'IiTCACCCTTG
GACAATGGCT
GTGTTGGACA
GATCCCCCAC
4260 4320 4380 4 44 0 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 51.60 5220 5280 5340 5400 5460 5520 5580 5640 5700 5712 AGCCACTACT GA INFORMATION FOR SEQ ID NO:13: Wi SEQUENCE CHARACTERISTICS: LENGTH: 26 amino acids TYPE: amino acid STRANDEDNESS., single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 1 i WO 96/33271 PCTIUS96/05621 Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val Ile Asn 1 5 10 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 38 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val Ile Asn 1 5 10 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys 25 Glu Pro Val Ser Thr Val INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 63 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val 1 5 10 Ala Met Gln Lys Ile Leu Glu Cys Pro Ile Cys 25 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe 40 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser 55 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 1863 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: Gin Asn Val Ile Asn Leu Glu Leu Ile Lys Cys Lys Phe Cys Met Gin Cys Pro Leu
I
low
I
WO 96/33271 Met Asp 1 Ala Met Glu Pro Leu Lys Lys Asn Gin Leu Thr Gly Asn Ser Gly Tyr 130 Pro Ser 145 Thr Val Ser Val Lys Ala Pro Gin 210 Ala Cys 225 Pi:o Ser His Pro Pro Cys Ser Leu 290 Cys Asn 305 Trp, Ala Leu Gin Val Leu Asp Val1 Leu Pro 115 Arg Leu Arg Tyr Thr 195 Gly Giu Asn Giu Gly 275 Leu Lys G ly PCTIUS96/05621 I Ile Asn .i Ile Lys 2 Cys Met Leu Cys ;Phe Ser i Leu Asp s Giu Asn i Ser Met )Giu Asn a Leu Gly 160 i Lys Thr 175 Val Asn SIle Thr 3 Lys Ala 3 His Gin 240 i Giu Arg 255 5 Val Giu i Asn Ser i. Giu Phe 3 Asn Arg 320 Ser Thr 335 52 .3
L
WO 96/33271 Giu Lys Lys Val Asp Leu Asn Ale 340 PCTIUS96/05621 Arg Lys Giu 350 Asp Pro Leu Cys Giu 345 Trp Asp Trp 385 Gly Asn Ala Ser 'y r 465 Leu Pro His Pro Val.
545 Ser Giu Asn Asn Leu 625 Ile Gin Gin Trp Arg Giu Asp 420 Pro Val Lys Gly Asn 500 Asp Ile Ile Asn Phe 580 I.Su Arg Ser Cys Val.
660 Cys 360 Asn Leu Lys
C,Y
Leu 440 Ile Pro Thr Arg Lys 520 Thr Gly Pro Ala Ile 600 Ser Ser Giu Arg Giu Giu Asp 400 Leu Leu His Thr Asn 480 Arg Leu Thr Gin Asp 560 Lys Ser Lys Giu Gin 640 Asi.
Lys WO 96/33271 PCTUS96/05621 Glu Pro Ala Thr Gly Ala Lys Lys Ser Asn Lys Pro Asn Glu Gln Thr 675 680 685 Ser Lys Arg His Asp Ser Asp Thr Phe Pro Glu Leu Lys Leu Thr Asn 690 695 700 Ala Pr,- Gly Ser Phe Thr Lys Cys Ser Asn Thr Ser Glu Lou Lys Glu 705 710 715 720 Phe Val Asn Pro Ser Leu Pro Arg Glu Glu Lys Glu Glu Lys Leu Glu 725 730 735 Thr Val uys Val Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu- 740 745 750 Ser Gly Glu Arg Val Leu Gln Thr Glu Arg Ser Val Glu Ser Ser Ser 755 760 765 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly Thr Gin Glu Ser Ile Ser 770 775 780 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 785 790 795 800 Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu Ile His 805 810 815 Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 820 825 830 Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser Ile Glu Met Glu 835 840 845 Glu Ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser 850 855 860 Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865 870 875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gln Ser 885 890 895 Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn Gin Gly Lys 900 905 910 Asn Glu Ser Asn Ile Lys Pro Val Gln Thr Val Asn Ile Thr Ala Gly 935 920 925 Phe Pro Val Val Gly Gin Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 930 935 940 Ser Ile Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gln Phe Arg Gly 945 950 955 960 Asn Glu Thr Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu Gin Asn 965 970 975 Pro Tyr Arg Ile Pro Pro Leu Phe Pro Ile Lys Ser Phe Val Lys Thr 980 985 990 Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met 995 1000 1005 54 1b 41 WO 9633271PCTJUS96/05621 Ser Pro Giu Arg Giu Met Gly Asn Giu TAsn Ile Pro Ser Thr Val Ser 1010 1015 1020 Thr Ile Ser Arg Asn Asn Ile Arg Giu Asn Val Phe Lys Giu Ala Ser 1025 1030 1035 1040 Ser Ser Asn Ile Asn Giu Val Gly Ser Ser Thr Asn Giu Val Gly Ser 1045 1050 1055 Ser Ile Asn Giu Ile Gly Ser Ser Asp Giu Asr Ile Gin Ala Giu Leu 1060 1065 1070 G'y Arg Asn Arg Gly Pro Lys Leu Asn Ala Met Leu Arg Leu Gly Val 1075 1080 1085 Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 1090 1095 1100 H.I.s Pro Giu Ile Lys Lys Gin Giu Tyr Giu Giu Vai Val Gin Thr Val 105 1110 1115 1120 Asn Thr Asp Phe Ser Pro Tyr Leu Ile Ser Asp Asn Leu Giu Gin Pro 1125 1130 1135 Met Gly Ser Ser His Ala Ser Gin Val Cys Ser Giu Thr Pro Asp Asp 1.140 1145 1150 Leu Leu Asp Asp Gly Giu Ile Lys Giu Asp Thr Ser Phe Ala Giu Asn 301151101E Asp Ile Lys Giu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 1170 1175 1180 Giu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 1185 1190 1195 1200 Gly Tyr Arg Arg Gly Ala Lys Lys Leu Giu Ser Ser Giu Giu Asn Leu 1205 1210 1215 Ser Ser Glu Asp Giu Giu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 1220 1225 1230 Lys Val Asn Asn Ile Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 451235 1240 1245 Thr Giu Cys Leu Ser Lys Asn Thr Giu Giu Asn Leu Leu Ser Leu Lys 1250 1255 1260 Asn Ser Leu Asn Asp Cys Ser Asn Gin Vai Ile Leu Ala Lys Ala Ser 1265 1270 1275 1280 Gin Giu His His Leu Ser Giu Glu Thr Lys Cys Ser Ala Ser Leu Phe 1285 1290 1295 Ser Ser Gin Cys Ser Glu Leu Giu Asp Leu Thr Ala Asn Thr Asn Thr 1300 1305 1310 Asp Pro e Leu Ile Gly Ser Ser Lys Gin Met Arg His Gin Ser 601315 1320 1325 Giu Ser Gin Gly Val Gly Leu Ser Asp Lys Giu Leu Val Ser Asp Asp 1330 1335 1340 42 WO 96/33271 PCTIUS9605621 Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn Asn Gin Glu Glu Gin Ser 1345 1350 1355 1360 Met Asp Ser Asn Leu Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu Thr 1365 1370 1375 Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp Ile Leu 1380 1385 1390 Thr Thr Gin Gin Arg Asp Thr Met Gin His Asn Leu Ile Lys Leu Gin 1395 1400 1405 Gin Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gin His Gly Ser Gin 1410 1415 1420 Pro Ser Asn Ser Tyr Pro Ser Ile Ile Ser Asp Ser Ser Ala Leu Glu 1425 1430 1435 1440 Asp Leu Arg Asn Pro Glu Gin Ser Thr Ser Glu Lys Ala Val Leu Thr 1445 1450 1455 Ser Gin Lys Ser Ser Glu Tyr Pro Ile Ser Gin Asn Pro Glu Gly Leu 1460 1465 1470 Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 1475 1480 1485 Lys Glu Pro Gly Val Glu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 1490 1495 1500 Asp Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gin Asn Arg 1505 1510 1515 1520 Asn Tyr Pro Ser Gin Glu Glu Leu Ile Lys Val Val Asp Val Glu Glu 1525 1530 1535 Gin Gin Leu Glu Glu Ser Gly Pro His Asp Leu Thr Glu Thr Ser Tyr 1540 1545 1550 Leu Px, Arg Gin Asp Leu Glu Gly Thr Pro Tyr Leu Glu Ser Gly Ile 1555 1560 1565 Ser Leu Phe Ser Asp Asp Pro Glu Ser Asp Pro Ser Glu Asp Arg Ala 1570 1575 1580 Pro Glu Ser Ala Arg Val Gly Asn Ile Pro Ser Ser Thr Ser Ala Leu 1585 1590 1595 1600 Lys Val Pro Gin Leu Lys Val Ala Glu Ser Ala Gin Cer Pro Ala Ala 1605 1610 1615 Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Met Glu Glu Ser Val 1620 1625 1630 Ser Arg Glu Lys Pro Glu Leu Thr Ala Ser Thr Glu Arg Val Asn Lys 1635 1640 1645 Arg Met Ser Met Val Val Ser Gly Leu Thr Pro Glu Glu Phe Met Leu 1650 1655 1660 Val Tyr Lys Phe Ala Arg Lys His His Ile Thr Leu Thr Asn Leu Ile 1665 1670 1675 1680 Ir U-Y -YI 11~- CI bl r Y L-~I WO 96133271 PCTUS96/05621 Thr Giu Giu Thr Thr His Val Val Met Lys Thr Asp Ala Giu Phe Val 1685 1690 1695 Cys Giu Arg Thr Leu Lys Tyr Phe Leu Gly Ile Ala Gly Gly Lys Trp 1700 1705 1710 Val Val Ser Tyr Phe Trp Val Thr Gin Ser Ile Lys Giu Arg Lys Met 1715 1720 1725 Leu Asn Giu His Asp Phe Giu Val Arg Gly Asp Val Vai Asn Gly Arg 1730 1735 1740 Asn His Gln Gly Pro Lys Arg Ala Arg Giu Ser Gin Asp Arg Lys Ile 1745 1750 1755 1760 Phe Arg Giy Leu Glu Ile Cys Cys Tyr Gly Pro Phe Thr Asn Met Pro 1765 1770 1775 Thr Asp Gin Leu Giu Trp Met Val Gin Leu Cys Gly Ala Ser Val Val 1780 1785 1790 Lys Giu Leu Ser Ser Phe Thr Leu. Gly Thr Giy Val His Pro Ile Val 1795 1800 1805 Vai Val Gin Pro Asp Ala Trp Thr GJu Asp Asn Gly Phe His Ala Ile 1810 1815 1820 Gly Gin Met Cys Glu Ala Pro Val Val. Thr Arg Giu Trp Val Leu Asp 1825 1830 1835 1840 Ser Val Ala Leu Tyr Gin Cys Gin Giu Leu Asp Thr Tyr Leu Ile Pro 1845 1850 1855 Gin Ile Pro His Ser His Tyr 1860 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 80 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: Met Asp Leu Ser Ala Leu Arg Val Giu Giu Val Gln Asn Val Ile Asn 1 5 10 15 Ala Met Gin Lys Ile Leu Giu Cys Pro Ile Cys Leu Giu Leu Ile Lys 25 30 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 40 45 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 55 60 Lys Asn Asp Ile Thr Lys Ser Val Leu Lys Arg Leu Ile Ile Thr Cys 70 75 80 WO 961332,
INFOR
SE
(ii) M (xi) S Met A 1 Ala M Glu P Leu L Lys A Gin Thr Asn Gly Pro 145 Thr Ser Lys Pro Ala 225 Pro His PCTfUS9/O5621 WO 961332i 1 INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 312 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: PCT[US96105621 Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gln Asn Val Ile Asni Ala Glu Leu Lys Gln Thr Asn Gly Pro 145 Thr Ser Lys Pro Ala 225 Pro His Leu Lys Gln Lys 70 Leu Ala Ala Thr 150 Arg Leu Ser Asp Glu 230 Leu Gln Lys Met Cys Ser Asp Asn Met Asn Gly 160 Thr Asn Thr Al a Gln 240 Arg Glu -jo WO 96/33271 PCT/US96/05621 Pro Cys Gly Thr Asn Thr His Ala Ser Ser Leu Gin His Glu Asn Ser 275 280 285 Ser Leu Leu Leu Thr Lys Asp Arg Met Asn Val Glu Lys Ala Glu Phe 290 295 300 Cys Asn Lys Ser Lys Arg Leu Ala 305 310 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 765 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val Ile Asn 1 5 10 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys 25 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 40 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 55 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 65 70 75 Gin Leu Val Glu Glu Leu Leu Lys Ile Ile Cys Ala Phe Gin Leu Asp 90 Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 100 105 110 Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser Ile Ile Gin Ser Met 115 120 125 Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn I 130 135 140 Pro Ser Leu Gin Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 145 150 155 160 Thr Val Arg Thr Leu Arg Thr Lys Gin Arg Ile Gin Pro Gin Lys Thr 165 170 175 Ser Val Tyr Ile Glu Leu Gly Ser Asp Ser Ser Glu Asp Thr Val Asn 180 185 190 Lys Ala Thr Tyr Cys Ser Val Gly Asp Gin Glu Leu Leu Gin Ile Thr 195 200 205 Pro Gin Gly Thr Arg Asp Glu Ile Ser Leu Asp Ser Ala Lys Lys Ala 210 215 220 59 a 46 WO 96/33271 PCT/US96/05621 Ala Cys Giu Phe Ser Glu Thr Asp Val. Thr Asn Thr Giu His His Gin 225 230 235 240 4G 55 Asn Giu Gly 275 Leu Lys G ly Lys Lys 355 Pro Ser Ser Val Asp 435 Ser Lys Ile Thr Giu 515 Met Asn Asn Lys 260 Thr Leu Ser Ser Val 340 Gin Trp Arg Giu Asp 420 Pro Val.
Lys Gly Asn 500 Asp Ile Ile Leu Gin Thr Lys Gin 310 Giu Leu Leu Thr Asp 390 Asn Tyr Giu Ser Ser 470 Phe Leu Ile Gin Asn 550 Asn Giy His Asp 295 Pro Thr Asn Pro Leu 375 Glu Ala Ser Ala Asn 455 Leu Val Lys Lys Giy 535 Ser Thr Ser 265 Ser Me t Leu Asn Asp 345 Ser Ser Leu Vai Ser 425 Ile Glu Asn Giu Lys 505 Ala Asn His Giu 250 Vai Ser Asn Aia Asp 330 Pro Giu Ser Gly Ala 410 Ser Cys Asp Leu Pro 490 Arg Asp Gin Arg Asn Gin Giu 300 Ser Arg Cys Pro 380 Asp Val Lys Ser Ile 460 His Ile Pro Aia Glu 540 Aia Leu His 285 Lys Gin Thr Giu Arg
CS-
Lys Asp Leu Ile Giu 445 Ph Vai Ile Thr Vai 525 Gin Aia His 270 Giu Ala His Pro Arg 350 Asp Val Ser Asp Asp 430 Arg Giy Thr Gin Ser 510 Gin Asn Arg Giu Ser Phe Arg 320 Thr Giu Giu Giu Asp 400 Leu Leu His Thr Asn 480 Arg Leu Thr Gin
'I
Giu Asn Lys Thr Lys Giy 555 WO 96/33271 Ser Ile Giu Ser Asn Met Asn Arg 610 Leu Val 625 Ile Asp Gin Met Giu Pro Ser Lys 690 Ala Pro 705 Phe Vai Thr Val Giu Ser Lys Ile 620 Cys Lys Leu Pro Leu 700 Ser Giu Lys Vali PCT[US96/05621 Leu Giu Lys 575 5cr Ile Ser 590 Pro Lys Lys Ala Leu Giu Giu Leu Gin 640 Lys Tyr Asn 655 Giu Ciy Lys 670 Giu Gin Thr Leu Thr Asn Leu Lys Giu 720 Lys Leu Glu 735 Leu Met Leu 750 Ser Gly Arg Vai Leu Gin Thr 760 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: (ii) (xi) Met 1 Ala Glu Leu LENGTH: 900 amino acids TYPE: amino acid STRAN~DEDNESS: single TOPOLOGY: linear MOLECULE TYPE: protein SEQUENCE DESCRIPTION: SEQ ID Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Vai Ile As,, 5 10 Met Gin Lys Ile Leu Giu Cys Pro Ile Cys Leu Glu Leu Ile Lys 25 Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 40 Lys Leu Leu Asn Gin Lys Lys Gly Pro 5cr Gin Cys Pro Leu Cys 55 61 AGTTAACAAA TGCACCTGGT TCTTTTACTA AGTGTTCAAA TACCAUTUIA~ i '~TTAAG TTTCAGATGA T WO 96/33271 Lys Asn Asp Ile PCTIUS96/05621 Arg Ser Leu Gin Giu Ser Thr Arg Phe Ser WO 96/33271 Gly C- Thr Lys Gin Thr Asn Giy Pro 145 Thr Ser LysJ Pro( Aia 225 Pro HisI ProC Ser1 CysP 305 Trp TrpA 'spV TrpP 385 Leu Giy Ser Tyr 130 Ser Vai Vai Aia Gin 210 Cys Ser Pro Cys Leu 290 Asn Aia Lys Asn Vai 370 Phe Vai Leu Pro 115 Arg Leu Arg Tyr Thr 195 Giy Giu Asn Giu Giy 275 Leu Lys Giy Lys Lys 355 Pro SerJ Giu Giu 100 Giu Asn Gin Thr Ile 180 Tyr Thr Phe Asn Lys 260 Thr Leu Ser Ser Vai 340 Gin Trp Arg Giu Leu Tyr Aia His Leu Arg Aia Giu Thr 150 Leu Arg 165 Giu Leu Cys Ser Arg Asp Ser Giu 230 Asp Leu 245 Tyr Gin Asn Thr Thr Lys Lys Gin 310 Lys Giu 325 Asp Leu Lys LeuI Ile ThrI Ser Asp 390 Leu Asn Lys Lys i3 5 Ser Thr Giy Vai Giu 215 Thr Asn Giy His Asp 295 Pro Thr Asn ProI Leu 375 GiuI Lys Ser Asp 120 Arg Leu Lys Ser Giy 200 Ile Asp Thr Ser Aia 280 ArgI GiyI Cysj AiaJ Cys 360 Asn LeuI Ile Tyr 105 Giu Leu I S erI Asp 185 Asp SerI ValI ThrC 265 MetP LeuP AsnP AspF 345 Ser G SerS LeuG Ile 90 Asn Vai Leu Vai Arg 170 Ser Gin Leu Thr Giu 250 Vai Ser Asn Aia Asp 330 ProI Giu Ser !Cys LPhe Ser Gin Gin Ilie Ser GiuI Asp Asn1 235 LysI SerI LeuC Vai Arg S 315 ArgP LeuC AsnF IleG Aia Aia Ile S er i4 0 Leu Gin Giu Leu Ser 220 Thr Arg Asn Gin Giu 300 Ser Arg Cys Pro GinI LPhe LLys Ile i2 Ser Pro Asp Leu 205 AiaI GiuI Aiaj LeuI His 285 LysI GinI ThrI Giu ArgP 365 Lys N Gin Lys Gin Pro Asn Gin Thr 190 Gin Lys His Aia His 270 Giu Aia His Pro Arg 350 Asp Val LLeu Giu Ser Giu Leu Lys 175 Vai Ile Lys His Giu 255 Vai Asn Giu Asn Ser 335 Lys Thr Asn IAsp iAsn 7Met Asn- Giy 160 Thr Asn Thr Aia Gin 240 Arg Giu Ser Phe Arg 320 Thr Giu Giu Giu I110 Asn C Aia S Ser I Tyr 1 465 Leu Pro His Pro Vai 545 Ser Giu Asn Asn Leu 625 Ile Gin Giu Ser Aia 705 ~a 380 Giy Ser 395 Asp Asp Ser His
I
TTITCAGATGA TGAAGAAAGA GGAACGGGCT TGGAAGAAAA. TAATCAAGA GAGCAAAGCA 4200 j WO 96/33271 PCTJUS96/05621 Gly Glu Ser Giu Ser Asn Ala Lys Val Ala Asp Val Leu Asp Val Leu 405 410 415 Asn Ala Ser Tyr 465 Leu Pro His Pro Val 545 Ser Giu Asn Asn Leu 625 Ile Gin Giu Ser Ala 705 Phe Giu Ser Lys 450 Arg Ile Leu Pro Giu 530 Met Ile S er Met Arg 610 Val Asp Met Pro Lys 690 prQ, Val Glu His Glu Ala Ala 485 Lys Phe Asn Thr Giu 565 Lys Giu Arg Arg Ser 645 Arg Gly Asp Phe Sr 725 Gly Leu 440 Ile Pro Thr Arg Lys 520 Thr Gly Pro Ala Ile 600 Ser Ser Giu Arg Lys 680 Thr Cys Arg Asp 430 Arg Giy Thr Gin Ser 510 Gin Asn Lys Leu Ser 590 Pro Ala Giu Lys Giu 670 Giu Leu Leu Lys
I
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: WO 96/33271 Thr Val Lys Val Ser Asn Asn Ala Glu Asp 740 745 Ser Gly Glu Arg Val Leu Gln Thr Glu Arg 755 760 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly 770 775 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala 785 790 Cys Val Ser Gln Cys Ala Ala Phe Glu Asn 805 810 Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr 820 825 Leu Gly His Glu Val Asn His Ser Arg Glu 835 840 Glu Ser Glu Leu Asp Ala Gln Tyr Leu Gln 850 855 Lys Arg Gln Ser Phe Ala Pro Phe Ser Asn 865 870 Glu Cys Ala Thr Phe Ser Ala His Ser Gly 885 890 Lys Ser His Phe 900 INFORMATION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 914 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: Met Asp Leu Ser Ala Leu Arg Val Glu Glu 1 5 10 Ala Met Gln Lys Ile Leu Glu Cys Pro Ile 20 Glu Pro Val Ser Thr Lys Cys Asp His Ile 35 Leu Lys Leu Leu Asn Gln Lys Lys Gly Pro Lys Asn Asp Ile Thr Lys Arg Ser Leu Gln 70 Gln Leu Val Glu Glu Leu Leu Lys Ile Ile 90 Pro Ser Thr Lys 795 Pro Glu Thr Asn Pro 875 Ser Val Cys Phe Ser Glu Cys PCT/US96/05621 Leu Met Leu 750 Ser Ser Ser Ser Ile Ser Pro Asn Lys 800 Leu Ile His 815 Lys Tyr Pro 830 Glu Met Glu Lys Val Ser Ala Glu Glu 880 Thr Lys Ser 895 ^r Ile Ile Cys Leu Phe Leu WO 96/3327 1 Thr G Asn S Gly T 1 Pro S 145 Thr V Ser V Lys A P ro G 2 Ala C 225 Pro S His P Pro C Ser L 2 Cys A 305 Trp A Giu L Trp A Asp V 3 Trp P 385 Gly G Asn G PCT/US96/05621 Lys Glu Asn 110 Gin Ser Met Pro Glu Asn Asn Leu Gly 160 Gin Lys Thr 175 Thr Val Asn 190 Gin Ile Thr Lys Lys Ala His His Gin 240 Ala Giu Arg 255 His Val Giu 270 Giu Asn Ser Ala Giu Phe His Asn Arg 320 Pro Ser Thr 335 Arg Lys Giu 350 Asp Thr Giu Val Asn Giu Ser His Asp 400 Asp Vai Leu 415 Asp Leu Leu 430 WO 96/33271 PC' )6/05621 Ala Ser Asp Pro His Giu Ala Leu Ile Cys Lys Ser Giu Arg Val His 435 440 445 Ser Lys Ser Val Glu Ser Asn Ile Giu Asp Lys Ile Phe Gly Lys Thr 4590 455 460 Tyr Arg Lys Lys Ala Ser Leu Pro Asn Leu Ser His Val Thr Glu Asn 465 470 475 480 Leu Ile Ile Gly Ala Phe Val Thr Giu Pro Gln Ile Ile Gin Giu Arg 485 490 495 Pro Leu Thi Asn Lys Leu Lys Arg Lys Arg Arg Pro Thr Ser Gly Leu- 500 505 510 His Pro Giu Asp Phe Ile Lys Lys Ala Asp Leu Ala Val Gin Lys Thr 515 520 525 Pro Glu Met Ile Asn Gin Gly Thr Asn Gin Thr Giu Gin Asn Gly Gin 530 535 540 Val Met Asn Ile Thr Asn Ser Gly His Giu Asn Lys Thr Lys Gly Asp 545 550 555 560 Ser Ile Gin Asn Giu Lys Asn Pro Asn Pro Ile Giu Ser Leu Giu Lys 565 570 575 Giu Ser Ala Phe Lys Thr Lys Ala Giu Pro Ile Ser Ser Ser Ile Ser 580 585 590 Asn Met Giu Leu Giu Leu Asn Ile K.As Asn S3er Lys Ala Pro Lys Lys 595 600 605 Asn Arg Leu Arg Arg Lys Ser Ser Thr Arg His Ile His Ala Leu Giu 610 615 620 Leu Vai Val Ser Arg Asn Leu Ser Pro Pro Asn Cys Thr Giu Leu Gin 625 630 635 640 Ile Asp Ser Cys Ser Ser Ser Glu Giu Ile Lys Lys Lys Lys Tyr Asn 645 650 655 Gin Met Pro Vai Arg His Ser Arg Asr~ Leu Gin Leu Met Giu Giy Lys 45660 665 670 Giu Pro Ala Thr Giy Ala Lys Lys Ser Asn Lys Pro Asn Giu Gin Thr 675 680 685 Ser Lys Arg His Asp Ser Asp Thr Phe Pro Giu Leu Lys Leu Thr Asn 690 695 700 Ala Pro Giy Ser Phe Thr Lys Cys Ser Asn Thr Ser Giu Leu Lys Giu 705 710 715 720 Phe Val Asn Pro Ser Leu Pro Arg Giu Giu Lys Glu Giu Lys Leu Giu 725 730 735 Thr Vai Lys Ser Asn Asn Ala Glu Asp Pro Lys Asp Leu Met Leu 60740 745 750 Ser Gly Giu Arg Vai Leu Gin Thr Glu Arg Ser Val Giu Ser Ser Ser 755 760 765 000 b/U WO 96/33271 Ile Ser Leu Val Pro Gly Thr Asp Tyr Gly 770 775 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala 785 790 Cys Val Se; Gln Cys Ala Ala Phe Glu Asn 805 810 Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr 820 825 Leu Gly His Glu Val Asn His Ser Arg Glu 835 840 Glu Ser Glu Leu Asp Ala Gln Tyr Leu Gin 850 855 Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn 865 870 Glu Cys Ala Thr Phe Ser Ala His Ser Gly 885 890 Pro Lys Val Thr Pne Glu Cys Glu Gin Lys 900 905 Asn Glu INFORMATION FOR SEQ ID NO.22: SEQUENCE CHARACTERISTICS: LENGTH: 1202 amino acids 33 TYPE: e.mino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: Met Asp Leu Ser Ala Leu Arg Val Glu Glu 1 5 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile Glu Prn Val Ser Thr Lys Cys Asp His Ile 35 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Lys Asn Asp Ile Thr Lys Arg Ser Leu Gin Gin Leu Val Glu Glu Leu Leu Lys Ile Ile Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn 100 105 PCT/US96/05621 Ser Ile Ser Pro Asn Lys 800 Leu Ile His 815 Lys Tyr Pro 830 Glu Met Glu Lys Val Ser Ala Glu Glu 880 Lys Gin Ser 895 Gin Cly Lys 910 67 ~u
I
1000 1005 WO 96/33271 Asn Ser Pro Glu His Leu Lys 115 PCT1US96/05621 Giu Val. Ser Ile Ile Gin Ser Met 125 Arg Leu Arg Tyr Thr 195 Giy Giu Asn Giu Gly 275 Leu Lys Giy Lys Lys 355 Pro Ser Ser Vai Asp 435 Leu Val1 Arg 170 Ser Gin Leu Thr Giu 2 Val Ser Asn Ala Asp 330 Pro G iu Ser Gly Ala 410 Ser Cys Glu Pro Giu Ser Pro Asp Leu 205 Ala Glu Ala Leu His 285 Lys Gin Thr Glu Arg 365 Lys Asp Leu Ile Glu 445 Asn Gin Thr 190 Gin Lys His Aia His 270 Giu Aia His Pro Arg 350 Asp Val Ser Asp Asp 430 Arg Asn Giy 160 Thr Asn Thr Ala Gin 240 Arg Giu Ser Phe Arg 320 Thr Giu Glu Giu Asp 400 Leu Leu His 153U1335 14 1340 WO096/33,: -1 Ser I Tyr 1 465 Leu Pro I His I Pro Val 1 545 Ser Giu Asn I Asn I Leu 1 625 Ile I Gin I GiuI Ser I Ala 1 705 Phe I ThrI SerC Ile Ile 460 His Ile Pro Ala Giu 540 Lys Giu Ser Lys Ile 620 Cys Lys Leu Pro Leu 700 Ser Giu Lys Val Gin 780 PCTIUS96/05621 Giy Lys Thr Thr Giu Asn 480 Gin Giu Arg 495 Ser Giy Leu 510 Gin Lys Thr Asn Gly Gin Lys Giy Asp 560 Leu Giu Lys 575 Ser Ile Ser 590 Pro Lys Lys Aia Leu Giu Giu Leu Gin 640 Lys Tyr Asn 655 Giu Giy Lys 670 Giu Gin Thr Leu Thr Asn Leu Lys Giu 720 Lys Leu Giu 735 Leu Met Leu 750 Ser Ser Ser Ser Ile Ser 1 (59 1665 1670 1675 160 56 WO 96/33271 PCTUS96/05621 Leu Leu Glu Val Ser Thr Leu Gly Lys Ala Lys Thr Glu Pro Asn Lys 785 790 795 800 Cys Val Ser Gin Cys Ala Ala Phe Glu Asn Pro Lys Gly Leu Ile His 805 810 815 Gly Cys Ser Lys Asp Asn Arg Asn Asp Thr Glu Gly Phe Lys Tyr Pro 820 825 Leu Gly His Glu Val Asn His Ser Arg Glu Thr Ser Ile Glu Met Glu 835 840 845 Glu Ser Glu Leu Asp Ala Gin Tyr Leu Gin Asn Thr Phe Lys Val Ser- 850 855 860 Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865 870 875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gin Ser 885 890 895 Pro Lys Val Thr Phe Glu Cys Glu Gin Lys Glu Glu Asn Gin Gly Lys 900 905 910 Asn Glu Ser Asn Ile Lys Pro Val Gin Thr Val Asn Ile Thr Ala Gly 915 920 925 Phe Pro Val Val Gly Gin Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 930 935 940 Ser Ile Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gln Phe Arg Gly 945 950 955 960 Asn Glu Thr Gly Leu Ile Thr Pro ADr' Lys His Gly Leu Leu Gin Asn 965 70 975 Pro Tyr Arg Ile Pro Pro Leu Phe Pro Ie Lys Ser Phe Val Lys Thr 980 985 990 Lys Cys Lys Lys Asn Leu Leu Glu Glu Asn Phe Glu Glu His Ser Met 995 1000 1005 Ser Pro Glu Arg Glu Met Gly Asn Glu Asn Ile Pro Ser Thr Val Ser 1010 1015 1020 Thr Ile Ser Arg Asn Asn Ile Arg Glu Asn Val Phe Lys Glu Ala Ser 1025 1030 1035 1040 Ser Ser Asn Ile Asn Glu Val Gly Ser Ser Thr Asn Glu Val Gly Ser 1045 1050 loss Ser Ile Asn Glu Ile Gly Ser Ser Asp Glu Asn Ile Gin Ala Glu Leu 1060 1065 1070 Gly Arg Asn Arg Gly Pro Lys Leu Asn Ala Met Leu Arg Leu Gly Val 1075 1080 1085 Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 1090 1095 1100 His Pro Glu Ile Lys Lys Gin Glu Tyr Glu Glu Val Val Gin Thr Val 1105 1110 1115 1120 L. 70 75 57
I
WO 96/33271 PCTIUS96/05621 Asn Thr Asp Phe Ser Pro Tyr Leu Ile Ser Asp Asn Leu Glu Gin Pro 1125 1130 1135 Met Gly Ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 1140 1145 1150 Leu Leu Asp Asp Gly Glu Ile Lys Glu Asp Thr Ser Phe Ala Glu Asn 1155 1160 1165 Asp Ile Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 1170 1175 1180 Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 1185 1190 1193 1200 Gly Tyr INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 1363 amino acids TYPE: amino acid ;.IlNDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val Ile Asn 1 5 10 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Leu Ile Lys 20 25 Glu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 40 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 55 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 70 75 Gin Leu Val Glu Glu Leu Leu Lys Ile Ile Cys Ala Phe Gin Leu Asp 90 Thr Gly Leu Glu Tyr Ala Asn Ser Tyr Asn Phe Ala Lys Lys Glu Asn 100 105 110 Asn Ser Pro Glu His Leu Lys Asp Glu Val Ser Ile Ile Gin Ser Met x 115 120 125 Gly Tyr Arg Asn Arg Ala Lys Arg Leu Leu Gin Ser Glu Pro Glu Asn 130 135 140 Pro Ser Leu Gln Glu Thr Ser Leu Ser Val Gin Leu Ser Asn Leu Gly 145 150 155 160 Thr Val Arg Thr Leu Arg Thr Lys Gin Arg Ile Gin Pro Gin Lys Thr 165 170 175 71
I'
f.Lb Z.Lw J x 260 265 58 WO 96/3327 1 Ser Val Lys Ala Pro Gin 210 Ala Cys 225 Pro Ser His Pro Pro Cys Ser Leu 290 Cys Asn 305 Trp Ala Glu Lys Trp Asn Asp Val 370 Trp Phe 385 Gly Glu Asn Giu Ala Ser Ser Lys 450 Tyr Arg 465 Leu Ile Giu Cys Arg Ser Asp 245 Tyr Asn Thr Lys Lys 325 Asp Lys Ile Ser Ser 405 Glu His Glu Ala Ala 485 Leu S er Asp Giu 230 Leu Gin Thr Lys Gin 310 Glu Leu Leu Thr Asp 390 Asn Tyr Glu Ser Ser 470 Phe PCTIUS96105621 Thr Vai Asn 190 Gin Ile Thr Lys Lys Ala His His Gin 240 Ala Giu Arg 255 His Val Glu 270 Giu Asn Ser Ala Glu Phe His Asn Arg 320 Pro Ser Thr 335 Arg Lys Glu 350 Asp Thr Glu Val Asn Giu Ser His Asp 400 Asp Val Leu 415 Asp Leu Leu 430 Arg Val His Gly Lys Thr Thr Glu Asn 480 Gin Glu Arg 495 Ser Gly Leu 510 Pro Leu Thr Asn Lys Leu Lys Arg 500 Pro Gin G.Ly Tnr arg A~sp ti.u .L.Ltemt- 210 215 220 WO 96/33271 His Pro Giu Asp Phe Ile Lys 515 PCTUS96/05621 Ala Asp Leu Ala Val Gin Lys Thr 525 Met PAsn Gin Ala Glu 595 Leu Va 1 Ser Pro Ala 675 Arg Gly Asn Lys Glu 755 Leu Glu Ser Ser His 835 Ile Ile Asn Phe 580 Leu Arg Ser Cys Val 660 Thr His Ser Pro Val1 740 Arg Val1 Val Gin Lys 820 Giu Gin Asn 550 Lys Thr Leu Lys Asn 630 Ser His Ala Ser Thr 710 Leu Asn Leu Gly Thr 790 Ala Asn Asn Giy Gly Giu 575 Ile Lys Leu Leu Tyr 655 Gly Gin Thr Lys Leu 735 Met S er Ile Asn Ile 815 Tyr Met Gin Asp 560 Lys Ser Lys Giu Gin 640 Asn Lys Thr Asn Giu 720 Giu Leu Ser Ser Lys 800 His Pro Glu va. o on flS %ZC y f2l± Z U &mLWX *UU.Jjjy 545 550 555 560 WO 96/33271 PCTIUS96/05621 Giu Ser Glu Leu Asp Ala Gln Tyr Leu Gln Asn Thr Phe Lys Val Ser 850 855 860 Lys Arg Gin Ser Phe Ala Pro Phe Ser Asn Pro Gly Asn Ala Glu Glu 865 870 875 880 Glu Cys Ala Thr Phe Ser Ala His Ser Gly Ser Leu Lys Lys Gln Ser 885 890 895 Pro Lys Val Thr Phe Glu Cys Glu Gln Lys Glu Glu Asn Gln Gly Lys 900 905 910 Asn Glu Ser Asn Ile Lys Pro Val Gln Thr Val Asn Ile Thr Ala Gly- 915 920 925 Phe Pro Val Val Gly Gin Lys Asp Lys Pro Val Asp Asn Ala Lys Cys 930 935 940 Ser Ile Lys Gly Gly Ser Arg Phe Cys Leu Ser Ser Gin Phe Arg Gly 945 950 955 960 Asn Glu Thr Gly Leu Ile Thr Pro Asn Lys His Gly Leu Leu Gin Asn 965 970 975 Pro Tyr Arg Ile Pro Pro Leu Phe Pro Ile Lys Ser Phe Val Lys Thr 980 985 990 Lys Cys Lys Lys Asn Leu Leu Glu Giu Asn Phe Giu Giu His Ser Met 995 1000 1005 Ser Pro Glu Arg Glu Met Gly Asn Glu Asn Ile Pro Ser Thr Val Ser 1010 1015 1020 Thr Ile Ser Arg Asn Asn Ile Arg Giu Asn Val Phe Lys Glu Ala Ser 1025 1030 1035 1040 Ser Ser Asn Ile Asn Giu Val Gly Ser Ser Thr Asn Giu Val Gly Ser 1045 1050 1055 Ser Ilie Asn Glu Ile Gly Ser Ser Asp Giu Asn Ile Gin Ala Giu Leu 1060 1065 1070 Gly Arg Asn Arg Gly Pro Lys Leu Asn Ala Met Leu Arg Leu Gly Val 1075 1080 1085 Leu Gin Pro Giu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 1090 1095 1100 His Pro Glu Ile Lys Lys Gin Giu Tyr Giu Glu Val Val Gin Thr Val 1105 1110 1115 1120 Asn Thr Asp Phe Ser Pro Tyr Leu Ile Ser Asp Asn Leu Glu Gin Pro 1125 1130 1135 Met Gly Ser Ser His Ala Ser Gin Val Cys Ser Giu Thr Pro Asp Asp 1140 1145 1150 Leu Leu Asp Asp Gly Glu Ile Lys Giu Asp Thr Ser Phe Ala Giu Asn 1155 1160 1165 Asp Ile Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 1170 1175 1180 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 55 61 W 9633271 pCTIUS96/05621 Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 1185 1190 1195 1200 Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu 1205 1210 1215 Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 1220 1225 1230 Lys Val Asn Asn Ile Pro Ser Gln Ser Thr Arg His Ser Thr Val Ala 1235 1240 1245 Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu Lys 1250 1255 1260 Asn Ser Leu Asn Asp Cys Ser Asn Gin Val Ile Leu Ala Lys Ala Ser 1265 1270 1275 1280 Gin Glu His His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu Phe 1285 1290 1295 Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 1300 1305 1310 Gin Asp Pro Phe Leu Ile Gly Ser Ser Lys Gin Met Arg His Gin Ser 1315 1320 1325 Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 1330 1335 1340 Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn Lys Lys Ser Lys Ala Trp 1345 1350 1355 1360 Ile Gin Thr INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 1852 amino acids TYPE. amino acid STRAIKDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: Met Asp Leu Ser Ala Leu Arg Val Glu Glu Val Gin Asn Val Ile Asn 1 5 10 Ala Met Gin Lys Ile Leu Glu Cys Pro Ile Cys Leu Glu Lou Ile Lys 25 Giu Pro Val Ser Thr Lys Cys Asp His Ile Phe Cys Lys Phe Cys Met 40 Leu Lys Leu Leu Asn Gin Lys Lys Gly Pro Ser Gin Cys Pro Leu Cys 55 Lys Asn Asp Ile Thr Lys Arg Ser Leu Gin Glu Ser Thr Arg Phe Ser 70 75 Trp Phe Ser Arg Ser Asp Glu Leu Leu Gly Ser Asp Asp Ser His Asp 385 390 395 400 WO 96/33271 Gin Leu Thr Gly Asn Ser Gly Tyr 130 Pro Ser 145 Thr Val Ser Val Lys Ala Pro Gin 210 Ala Cys 225 Pro Ser Hius Pro Pro Cys Ser Leu 290 Cys Asn 305 Trp Ala Giu Lys Trp Asn Asp Val 370 Trp Phe 385 Gly Giu PCT/US96/05621l Gin Leu Asp Lys Giu Asn 110 Gin Ser Met Pro Giu Asn Asn Leu Gly 160 Gin Lys Thr 175 Thr Val Asn 190 GinIlie Thr Lys Lys Ala H-is His Gin 240 Ala Giu Arg 255 His Val Giu 270 Giu Asn Ser i±a Giu Phe s Asn An Pro Ser Thr Arg Lys Giu '150 Thr C Val Asii Giu Ser His Asp 400 Asp Val Leu 415 WO 96/3,
A.,
Se Ty 46 Lei Pr( HiE, Prc Val 545 Ser Giu Asn Asn Leu.
625 Ilie Gin Glu Ser Ala 705 Phe V Thr v '1 4 PheVa Afl roS~ Leu Pro Arg Glu Glu LYS GiU UluU Lyb 1 ~7257375 WO 96/3327 1 Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys Ile 420 425 PCTIUS96/05621 Asp Leu Leu 430 Ala Ser Tyr 465 Leu Pro His Pro Val 545 Ser Glu Asn Asn Leu 625 Ile Gin Glu Ser Al a 705 Phe Thr Asp 435 S er Lys Ile Thr Glu 515 Met Asn Gin Al a Glu 595 Leu Val1 Ser Pro Al a 675 Arg Gly Asn Lys Pro Val Lys Gly Asn 500 Asp Ile Ile Asn Phe 580 Leu Arg Ser Cys Val 660 Thr His S er Pro Val 740 Ile Glu Asn Giu Lys 505 Ala Asn His Asn Glu 585 His Thr Pro Glu Asn 665 Ser Phe S er Glu Glu 745 Arg Gly Thr Gin Ser 510 Gin Asn Lys Leu Ser 590 Pro Ala Giu Lys Giu 670 Giu Leu Leu Lys Leti 750 Val Lys Gilu Gi u 495 Gly Lys Gly Gly Giu 575 Ile Lys Leu Leu Tyr 655 Gly Gin Thr Lys Leu 735 Met His Thr Asn 480 Arg Leu Thr Gin Asp 560 Lys S er Lys Glu Gin 640 Asn Lys Thr Asn Glu 720 Glu Leu Gin Leu Val Giu Giu Leu Leu Lys Ile Ile Cys Ala Phe Gin Leu Asp 90 WO 96/33271 PCTUJS96/05621 Ser Gly Giu Arg Val Leu Gin Thr Giu Arg Ser Val Giu Ser Ser Ser 755 760 765 Ile Ser Leu Vai 770 Leu Leu Giu Val 785 Cys Val Ser Gin Gly Cys Ser Lys 820 Leu Giy His Giu 835 Giu Ser Giu Leu 850 Lys Arg Gin Ser 865 Giu Cys Ala Thr Pro LyG Vai Thr 900 Asn Giu Ser Asn 915 Phe Pro Vai Val 930 Ser Ile Lys Gly 945 Asn Giu Thr Gly Pro Tyr Arg Ile 980 Lys Cys Lys Lys 995 Ser Pro GJlu Arg 1010 Thr Ile Ser Arg 1025 Ser Ser Asn Ile Pro Gly Thr Asp Tyr 775 Ser Thr Leu Gly Lys 790 Cys Ala Ala Phe Giu 805 Asp Asn Arg Asn Asp 825 Val Asn His Ser Arg 840 Asp Ala Gin Tyr Leu 855 Phe Ala Pro Phe Ser 870 Phe Ser Ala His Ser 885 Phe Giu Cys Giu Gin 905 Ile Lys Pro Vai Gin 920 Giy Gin Lys Asp Lys 935 Giy Ser Arg Phe Cys 950 Leu Ile Thr Pro Asin 965 Pro Pro Leu Phe Pro 985 Asn Leu Leu Giu Giu 1000 Giu Met Gly Asn Glu 10i5 Asn Asn Ile Arg Giu 1030 Asn Giu Vai Giy Ser 1045 Ile Giy Ser Ser Asp 0 1065 Gly Thr Gin Giu Ser Ile Ser 780 Ala Lys Thr Giu Pro Asn Lys 795 800 Asn Pro Lys Gly Leu Ile His 810 815 Thr Giu Giy Phe Lys Tyr Pro.
830 Giu Thr Ser Ile Glu Met Giu 845 Gin Asn Thr Phe Lys Vai Ser 860 Asn Pro Gly Asn Aia Giu Giu 875 880 Giy Ser Leu Lys Lys Gin Ser 890 895 Lys Giu Giu Asn Gin Giy Lys 910 Thr Vai Asn Ile Thr Aia Gly 925 Pro Val Asp Asn Ala Lys Cys 940 Leu Ser Ser Gin Phe Arg Giy 955 960 Lys His Giy Leu Leu Gin Asn 970 975 Ile Lys Ser Phe Vai Lys Thr 990 Asn Phe Giu Giu His Ser Met i005 Asn Ile Pro Ser Thr Vai Ser 1020 Asn Vai Phe Lys Giu Aia Ser 1035 1040 Ser Thr Asn Giu Vai Giy Ser 1050 1055 Giu Asn Ile Gin Ala Giu Leu 1070 Ala Met Leu Arg Leu Giy Val 1085 Ser Ile Asn Giu 106( Gly Arg Asn 1075 Arg Gly Pro Lys Leu Asn 1080 Asn Glu Val Asp Glu Tyr Ser Gly Ser Ser Glu Lys Ile Asp Leu Leu 420 425 430 51
B!
WO 96/33271 PCT/US96/05621 Leu Gin Pro Glu Val Tyr Lys Gin Ser Leu Pro Gly Ser Asn Cys Lys 1090 1095 1100 His Pro Glu Ile Lys Lys Gin Glu Tyr Glu Glu Val Val Gin Thr Val 1105 1110 1115 1120 Asn Thr Asp Phe Ser Pro Tyr Leu Ile Ser Asp Asn Leu Glu Gin Pro 1125 1130 1135 Met Gly Ser Ser His Ala Ser Gin Val Cys Ser Glu Thr Pro Asp Asp 1140 1145 1150 Leu Leu Asp Asp Gly Glu Ile Lys Glu Asp Thr Ser Phe Ala Glu Asn 1155 1160 1165 Asp Ile Lys Glu Ser Ser Ala Val Phe Ser Lys Ser Val Gin Lys Gly 1170 1175 1180 Glu Leu Ser Arg Ser Pro Ser Pro Phe Thr His Thr His Leu Ala Gin 1185 1190 1195 1200 Gly Tyr Arg Arg Gly Ala Lys Lys Leu Glu Ser Ser Glu Glu Asn Leu 1205 1210 1215 Ser Ser Glu Asp Glu Glu Leu Pro Cys Phe Gin His Leu Leu Phe Gly 1220 1225 1230 Lys Val Asn Asn Ile Pro Ser Gin Ser Thr Arg His Ser Thr Val Ala 1235 1240 1245 Thr Glu Cys Leu Ser Lys Asn Thr Glu Glu Asn Leu Leu Ser Leu Lys 1250 1255 1260 Asn Ser Leu Asn Asp Cys Ser Asn Gin Val Ile Leu Ala Lys Ala Ser 1265 1270 1275 1280 Gin Glu His His Leu Ser Glu Glu Thr Lys Cys Ser Ala Ser Leu Phe 1285 1290 1295 Ser Ser Gin Cys Ser Glu Leu Glu Asp Leu Thr Ala Asn Thr Asn Thr 1300 1305 1310 Gin Asp Pro Phe Leu Ile Gly Ser Ser Lys Gin Met Arg His Gln Ser 1315 1320 1325 Glu Ser Gin Gly Val Gly Leu Ser Asp Lys Glu Leu Val Ser Asp Asp 1330 1335 1340 Glu Glu Arg Gly Thr Gly Leu Glu Glu Asn Asn Gin Glu Glu Gin Ser 1345 1350 1355 1360 Met Asp Ser Asn Leu Gly Glu Ala Ala Ser Gly Cys Glu Ser Glu Thr 1365 1370 1375 Ser Val Ser Glu Asp Cys Ser Gly Leu Ser Ser Gin Ser Asp Ile Leu 1380 1385 1390 Thr Thr Gin Gin Arg Asp Thr Met Gin His Asn Leu Ile Lys Leu Gin 1395 1400 1405 Gin Glu Met Ala Glu Leu Glu Ala Val Leu Glu Gin His Gly Ser Gin 1410 1415 1420 I 79 i WO 96/33271 PCTIUS96/05621 Pro Ser Asn Ser Tyr Pro Ser Ile Ile Ser Asp Ser Ser Ala Leu Giu 1425 1430 1435 1440 Asp Leu Arg Asn Pro Giu Gin Ser Thr Ser Giu Lys Ala Val Leu Thr 1445 1450 1455 Ser Gln Lys Ser Ser Giu Tyr Pro Ile Ser Gln Asn Pro Giu Gly Leu 1460 1465 1470 Ser Ala Asp Lys Phe Glu Val Ser Ala Asp Ser Ser Thr Ser Lys Asn 1475 1480 1485 Lys Giu Pro Gly Vai Giu Arg Ser Ser Pro Ser Lys Cys Pro Ser Leu 1490 1495 1500 Asp Asp Arg Trp Tyr Met His Ser Cys Ser Gly Ser Leu Gin Asn Arg 1505 1510 1515 1520 Asn Tyr Pro Ser Gin Glu Giu Leu Ile Lys Val Val Asp Vai Glu Gl-i 1525 1530 1535 Gin Gin Leu Giu Giu Ser Gly Pro His Asp Leu Thr Giu Thr Ser Tyr 1540 1545 1550 L~u Pro Arg Gin Asp Leu Giu Gly Thr Pro Tyr Leu Giu Ser Gly Ile 1555 1560 1565 Ser Leu Phe Ser Asp Asp Pro Giu Ser Asp Pro Ser Giu Asp Arg Ala 1570 1575 1580 Pro Glu Ser Ala Arg Val Gly Asn lie Pro ber Ser Thr Ser Ala Leu 1585 1590 1595 1600 Lys Val Pro Gin Leu Lys Val Ala Giu Ser Ala Gin Ser Pro Ala Ala 1605 1610 1615 Ala His Thr Thr Asp Thr Ala Gly Tyr Asn Ala Met Giu Glu Ser Val 1620 1625 1630 Ser Arg Giu Lys Pro Giu Leu Thr Ala Ser Thr Giu Arg Val Asn Lys 1635 1640 1645 Arg Met Ser Met Val Val Ser Gly Leu Thr Pro Giu Glu Phe Met Leu 1650 1655 1660 Val Tyr Lys Phe Ala Arg Lys His His Ile Thr Leu Thr Asn Leu Ile 1665 1670 1675 1680 Thr Giu Giu Thr Thr His Val Vai Met Lys Thr Asp Aia Giu Phe Val 1685 1690 1 395 Cys Giu Arg Thr Leu Lys Tyr Phe Leu Giy Ile Ala Giy Gly Lys Trp 1700 1705 1710 Val Val Ser Tyr Phe Trp Val Thr Gin Ser Ile Lys Giu Arg Lys Met 1715 1720 1725 Le~u Asn Giu His Asp Phe Giu Val Arg Giy Asp Val Val Asn Giy Arg 1730 1735 1740 Asn His Gin Gly Pro Lys Arg Ala Arg Giu Ser Gin Asp Arg Lys Ile 1745 1750 1755 1760 WO 96/33271 PCTIUS96/05621 Phe Arg Gly Leu Giu Ile Cys Cys Tyr GlY Pro Phe Thr Asn Met Pro 1765 1770 1775 Thr Asp Gin Leu Giu Trp Met Val Gin Leu Cys Giy Ala Ser Vai Val 51780 1785 1790 Lys Giu Leu Ser Ser Phe Thr Leu Gly Thr Giy Vai His Pro Ile Val 1795 1800 1805 Vai Val Gin Pro Asp Ala Trp Thr Glu Asp Asn Gly Phe His Ala Ile 1810 1815 1820 Gly Gin Met Cys Giu Ala Pro Val Val Thr Arg Giu Trp Val Leu Asp 1825 1830 1835 1840 Ser Val Ala Leu Tyr Gin Cys Gin Giu Leu Asp Thr 1845 IRC;0 81 A9

Claims (12)

1. isolated nucleic acid comprising BRCAI allele 95803 (SEQ ED NO:1t), 9601 (SEQ ID 9815 (SEQ ID 1NO1:3),82203 (SEQ ID NO:5), 388 (SEQUENCE ED NO:6), or a fragment thereof, wherein said fragment is capable of specifically h-ybridizin~g with said allele in the presence of wld-type 13RCAI.
2. An isolated nucleic acid comprising BRCAl allele #5803 (SEQ ID NO: 1) or a fragment thereof, wherein said fragment is capable of specifically hybridizing with said allele in the presence of wild-type BRCAI.
3. An isolated nucleic acid comprising )3RCA. allele "13601 (SEQ ED NO:2) or a fragment thereof, wherein said fragment is capable of specifically hybridizing with said allele in the presence of wild-type BRCAl.
4. An isolated nucleic acid comprising BRCAI allele M981.5 (SE-Q ED NO:3) or a frag,?ment thereof, wherein said fragment is capable of specifically hybridizing with said allele in the presence of wild-type BIRCAI. An isolated nucleic acid comprising BRCAI allele '#t8203 (SEQ ID NO-5') of a fragment thereof, wherein said figmni 13 i capable of specifically hybridizing with said allele in th~e presence of wild-type BRCA1.
6. An isolated nucleic acid comprising BRCA1 allele #388 (SEQUENCE ID NO:6), or a fragment thereof, wherein said fragment is capable of specifically hybridizing witlh said allele in the presence of wild-type BRCA1.
7. An isolated translation product of BRCAI allel e #6 803 (SEQ ID_ NO: t13), 96 01 (SEQ ID NO: 14), 9815 (SEQ ED NO: 15), 8203 (SEQ ED NO: 17), 388 (SEQ ID _NO: 18), or a C-terminus fragment thereof
8. An isolated translation product of BRCA1 allele 803 (SEQ ED NO- 13) or a C- terminus fragment thereof ~NrO~AMENDED SHEET TI
9. An isolated translation product ofBRCA1 alleLe #9601 (SEQ D NO: 14) or a C- terminus fragment thereof. An isolated translation product ofBRCA1 allele #98 5 (SEQ ID NO:15) or a C- terminus fragment thereof.
11. An isolated translation product of BRCA1 allele #8203 (SEQ ID NO:17) or a C- terminus fragment thereof.
12. An isolated translation product of BRCA1 allele 388 (SEQ ID NO:18) or a C- terminus fragment thereof.
13. A method of screening a patient for a cancer a.e susceptibility, said method comprising the steps of: isolating from said patient a first nucleic acid comprising e0* at least one BRCA1 allele or fragment thereof; contacting said sample with a second nucleic acid according to any one of claims 1, 2, 3, 4, 5 and 6, under conditions whereby said second nucleic acid is capable of specifically hybridizing with said first nucleic acid; detecting the presence or absence of specific hybridization of said second nucleic acid with said first nucleic acid; wherein the presence of specific hybridization of said second nucleic acid to said first nucleic acid is diagnostic of a cancer susceptibility. e e e2
14. A method of screening a patient for a cancer susceptibility, said method comprising the steps of: isolating from said patient a composition comprising a first translation product of at least one BRCA1 allele; contacting said first translation product with a BRCA1 gene product specific binding agent specific for a protein or C-terminal fragment thereof according to any one of claims 7, 8, 9, 10, 11 and 12 under conditions wherein said reagent is capable of specifically binding said first translation product; detecting the presence or absence of specifically bound complexes of said reagent and said first translation product; wherein the presence of said complexes correlates with a cancer susceptibility.
AU55668/96A 1995-04-19 1996-04-19 Genetic markers for breast and ovarian cancer Expired AU698800B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/425,061 US5622829A (en) 1993-12-08 1995-04-19 Genetic markers for breast, ovarian, and prostatic cancer
US08/425061 1995-04-19
PCT/US1996/005621 WO1996033271A2 (en) 1995-04-19 1996-04-19 Genetic markers for breast and ovarian cancer

Publications (2)

Publication Number Publication Date
AU5566896A AU5566896A (en) 1996-11-07
AU698800B2 true AU698800B2 (en) 1998-11-05

Family

ID=23684980

Family Applications (1)

Application Number Title Priority Date Filing Date
AU55668/96A Expired AU698800B2 (en) 1995-04-19 1996-04-19 Genetic markers for breast and ovarian cancer

Country Status (9)

Country Link
US (3) US5622829A (en)
EP (1) EP0821733B1 (en)
JP (1) JPH11503915A (en)
AT (1) ATE280226T1 (en)
AU (1) AU698800B2 (en)
CA (1) CA2217668C (en)
DE (1) DE69633664T2 (en)
ES (1) ES2231808T3 (en)
WO (1) WO1996033271A2 (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040014051A1 (en) * 2002-07-18 2004-01-22 Isis Pharmaceuticals Inc. Antisense modulation of breast cancer-1 expression
US5605798A (en) 1993-01-07 1997-02-25 Sequenom, Inc. DNA diagnostic based on mass spectrometry
US5622829A (en) * 1993-12-08 1997-04-22 The Regents Of The University Of California Genetic markers for breast, ovarian, and prostatic cancer
US5643722A (en) 1994-05-11 1997-07-01 Trustees Of Boston University Methods for the detection and isolation of proteins
US6210941B1 (en) 1997-06-27 2001-04-03 The Trustees Of Boston University Methods for the detection and isolation of proteins
US6403303B1 (en) * 1996-05-14 2002-06-11 Visible Genetics Inc. Method and reagents for testing for mutations in the BRCA1 gene
US6428955B1 (en) 1995-03-17 2002-08-06 Sequenom, Inc. DNA diagnostics based on mass spectrometry
US5830655A (en) 1995-05-22 1998-11-03 Sri International Oligonucleotide sizing using cleavable primers
US6083698A (en) * 1995-09-25 2000-07-04 Oncormed, Inc. Cancer susceptibility mutations of BRCA1
DE19629938C1 (en) * 1996-07-24 1997-11-27 Gsf Forschungszentrum Umwelt Monoclonal antibodies specific for mutated E-Cadherin peptide sequences
US5905026A (en) 1996-08-27 1999-05-18 Cornell Research Foundation, Inc. Method of detecting expression of and isolating the protein encoded by the BRCA1 gene
WO1998012327A2 (en) * 1996-09-20 1998-03-26 Board Of Regents, The University Of Texas System Compositions and methods comprising bard1 and other brca1 binding proteins
AU5093898A (en) * 1996-10-31 1998-05-22 Jennifer Lescallett Primers for amplification of brca1
US20030129589A1 (en) * 1996-11-06 2003-07-10 Hubert Koster Dna diagnostics based on mass spectrometry
US5965377A (en) * 1997-03-24 1999-10-12 Baystate Medical Center Method for determining the presence of mutated BRCA protein
EP0878552A1 (en) * 1997-05-13 1998-11-18 Erasmus Universiteit Rotterdam Molecular detection of chromosome aberrations
AU754748B2 (en) * 1997-06-04 2002-11-21 Rijksuniversiteit Te Leiden A diagnostic test kit for determining a predisposition for breast and ovarian cancer, materials and methods for such determination
US7014993B1 (en) * 1997-08-21 2006-03-21 The Board Of Trustees Of The University Of Arkansas Extracellular serine protease
US6207370B1 (en) 1997-09-02 2001-03-27 Sequenom, Inc. Diagnostics based on mass spectrometric detection of translated target polypeptides
US6030832A (en) * 1997-11-21 2000-02-29 Myriad Genetics, Inc. Carboxy-terminal BRCA1 interacting protein
CA2327542C (en) * 1998-05-04 2011-11-22 Dako A/S Method and probes for the detection of chromosome aberrations
US6723564B2 (en) 1998-05-07 2004-04-20 Sequenom, Inc. IR MALDI mass spectrometry of nucleic acids using liquid matrices
AU2004202605B2 (en) 1998-07-13 2007-10-25 Board Of Regents, The University Of Texas System Cancer treatment methods using antibodies to aminophospholipids
AU5128799A (en) * 1998-07-24 2000-02-14 Government Of The United States Of America, As Represented By The Secretary Of The Department Of Health And Human Services, The Pb 39, a gene dysregulated in prostate cancer, and uses thereof
US7226731B1 (en) 1998-07-24 2007-06-05 The United States Of America, As Represented By The Secretary Of The Department Of Health And Human Services PB 39, a gene dysregulated in prostate cancer, and uses thereof
EP1234585A3 (en) 1998-09-04 2004-01-21 The Regents Of The University Of Michigan Methods and compositions for the prevention or treatment of cancer
AU5460500A (en) 1999-06-04 2000-12-28 Massachusetts Institute Of Technology Compositions and methods for the screening of compounds to enhance or reduce apoptosis
US6306628B1 (en) 1999-08-25 2001-10-23 Ambergen, Incorporated Methods for the detection, analysis and isolation of Nascent proteins
US20030064372A1 (en) * 2000-06-22 2003-04-03 Bodnar Jackie S. Gene and sequence variation associated with lipid disorder
WO2002010436A2 (en) * 2000-07-28 2002-02-07 The Brigham And Women's Hospital, Inc. Prognostic classification of breast cancer
US6703204B1 (en) 2000-07-28 2004-03-09 The Brigham & Women's Hospital, Inc. Prognostic classification of breast cancer through determination of nucleic acid sequence expression
US7465540B2 (en) * 2000-09-21 2008-12-16 Luminex Corporation Multiple reporter read-out for bioassays
US20030188326A1 (en) * 2000-11-03 2003-10-02 Dana Farber Cancer Institute Methods and compositions for the diagnosis of cancer susceptibilities and defective DNA repair mechanisms and treatment thereof
WO2002099418A1 (en) * 2001-06-04 2002-12-12 Research Development Foundation Premalignant, serially-transplantable breast tissue lines and uses thereof
WO2003104474A2 (en) * 2002-06-07 2003-12-18 Myriad Genetics, Inc Large deletions in human brca1 gene and use thereof
NZ584715A (en) 2002-07-15 2011-12-22 Univ Texas Peptides binding to phosphatidylethanolamine and their use in treating viral infections and cancer
ES2410587T3 (en) 2004-01-22 2013-07-02 University Of Miami Topical formulations of coenzyme Q10 and methods of use
EP2136831B1 (en) 2007-03-02 2012-09-12 The Cleveland Clinic Foundation Anti-angiogenic peptides
KR20100102110A (en) 2007-11-09 2010-09-20 페레그린 파마수티컬즈, 인크 Anti-vegf antibody compositions and methods
US8167949B2 (en) * 2008-01-25 2012-05-01 Aesculap Implant Systems, Llc Hydrostatic interbody
CN111253483B (en) * 2020-03-02 2021-07-30 江苏莱森生物科技研究院有限公司 anti-BRCA 1 monoclonal antibody and application thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4742000A (en) * 1986-05-02 1988-05-03 University Of Chicago Antibody to human progesterone receptor and diagnostic materials and methods
DK1013666T3 (en) * 1990-06-27 2006-08-14 Univ Princeton Protein complex p53 / p90
US5597707A (en) * 1993-04-15 1997-01-28 Bristol-Myers Squibb Company Tumor associated antigen recognized by the murine monoclonal antibody L6, its oligonucleotide sequence and methods for their use
US5622829A (en) * 1993-12-08 1997-04-22 The Regents Of The University Of California Genetic markers for breast, ovarian, and prostatic cancer
WO1995019369A1 (en) * 1994-01-14 1995-07-20 Vanderbilt University Method for detection and treatment of breast cancer

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NATURE GENETICS VOL. 8 PP387-391 *
NATURE GENETICS VOL. 8 PP399-404 *
SCIENCE VOL. 266 PP120-122 *

Also Published As

Publication number Publication date
US6512091B1 (en) 2003-01-28
WO1996033271A3 (en) 1997-03-20
WO1996033271A2 (en) 1996-10-24
US5821328A (en) 1998-10-13
ES2231808T3 (en) 2005-05-16
CA2217668A1 (en) 1996-10-24
EP0821733B1 (en) 2004-10-20
EP0821733A2 (en) 1998-02-04
DE69633664T2 (en) 2006-02-23
CA2217668C (en) 1999-11-16
US5622829A (en) 1997-04-22
ATE280226T1 (en) 2004-11-15
JPH11503915A (en) 1999-04-06
AU5566896A (en) 1996-11-07
DE69633664D1 (en) 2004-11-25

Similar Documents

Publication Publication Date Title
AU698800B2 (en) Genetic markers for breast and ovarian cancer
Shen et al. Identification of the human prostatic carcinoma oncogene PTI-1 by rapid expression cloning and differential RNA display.
Russo et al. Genetic alterations in thyroid hyperfunctioning adenomas
JP3632171B2 (en) DNA segment encoding erbB-3 polypeptide, antibody and bioassay for detecting said polypeptide
US7994294B2 (en) Nucleic acids and polypeptides related to a guanine exchange factor of Rho GTPase
ES2397441T3 (en) Polynucleotide and polypeptide sequences involved in the bone remodeling process
CA2554380C (en) Mecp2e1 gene
EP0725799A1 (en) A novel tumor marker and novel method of isolating same
JP2002511749A (en) Tumor suppressor gene DBCCR1 at 9q32-33
US20020137135A1 (en) Novel NPG-1 gene that is differentially expressed in prostate tumors
AU7966498A (en) Methods and compositions for treating abnormal cell growth related to unwanted guanine nucleotide exchange factor activity
Weber et al. Structure and chromosomal localization of the mouse bombesin receptor subtype 3 gene
US20020009720A1 (en) Plag gene family and tumorigenesis
WO2002064611A1 (en) Compositions and methods relating to breast specific genes and proteins
US20040141918A1 (en) Nucleotide and deduced amino acid sequences of tumor gene Int6
US20040047845A1 (en) Methods and compositions for diagnosis and treatment of cancer based on the transcription factor ets2
WO2012048667A1 (en) Epidermal growth factor receptor variant lacking exon
US7037683B2 (en) Human longevity assurance protein, its coding sequence and their use
US20030104418A1 (en) Diagnostic markers for breast cancer
WO2001057186A2 (en) Methods and compositions for diagnosis and treatment of cancer based on esf
US20030023996A1 (en) Human fat-2 (hfat2)
KR20000062461A (en) Human tumor suppressor gene
AU1540202A (en) Guanine exchange factor of RHO GTPASE and nucleic acid encoding it
WO2004033493A1 (en) Human protein for promoting transformation of 3t3 cell and its coding sequence