EP0880586A1 - Lyst1 and lyst2 gene compositions and methods of use - Google Patents

Lyst1 and lyst2 gene compositions and methods of use

Info

Publication number
EP0880586A1
EP0880586A1 EP97904209A EP97904209A EP0880586A1 EP 0880586 A1 EP0880586 A1 EP 0880586A1 EP 97904209 A EP97904209 A EP 97904209A EP 97904209 A EP97904209 A EP 97904209A EP 0880586 A1 EP0880586 A1 EP 0880586A1
Authority
EP
European Patent Office
Prior art keywords
seq
lyst2
lyst
segment
lyst1
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP97904209A
Other languages
German (de)
French (fr)
Inventor
Stephen F. Kingsmore
Maria D. F. S. Barbosa-Alleyne
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Florida
Original Assignee
University of Florida
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Florida filed Critical University of Florida
Publication of EP0880586A1 publication Critical patent/EP0880586A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • A61P37/04Immunostimulants
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Definitions

  • the present invention relates generally to the field of molecular biology. More particularly, certain embodiments concern methods and compositions comprising novel DNA segments, and proteins derived from mammalian species. More particularly, the invention provides Lystl and Lyst2 gene compositions from murine origins and the homologous LYSTl and LYST2 gene compositions from human origins. Various methods for making and using these LYSTILyst DNA segments, native peptides and synthetic protein derivatives are disclosed, such as, for example, the use of DNA segments as diagnostic probes and templates for protein production, and the use of LYSTl, Lystl, LYST2, and Lyst2 proteins, fusion protein carriers and Lyst-derived peptides in various pharmacological and immunological applications.
  • Chediak-Higashi syndrome is an autosomal recessive, immune deficiency disease that maps on chromosome (Chr) Iq42-q43 (Goodrich and Holcombe, 1995; Barrat et al. 1996; Fukai etai, 1996).
  • Affected individuals have giant, perinuclear lysosomes, defective granulocyte, NK and cytolytic T cell function, and die prematurely of infection or malignancy (Beguez Cesar, 1943, Blume et al, 1968; Wolff et al., 1972; Blume and Wolff, 1972, Root et al., 1972; Roder et al., 1982; Baetz et al., 1995).
  • CHS patients also exhibit partial oculocutaneous albinism, platelet storage pool deficiency and neurologic defects such as peripheral neuropathy and ataxia (Windhorst et al. , 1968, Meyers et al. , 1974, Maeda et al. , 1989, Pettit and Berdal, 1984, Misra et al.
  • CH gene product LYSTl
  • LYSTl CH gene product
  • the isolation and sequencing of the Chediak-Higashi gene (LYSTl) from both murine and human sources has now provided methods of detecting CHS at the gene level, such as by various assays making use of the gene, gene segments and/or the encoded proteins or polypeptides.
  • the gene provides a tool for understanding and controlling mechanisms of regulation of protein trafficking to lysosomes, and particularly to the contribution of vesicular sorting to diverse cellular functions.
  • An immediate result of the identification of the LYSTl gene is the ability to perform linkage analysis and to identify individuals at risk to have progeny carrying the mutated gene.
  • the inventors have shown that the murine gene, Lystl, and BG sequences are derived from a single gene with alternatively spliced mRNAs. In an important embodiment, the inventors have also identified the human homolog of the bg gene (Lystl), LYSTl. LYSTl maps within the CHS critical region and is mutated in several CHS patients. 2.1 LYST and Lyst Gene Compositions
  • DNA segment refers to a DNA molecule that has been isolated free of total genomic DNA of a particular species Therefore, a DNA segment encoding LYST/Lyst refers to a DNA segment that contains LYST or Lyst coding sequences yet is isolated away from, or purified free from, total genomic DNA of the species from which the DNA segment is obtained Included within the term "DNA segment”, are DNA segments and smaller fragments of such segments, and also recombinant vectors, including, for example, plasmids, cosmids, phagemids, phage, viruses, and the like Preferred LYST genes are the LYSTl and LYST2 genes from human origin, while preferred Lyst genes are the Lystl and Lyst2 genes from murine origin
  • a DNA segment comprising an isolated or purified LYST/Lyst gene refers to a DNA segment including a LYST or Lyst coding sequence and, in certain aspects, regulatory sequences, isolated substantially away from other naturally occurring genes or protein encoding sequences
  • the term "gene” is used for simplicity to refer to a functional protein, polypeptide or peptide encoding unit
  • this functional term includes both genomic sequences, extra-genomic and plasmid-encoded sequences and smaller engineered gene segments that express, or may be adapted to express, proteins, polypeptides or peptides
  • Such segments may be naturally isolated, or modified synthetically by the hand of man Preferred DNAs are those which comprise one or more LYST genes, with human LYSTl and LYST2 genes being particularly preferred, or one or more Lyst genes, with murine Lystl and Lyst2 genes being particularly preferred
  • isolated substantially away from other coding sequences means that the gene of interest, in this case, a gene encoding a LYST/Lyst protein or peptide, forms the significant part of the coding region of the DNA segment, and that the DNA segment does not contain large portions of naturally-occurring coding DNA, such as large chromosomal fragments or other functional genes or polypeptide coding regions Of course, this refers to the DNA segment as originally isolated, and does not exclude genes or coding regions later added to the segment by the hand of man
  • the invention concerns isolated DNA segments and recombinant vectors incorporating DNA sequences that encode a LYST/Lyst species that includes within its amino acid sequence an amino acid sequence essentially as set forth in SEQ ED NO 2,
  • the inv 5ention concerns isolated DNA segments and recombinant vectors incorporating DNA sequences that include within their sequence a nucleotide sequence essentially as set forth in SEQ ID NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ID NO 9, SEQ LO NO 11, or SEQ ID NO 13
  • SEQ ID NO 6, SEQ ID NO 8, SEQ LD NO 10, SEQ LD NO 12, or SEQ LD NO 14 means that the sequence substantially corresponds to a portion of SEQ LD NO 2, SEQ LD NO 4, SEQ LD NO 6, SEQ ID NO 8, SEQ ID NO 10, SEQ LD NO 12, or SEQ ID NO 14, and has relatively few amino acids that are not identical to, or a biologically functional equivalent of, the amino acids of SEQ ID NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ D NO 10, SEQ LD NO.12, or
  • sequences that have between about 70% and about 80%, or more preferably, between about 81% and about 90%, or even more preferably, between about 91% and about 99%, of amino acids that are identical or functionally equivalent to the amino acids SEQ ID NO.2, SEQ ID NO 4, SEQ LD NO 6, SEQ ED NO 8, SEQ LD NO 10, SEQ ID NO 12, or SEQ ED NO 14 will be sequences that are "essentially as set forth in SEQ ID NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ LD NO 8, SEQ ID NO 10, SEQ ID NO 12, or SEQ LD NO 14"
  • the invention concerns isolated DNA segments and recombinant vectors that include within their sequence a nucleic acid sequence essentially as set forth in SEQ LD NO 1, SEQ ED NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ ID NO 9, SEQ LD NO 11, or SEQ ID NO 13
  • the term "essentially as set forth in SEQ LD NO 1, SEQ LD NO.3, SEQ LD NO 5, SEQ ID NO 7, SEQ ID NO 9, SEQ ID NO 1 1, or SEQ LD NO 13” is used in the same sense as described above and means that the nucleic acid sequence substantially corresponds to a portion of SEQ ID NO.1, SEQ LD NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ LD NO 9, SEQ ID NO 11, or SEQ LD NO: 13 and has relatively few codons that are not identical, or functionally equivalent, to the codons of SEQ ID NO 1, SEQ ID NO 3, SEQ LD NO.5, SEQ ED NO 7, SEQ LD NO 9, SEQ ID NO 11
  • amino acid and nucleic acid sequences may include additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences, and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence meets the criteria set forth above, including the maintenance of biological protein activity where protein expression is concerned
  • additional residues such as additional N- or C-terminal amino acids or 5' or 3' sequences
  • terminal sequences particularly applies to nucleic acid sequences that may, for example, include various non-coding sequences flanking either of the 5' or 3' portions of the coding region or may include various upstream or downstream regulatory or structural genes
  • nucleic acid sequences that are “complementary” are those that are capable of base-pairing according to the standard Watson-Crick complementarity rules
  • complementary sequences means nucleic acid sequences that are substantially complementary, as may be assessed by the same nucleotide comparison set forth above, or as defined as being capable of hybridizing to the nucleic acid segment of SEQ LD NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ ID NO 9, SEQ ED NO 11, or SEQ LD NO 13 under relatively stringent conditions such as those described herein
  • nucleic acid segments of the present invention may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably It is therefore contemplated that a nucleic acid fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol
  • nucleic acid fragments may be prepared that include a short contiguous stretch identical to or complementary to SEQ ID NOT, SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ LD NO 9, SEQ ED NO 11, or SEQ ED NO 3, such as about 14 nucleotides, and that are up to about 10,000 or about 5,000 base pairs in length, with segments of about 3,000 being preferred in certain cases DNA segments with total lengths of about 2,000, about 1,000, about 500, about 200, about 100 and about 50 base pairs in length (including all intermediate
  • intermediate lengths means any length between the quoted ranges, such as 14, 15, 16, 17, 18, 19, 20, etc., 21, 22, 23, etc., 30, 31, 32, etc., 50, 51, 52, 53, etc., 100, 101, 102, 103, 7 etc.; 150, 151, 152, 153, etc., including all integers through the 200-500, 501-1,000, 1,001-2,000, 2,001-3,000, 3,001-5,000, 5,001-10,000 ranges, up to and including sequences of about 12,001, 12,002, 12,003, 13,001, 13,002 and the like
  • this invention is not limited to the particular nucleic acid sequences disclosed in SEQ ED NO 1, SEQ LD NO 3, SEQ ID NO 5, SEQ ED NO.7, SEQ ED NO.9, SEQ ED NO,l 1, or SEQ ID NO 13, or to the particular amino acid sequences as disclosed in SEQ LD NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ ID NO TO, SEQ LD NO.12, or SEQ ED NO 14
  • Recombinant vectors and isolated DNA segments may therefore variously include the LYST or Lyst coding regions themselves, coding regions bearing selected alterations or modifications in the basic coding region, or they may encode larger polypeptides that nevertheless include LYST, Lyst, LYST-like, or Lyst-like coding regions or may encode biologically functional equivalent proteins or peptides that have variant amino acids sequences
  • fusion proteins and peptides e.g., where the LYST or Lyst coding regions are aligned within the same expression unit with other proteins or peptides having desired functions, such as for purification or immunodetection purposes (e.g., proteins that may be purified by affinity chromatography and enzyme label coding regions, respectively).
  • Recombinant vectors form further aspects of the present invention
  • Particularly useful vectors are contemplated to be those vectors in which the coding portion of the DNA segment, whether encoding a full length protein or smaller peptide, is positioned under the control of a promoter
  • the promoter may be in the form of the promoter that is naturally associated with a LYSTl, Lystl, LYST2, or Lyst2 gene, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment, for example, using recombinant cloning and/or PCRTM technology, in connection with the compositions disclosed herein
  • a recombinant or heterologous promoter is intended to refer to a promoter that is not normally associated with a LYST/Lyst gene in its natural environment
  • promoters may include LYST or Lyst promoters normally associated with other genes, and/or promoters isolated from any bacterial, viral, eukaryotic, or mammalian cell
  • promoter and cell type combinations for protein expression is generally known to those of skill in the art of molecular biology, for example, see Sambrook et al, 1989.
  • the promoters employed may be constitutive, or inducible, and can be used under the appropriate conditions to direct high level expression of the introduced DNA segment, such as is advantageous in the large-scale production of recombinant proteins or peptides.
  • Prokaryotic expression of nucleic acid segments of the present invention may be performed using methods known to those of skill in the art, and will likely comprise expression vectors and promotor sequences such as those obtained from tac, trp, lac, laclIV5 or T7.
  • expression of the recombinant LYSTl LYST2, Lystl or Lyst2 proteins is desired in eukaryotic cells, a number of expression systems are available and known to those of skill in the art.
  • An exemplary eukaryotic promoter system contemplated for use in high-level expression is the Pichia expression vector system (Pharmacia LKB Biotechnology).
  • DNA segments that encode peptide antigens from about 15 to about 100 amino acids in length, or more preferably, from about 15 to about 50 amino acids in length are contemplated to be particularly useful.
  • the LYST or Lyst genes and DNA segments may also be used in connection with somatic expression in an animal or in the creation of a transgenic animal.
  • the use of a recombinant vector that directs the expression of the full length or active LYST/Lyst protein is particularly contemplated.
  • Expression of a LYST/Lyst transgene in animals is particularly contemplated to be useful in the production of anti-LYST/Lyst antibodies for use in passive immunization methods, the detection of LYST/Lyst proteins, and the purification of
  • nucleic acid sequences disclosed herein also have a variety of other uses for example, they also have utility as probes or primers in nucleic acid hybridization embodiments
  • nucleic acid segments that comprise a sequence region that consists of at least a 14 nucleotide long contiguous sequence that has the same sequence as, or is complementary to, a 14 nucleotide long contiguous sequence of SEQ ED NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ ED NO 7, SEQ ED NO 9, SEQ LD NO 1 1, or SEQ ED NO 13 will find particular utility Longer contiguous identical or complementary sequences, e.g., those of about 20, 30, 40, 50, 100, 200, 500, 1000 (including all intermediate lengths) and even up to full length sequences will also be of use in certain embodiments
  • nucleic acid probes to specifically hybridize to LYST/Lyst-encoding sequences will enable them to be of use in detecting the presence of complementary sequences in a given sample
  • sequence information for the preparation of mutant species primers, or primers for use in preparing other genetic constructions
  • Nucleic acid molecules having sequence regions consisting of contiguous nucleotide stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, identical or complementary to SEQ ID NO 1, SEQ ID NO.3, SEQ ID NO 5, SEQ ED NO 7, SEQ ID NO 9, SEQ ED NO 1 1, or SEQ LD NO.13 are particularly contemplated as hybridization probes for use in, e.g., Southern and Northern blotting This would allow LYST/Lyst structural or regulatory genes to be analyzed, both in diverse cell types and also in various bacterial cells
  • the total size of fragment, as well as the size of the complementary stretch(es), will ultimately depend on the intended use or application of the particular nucleic acid segment Smaller fragments will generally find use in hybridization embodiments, wherein the length of the contiguous complementary region may be varied, such as between about 14 and about 100 nucleotides, but larger contiguous complementarity stretches may be used, according to the length complementary sequences one wishes to detect
  • hybridization probe of about 14-25 nucleotides in length allows the formation of a duplex molecule that is both stable and selective Molecules having contiguous complementary sequences over stretches greater than 14 bases in length are generally preferred, though, m order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained
  • nucleic acid molecules having gene-complementary stretches of 15 to 25 contiguous nucleotides, or even longer where desired
  • Hybridization probes may be selected from any portion of any of the sequences disclosed herein. All that is required is to review the sequence set forth in SEQ ID NO 1, SEQ LD NO.3, SEQ LD NO 5, SEQ ID NO 7, SEQ LD NO.9, SEQ LD NOT 1, or SEQ LD NO 13 and to select any continuous portion of the sequence, from about 14-25 nucleotides in length up to and including the full length sequence, that one wishes to utilize as a probe or primer
  • the choice of probe and primer sequences may be governed by various factors, such as, by way of example only, one may wish to employ primers from towards the termini of the total sequence
  • nucleic acid segment that includes a contiguous sequence from within SEQ ID NO 1, SEQ ED NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ED NO 9, SEQ LD NO 11, or SEQ ID NO 13, may alternatively be described as preparing a nucleic acid fragment
  • fragments may also be obtained by other techniques such as, e.g., by mechanical shearing or by restriction enzyme digestion
  • Small nucleic acid segments or fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer
  • fragments may be obtained by application of nucleic acid reproduction technology, such as the PCRTM technology of U S Patent 4,683,202 (incorporated herein by reference), by introducing selected sequences into recombinant vectors for recombinant production, and by other recombinant DNA techniques generally known to those of skill in the art of molecular biology.
  • nucleotide sequences of the invention may be used for their ability to selectively form duplex molecules with complementary stretches of the entire LYST/Lyst gene or gene fragments
  • relatively stringent conditions e.g., one will select relatively low salt and/or high temperature conditions, such as provided by about 0 02 M to about 0 15 M NaCl at temperatures of 50°C to 70°C
  • Such selective conditions tolerate little, if any, mismatch between the probe and the template or target strand, and would be particularly suitable for isolating LYST or Lyst genes.
  • nucleic acid sequences of the present invention in combination with an appropriate means, such as a label, for determining hybridization.
  • appropriate indicator means include fluorescent, radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of giving a detectable signal.
  • fluorescent label or an enzyme tag such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other environmental undesirable reagents.
  • enzyme tags colorimetric indicator substrates are known that can be employed to provide a means visible to the human eye or spectrophotometrically, to identify specific hybridization with complementary nucleic acid- containing samples.
  • the hybridization probes described herein will be useful both as reagents in solution hybridization as well as in embodiments employing a solid phase.
  • the test DNA or RNA
  • the test DNA is adsorbed or otherwise affixed to a selected matrix or surface. This fixed, single-stranded nucleic acid is then subjected to specific hybridization with selected probes under desired conditions.
  • the selected conditions will depend on the particular circumstances based on the particular criteria required (depending, for example, on the G+C content, type of target nucleic acid, source of nucleic acid, size of hybridization probe, etc.) Following washing of the hybridized surface so as to remove nonspecifically bound probe molecules, specific hybridization is detected, or even quantitated, by means of the label
  • Particular aspects of the invention concern the use of plasmid vectors for the cloning and expression of recombinant peptides, and particular peptide epitopes comprising either native, or site-specifically mutated LYST or Lyst proteins, peptides, or epitopes
  • recombinant vectors for the cloning and expression of recombinant peptides
  • peptide epitopes comprising either native, or site-specifically mutated LYST or Lyst proteins, peptides, or epitopes
  • Prokaryotic hosts are preferred for expression of the peptide compositions of the present invention
  • An example of a preferred prokaryotic host is E. coli, and in particular, E.
  • Enterobactenaceae species such as Salmonella typhimurmm and Serratia marcescens, or even other Gram-negative hosts including various Pseudomonas species may be used in the recombinant expression of the genetic constructs disclosed herein
  • plasmid vectors containing replicon and control sequences which are derived from species compatible with the host cell are used in connection with these hosts.
  • the vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells
  • E. coli may be typically transformed using vectors such as pBR322, or any of its derivatives (Bolivar et al, 1977)
  • pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells pBR322, its derivatives, or other microbial plasmids or bacteriophage may also contain, or be modified to contain, promoters which can be used by the microbial organism for expression of endogenous proteins
  • phage vectors containing replicon and control sequences that are compatible with the host microorganism can be used as transforming vectors in connection with these hosts
  • bacteriophage such as ⁇ GEMTM-l 1 may be utilized in making a recombinant vector which can be used to transform susceptible host cells such as E. coli LE392
  • promoters most commonly used in recombinant DNA construction include the ⁇ - lactamase (penicillinase) and lactose promoter systems (Chang et al , 1978, Itakura et al., ⁇ 911, Goeddel et al , 1979) or the tryptophan (trp) promoter system (Goeddel et al , 1980)
  • ⁇ - lactamase penicillinase
  • lactose promoter systems Chang et al , 1978, Itakura et al., ⁇ 911, Goeddel et al , 1979
  • trp tryptophan promoter system
  • eukaryotic microbes such as yeast cultures may also be used in conjunction with the methods disclosed herein.
  • Saccharomyces cerevisiae, or common bakers' yeast is the most commonly used among eukaryotic microorganisms, although a number of other species may also be employed for such eukaryotic expression systems.
  • the plasmid YRp7 for example, is commonly used (Stinchcomb et al, 1979; Kingsman et al, 1979; Tschemper et al., 1980).
  • This plasmid already contains the trpL gene which provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example ATCC44076 or PEP4-1 (Jones, 1977).
  • the presence of the trpL lesion as a characteristic of the yeast host cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan.
  • Suitable promoting sequences in yeast vectors include the promoters for 3- phosphoglycerate kinase (Hitzeman et al, 1980) or other glycolytic enzymes (Hess et al, 1968; Holland et ai, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- phosphoglycerate utase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase.
  • 3- phosphoglycerate kinase Hitzeman et al, 1980
  • other glycolytic enzymes Hess et al, 1968; Holland et ai, 1978
  • enolase glyceraldehyde-3-phosphat
  • the termination sequences associated with these genes are also ligated into the expression vector 3' of the sequence desired to be expressed to provide polyadenylation of the mRNA and termination.
  • Other promoters which have the additional advantage of transcription controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization.
  • Any plasmid vector containing a yeast-compatible promoter, an origin of replication, and termination sequences is suitable.
  • cultures of cells derived from multicellular organisms may also be used as hosts in the routine practice of the disclosed methods.
  • any such cell culture is workable, whether from vertebrate or invertebrate culture
  • interest has been greatest in vertebrate cells, and propagation of vertebrate cells in culture (tissue culture) has become a routine procedure in recent years
  • useful host cell lines are VERO and HeLa cells, Chinese hamster ovary (CHO) cell lines, and W138, BHK, COS-7, 293 and MDCK cell lines
  • Expression vectors for such cells ordinarily include (if necessary) an origin of replication, a promoter located in front of the gene to be expressed, along with any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional terminator sequences
  • control functions on the expression vectors are often obtained from viral material
  • promoters are derived from poiyoma, Adenovirus 2, and most frequently Simian Virus 40 (SV40)
  • the early and late promoters of SV40 virus are particularly useful because both are obtained easily from the virus as a fragment which also contains the SV40 viral origin of replication (Fiers et al, 1978) Smaller or larger SV40 fragments may also be used, provided there is included the approximately 250 bp sequence extending from the Hin ⁇ lll site toward the BgH site located in the viral origin of replication
  • promoter or control sequences normally associated with the desired gene sequence provided such control sequences are compatible with the host cell systems
  • the origin of replication may be obtained from either construction of the vector to include an exogenous origin, such as may be derived from SV40 or other viral (e.g., Poiyoma, Adeno, VSV, BPV) source, or may be obtained from the host cell chromosomal replication mechanism If the vector is integrated into the host cell chromosome, the latter is often sufficient.
  • an exogenous origin such as may be derived from SV40 or other viral (e.g., Poiyoma, Adeno, VSV, BPV) source, or may be obtained from the host cell chromosomal replication mechanism If the vector is integrated into the host cell chromosome, the latter is often sufficient.
  • polypeptides may be present in quantities below the detection limits of the Coomassie brilliant blue staining procedure usually employed in the analysis of SDS/PAGE gels, or that their presence may be masked by an inactive polypeptide of similar M r
  • detection techniques may be employed advantageously in the visualization of particular polypeptides of interest Immunologically-based techniques such as Western blotting using enzymatically-, radiolabel-, or fluorescently-tagged antibodies described herein are considered to be of particular use in this regard
  • the peptides of the present invention may be detected by using antibodies of the present invention in combination Iff with secondary antibodies having affinity for such primary antibodies This secondary antibody may be enzymatically- or radiolabeled, or alternatively, fluorescently-, or colloidal gold-tagged Means for the labeling and detection of such two-step secondary antibody techniques are well- known to those of skill m the art
  • LYST/Lyst gene is intended to mean a LYSTo ⁇ Lyst gene from a mammalian source, with human LYST and mu ⁇ ne Lyst genes being most preferred
  • LYST' genes are those genes derived from human sources
  • Lyst genes are those genes derived from mu ⁇ ne sources
  • LYSTl and LYST2 genes are two genes of the "LYST/Lyst” family which are isolated from humans, while Lystl and Lyst2 represent two genes of the "LYST/Lyst” family which are their murine homologs, respectively
  • LYST Lyst protein is intended to mean a LYST or Lyst protein isolated from a mammalian source, with human and murine peptides being most preferred
  • LYST proteins are those proteins encoded by LYST genes derived from human sources
  • Lyst proteins are those proteins encoded by Lyst genes derived from murine sources
  • LYSTl and LYST2 are the proper designations of two proteins of the "LYST/Lyst” protein family which are isolated from humans, while Lystl and Lyst2 represent the two homologous proteins of the LYST/Lyst protein family isolated from mu ⁇ nes
  • Lystl-II and Lystl-II are terms used to represent two isoforms of the murine isoforms of Lystl
  • LYSTl -I and LYSTl -II are terms used to represent two isoforms of the human LYSTl
  • Lyst2-I and Lyst2-II would represent two isoforms of the mu ⁇ ne Lyst2 protein
  • LYST2-I and LYST2-II would represent two isoforms of the human LYST2 protein
  • the present invention also concerns recombinant host cells for expression of an isolated LYSTl, Lystl, LYST2, or Lyst2 gene.
  • a host cell may be employed for this purpose, but certain advantages may be found in using a bacterial host cell such as E. coli, S. typhimurium, B. subtilis, or others. Expression in eukaryotic cells is also contemplated such as those derived from yeast, insect, or mammalian cell lines. These recombinant host cells may be employed in connection with "overexpressing" the LYSTl, Lystl, LYST2, or Lyst2 protein, that is, increasing the level of expression over that found naturally in mammalian cells.
  • a suitable vector for expression in mammalian cells is that described in TJ. S. Patent 5,168,050, incorporated herein by reference.
  • the coding segment employed encodes a protein or peptide of interest (e.g., the LYSTl, Lystl, LYST2, or Lyst2 protein) and does not include any coding or regulatory sequences that would have an adverse effect on cells. Therefore, it will also be understood that useful nucleic acid sequences may include additional residues, such as additional non-coding sequences flanking either of the 5' or 3' portions of the coding region or may include various regulatory sequences.
  • an appropriate epitope-encoding nucleic acid molecule After identifying an appropriate epitope-encoding nucleic acid molecule, it may be inserted into any one of the many vectors currently known in the art, so that it will direct the expression and production of the protein or peptide epitope of interest (e.g., the LYSTl, Lystl, LYST2, or
  • Lyst2 protein when incorporated into a host cell
  • the coding portion of the DNA segment is positioned under the control of a promoter.
  • the promoter may be in the form of the promoter which is naturally associated with a LYST1-, Lystl-, LYST2-, or Lyst2-encoding nucleic acid segment, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment, for example, using recombinant cloning and/or PCRTM technology, in connection with the compositions disclosed herein.
  • Direct amplification of nucleic acids using the PCRTM technology of U.S. Patents 4,683,195 and 4,683,202 (herein incorporated by reference) are particularly contemplated to be useful in such methodologies.
  • a recombinant or heterologous promoter is intended to refer to a promoter that is not normally associated with a LYSTl, Lystl, LYST2, or Lyst2-encoding DNA segment in its natural environment
  • promoters may include those normally associated with other genes, and/or promoters isolated from any other bacterial, viral, eukaryotic, or mammalian cell
  • recombinant promoters to achieve protein expression is generally known to those of skill in the art of molecular biology, for example, see Sambrook et al, (1989).
  • the promoters employed may be constitutive, or inducible, and can be used under the appropriate conditions to direct high level or regulated expression of the introduced DNA segment.
  • the currently preferred promoters are those such as CMV, RS V LTR, the SV40 promoter alone, and the SV40 promoter in combination with the SV40 enhancer.
  • the expression of recombinant LYSTl, Lystl, LYST2, or Lyst2 protein is carried out using prokaryotic expression systems, and in particular bacterial systems such as E.
  • nucleic acid segments of the present invention may be performed using methods known to those of skill in the art, and will likely comprise expression vectors and promotor sequences such as those obtained from tac, trp, lac, lacUV5 or T7 promotors
  • LYSTl, Lystl, LYST2, or Lyst2 protein and LYST1-, Lystl-, LYST2-, or Lyst2-derived epitopes once a suitable clone or clones have been obtained, whether they be native sequences or genetically-modified, one may proceed to prepare an expression system for the recombinant preparation of the LYSTl, Lystl, LYST2, or Lyst2 protein or peptides derived from one or more of the LYSTl, Lystl, LYST2, or Lyst2 proteins
  • the engineering of DNA segment(s) for expression in a prokaryotic or eukaryotic system may be performed by techniques generally known to those of skill in recombinant expression. It is believed that virtually any expression system may be employed in the expression of LYSTl, Lystl, LYST2, or Lyst2 proteins or epitopes derived from such proteins
  • DNA sequences encoding the desired , r epitope may be separately expressed in various eukaryotic systems as is well-known to those of skill in the art.
  • Genomic sequences are suitable for eukaryotic expression, as the host cell will, of course, process the genomic transcripts to yield functional mRNA for translation into protein
  • eukaryotic expression system may be utilized for the expression of the proteins of the present invention, or of peptides or epitopes derived from such proteins, e.g., baculovirus-based, glutamine synthase-based or dihydrofolate reductase-based systems may be employed
  • plasmid vectors incorporating an origin of replication and an efficient eukaryotic promoter as exemplified by the eukaryotic vectors of the pCMV series, such as pCMV5
  • pCMV5 eukaryotic vectors of the pCMV series
  • an appropriate polyadenylation site e.g., an appropriate polyadenylation site
  • the poly- A addition site is placed about 30 to 2000 nucleotides "downstream" of the termination site of the protein at a position prior to transcription termination.
  • any of the commonly employed host cells can be used in connection with the expression of the LYSTl, Lystl, LYST2, or Lyst2 proteins and epitopes derived therefrom in accordance herewith Examples include cell lines typically employed for eukaryotic expression such as 239, AtT-20, HepG2, VERO, HeLa, CHO, WI 38, BHK, COS-7, RIN and MDCK cell lines It is further contemplated that the protein 'Ts, peptides, or epitopic peptides derived from native or recombinant LYSTl, Lystl, LYST2, or Lyst2 proteins may be "overexpressed", i.e., expressed in increased levels relative to its natural expression in human cells, or even relative to the expression of other proteins in a recombinant host cell containing LYSTl -, Lystl-, LYST2-, or Lyst2-encoding DNA segments.
  • overexpression may be assessed by a variety of methods, lincluding radiolabeling and/or protein purification
  • facile and direct methods are preferred, for example, those involving SDS/PAGE and protein staining or Western blotting, followed by quantitative analyses, such as densitometric scanning of the resultant gel or blot.
  • a specific increase in the level of the recombinant protein or peptide in comparison to the level in natural LYST1-, Lystl-, LYST2-, or Lyst2-producing cells is indicative of overexpression, as is a relative abundance of the specific protein in relation to the other proteins produced by the host cell and, e.g., visible on a gel.
  • engineered or "recombinant” cell is intended to refer to a cell into which a recombinant gene, such as a gene encoding LYSTl, Lystl, LYST2, or Lyst2 has been introduced. Therefore, engineered cells are distinguishable from naturally occurring cells which do not contain a recombinantly introduced gene. Engineered cells are thus cells having a gene or genes introduced through the hand of man.
  • Recombinantly introduced genes will either be in the form of a single structural gene, an entire genomic clone comprising a structural gene and flanking DNA, or an operon or other functional nucleic acid segment which may also include genes positioned either upstream and/or downstream of the promotor, regulatory elements, with or without introns, or a cDNA clone comprising the structural gene itself, or even genes not naturally associated with the particular gene of interest.
  • constitutive eukaryotic promoters include viral promotors such as the cytomegalovirus (CMV) promoter, the Rous sarcoma long-terminal repeat (LTR) sequence, or the SV40 early gene promoter. The use of these constitutive promoters will ensure a high, constant level of expression of the introduced genes.
  • CMV cytomegalovirus
  • LTR Rous sarcoma long-terminal repeat
  • the level of expression from the introduced genes of interest can vary in different clones, or genes isolated from different strains or bacteria
  • the level of expression of a particular recombinant gene can be chosen by evaluating different clones derived from each transfection study, once that line is chosen, the constitutive promoter ensures that the desired level of expression is permanently maintained
  • promoters that are specific for cell type used for engineering such as the insulin promoter in insulinoma cell lines, or the prolactin or growth hormone promoters in anterior pituitary cell lines.
  • a further aspect of the invention is the preparation of immunological compositions, and in particular anti- LYST/Lyst antibodies for diagnostic and therapeutic methods relating to the detection and diagnosis of CHS.
  • Methods for diagnosing CHS and the detection of LYST/Lyst - encoding nucleic acid segments in clinical samples using nucleic acid compositions are also obtained from the invention.
  • the nucleic acid sequences encoding LYST/Lyst are useful as diagnostic probes using conventional techniques such as in Southern hybridization analyses or Northern hybridization analyses to detect the presence of LYST/Lyst nucleic acid segments within a clinical sample from a patient suspected of having such a condition.
  • nucleic acid sequences as disclosed in SEQ ID NO 1, SEQ ID NO 3, SEQ LD NO.5, SEQ LD NO: 7, SEQ LD NO 9, SEQ ID NO 11 and SEQ ID NO 13 are preferable as probes for such hybridization analyses
  • a method of generating an immune response in an animal generally involves administering to an animal a pharmaceutical composition comprising an immunologically effective amount of a peptide composition disclosed herein
  • Preferred peptide compositions include the peptide disclosed in SEQ ID NO.2, SEQ ID NO 4, SEQ ID NO 6, SEQ LD NO 8, SEQ ID NO: 10, SEQ ID NO 12, or SEQ ID NO 14.
  • the invention also encompasses LYST/Lyst and LYST/Lyst -derived peptide antigen compositions together with pharmaceutically-acceptable excipients, carriers, diluents, adjuvants, and other components, such as additional peptides, antigens, or outer membrane preparations, as may be employed in the formulation of particular vaccines
  • Antibodies may be of several types including those raised in heterologous donor animals or human volunteers immunized with the LYST/Lyst gene product, monoclonal antibodies (mAbs) resulting from hybridomas derived from fusions of B cells from immunized animals or humans with compatible myeloma cell lin ⁇ s, so-called "humanized” mAbs resulting from expression of gene fusions of combinatorial determining regions of mAb-encoding genes from heterologous species with genes encoding human antibodies, or LYST/Lyst -reactive antibody-containing fractions of plasma from human donors suspected of having CHS. It is contemplated that any of the techniques described above might be used for the vaccination of subjects for the purpose of antibody production Optimal dosing of such antibodies is highly dependent upon the pharmacokinetics of the specific antibody population in the particular species to be treated
  • the present invention also provides methods of generating an immune response, which methods generally comprise administering to an animal, a pharmaceutically-acceptable composition comprising an immunologically effective amount of a LYST/Lyst peptide composition
  • Preferred animals include mammals, and particularly humans Other preferred animals include murines, bovines, equines, porcines, canines, and felines
  • the composition may include partially or significantly purified LYST/Lyst peptide epitopes, obtained from natural or recombinant sources, which proteins or peptides may be obtainable naturally or either chemically synthesized, or alternatively produced vitro from recombinant host cells expressing DNA segments encoding such epitopes Smaller peptides that include reactive epitopes, such as those between about 10 and about 50, or even between about 50 and about 100 amino acids in length will often be preferred
  • the antigenic proteins or peptides may also be combined with other agents, such as other LYST/Lyst -related
  • immunologically effective amount an amount of a peptide composition that is capable of generating an immune response in the recipient animal This includes both the generation of an antibody response (B cell response), and/or the stimulation of a cytotoxic immune response (T cell response).
  • an immune response will have utility in both the production of useful bioreagents, e.g., CTLs and, more particularly, reactive antibodies, for use in diagnostic embodiments, and will also have utility in various prophylactic or therapeutic embodiments Therefore, although these methods for the stimulation of an immune response include vaccination regimens and treatment regimens, it will be understood that achieving either of these end results is not necessary for practicing these aspects of the invention
  • Further means contemplated by the inventors for generating an immune response in an animal includes administering to the animal, or human subject, a pharmaceutically-acceptable composition comprising an immunologically effective amount of a nucleic acid composition encoding a LYST/Lyst epitope, or an immunologically effective amount of an attenuated live organism that includes and expresses such a nucleic acid composition
  • the "immunologically effective amounts" are those amounts capable of stimulating a B cell and/or T cell response
  • Immunoformulations of this invention may comprise native, or synthetically- derived antigenic peptide fragments from these proteins As such, antigenic functional equivalents of the proteins and peptides described herein also fall within the scope of the present invention
  • an "antigenically functional equivalent” protein or peptide is one that incorporates an epitope that is immunologically cross-reactive with one or more epitopes derived from any of the particular proteins disclosed Antigenically functional equivalents, or epitopic sequences, may be first designed or predicted and then tested, or may simply be directly tested for cross-reactivity
  • the present invention concerns immunodetection methods and associated kits. It is contemplated that the proteins or peptides of the invention may be employed to detect antibodies having reactivity therewith, or, alternatively, antibodies prepared in accordance with the present invention, may be employed to detect LYST/Lyst proteins or peptides. Either type of kit may be used in the immunodetection of compounds, present within clinical samples, that are indicative of CHS The kits may also be used in antigen or antibody purification, as appropriate.
  • the preferred immunodetection methods will include first obtaining a sample suspected of containing a LYST/Lyst -reactive antibody, such as a biological sample from a patient, and contacting the sample with a first LYST/Lyst protein or peptide under conditions effective to allow the formation of an immunocomplex (primary immune complex) One then detects the presence of any primary immunocomplexes that are formed
  • a sample suspected of containing a LYST/Lyst -reactive antibody such as a biological sample from a patient
  • a first LYST/Lyst protein or peptide under conditions effective to allow the formation of an immunocomplex (primary immune complex)
  • LYST/LYST proteins include LYSTl and LYST2 from human origins, and Lystl and Lyst2 proteins derived from murine origins
  • Detection of primary immune complexes is generally based upon the detection of a label or marker, such as a radioactive, fluorescent, biological or enzymatic label, with enzyme tags such as alkaline phosphatase, urease, horseradish peroxidase and glucose oxidase being suitable
  • a label or marker such as a radioactive, fluorescent, biological or enzymatic label
  • enzyme tags such as alkaline phosphatase, urease, horseradish peroxidase and glucose oxidase being suitable
  • the particular antigen employed may itself be linked to a detectable label, wherein one would then simply detect this label, thereby allowing the amount of bound antigen present in the composition to be determined.
  • the primary immune.complexes may be detected by means of a second binding ligand that is linked to a detectable label and that has binding affinity for the first protein or peptide.
  • the second binding ligand is itself often an antibody, which may thus be termed a "secondary" antibody
  • the primary immune complexes are contacted with the labeled, secondary binding ligand, or antibody, under conditions effective and for a period of time sufficient to allow the formation of secondary immune complexes
  • the secondary immune complexes are then generally washed to remove any non-specifically bound labeled secondary antibodies and the remaining bound label is then detected
  • sample suspected of containing the antibodies of interest may be employed.
  • exemplary samples include clinical samples obtained from a patient such as blood or serum samples, bronchoalveolar fluid, ear swabs, sputum samples, middle ear fluid or even perhaps urine samples may be employed. This allows for the diagnosis of CHS and related disorders.
  • the clinical samples may be from veterinary sources and may include such domestic animals as cattle, sheep, and goats Samples from feline, canine, and equine sources may also be used in accordance with the methods described herein.
  • kits in accordance with the present invention contemplates the preparation of kits that may be employed to detect the presence of LYST/Lyst -specific antibodies in a sample
  • kits in accordance with the present invention will include a suitable protein or peptide together with an immunodetection reagent, and a means for containing the protein or peptide and reagent.
  • the immunodetection reagent will typically comprise a label associated with a LYST/Lyst protein or peptide, or associated with a secondary binding ligand.
  • exemplary ligands might include a secondary antibody directed against the first LYST/Lyst or peptide or antibody, or a biotin or avidin (or streptavidin) ligand having an associated label.
  • kits may contain antigen or antibody-label conjugates either in fully conjugated form, in the form of intermediates, or as separate moieties to be conjugated by the user of the kit.
  • the container means will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which the antigen may be placed, and preferably suitably allocated. Where a second binding ligand is provided, the kit will also generally contain a second vial or other container into which this ligand or antibody may be placed.
  • the kits of the present invention will also typically include a means for containing the vials in close confinement for commercial sale, such as, e.g., injection or blow-molded plastic containers into which the desired vials are retained.
  • LYST- or Eyst-encoding proteins it may be desirable to administer LYST- or Eyst-encoding proteins to the human or animal subject in a pharmaceutically acceptable composition comprising an immunologically effective amount of LYST or Lyst proteins or peptides mixed with other excipients, carriers, or diluents which may improve or otherwise alter stimulation of B cell and/or T cell responses, or immunologically inert salts, organic acids and bases, carbohydrates, and the like, which promote stability of such mixtures.
  • Immunostimulatory excipients may include salts of aluminum (often referred to as Alums), simple or complex fatty acids and sterol compounds, physiologically acceptable oils, polymeric carbohydrates, chemically or genetically modified protein toxins, and various particulate or emulsified combinations thereof.
  • Alums aluminum
  • simple or complex fatty acids and sterol compounds simple or complex fatty acids and
  • Attenuated organisms may be engineered to express recombinant LYST or Lyst proteins or peptides, and the organisms themselves be delivery vehicles for the invention.
  • Pox-, polio-, adeno-, or other viruses, and bacteria such as Salmonella, Shigella, Listeria, Streptococcus species may also be used in conjunction with the methods and compositions disclosed herein.
  • the naked DNA technology has been shown to be suitable for protection against infectious organisms.
  • DNA segments could be used in a variety of forms including naked DNA and plasmid DNA, and may administered to the subject in a variety of ways including parenteral, mucosal, and so-called microprojectile-based "gene-gun” inoculations.
  • LYST or Lyst nucleic acid compositions of the present invention in such immunization techniques is thus proposed to be useful as a vaccination strategy against Lyme 5 disease.
  • an optimal dosing schedule of a vaccination regimen may include as many as five to six, but preferably three to five, or even more preferably one to three administrations of the immunizing entity given at intervals of as few as two to four weeks, to as long as five to ten years, or occasionally at even longer intervals.
  • Blockade of such degranulation using dominant-negatively acting truncated Lyst peptides may reasonably be expected to be efficacious in inflammatory and 15 autoimmune diseases such as asthma, urticaria, inflammatory bowel disease, systemic lupus erythematosus, rheumatoid arthritis, psoriasis, systemic vasculitis, glomerulonephritis, multiple sclerosis, post-angioplasty restenosis. Proof of this principal is documented in Clark et al, 1982, who demonstrated that bg mice are protected from lupus nephritis.
  • Blockade of such degranulation using dominant-negatively acting truncated Lyst peptides may reasonably be expected to be efficacious in inflammatory and autoimmune diseases such as asthma, urticaria, inflammatory bowel disease systemic lupus erythematosus, rheumatoid arthritis, psoriasis, systemic vasculitis, glomerulonephritis, multiple 5 sclerosis, post-angioplasty restenosis. Proof of this principal is documented in Clark et al, (1982) who demonstrated that bg mice are protected from lupus nephritis.
  • Lyst peptides that mimic or augment Lyst function may reasonably be expected to be efficacious in the treatment of neoplasia. Proof of this principle is documented in Aboud et al. (1993) and Hayakawa et al. (1986), who demonstrate that bg mice and CHS patients are susceptible to development off neoplasia, and have more aggressive neoplasms with accelerated metastases
  • Lyst2 is thought to act to regulate degranulation of vesicles within cells in the brain and kidney Bblockade of such degranulation using dominant-negatively acting truncated Lyst2 peptides may reasonably be expected to be efficacious for the treatment of neurologic and renal degenerative diseases such as Alzheimer's disease, motor neuron disease, Parkinson's disease, acute tubular necrosis, glomerulonephritis and glomerulosclerosis
  • Drugs that mimic the action of dominant-negatively acting truncated Lyst2 peptides Lyst2 is thought to act to regulate degranulation of vesicles within cells in the brain and kidney Blockade of such degranulation using dominant-negatively acting truncated Lyst2 peptides may reasonably be expected to be efficacious for the treatment of neurologic and renal degenerative diseases such as Alzheimer's disease, motor neuron disease, Parkinson's disease, acute tubular necrosis, glomerulonephritis and glomerulosclerosis
  • FIG. IB Autoradiographs of corresponding Southern blots from the gels shown in FIG 1A hybridized with pBR322 (which cross-hybridizes to pYAC4 YAC clone numbers are shown above each panel and molecular size standards (in kilobases) to the left of each panel 1380 is the host S.
  • FIG. 2 STS content mapping of bg critical region YAC and PI clones
  • the presence of an STS (y-axis) in a YAC/P1 clone (x-axis) is indicated by a filled box Each contig is identified by the degree of shading of the box
  • the bg critical region extends from proximal to D13MM34 to the interval between D13MH207 and D13M ⁇ t 62/D13M ⁇ t305 (crossover location indicated by a double line)
  • STS used for isolation of YAC clones were N ⁇ d5' for 151H1, 195A8, 64F5, 93E4, 68E12, and 55F3, Estm9 for 148E1 1, D13MM34 for 165F7, D13Sfkl3 for 84A8, and D13M ⁇ t207 for 135G3 and 148H8 PI clones 8591 and 8592 were identified with D13Sfkl3 YAC clone 64F5 is
  • FIG. 3A Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse
  • FIG. 3B Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse Chr 1 genetic markers in 111 (C57BL/6J -W sh -bg J x Mus domes cus PAC)F, X
  • FIG. 3C Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse Chr 1 genetic markers in 1 11 (C57BL/6J- A -bg J x Mus usculus PWK)F, X C57BL/6J- bg backcross mice Closed boxes represent the homozygous C3H pattern and open boxes the Fi pattern Number of mice of each haplotype are indicated
  • FIG. 3D Genetic mapping of bg on mouse Chr 1
  • Composite linkage map of mouse Chr 13 in the vicinity of bg Loci are positioned according to their Approximate relative positions of loci were ascertained by integration of data from the above three backcrosses and from
  • FIG. 4 Autoradiograph of a pulse field gel Southern blot of mouse DNA probed with Nd
  • FIG. 5A DNA sequence of the CH gene (LYSTl) from position 1 to position 1400 The DNA sequence continues in FIG 5B
  • FIG. 5B Continuation of the DNA sequence of the CH gene in FIG 5A beginning at position 1401 and continuing to position 2800
  • FIG. 5C Continuation of the DNA sequence of the CH gene in FIG 5B beginning at position 2801 and continuing to position 3514
  • FIG. 6 Amino acid sequence of the CH protein
  • FIG. 7A Genetic mapping of the Bl gene
  • FIG. 7B Genetic mapping of the bg and Bl genes
  • FIG. 7C Detailed map showing the localiztion of the bg and Bl genes
  • FIG. 8 Bl cDNA clones.
  • FIG. 9A Deletion of part of Bl in bg lu Probe used in Southern analysis is probe A from FIG 8
  • FIG. 9B Deletion of part of Bl in bg Probe used in Southern analysis is probe B from FIG 8 FIG. 9C. Deletion of part of Bl in bg 11J Probe used in Southern analysis is probe C from
  • FIG 8 In bp I U a deletion from bp 1250 to 2400 was observed
  • FIG. 10 Physical mapping of Bl gene within bg critical region
  • FIG. 11 Genetic and physical map of the bg non-recombinant interval on mouse chromosome 13 showing the location of Lyst Mouse chromosome 13 is shown by the horizontal line with the centromere on the left
  • the bg critical region is delineated by chromosome crossovers (denoted with an X) in animals 134 and 137 of an interspecific mouse backcross [C57BL/6- ⁇ V x (C57BL/6J- x CAST EiJ)F,]
  • Microsatellite markers D13MU172 and D13MU239 flank bg proximal.y, D13MU162 and D13M 305 lie distal to bg (indicated by turquoise circles) YAC and PI clones identified by PCRTM screening (Kusumi et al, 1993; Pierce et al , 1992) with oligonucleotides corresponding to Nid or D13Sfkl3 are shown above the chromosome Novel sequence-tagged sites (STS, indicated by
  • FIG. 12A Intragenic deletion of Lyst in bg" J DNA Southern blot identification of an intragenic Lyst deletion in bg" J DNA. A Southern blot was sequentially hybridized
  • FIG. 12B Intragenic deletion of Lyst in bg" J DNA Southern blot identification of an intragenic Lyst deletion in bg" J DNA A Southern blot was sequentially hybridized (Barbosa et al , 1995) with 3 Lyst probes This panel shows the probe (nucleotides 2,835-
  • FIG. 12C Intragenic deletion of Lyst in bg" DNA Southern blot identification of an intragenic Lyst deletion in bg" J DNA A Southern blot was sequentially hybridized (Barbosa et al , 1995) with 3 Lyst probes, Shown in this panel are results when the probe
  • FIG. 12D Intragenic deletion of Lyst in bg" J DNA PCRTM analysis of the bg" J deletion C57BL/10J, C3HeB/FeJ, C57BL/6J and C57BL/6J-_>g" genomic DNA and Lyst cDNA were used as templates in the PCRTM reactions Amplicons illustrated correspond to Lyst cDNA nucleotides 1,337-1,837, which represent exon ⁇ and are upstream from the deletion No amplicon was observed in control PCRTM reactions performed without template More than 30 other STSs that had been localized within the bg non- recombinant interval PCRTM amplified normally from bg" J DNA
  • FIG. 12E Intragenic deletion of Lyst in bg" J DNA PCRTM analysis of the bg" J deletion C57BL/10J, C3HeB/FeJ, C57BL/6J and C57BL/6J- ' genomic DNA and Lyst cDNA were used as templates in the PCRTM reactions Amplicons illustrated correspond to nucleotides 2,670-3,210, which represent exon ⁇ , which is deleted in bg" J DNA No amplicon was observed in control PCRTM reactions performed without template More than 30 other STSs that had been localized within the bg non-recombinant interval PCRTM amplified normally from bg" J DNA FIG. 12F.
  • FIG. 12G Intragenic deletion of Lyst in bg" J DNA. Genomic structure of Lyst in the vicinity of the bg" J deletion Lyst exons ( ⁇ , ⁇ , ⁇ , ⁇ , and ⁇ ) are depicted by black boxes, and intervening introns by a solid line.
  • Nucleotides of the mouse Lyst cDNA that correspond to exonic boundaries are indicated above the boxes The 3' end of exon ⁇ , and all of exons ⁇ and ⁇ , are deleted in bg 1 " DNA
  • the region of Lyst protein that is deleted in bg" J contains a pair of helices with N-terminal phosphorylation sites
  • Genomic structure and intronic sequences were ascertained by sequence analysis of nested PCRTM products, performed with exonic primers and PI clone DNA as template (Kingsmore et al, 1994) Boundaries of the bg'" deletion were determined by PCRTM of genomic DNA.
  • FIG. 13A Northern blot analysis of mouse and human Lyst Northern blots of 2 ⁇ g poly(A) + RNA from various mouse tissues (Clontech) hybridized with probes that correspond to nucleotides 4,423-4,621 of mouse Lyst cDNA
  • FIG. 13B Northern blot analysis of mouse and human Lyst Northern blots of 2 ⁇ g poly(A) + RNA from various mouse tissues (Clontech) hybridized with probes that correspond to nucleotides 1,430-2,457 (exon ⁇ )of mouse Lyst cDNA. Molecular size standards (in kb) are shown to the left. Hybridization of mouse mRNA with probes from mouse Lyst exons ⁇ and ⁇ gave identical results to those shown with exon ⁇ , whereas probes from exons ⁇ , ⁇ , and ⁇ gave results identical to those shown in FIG 13 A.
  • FIG. 13C Northern blot analysis of mouse and human Lyst.
  • Northern blot of 2 ⁇ g poly(A) + RNA from various human Iymphoid tissues, hybridized with a probe that corresponds to nucleotides 357-800 of human LYST cDNA Molecular size standards (in kb) are shown to the left.
  • FIG. 13D Northern blot analysis of mouse and human Lyst Northern blot of 2 ⁇ g poly(A) + RNA from human cancer cell lines, hybridized with a probe that corresponds to nucleotides 357-800 of human LYST cDNA Molecular size standards (in kb) are shown to the left
  • FIG. 14A Mutation analysis of LYST cDNA from CHS patients A Northern blot of 2 ⁇ g aliquots of lymphoblastoid poly(A) + RNA from CHS patients and a control The probe used for hybridization corresponds to nucleotides 490 to 817 of LYST
  • FIG. 14B SSCP analysis of cDNA corresponding to LYST nucleotides 439 to 806 Each lane contains samples from individual patients as indicated Note the appearance of an extra band in lanes corresponding to patients 371 and 373
  • FIG. 14C Sequence chromatograms showing mutations in LYST cDNA clones from patients 371 and 373
  • the upper part is normal human Z7SJcDNA sequence
  • the arrows indicate the positions of a G insertion (patient 371) and C to T substitution (patient 373)
  • the antisense strand of LYST is shown
  • FIG. 15A Physical mapping of LYST Monochromosomal somatic cell hybrid blot (BIOS Laboratories, New Haven, Connecticut) containing DNA from 24 somatic cell hybrid cell lines and three control DNAs (human, hamster, or mouse) digested with EcoKL The cell line and chromosome number are indicated at the top of the figure *Mix lane consists of I 5 mg human DNA and 4 5 mg mouse DNA **Human/hamster somatic cell hybrid All others are human/mouse hybrids Molecular size standards (in kb) are shown to the right The blot was hybridized with a probe corresponding to nucleotides 2923-4865 of human ZTSrcDNA
  • FIG. 15B Southern blot of CHS critical region YACs digested with Taql The YAC coordinates are indicated at the top and molecular size standards (in kb) are shown to the left The probe used for the hybridization corresponds to LYST nucleotides 490 to 817
  • FIG. 15C The same Southern blot shown in panel B, rehybridized with a probe corresponding to LYST nucleotides 4551 to 4977 A third LSYT probe (corresponding to nucleotides 3032-4722) also hybridized to the same YAC clones
  • FIG. 15D Physical map of human chromosome 1 showing the location of LYST within a YAC contig of the CHS critical region (Barrat et al. 1996) The upper part represents chromosome 1 The microsatellite markers D1S179 (centromeric) and WI- 12396 (telomeric) flank the CHS locus . YAC clones are shown below the chromosome The figure is not drawn to scale
  • FIG. 16A Genomic organization of LYST Schematic representation of PCRTM clones corresponding to the human LYST cDNA (Genbank accession number U70064) The solid and open bars represent the LYST coding region and the 5' TJTR, respectively
  • Nucleotide 5095 corresponds to the transition between sequences conserved with Lyst (Barbosa et al, 1996) and BG (Perou et al, 1996a)
  • the three human ESTs identified by database searches with the mouse sequence (#1, #2 and #3, Genbank accession numbers #1, L77889, #2, W26957, #3, H51623) are shown at the top
  • Clones #4, #5, #6 and #8 are RT-PCRTM products
  • Clone #7 is a 2 kb 5'RACE product
  • FIG. 16B Alternative splicing of mouse Lyst Solid boxes represent Lyst exons ⁇ and ⁇ Splicing of exon ⁇ to exon ⁇ occurs in the Lyst-l mRNA (12 kb)
  • the hatched box represents the intronic region that forms the 3' end of the Z st-II ORF (5 9 kb)
  • An asterisk indicates a stop codon and an 'A' indicates a polyadenylation signal within the intron Nucleotide positions indicated are from Genbank accession number L77884
  • FIG. 16C Detection of Lyst-l and Lyst-ll by RT-PCRTM and genomic PCRTM
  • RNAse-treated mouse melanocyte RNA was reverse transcribed and amplified with primers Fl/Rl (expected amplicon size 273 bp) or F1/R2 (expected amplicon size 560 bp) RNAse-treated C57BL/6J DNA was amplified with primers Fl/Rl
  • primer sequences are:
  • LYSTl gene is used to refer to a gene or DNA coding region that encodes a Chediak-Higashi protein, polypeptide or peptide.
  • LYSTl gene is a gene that hybridizes, under relatively stringent hybridization conditions (see, e.g., Maniatis et al, 1982), to DNA sequences presently known to include LYSTl gene sequences. It will, of course, be understood that one or more than one genes encoding LYSTl proteins or peptides may be used in the methods and compositions of the invention.
  • the nucleic acid compositions and methods disclosed herein may entail the administration of one, two, three, or more, genes or gene segments. The maximum number of genes that may be used is limited only by practical considerations, such as the effort involved in simultaneously preparing a large number of gene constructs or even the possibility of eliciting a significant adverse cytotoxic effect.
  • LYST2 gene is used to refer to a gene or DNA coding region that encodes a LYST2 protein, polypeptide or peptide.
  • LYST2 gene is a gene that hybridizes, under relatively stringent hybridization conditions (see, e.g., Maniatis et al, 1982), to DNA sequences presently known to include LYST2 gene sequences. It will, of course, be understood that one or more than one genes encoding LYST2 proteins or peptides may be used in the methods and compositions of the invention.
  • the nucleic acid compositions and methods disclosed herein may entail the administration of one, two, three, or more, genes or gene segments. The maximum number of genes that may be used is limited only by practical considerations, such as the effort involved in simultaneously preparing a large number of gene constructs or even the possibility of eliciting a significant adverse cytotoxic effect.
  • Lyst genes disclosed herein may be combined on a single genetic construct under control of one or more promoters, or they may be prepared as separate constructs of the same of different types.
  • Certain gene combinations may be designed to, or their use may otherwise result in, achieving synergistic effects on formation of an immune response, or the development of antibodies to gene products encoded by such nucleic acid segments, or in the production of diagnostic and treatment protocols for, among other things, Chediak-Higashi Syndrome. Any and all such combinations are intended to fall within the scope of the present invention. Indeed, many synergistic effects have been described in the scientific literature, so that one of ordinary skill in the art would readily be able to identify likely synergistic gene combinations, or even gene-protein combinations.
  • nucleic segment or gene could be administered in combination with further agents, such as, e.g. , proteins or polypeptides or various pharmaceutically active agents. So long as genetic material forms part of the composition, there is virtually no limit to other components which may also be included, given that the additional agents do not cause a significant adverse effect upon contact with the target cells or tissues.
  • kits comprising, in suitable container means, a LYST or Lyst composition of the present invention in a pharmaceutically acceptable formulation represent another aspect of the invention.
  • the LYST or Lyst composition may be native LYST or Lyst protein, truncated LYST or Lyst protein, site-specifically mutated LYST or Lyst-encoding DNAs, or LYST- or Lyst- derived peptide epitopes, or alternatively antibodies which bind the native LYST or Lyst gene product, truncated LYST or Lyst protein, site-specifically mutated LYST or Lyst protein, or LYST- or Lyst-encoded peptide epitopes.
  • the LYST or Lyst composition may be nucleic acid segments encoding one or more native LYST or Lyst proteins, truncated LYST or Lyst proteins, site-specifically mutated LYST or Lyst proteins, or peptide epitope derivatives of LYST or Lyst.
  • Such nucleic acid segments may be DNA or RNA, and may be either native, recombinant, or mutagenized nucleic acid segments.
  • kits may comprise a single container means that contains the LYST or Lyst composition.
  • the container means may, if desired, contain a pharmaceutically acceptable sterile excipient, having associated with it, the LYST or Lyst composition and, optionally, a detectable label or imaging agent.
  • the formulation may be in the form of a gelatinous composition, e.g., a collagenous- LYST or Lyst composition, or may even be in a more fluid form.
  • the container means may itself be a syringe, pipette, or other such like apparatus, from which the LYST or Lyst composition may be applied to a tissue site, injected into an animal, or otherwise administered as needed.
  • the single container means may contain a dry, or lyophilized, mixture of a
  • LYST or Lyst composition which may or may not require pre-wetting before use.
  • kits of the invention may comprise distinct container means for each component.
  • one container would contain the LYST or Lyst composition, either as a sterile DNA solution or in a lyophilized form, and the other container would include the matrix, which may or may not itself be pre-wetted with a sterile solution, or be in a gelatinous, liquid or other syringeable form.
  • kits may also comprise a second or third container means for containing a sterile, pharmaceutically acceptable buffer, diluent or solvent.
  • a sterile, pharmaceutically acceptable buffer, diluent or solvent Such a solution may be required to formulate the LYST or Lyst component into a more suitable form for application to the body, e.g., as a topical preparation, or alternatively, in oral, parenteral, or intravenous forms.
  • all components of a kit could be supplied in a dry form (lyophilized), which would allow for "wetting" upon contact with body fluids.
  • the presence of any type of pharmaceutically acceptable buffer or solvent is not a requirement for the kits of the invention.
  • the kits may also comprise a second or third container means for containing a pharmaceutically acceptable detectable imaging agent or composition.
  • the container means will generally be a container such as a vial, test tube, flask, bottle, syringe or other container means, into which the components of the kit may placed.
  • the matrix and gene components may also be aliquoted into smaller containers, should this be desired.
  • the kits of the present invention may also include a means for containing the individual containers in close confinement for commercial sale, such as, e.g., injection or blow-molded plastic containers into which the desired vials or syringes are retained.
  • kits of the invention may also comprise, or be packaged with, an instrument for assisting with the placement of the ultimate LYST or Lyst composition within the body of an animal.
  • an instrument may be a syringe, pipette, forceps, or any such medically approved delivery vehicle.
  • nucleic acid segments disclosed herein will be used to transfect appropriate host cells.
  • Technology for introduction of DNA into cells is well-known to those of skill in the art Four general methods for delivering a nucleic segment into cells have been described
  • the inventors contemplate the use of liposomes and/or nanocapsules for the introduction of particular peptides or nucleic acid segments into host cells Such formulations may be preferred for the introduction of pharmaceutically-acceptable formulations of the nucleic acids, peptides, and/or antibodies disclosed herein
  • liposomes are generally known to those of skill in the art (see for example, Couvreur et al. , 1977 which describes the use of liposomes and nanocapsules in the targeted antibiotic therapy of intracellular bacterial infections and diseases) Recently, liposomes were developed with improved serum stability and circulation half-times (Gabizon and Papahadjopoulos, 1988, Allen and Choun, 1987)
  • Nanocapsules can generally entrap compounds in a stable and reproducible way (Henry- Michelland et ai, 1987) To avoid side effects due to intracellular polymeric overloading, such ultrafine particles (sized around 0.1 ⁇ m) should be designed using polymers able to be degraded in vivo Biodegradable polyalkyl-cyanoacrylate nanoparticles that meet these requirements are contemplated for use in the present invention, and such particles may be are easily made, as described (Couvreur et ai, 1977, 1988).
  • Liposomes are formed from phospholipids that are dispersed in an aqueous medium and spontaneously form multilamellar concentric bilayer vesicles (also termed multilamellar vesicles
  • MLVs MLVs
  • SAVs small unilamellar vesicles
  • Phospholipids can form a variety of structures other than liposomes when dispersed in water, depending on the molar ratio of lipid to water At low ratios the liposome is the preferred structure
  • the physical characteristics of liposomes depend on pH, ionic strength and the presence of divalent cations Liposomes can show low permeability to ionic and polar substances, but at elevated temperatures undergo a phase transition which markedly alters their permeability
  • the phase transition involves a change from a closely packed, ordered structure, known as the gel state, to a loosely packed, less-ordered structure, known as the fluid state This occurs at a characteristic phase-transition temperature and results in an increase in permeability to ions, sugars and drugs
  • Liposomes interact with cells via four different mechanisms Endocytosis by phagocytic cells of the reticuloendothelial system such as macrophages and neutrophils, adsorption to the cell surface, either by nonspecific weak hydrophobic or electrostatic forces, or by specific interactions with cell-surface components, fusion with the plasma cell membrane by insertion of the lipid bilayer of the liposome into the plasma membrane, with simultaneous release of liposomal contents into the cytoplasm, and by transfer of liposomal lipids to cellular or subcellular membranes, or vice versa, without any association of the liposome contents It often is difficult to determine which mechanism is operative and more than one may operate at the same time
  • the present invention contemplates an antibody that is immunoreactive with a polypeptide of the invention
  • one of the uses for LYST- or Lyst-derived epitopic peptides according to the present invention is to generate antibodies
  • Reference to antibodies throughout the specification includes whole polyclonal and monoclonal antibodies (mAbs), and parts thereof, either alone or conjugated with other moieties
  • Antibody parts include Fab and F(ab) 2 fragments and single chain antibodies
  • the antibodies may be made in vivo in suitable laboratory animals or vitro using recombinant DNA techniques
  • an antibody is a polyclonal antibody Means for preparing and characterizing antibodies are well known in the art (See, e.g., Harlow and Lane, 1988)
  • a polyclonal antibody is prepared by immunizing an animal with an immunogen comprising a polypeptide of the present invention and collecting antisera from that immunized animal.
  • an immunogen comprising a polypeptide of the present invention
  • an animal used for production of anti-antisera is a rabbit, a mouse, a rat, a hamster or a guinea pig Because of the relatively large blood volume of rabbits, a rabbit is a preferred choice for production of polyclonal antibodies.
  • Antibodies both polyclonal and monoclonal, specific LYST- or Lyst-derived epitopes may be prepared using conventional immunization techniques, as will be generally known to those of skill in the art.
  • a composition containing antigenic epitopes of particular proteins can be used to immunize one or more experimental animals, such as a rabbit or mouse, which will then proceed to produce specific antibodies against LYST- or Lyst-derived peptides.
  • Polyclonal antisera may be obtained, after allowing time for antibody generation, simply by bleeding the animal and preparing serum samples from the whole blood.
  • the amount of immunogen composition used in the production of polyclonal antibodies varies upon the nature of the immunogen, as well as the animal used for immunization.
  • a variety of routes can be used to administer the immunogen (subcutaneous, intramuscular, intradermal, intravenous and intraperitoneal).
  • the production of polyclonal antibodies may be monitored by sampling blood of the immunized animal at various points following immunization. A second, booster injection, also may be given. The process of boosting and titering is repeated until a suitable titer is achieved.
  • the immunized animal can be bled and the serum isolated and stored, and/or the animal can be used to generate mAbs (below).
  • polyclonal antisera is derived from a variety of different "clones,” i.e., B-cells of different lineage.
  • mAbs by contrast, are defined as coming from antibody-producing cells with a common B-cell ancestor, hence their "mono" clonality.
  • Polyclonal antisera according to present invention is produced against peptides that are predicted to comprise whole, intact epitopes It is believed that these epitopes are, therefore, more stable in an immunologic sense and thus express a more consistent immunologic target for the immune system Under this model, the number of potential B-cell clones that will respond to this peptide is considerably smaller and, hence, the homogeneity of the resulting sera will be higher
  • the present invention provides for polyclonal antisera where the clonality, i.e., the percentage of clone reacting with the same molecular determinant, is at least 80% Even higher clonality - 90%, 95% or greater - is contemplated
  • mAbs To obtain mAbs, one would also initially immunize an experimental animal, often preferably a mouse, with a LYST- or Lyst-containing composition One would then, after a period of time sufficient to allow antibody generation, obtain a population of spleen or lymph cells from the animal The spleen or lymph cells can then be fused with cell lines, such as human or mouse myeloma strains, to produce antibody-secreting hybridomas These hybridomas may be isolated to obtain individual clones which can then be screened for production of antibody to the desired peptide
  • Hybridomas which produce mAbs to the selected antigens are identified using standard techniques, such as ELISA and Western blot methods Hybridoma clones can then be cultured in liquid media and the culture supernatants purified to provide the LYST- or Lyst- specific mAbs
  • the mAbs of the present invention will also find useful application in immunochemical procedures, such as ELISA and Western blot methods, as well as other procedures such as immunoprecipitation, immunocytological methods, etc. which may utilize antibodies specific to the LYST or Lyst protein
  • anti-LYST/Lyst antibodies may be used in immunoabsorbent protocols to purify native or recombinant LYST/Lyst proteins or LYST/Lyst-derived peptide species or synthetic or natural variants thereof
  • the antibodies disclosed herein may be employed in antibody cloning protocols to obtain cDNAs or genes encoding LYST/Lyst proteins from other species or organisms, or to identify proteins having significant homology to the LYST Lyst gene product.
  • Anti- LYST/Lyst antibodies will also be useful in immunolocalization studies to analyze the distribution of cells expressing LYST/Lyst protein during particular cellular activities, or for example, to determine the cellular or tissue-specific distribution of LYST/Lyst under different physiological conditions.
  • a particularly useful application of such antibodies is in purifying native or recombinant LYST/Lyst proteins, for example, using an antibody affinity column. The operation of all such immunological techniques will be known to those of skill in the art in light of the present disclosure.
  • Recombinant clones expressing the "LYST family" nucleic acid segments may be used to prepare purified recombinant LYST protein (rLYST), purified rLYST-derived peptide antigens as well as mutant or variant recombinant protein species in significant quantities.
  • the selected antigens, and variants thereof, are proposed to have significant utility in diagnosing and treating CHS.
  • rLYSTs, peptide variants thereof, and/or antibodies against such rLYSTs may also be used in immunoassays to detect the presence of LYST or as vaccines or immunotherapeutics to treat CHS and related disorders.
  • Second generation proteins will typically share one or more properties in common with the full-length antigen, such as a particular antigenic/immunogenic epitopic core sequence.
  • Epitopic sequences can be obtained from relatively short mo:ecules prepared from knowledge of the peptide, or encoding DNA sequence information.
  • variant molecules may not only be derived from selected immunogenic/ antigenic regions of the protein structure, but may additionally, or alternatively, include one or more functionally equivalent amino acids selected on the basis of similarities or even differences with respect to the natural sequence.
  • a polyclonal antibody is prepared by immunizing an animal with an immunogenic composition in accordance with the present invention and collecting antisera from that immunized animal
  • an immunogenic composition in accordance with the present invention
  • a wide range of animal species can be used for the production of antisera
  • the animal used for production of anti-antisera is a rabbit, a mouse, a rat, a hamster, a guinea pig or a goat Because of the relatively large blood volume of rabbits, a rabbit is a preferred choice for production of polyclonal antibodies
  • a given composition may vary in its immunogenicity It is often necessary therefore to boost the host immune system, as may be achieved by coupling a peptide or polypeptide immunogen to a ' carrier
  • exemplary and preferred carriers are keyhole limpet hemocyanin (KLH) and bovine serum albumin (BSA)
  • KLH keyhole limpet hemocyanin
  • BSA bovine serum albumin
  • albumins such as ovalbumin, mouse serum albumin or rabbit serum albumin can also be used as carriers
  • Means for conjugating a polypeptide to a carrier protein are well known in the art and include glutaraldehyde, /w-maleimidobenzoyl-N-hydroxysuccinimide ester, carbodiimide and bis-biazotized benzidine
  • mAbs may be readily prepared through use of well-known techniques, such as those exemplified in U S Patent 4,196,265, incorporated herein by reference
  • this technique involves immunizing a suitable animal with a selected immunogen composition, e.g., a purified or partially purified protein, polypeptide or peptide
  • a selected immunogen composition e.g., a purified or partially purified protein, polypeptide or peptide
  • the immunizing composition is administered in a manner effective to stimulate antibody producing cells
  • Rodents such as mice and rats are preferred animals, however, the use of rabbit, sheep or frog cells is also possible
  • the use of rats may provide certain advantages (Goding, 1986), but mice are preferred, with the BALB/c mouse being most preferred as this is most routinely used and generally gives a higher percentage of stable fusions.
  • somatic cells with the potential for producing antibodies, specifically B-lymphocytes (B-cells) are selected for use in the mAb generating protocol
  • B-cells B-lymphocytes
  • These cells may be obtained from biopsied spleens, tonsils or lymph nodes, or from a peripheral blood sample Spleen cells and peripheral blood cells are preferred, the former because they are a rich source of antibody-producing cells that are m ⁇ t the dividing plasmablast stage, and the latter because peripheral blood is easily accessible
  • a panel of animals will have been immunized and the spleen of animal with the highest antibody titer will be removed and the spleen lymphocytes obtained by homogenizing the spleen with a syringe
  • a spleen from an immunized mouse contains approximately about 5 x 10 7 to about 2 x 10 8 lymphocytes
  • the antibody-producing B lymphocytes from the immunized animal are then fused with cells of an immortal myeloma cell, generally one of the same species as the animal that was immunized Myeloma cell lines suited for use in hybridoma-producing fusion procedures preferably are non-antibody-producing, have high fusion efficiency, and enzyme deficiencies that render them incapable of growing in certain selective media which support the growth of only the desired fused cells (hybridomas)
  • any one of a number of myeloma cells may be used, as are known to those of skill in the art (Goding, 1986, Campbell, 1984)
  • the immunized animal is a mouse
  • P3-X63/Ag8, X63-Ag8 653, NS1/1 Ag 4 1, Sp210-Agl4, FO, NSO/U, MPC-11, MPC11-X45-GTG 1 7 and S194/5XX0 Bui for rats
  • one may use R210 RCY3, Y3-Ag 1 2 3, IR983F and 4B210, and U-266, GM1500-GRG2, LICR-LON-HMy2 and UC729-6 are all useful in connection with human cell fusions
  • NS-1 myeloma cell line also termed P3-NS-1- Ag4-1
  • P3-NS-1- Ag4-1 Another mouse myeloma cell line that may be used is the 8-azaguanine-resistant mouse murine myeloma SP2/0 non-producer cell line
  • Methods for generating hybrids of antibody-producing spleen or lymph node cells and myeloma cells usually comprise mixing somatic cells with myeloma cells in a 2 1 ratio, though the ratio may vary from about 20 1 to about 1 1, respectively, in the presence of an agent or agents (chemical or electrical) that promote the fusion of cell membranes Fusion methods using Sendai virus have been described (Kohler and Milstein, 1975, 1976), and those using polyethylene glycol (PEG), such as 37% (v/v) PEG, by Gefter et al ( 977) The use of electrically induced fusion methods is also appropriate (Goding, 1986)
  • the selective medium is generally one that contains an agent that blocks the de novo synthesis of nucleotides in the tissue culture media
  • agents are aminopterin, methotrexate, and azaserine Aminopterin and methotrexate block de novo synthesis of both purines and pyrimidines, whereas azaserine blocks only purine synthesis
  • aminopterin or methotrexate the media is supplemented with hypoxanthine and thymidine as a source of nucleotides (HAT medium)
  • HAT medium Hypoxanthine
  • the preferred selection medium is HAT Only cells capable of operating nucleotide salvage pathways are able to survive in HAT medium
  • the myeloma cells are defective in key enzymes of the salvage pathway, e.g., hypoxanthine phosphoribosyl transferase (HPRT), and they cannot survive
  • HPRT hypoxanthine phosphoribosyl transferase
  • the B-cells can operate this pathway, but they have a limited life span in culture and generally die within about two weeks Therefore, the only cells that can survive in the selective media are those hybrids formed from myeloma and B-cells
  • This culturing provides a population of hybridomas from which specific hybridomas are selected Typically, selection of hybridomas is performed by culturing the cells by single-clone dilution in microtiter plates, followed by testing the individual clonal supernatants (after about two to three weeks) for the desired reactivity
  • the assay should be sensitive, simple and rapid, such as radioimmunoassays, enzyme immunoassays, cytotoxicity assays, plaque assays, dot immunobinding assays, and the like
  • the selected hybridomas would then be serially diluted and cloned into individual antibody-producing cell lines, which clones can then be propagated indefinitely to provide mAbs
  • the cell lines may be exploited for mAb production in two basic ways
  • a sample of the hybridoma can be injected (often into the peritoneal cavity) into a histocompatible animal of the type that was used to provide the somatic and myeloma cells for the original fusion
  • the injected animal develops tumors secreting the specific mAb produced by the fused cell hybrid
  • the body fluids of the animal such as serum or ascites fluid, can then be tapped to provide mAbs in high concentration
  • the individual cell lines could also be cultured in vitro, where the mAbs are naturally secreted into the culture medium from which they can be readily obtained in high concentrations mAbs produced by either means may be further purified, if desired, using filtration, centrifugation and various chromatographic methods such as HPLC or affinity
  • native and synthetically-derived peptides and peptide epitopes of the invention will find utility as immunogens, e.g., in connection with vaccine development, or as antigens in immunoassays for the detection of reactive antibodies.
  • preferred immunoassays of the invention include the various types of enzyme linked immunosorbent assays (ELISAs), as are known to those of skill in the art.
  • ELISAs enzyme linked immunosorbent assays
  • LYST-derived proteins and peptides is not limited to such assays, and that other useful embodiments include RIAs and other non-enzyme linked antibody binding assays and procedures.
  • proteins or peptides incorporating LYST, rLYST, or LYST- derived protein antigen sequences are immobilized onto a selected surface, preferably a surface exhibiting a protein affinity, such as the wells of a polystyrene microtiter plate.
  • a selected surface preferably a surface exhibiting a protein affinity
  • a nonspecific protein that is known to be antigenically neutral with regard to the test antisera, such as bovine serum albumin (BSA) or casein, onto the well.
  • BSA bovine serum albumin
  • the immobilizing surface is contacted with the antisera or clinical or biological extract to be tested in a manner conducive to immune complex (antigen/antibody) formation.
  • Such conditions preferably include diluting the antisera with diluents such as BSA, bovine gamma globulin (BGG) and phosphate buffered saline (PBS)/TweenTM. These added agents also tend to assist in the reduction of nonspecific background.
  • the layered antisera is then allowed to incubate for, e.g., from 2 to 4 hours, at temperatures preferably on the order of about 25° to about 27°.
  • the antisera-contacted surface is washed so as to remove non-immunocomplexed material.
  • a preferred washing procedure includes washing with a solution such as PBS/TweenTM, or borate buffer. Following formation of specific immuno 7co7mplexes between the test sample and the bound antigen, and subsequent washing, the occurrence and the amount of immunocomplex formation may be determined by subjecting the complex to a second antibody having specificity for the first
  • the second antibody will preferably be an antibody having specificity for human antibodies
  • the second antibody will preferably have an associated detectable label, such as an enzyme label, that will generate a signal, such as color development upon incubating with an appropriate chromogenic substrate
  • an associated detectable label such as an enzyme label
  • the second antibody will preferably have an associated detectable label, such as an enzyme label, that will generate a signal, such as color development upon incubating with an appropriate chromogenic substrate
  • a detectable label such as an enzyme label
  • the amount of label is quantified by incubation with a chromogenic substrate such as urea and bromocresol purple or 2,2'-azino-di-(3-ethyl-benzthiazoline)-6-sulfonic acid (ABTS) and H 2 O , in the case of peroxidase as the enzyme label Quantitation is then achieved by measuring the degree of color generation, e.g., using a visible spectrum spectrophotometer
  • ELISAs may be used in conjunction with the invention
  • proteins or peptides incorporating antigenic sequences of the present invention are immobilized onto a selected surface, preferably a surface exhibiting a protein affinity such as the wells of a polystyrene microtiter plate
  • a nonspecific protein that is known to be antigenically neutral with regard to the test antisera such as bovine serum albumin (BSA), casein or solutions of powdered milk
  • the anti-LYST protein antibodies of the present invention are particularly useful for the isolation of LYST protein antigens by immunoprecipitation Immunoprecipitation involves the separation of the target antigen component -from a complex mixture, and is used to discriminate or isolate minute amounts of protein.
  • the antibodies of the present invention are useful for the close juxtaposition of two antigens This is particularly useful for increasing the localized concentration of antigens, e.g., enzyme-substrate pairs
  • compositions of the present invention will find great use in immunoblot or western blot analysis.
  • the anti-LYST antibodies may be used as high-affinity primary reagents for the identification of proteins immobilized onto a solid support matrix, such as nitrocellulose, nylon or combinations thereof In conjunction with immunoprecipitation, followed by gel electrophoresis, these may be used as a single step reagent for use in detecting antigens against which secondary reagents used in the detection of the antigen cause an adverse background.
  • the antigens studied are immunoglobulins (precluding the use of immunoglobulins binding bacterial cell wall components), the antigens studied cross-react with the detecting agent, or they migrate at the same relative molecular weight as a cross-reacting signal.
  • Immunologically- based detection methods in conjunction with Western blotting are considered to be of particular use in this regard
  • compositions disclosed herein may be orally administered, for example, with an inert diluent or with an assimilable edible carrier, or they may be enclosed in hard or soft shell gelatin capsule, or they may be compressed into tablets, or they may be incorporated directly with the food of the diet
  • the active compounds may be incorporated with excipients and used in the form of ingestible tablets, buccal tables, troches, capsules, elixirs, suspensions, syrups, wafers, and the like.
  • Such compositions and preparations should contain at least 0.1% of active compound.
  • compositions and preparations may, of course, be varied and may conveniently be between about 2 to about 60% of the weight of the unit
  • amount of active compounds in such therapeutically useful compositions is such that a suitable dosage will be obtained.
  • V c l The tablets, troches, pills, capsules.
  • a binder as gum tragacanth, acacia, cornstarch, or gelatin
  • excipients such as dicalcium phosphate
  • a disintegrating agent such as corn starch, potato starch, alginic acid and the like
  • a lubricant such as magnesium stearate
  • a sweetening agent such as sucrose, lactose or saccharin may be added or a flavoring agent, such as peppermint, oil of wintergreen, or cherry flavoring.
  • a sweetening agent such as sucrose, lactose or saccharin may be added or a flavoring agent, such as peppermint, oil of wintergreen, or cherry flavoring.
  • the dosage unit form may contain, in addition to materials of the above type, a liquid carrier.
  • Various other materials may be present as coatings or to otherwise modify the physical form of the dosage unit.
  • tablets, pills, or capsules may be coated with shellac, sugar or both.
  • a syrup of elixir may contain the active compounds sucrose as a sweetening agent methyl and propylparabens as preservatives, a dye and flavoring, such as cherry or orange flavor.
  • any material used in preparing any dosage unit form should be pharmaceutically pure and substantially non-toxic in the amounts employed.
  • the active compounds may be incorporated into sustained-release preparation and formulations.
  • the active compounds may also be administered parenterally or intraperitoneally.
  • Solutions of the active compounds as free base or pharmacologically acceptable salts can be prepared in water suitably mixed with a surfactant, such as hydroxypropylcellulose.
  • Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms.
  • the pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions.
  • the form must be sterile and must be fluid to the extent that easy syringability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms, such as bacteria and fungi.
  • the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils.
  • the proper fluidity can be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants.
  • a coating such as lecithin
  • surfactants for example, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium bicarbonate, sodium sorbic acid, thimerosal, and the like.
  • isotonic agents for example, sugars or sodium chloride
  • Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
  • Sterile injectable solutions are prepared by incorporating the active compounds in the required amount in the appropriate solvent with various of the other ingredients enumerated above, as required, followed by filtered sterilization.
  • dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium and the required other ingredients from those enumerated above.
  • the preferred methods of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
  • pharmaceutically acceptable carrier includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and abso ⁇ tion delaying agents and the like.
  • the use of such media and agents for pharmaceutical active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions.
  • the polypeptide may be incorporated with excipients and used in the form of non-ingestible mouthwashes and dentifrices
  • a mouthwash may be prepared incorporating the active ingredient in the required amount in an appropriate solvent, such as a sodium borate solution (Dobell's Solution).
  • the active ingredient may be incorporated into an antiseptic wash containing sodium borate, glycerin and potassium bicarbonate
  • the active ingredient may also be dispersed in dentifrices, including gels, pastes, powders and slurries.
  • the active ingredient may be added in a therapeutically effective amount to a paste dentifrice that may include water, binders, abrasives, flavoring agents, foaming agents, and humectants.
  • compositions that do not produce an allergic or similar untoward reaction when administered to a human.
  • pharmaceutically-acceptable refers to molecular entities and compositions that do not produce an allergic or similar untoward reaction when administered to a human.
  • aqueous composition that contains a protein as an active ingredient is well understood in the art.
  • injectables either as liquid I solutions or suspensions, solid forms suitable for solution in, or suspension in, liquid prior to injection can also be prepared
  • the preparation can also be emulsified
  • composition can be formulated in a neutral or salt form
  • Pharmaceutically-acceptable salts include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like
  • Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like
  • solutions will be administered in a manner compatible with the dosage formulation and in such amount as is therapeutically effective
  • the formulations are easily administered in a variety of dosage forms such as injectable solutions, drug release capsules and the like
  • aqueous solutions For parenteral administration in an aqueous solution, for example, the solution should be suitably buffered if necessary and the liquid diluent first rendered isotonic with sufficient saline or glucose
  • aqueous solutions are especially suitable for intravenous, intramuscular, subcutaneous and intraperitoneal administration
  • sterile aqueous media which can be employed will be known to those of skill in the art in light of the present disclosure
  • one dosage could be dissolved in 1 ml of isotonic NaCl solution and either added to 1000 ml of hypodermoclysis fluid or injected at the proposed site of infusion, (see for example, "Remington's Pharmaceutical Sciences” 15th Edition, pages 1035-1038 and 1570-1580)
  • the present invention is also directed to protein or peptide compositions, free from total cells and other peptides, which comprise a purified protein or peptide which incorporates an epitope that is immunologically cross-reactive with one or more of the antibodies of the present invention
  • incorporating an epitope(s) that is immunologically cross- reactive with one or more anti-LYST protein antibodies is intended to refer to a peptide or protein antigen which includes a primary, secondary or tertiary structure similar to an epitope located within a LYST polypeptide.
  • the level of similarity will generally be to such a degree that monoclonal or polyclonal antibodies directed against the LYST polypeptide will also bind to, react with, or otherwise recognize, the cross-reactive peptide or protein antigen.
  • Various immunoassay methods may be employed in conjunction with such antibodies, such as, for example, Western blotting, ELISA, RIA, and the like, all of which are known to those of skill in the art.
  • LYST-derived epitopes such as those derived from the LYST gene or LYST-like gene products and/or their functional equivalents, suitable for use in vaccines is a relatively straightforward matter
  • the methods described in several other papers, and software programs based thereon, can also be used to identify epitopic core sequences (see, for example, Jameson and Wolf, 1988; Wolf et al, 1988, U.S. Patent Number 4,554,101).
  • the amino acid sequence of these "epitopic core sequences" may then be readily incorporated into peptides, either through the application of peptide synthesis or recombinant technology.
  • Preferred peptides for use in accordance with the present invention will generally be on the order of about 5 to about 25 amino acids in length, and more preferably about 8 to about 20 amino acids in length. It is proposed that shorter antigenic peptide sequences will provide advantages in certain circumstances, for example, in the preparation of vaccines or in immunologic detection assays. Exemplary advantages include the ease of preparation and purification, the relatively low cost and improved reproducibility of production, and advantageous biodistribution.
  • an epitopic core sequence is a relatively short stretch of amino acids that is "complementary" to, and therefore will bind, antigen binding sites on LYST protein epitope- specific antibodies Additionally or alternatively, an epitopic core sequence is one that will elicit antibodies that are cross-reactive with antibodies directed against the peptide compositions of the present invention It will be understood that in the context of the present disclosure, the term “complementary” refers to amino acids or peptides that exhibit an attractive force towards each other Thus, certain epitope core sequences of the present invention may be operationally defined in terms of their ability to compete with or perhaps displace the binding of the desired protein antigen with the corresponding protein-directed antisera
  • the size of the polypeptide antigen is not believed to be particularly crucial, so long as it is at least large enough to carry the identified core sequence or sequences
  • the smallest useful core sequence expected by the present disclosure would generally be on the order of about 5 amino acids in length, with sequences on the order of 8 or 25 being more preferred
  • this size will generally correspond to the smallest peptide antigens prepared in accordance with the invention
  • the size of the antigen may be larger where desired, so long as it contains a basic epitopic core sequence.
  • Suitable competition assays include protocols based upon immunohistochemical assays, ELIS As, RIAs, Western or dot blotting and the like
  • one of the binding components generally the known element, such as the LYST gene product or LYST-derived peptides, or a known antibody, will be labeled with a detectable label and the test components, that generally remain unlabeled, will be tested for their ability to reduce the amount of label that is bound to the corresponding reactive antibody or antigen
  • LYST LYST protein
  • any test antigen one would first label LYST with a detectable label, such as, e.g., biotin or an enzymatic, radioactive or fluorogenic label, to enable subsequent identification
  • a detectable label such as, e.g., biotin or an enzymatic, radioactive or fluorogenic label
  • the known antibody would be immobilized, e.g., by attaching to an ELISA plate
  • the ability of the mixture to bind to the antibody would be determined by detecting the presence of the specifically bound label This value would then be compared to a control value in which no potentially competing (test) antigen was included in the incubation
  • the assay may be any one of a range of immunological assays based upon hybridization, and the reactive antigens would be detected by means of detecting their label, e.g., using streptavidin in the case of biotinylated antigens or by using a chromogenic substrate in connection with an enzymatic label or by simply detecting a radioactive or fluorescent label
  • An antigen that binds to the same antibody as LYST, for example, will be able to effectively compete for binding to and thus will significantly reduce LYST binding, as evidenced by a reduction in the amount of label detected
  • the reactivity of the labeled antigen, e.g., a LYST composition, in the absence of any test antigen would be the control high value
  • the control low value would be obtained by incubating the labeled antigen with an excess of unlabeled LYST antigen, when competition would occur and reduce binding
  • a significant reduction in labeled antigen reactivity in the presence of a test antigen is indicative of a test antigen that is "cross-reactive", i.e., that has binding affinity for the same antibody.
  • a significant reduction in terms of the present application, may be defined as a reproducible (i.e., consistently observed) reduction in binding.
  • peptidyl compounds described herein may be formulated to mimic the key portions of the peptide structure.
  • Such compounds which may be termed peptidomimetics, may be used in the same manner as the peptides of the invention and hence are also functional equivalents.
  • the generation of a structural functional equivalent may be achieved by the techniques of modelling and chemical design known to those of skill in the art. It will be understood that all such sterically similar constructs fall within the scope of the present invention.
  • Syntheses of epitopic sequences, or peptides which include an antigenic epitope within their sequence are readily achieved using conventional synthetic techniques such as the solid phase method (e.g., through the use of a commercially-available peptide synthesizer such as an Applied Biosystems Model 430A Peptide Synthesizer). Peptide antigens synthesized in this manner may then be aliquoted in predetermined amounts and stored in conventional manners, such as in aqueous solutions or, even more preferably, in a powder or lyophilized state pending use.
  • peptides may be readily stored in aqueous solutions for fairly long periods of time if desired, e.g. , up to six months or more, in virtually any aqueous solution without appreciable degradation or loss of antigenic activity.
  • agents including buffers such as Tris or phosphate buffers to maintain a pH of about 7.0 to about 7.5.
  • agents which will inhibit microbial growth such as sodium azide or Merthiolate.
  • the peptides are stored in a lyophilized or powdered state, they may be stored virtually indefinitely, e.g., in metered aliquots that may be rehydrated with a predetermined amount of water (preferably distilled) or buffer prior to use.
  • Site-specific mutagenesis is a technique useful in the preparation of individual peptides, or biologically functional equivalent proteins or peptides, through specific mutagenesis of the & underlying DNA.
  • the technique well-known to those of skill in the art, further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the DNA Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed.
  • a primer of about 14 to about 25 nucleotides in length is preferred, with about 5 to about 10 residues on both sides of the junction of the sequence being altered.
  • the technique of site-specific mutagenesis is well known in the art, as exemplified by various publications.
  • the technique typically employs a phage vector which exists in both a single stranded and double stranded form
  • Typical vectors useful in site-directed mutagenesis include vectors such as the Ml 3 phage. These phage are readily commercially-available and their use is generally well-known to those skilled in the art.
  • Double-stranded plasmids are also routinely employed in site directed mutagenesis which eliminates the step of transferring the gene of interest from a plasmid to a phage
  • site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector or melting apart of two strands of a double-stranded vector which includes within its sequence a DNA sequence which encodes the desired peptide
  • An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically.
  • This primer is then annealed with the single-stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand.
  • E. coli polymerase I Klenow fragment DNA polymerizing enzymes
  • a heteroduplex is formed wherein one strand encodes the original non-mutated sequence and the second strand bears the desired mutation.
  • This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement.
  • sequence variants of the selected peptide-encoding DNA segments using site-directed mutagenesis is provided as a means of producing potentially useful species and is not meant to be limiting as there are other ways in which sequence variants of peptides and the DNA sequences encoding them may be obtained.
  • recombinant vectors encoding the desired peptide sequence may be treated with mutagenic agents, such as hydroxylamine, to obtain sequence variants
  • mutagenic agents such as hydroxylamine
  • BIOLOGICAL FUNCTIONAL EQUIVALENTS Modification and changes may be made in the structure of the peptides of the present invention and DNA segments which encode them and still obtain a functional molecule that encodes a protein or peptide with desirable characteristics.
  • the following is a discussion based upon changing the amino acids of a protein to create an equivalent, or even an improved, second- generation molecule.
  • the amino acid changes may be achieved by changing the codons of the DNA sequence, according to the codon chart listed in TABLE 1.
  • amino acids may be substituted for other amino acids in a protein structure without appreciable loss of interactive binding capacity with structures such as, for example, antigen-binding regions of antibodies or binding sites on substrate molecules Since it is the interactive capacity and nature of a protein that defines that protein's biological functional activity, certain amino acid sequence substitutions can be made in a protein sequence, and, of course, its underlying DNA coding sequence, and nevertheless obtain a protein with like properties It is thus contemplated by the inventors that various changes may be made in the peptide sequences of the disclosed compositions, or corresponding DNA sequences which encode said peptides without appreciable loss of their biological utility or activity S ⁇
  • the hydropathic index of amino acids may be considered The importance of the hydropathic amino acid index in conferring interactive biologic function on a protein is generally understood in the art (Kyte and Doolittle, 1982, incorporate herein by reference) It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like
  • Each amino acid has been assigned a hydropathic index on the basis of their hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are isoleucine (+4 5), valine (+4 2), leucine (+3 8), phenylalanine (+2 8), cysteine/cystine (+2 5), methionine (+1 9), alanine (+1 8), glycine (-0 4), threonine (-0 7), serine (-0 8), tryptophan (-0 9),
  • hydrophilicity values have been assigned to amino acid residues arginine (+3.0), lysine (+3 0), aspartate (+3 0 ⁇ 1), glutamate (+3 0 ⁇ 1), serine (+0.3), asparagine (+0 2), glutamine (+0 2), glycine (0), threonine (-0 4), proline (-0 5 ⁇ 1), alanine (-0 5), histidine (-0.5), cysteine (-1 0), methionine (-1 3), valine (- 1 5), leucine (-1 8), isoleucine (-1 8), tyrosine (-2 3), phenylalanine (-2 5), tryptophan (-3 4).
  • amino acid substitutions are generally therefore based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like Exemplary substitutions which take various of the foregoing characteristics into consideration are well known to those of skill in the art and include arginine and lysine, glutamate and aspartate, serine and threonine, glutamine and asparagine, and valine, leucine and isoleucine.
  • Positional cloning represents an approach to disease gene identification based solely upon chromosomal location In the 10 years since its inception, positional cloning has become established as a general, relatively efficient mode of identification of genes causing mammalian Mendelian disorders (Collins, 1995).
  • proximal mouse Chr 13 adjacent to the extra-toes (Xt) locus is rich in mutant phenotypes, and represents an interval where a regional approach to disease gene identification may be synergistic Xt is homologous to the human disorder Greig cephalopolysyndactyly, using a positional candidate approach, mutations in a zinc-finger gene (Gli3) were shown to underlie Xt (Vortkamp et al, 1992, Hui and Joyner, 1993) Very close to Xt lies the recessive mutation progressive motor neuronopathy (p n), a model for Werdnig- Hoffmann spinal muscular atrophy (0 recombinants in 246 meioses, Brunialti et al, 1995) The recessive mutation crinkled (cr) maps approximately 2 cM proximal to Xt (23 recombinants in 1197 meioses, Swank et al, 1991, Lyon et al, 1967)
  • (2 ) bg is associated with a characteristic cellular phenotype (giant, perinuclear, dysfunctional lysosomes) offering the possibility of screening candidate genes by genetic complementation, and
  • a mouse genomic DNA library constructed in the vector pYAC4 (Kusumi et al, ⁇ 99 ⁇ ; Research Genetics Inc.) was screened by PCRTM with primers derived from STS flanking bg. False positive PCRTM products were minimized by raising annealing temperatures, and addition of an enhancer of polymerase specificity as necessary (Perfect Match, Stratagene, La Jolla, CA). Veracity of PCRTM products was checked by product digestion with suitable restriction endonucleases, and by inclusion of control yeast DNA in all PCRTM reactions. Individual colonies of yeast clones containing YACs of interest were isolated on plates and frozen in 50% glycerol to prevent occurrence of microdeletions.
  • YAC clones were grown in liquid YPD medium, converted to spheroplasts at exponential growth using Zymolase (ICN Pharmaceuticals, Costa Mesa, CA), and chromosomal DNA purified in agarose.
  • YAC DNA was separated from host yeast chromosomes using preparative pulsed field electrophoresis (PFGE) with low melting point agarose (SeaPlaqueTM GTG, FMC Bioproducts, Rockland, ME), and excised with a sterile blade.
  • PFGE pulsed field electrophoresis
  • a mouse genomic DNA library constructed in the vector PI (Pierce et al, 1992; Genome Systems Inc., St. Louis, MO) was screened by PCRTM with primers derived from STS flanking bg. Stabs corresponding to positive clones were streaked on kanamycin plates, and DNA prepared from individual colonies as described (Pierce et al, 1992).
  • Blocks were then washed, treated with phenylmethylsulfonylfluoride, washed again, and digested with 2-10 units/ ⁇ gDNA of restriction endonucleases (Boehringer-Mannheim Biochemicals, Indianapolis, IN), if necessary PFGE was carried out in 1% agarose gels (Fastlane, FMC BioProducts) at 14°C in IX TBE using a Gene Navigator unit (Pharmacia, Piscataway, NJ). Separation of 50-1500 kb DNA molecules was achieved using pulses ramped from 70-145 sec at 145 V for 46 h.
  • IRE-PCRTM was performed essentially as described using mouse Bl repetitive element primers and PFGE-purified YAC DNA as template (Hunter et al, 1993; Si mler et al, 1991)
  • the Bl repetitive element-specific primers used were 5'-CCAGGACACCAGGGCTACAGAG-3' (SEQ ID NO:75) (forward primer, derived from the 3'-end of Bl) and /or 5'- CCCGAGTGCTGGGATTAAAG-3' (SEQ ID NO:76) (reverse primer, derived from the 5'-end of Bl).
  • Inter-Bl PCRTM was performed with the forward primer alone, the reverse primer alone, or both primers together.
  • PCRTM amplification reactions were performed using 40 ng of YAC DNA, 1 ⁇ M of each primer, and 200 ⁇ M of each dNTP in a 20 ⁇ l reaction. Cycling parameters were 95°C for 2 min, followed by 32 cycles of 94°C for 20 sec, 55°C for 30 sec, and 72°C for 2 min.
  • IRE-PCRTM products were isolated either by band excision from low-melting agarose gels, or by TA subcloning (Invitrogen) IRE-PCRTM products were sequenced, screened for the presence of common mouse repetitive element sequences, and nonrepetitive regions of the sequence used to design oligonucleotides suitable for sequence tagged sites (STS).
  • cDNA was generated from mouse spleen by reverse transcription using random- and oligo(dT)-priming, ligated to amplification cassettes, and PCRTM amplified.
  • Preparative PFGE was used to purify YAC 195A8 DNA, which was biotin-labelled, denatured, and hybridized in solution to the denatured cDNA pool.
  • Direct selection amplicons were cycle sequenced with standard Ml 3 forward and reverse primers Oligonucleotides suitable for STS were designed using direct selection product sequences
  • PCRTM amplification reactions were performed using 40 ng of template DNA (YAC clone, PI clone, S. cerevisiae strain 1380, or C57BL/6J genomic DNA), 1 ⁇ M of each primer, and 200 ⁇ M of each dNTP in a 20 ⁇ l reaction as described (Barbosa et al, 1995) Cycling parameters were 95°C for 2 min, followed by 34 cycles of 94°C for 20 sec, 45-58°C for 30 sec, and 72°C for 20 sec.
  • Amplification products were separated on 3% agarose gels, and visualized by ethidium bromide staining, or by end-labeling one of the primers using [ ⁇ -[ P]ATP and T4 polynucleotide kinase, and separation of products on 6% denaturing polyacrylamide gels, with autoradiographic visualization Simple sequence length polymorphism (SSLP) primers were as described (Dietrich icS et al, 1994; Research Genetics Inc., Hunstsville, AL). Novel STS primer sequences, amplicon sizes, and annealing temperatures are summarized in Table 2.
  • SSLP Simple sequence length polymorphism
  • PCRTM using markers genetically mapped within the bg critical region.
  • YAC clone sizes as determined by PFGE, Southern blotting and hybridization with pBR322, are illustrated in FIG. 1.
  • YAC clones were examined for chimerism, microdeletions, and overlaps by STS content mapping.
  • SSLP were the first source of STS to be utilized.
  • the genomic region encompassing bg is particularly rich in such SSLP (38 have been localized within a 2 cM interval containing bg; Dietrich et al, 1994). Additional proximal chromosome 13 STS were generated using IRE-PCRTM and direct selection.
  • IRE-PCRTM represents a rapid and facile method with which to saturate a genomic region with novel STS for initial characterization of YAC clones and contig development (Hunter et al, 1993; Simmler et al, 1991).
  • IRE-PCRTM was performed using YAC DNA as template and primers derived from ends of the mouse repetitive element Bl which were oriented in opposite directions.
  • IRE-PCRTM products were subcloned, sequenced, and nonrepetitive regions used to design oligonucleotides suitable for sequence tagged sites.
  • 12 novel STS (D13Sfkl-D13Sfkl2) were developed by this method (Table 2), and physically assigned to Chr 13 YAC and PI clones by PCRTM (FIG. 2).
  • Nid cDNA fragments among these products confirmed the efficacy of the selection procedure in enriching for YAC 195A8-encoded genes. Furthermore, of 8 STS corresponding to novel direct selection products, 7 mapped back to YAC195A8 by PCRTM analysis (Dl3S ⁇ l3-Dl3Sfkl9; Table 2, FIG. 2). D13Sfkl3 and D13Sfkl8 also hybridized sufficiently well to Southern blots to permit physical mapping adjacent to Nid on a polymorphic NotI fragment (1 100-kb in DBA/2J D ⁇ A and 1150-kb in SB/LeJ D ⁇ A).
  • D13Sfkl3 was also genetically mapped within the bg critical region in 504 backcross mice [C57BL/6J- bg' X (CSTBL/ ⁇ l-bg' x CAST/Ei)Fj] using a Taql polymorphism.
  • YAC and PI clones were typed for the presence or absence of STS derived from SSLP, IRE-PCRTM amplicons, and direct selection products.
  • STS content mapping enabled examination of clones for chimerism and microdeletions.
  • One YAC clone, 64F5 was chimeric. This YAC, while 580-kb in size (FIG. 1), contained only D13MU44, and not STS derived from the 5'- or 3'- ends of Nid (FIG. 2).
  • YAC clone (84A8) contained an internal deletion which included D13Sfk6 (FIG. 2). Furthermore, the physical size of 84 A8 (370-kb) was considerably smaller than expected: the distance between the other genetic markers it encompassed was approximately 600-kb, confirming a substantial genomic deletion within this YAC. Some YAC clones have been reported to be unstable in culture, and become progressively smaller with time (Nehls et al, 1995). YAC 84 A8 may exhibit such instability.
  • STS content mapping also enabled ordering of YAC and PI clones within the bg critical region and integration of clones into 2 contigs (FIG. 2).
  • Contig 1 comprised 7 YAC and 2 PI clones, extended from D13Sfkl9 to D13Sfk2, and was approximately 1150-kb in length. The orientation this contig with respect to centromere was not established.
  • the second contig 2 consisted of 2 YAC clones. It extended from D13M 207 (proximal) to D13SfklO (distal), and was approximately 1000-kb in length. Contig 2 spanned the crossover defining the distal border of the bg critical region (FIG. 2).
  • Direct selection products identified from YAC 195A8 using splenocyte cDNA not only allowed STS content mapping of Chr 13 YACs, but also constitute candidate genes for bg and cr.
  • STS Novel Sequence Tagged Sites Isolated from bg Critical Region YACs by lRE-PCRTM(D13Sfkl-D13Sflil2) or Direct Selection (D13Sn ⁇ 13-D13Sfl ⁇ 19)
  • This example illustrates the generation of a high resolution genetic map of proximal Chr 13 in the vicinity of bg, and the identification of two genes which are tightly linked to bg.
  • C57BL/6J-Z / X (C51BL/6J-bg J x CAST/EiJ)F] backcross mice were bred and maintained as described (Barbosa et al, 1995).
  • C51BL/6J-bg x PA Fi X C57BL/6J- ⁇ g J backcross mice used have been described (Holcombe et al, 1991).
  • RNA prepared from liver, spleen and kidney of C57BL/6J-+/+, C57BL/6J- bg 1 , SBlLel-bg, and C3H/HeJ-Z>g' 2/ mice using standard techniques was separated on formaldehyde agarose gels, transferred to Zeta-probe membranes (Bio-Rad Laboratories), and hybridized as previously described (Kingsmore et al, 1994).
  • RNA was prepared from liver of C57BL/6J-+/+, C57BL/6J- , SB/LeJ- and C3H/HeJ- bg" mice by extraction with phenol / guanidine isothiocyanate (TRIzol7, Gibco BRL,
  • the template for quantitative RT- PCRTM assays was 1-10 ng of first- strand 7/ cDNA, which had been synthesized from total RNA with an oligo(dT) primer and Moloney murine leukemia virus reverse transcriptase (Stratagene, La Jolla, CA)
  • the nidogen (Nid) primers used for RT- PCRTM correspond to bp 3805-3822, and bp 3938-3955 of the mouse Nid cDNA (Durkin et al, 1988)
  • the Estm9 primers used were
  • RT-PCR7 products were amplified from bg, bg', bg 2 , and -r/+ RNA with Nid primers or Estm9 primers Fl-Rl or F2-R2 Quantitative RT-PCR7 of aldolase A, which is constitutively expressed, was also performed, to ensure that equal amounts of bg, bg , bg 2 , and +/+ template were used
  • PCRTM reactions were performed in a 50 ⁇ l volume containing 1-20 ng of cDNA, 1 ⁇ M of each primer, 200 ⁇ M each dNTP, 10 mM Tris-HCl, pH 8 8, 50 mM KC1, 1 5 mM MgCl 2 , and 1.25 TJ AmpliTaq7 DNA polymerase (Perkin-Elmer Cetus, Norwalk, CT) Cycling profiles consisted of an initial denaturation (94EC for 2 min) followed by 25 cycles of 94EC for 30 sec, 55-58EC for 30 sec, and 72EC for 1 minute per kb of expected product length PCRTM products were separated by electrophoresis on agarose gels, and quantified by intensity of ethidium bromide staining
  • PCRTM PCRTM amplification reactions were performed using 40 ng of genomic DNA, 1 ⁇ M of each primer (Dietrich et al, 1994; Research Genetics, Inc , Huntsville, AL), and 200 ⁇ M of each dNTP in a 20 ⁇ l reaction as described (Barbosa et al, 1995) Cycling parameters were 95EC for 2 min, followed by 36-38 cycles of 94EC for 20 sec, 58EC for 30 sec, 72EC for 10 sec Where possible, amplification products (20 ⁇ l) were separated on 3% agarose gels, and visualized by ethidium bromide staining.
  • SSLP with allele sizes differing among strains by less than 8 bp were typed by end-labeling one of the primers using [ ⁇ 32 P]ATP and T4 polynucleotide kinase, separation of amplification products (4 ⁇ l) on 6% denaturing polyacrylamide gels, and visualization by autoradiography. SSLP allele sizes are summarized in FIG. 3A, FIG. 3B, FIG. 3C and FIG. 3D.
  • Blocks were then washed, treated with phenylmethylsulfonylfluoride, washed again, and digested with 2-10 units/ ⁇ g DNA of restriction endonucleases (Boehringer Mannheim Biochemicals).
  • PFGE was carried out in 1% agarose gels (Fastlane, FMC BioProducts) at 14EC in IX TBE using a Gene Navigator system (Pharmacia, Piscataway, NJ). Separation of 50-1500 kb DNA molecules was achieved using pulses ramped from 70-145 sec at 145 V for 46 hr; 1000-6000 kb DNA was resolved by pulses of 15-90 min at 50 V for 6 or 10 days.
  • the third backcross was established between CSlBhl ⁇ i-bg 1 mice and Mus castaneus (CAST/EiJ), and 504 [C57BL/6J-Z / X (C57BL/6J-_>g / x CAST/EiJ)F, ] progeny were generated.
  • Mus castaneus was chosen as the second parent in the latter intrasubspecific backcross due to the increased likelihood of detection of DNA polymorphism in comparison to intraspecific crosses
  • Mice were phenotyped for the presence or absence of a beige-colored coat; Penetrance of bg in all of the crosses was complete (359 of 726 backcross mice [49%] exhibited a beige-colored coat).
  • FIG 3A Linkage relationships were determined using segregation analysis (Green, 1981), and the best gene order decided by minimization of crossover events and elimination of double crossover events (Bishop, 1985) Haplotype analysis for each cross is shown in FIG 3A, FIG. 3B, FIG 3C, and FIG 3D
  • FIG 3D A composite linkage map of proximal mouse Chr 13, derived by integration of these 3 crosses, is shown in FIG 3D
  • the combined results delimit the region containing bg to a 0 24 ⁇ 0 17 interval on Chr 13, flanked proximally by the genetic markers D13MU172 and D13M 239, and distally by Gl ⁇ 3, D13M ⁇ t56, D13MU162, D13M ⁇ t237, D13MU240, and D13M 305 bg cosegregated with 6 genetic markers (Nid, Estm9, D13MU44, D13MU114, D13M ⁇ tl34 and D13M ⁇ t207)
  • Backcross mice with recombination events which define the bg nonrecombinant interval were derived from the [C57BL/6J-6g X (C57BL/6J- ⁇ x CAST/EiJ)F] ] backcross
  • Southern blots were generated with DNA from 6 bg alleles SB/LeJ-Z>g, C57BL/6J-£g' / ,
  • Nid and Estm9 in bg mice was examined by northern blot analysis and quantitative RT-PCR7 Hybridization of northern blots of liver and kidney RNA from +/+, bg, bg 1 , and bg v with probes for Nid and Estm9, yielded signals of similar size and intensity in bg and +/+ RNA Furthermore, no difference in amplicon size or amount was observed upon quantitative RT-PCR7 using liver or kidney RNA from +/+, bg, bg 1 , and bg ⁇ mice and oligonucleotides for Nid or Estm9, indicating expression of Nid and Estm9 to be grossly intact in bg 5.2.2.5 PHYSICAL MAPPING OF PROXIMAL MOUSE CHR 1 IN THE VICINITY OF BG
  • the DBA/2 fragment identified with Nid was 25-50 kb smaller than the corresponding band identified in control DNA (FIG. 3A, FIG. 3B, FIG. 3C, and FIG. 3D).
  • No difference in band sizes were observed among other strains or upon reprobing of PFGE-Southern blots with GU3 or Estm9. Since fragment size differences were observed with many rare-cutting restriction endonucleases, including several which are methylation-insensitive, it is unlikely that they are merely interstrain differences in DNA methylation or point mutations. Instead, it is suggested that a genomic rearrangement has occurred in the DBA/2 mouse at a distance of less than 900 kb from Nid (FIG. 3D).
  • the rearrangement may represent a small (25-50 kb) genomic deletion in the DBA/2 mouse.
  • the functional significance of such a putative rearrangement is uncertain Interestingly, a similar phenomenon was recently described in the vicinity of the human nidogen gene (Goodrich and Holcombe, 1995) upon hybridization to pulsed field gel electrophoresis Southern blots of human genomic DNA digested with Sail, nidogen identified polymorphic band sizes in Caucasian populations.
  • homozygosity for one NID allele was observed, suggesting the possibility of linkage of human CHS and NTD (Goodrich and Holcombe, 1995). Definitive mapping of human CHS, however, must await identification of the mouse bg gene.
  • the interstrain differences in pulsed field restriction fragment length provide a physical landmark within the bg nonrecombinant interval.
  • bg candidate genes can be easily screened for physical linkage of with Nid as a means of determining whether or not they lie within the bg nonrecombinant interval.
  • the bg locus has been localized, which is the mouse homolog of human CHS, to a genomic interval corresponding to approximately one four-hundredth of mouse Chr 13. This represents an important intermediate step in the positional cloning of bg, and thereby human CHS.
  • RT-PCRTM Quantitative reverse transcription (RT)-PCRTM demonstrated a moderate decrease in Lyst mRNA in bg and bg' liver, and a gross reduction in bg 11 (Lyst ⁇ OD after normalization for ⁇ -actin mRNA, +/+, 1 00, bg ⁇ lbg ', 0 19, bglbg, 0 28, bg'lbg 1 , 0 40)
  • a commensurate reduction in bg" transcript abundance was noted by using several primer pairs derived from different regions of the Lyst cDNA Aberrant Lyst RT-PCRTM products were not observed
  • the molecular basis of the decrease in Lyst mRNA in bg 11 is not yet known, but it is pronounced of the leaky ablation of
  • the predicted open reading frame (ORF) of Lyst was 4,635 nucleotides, encoding a protein of 1,545 amino acids and relative molecular mass 172,500 ( - 172 5K) (FIG 13a)
  • Nucleotides 51-74 are rich in CG nucleotides, a common feature of the 5' region of housekeeping genes Comparison with DNA databases indicated that Lyst is novel, and resembles only uncharacterized human-expressed sequence tags (ESTs)
  • the sequence of a cDNA clone corresponding to one such human EST (Genbank accession number L77889) matched the 5' region of mouse Lyst (nucleotide identity was 76% in the 5' untranslated region (UTR), 91% in the ORF, and amin-acid identity was 97%, FIG 13c), another human EST matched the 3' region of the mouse Lyst coding domain (Genbank accession number W26957)
  • the human clones identified restriction fragments that were indistinguishable from mouse Lystl; physical mapping of the human clones to the same region of the mouse genome as Lyst indicates that they are indeed homologous to Lyst 12>
  • CHS and bg represent homologous disorders, as their clinical features(Blume and Wolff, 1972) and defects in lysosomal transport (Burkhardt et al, 1993) are identical. Homology of bg and CHS is supported by genetic complementation studies; fusion of fibroblasts from bg mice and CHS patients failed to reverse lysosomal abnormalities, in contrast to fusions with normal cells (Perou and Kaplan, 1993). Furthermore, recent genetic linkage studies have shown that CHS maps within a linkage group conserved between human Chromosome l q43 and the bg region on mouse Chromosome 13.
  • LYST mutations in CHS patients were sought by sequencing LYST lymphoblast and fibroblast cDNAs corresponding to these ESTs from 10 CHS patients.
  • a single-base insertional mutation was found at nucleotides 117-118 of the LYST coding domain, resulting in a frame shift and termination after amino acid 62 (FIG. 13c).
  • the domain is stathmin that matches Lyst is helical and has heptad repeats that participate in coiled- coil interactions with other proteins (Sobel, 1991; Maucuer et al, 1995).
  • the stathmin-like region of Lyst is also predicted to be helical and formed coiled coils. However, it is the charged residues, rather than the hydrophobic ones, that are conserved between Lyst and stathmin, suggesting that the sequence similarity is not primarily due to conserved secondary structure.
  • this region of Lyst potentially encodes a coiled-coil protein-interaction domain that may regulate microtubule-mediated lysoso e transport.
  • Lyst is no predicted to have transmembrane helices
  • the C-terminal tetrapeptide (CYSP; amino acids 1,542-1,545) is strikingly similar to known prenylation sites, which could provide attachment to lysosomal/late endosomal membranes through thioester linkage with the cysteine.
  • Lyst contains 25 sites of potential phosphorylation by PKC, 36 by casein kinase II (CKII) (many of which overlap those of PKC), two by cAMP-dependent protein kinase, and one by tyrosine kinase (FIG. 136).
  • CKII casein kinase II
  • Lyst seems to contain helical bundles with clusters of phosphorylation sites at either end Stathmin also has an N- terminal phosphorylation site and helix motif, and these Lyst domains may have a similar 'signal relaying' function to stathmin (Sobel, 1991, Maucuer et al, 1995) Furthermore, phosphorylation of these positions could provide a control mechanism by causing a conformational shift in the bundles, thereby affecting interactions with other molecules.
  • RT-PCRTM using total RNA and by sequencing of human ESTs similar in sequence to mouse Lyst.
  • the primers used to amplify the cDNA between bp 1891 and 3050 were derived from the mouse Lyst sequence Human primers were designed from the sequence of the PCRTM product (1159 bp) and used to amplify the flanking sequences
  • PCRTM products were cloned using a TA cloning kit (Invitrogen Corporation, San Diego California) and both strands were cycle sequenced The sequences were analyzed with the GCG Package (Devereux et ai, 1984) and searches of the National Center for Biotechnology Information database were performed using the BLAST network server (Altschul et ai, 1990) (National Library of Medicine, via INTERNET) and the Whitehead Institute Sequence Analysis Programs (MIT, Cambridge, Massachusetts)
  • each PCRTM product was mixed with an equal volume of denaturing buffer and heated to 95°C for 3 min , after which the samples were loaded onto 0.8 mm thick, 10% native polyacrylamide gels. Gels were run at ambient temperature at 9 W for 6-10 hours, depending on the size of the PCRTM product Bands were visualized by silver- staining (Beidler et al. , 1982)
  • PCRTM products spanning the mutation site in patient 371 were transferred to nylon membranes using a slot blot apparatus Approximately 5 ng of each PCRTM product was treated with a denaturing solution (0.5 M NaOH, 1 5 M NaCl), split in half and loaded in duplicate Two 17 mer oligonucleotides were synthesized that span the region containing the mutation One contained the sequence of the normal allele (5'-CGCACATGGCAACCCTT-3')(SEQ ID NO 73), while the other contained the sequence of the mutant allele (5'-GCACATGGGCAACCCTT-3') (SEQ ID NO 74) These were end-labeled with ⁇ 3 P-dATP using T4 polynucleotide kinase and hybridized to the membranes at 50°C Hybridization and wash buffers were as described (Church and Gilbert, 1984). Membranes were sequentially washed at 45°C, 55°C and 65°C for 10 min each and exposed to X-ray film
  • RT-PCRTM Reverse- tran %scrniption and PCRTM confirmed that nucleotides 1-4706 of Lyst also represent the previously undetermined 5' end of the BG open reading frame (FIG. 15c).
  • a full length cDNA was assembled from nucleotides 1-4706 of Lyst, the 2 kb 3 'RACE-PCRTM clone and 6824 nucleotides of BG cDNA
  • This 11,817 bp cDNA sequence corresponds to the largest mRNA observed in Northern blots (-12 kb) (Goodrich and Holcombe, 1995).
  • Lyst-ll corresponds to a smaller ( ⁇ 4kb) mRNA observed on Northern blots. Lyst-l and Z-yst-II are both present in poly(A) + RNA from many mouse tissues (FIG. 15b).
  • the putative Lyst-I protein is of relative molecular mass 425,287 (M r 425K) while that of Lyst-II is predicted to be of M r 172.5K.
  • the partial LYSTl cDNA sequence (Genbank Accession number U70064, 7.1 kb) was assembled by alignment of these clones with mouse Lystl cDNA.
  • Human LYSTl has 82% predicted amino acid identity with mouse Lystl over 1,990 amino acids.
  • the predicted human LYSTl amino acid sequence contains a 6 amino acid insertion relative to mouse Lystl at residue 1,039.
  • Another group has published the sequence of the human LYSTl cDNA (Nagle et al, 1996).
  • the cDNA sequence of the present invention differs in at 4 nucleotides and 3 predicted amino acids from that of Nagle et al. (1996).
  • This 13.5 kb cDNA sequence corresponds to the largest mRNA (LYSTl -isoform I) observed on northern blots of human tissues (caption in FIG. 2). These northern blots also demonstrated the existence of a smaller LYST isoform (-4.5 kb, designated
  • ZFSr-isoform II ZFSr-isoform II that was similar in size to the smaller mouse Lystl mRNA, and that appeared to differ in distribution of expression in human tissues from LYSTl -isoform I.
  • genomic derivation of human LYSTl -isoform II was the same as mouse Lystl-isoform II
  • sequence of the 3' end of the human LYSTl -II isoform was sought by cloning human LYSTl intron F' using PCRTM of human genomic DNA with primers derived from LYSTl exon F and mouse intron F' (caption in FIG. 2).
  • a 2 kb human LYST probe was assigned to human chromosome 1 by hybridization to human-rodent somatic cell hybrid DNA (FIG. 16). All of the bands that segregated with human DNA hybridized only to somatic cell hybrids containing human chromosome 1 DNA.
  • FIG. 16b and FIG. 16c Barrat et ai 1996.
  • SSCP Single-strand conformation polymorphism
  • Patient 371 had previously been shown to have a frame-shift mutation with a G insertion at nucleotide 118 of the coding domain (FIG 4c)[Barbosa et al, 1996]
  • FOG 4c Frame-shift mutation with a G insertion at nucleotide 118 of the coding domain
  • Lymphoblasts from all of these patients contain the giant perinuclear lysosomal vesicles that are the hallmark of CHS Patients 369, 370, and 371 had typical clinical presentations of CHS, with recurrent childhood infections and oculocutaneous albinism The parents of patients 369 and 370 are known not to have been cosanguinous In contrast, the clinical course of patients 372 and 373 was milder Lymphoblasts were immortalized from patient 372 at 27 years of age He had oculocutaneous albinism, recurrent skin infections, and peripheral neuropathy Patient 373 has not had systemic infections and is alive at age 37 Patient 373 does, however, have hypopigmented hair and irides as well as peripheral neuropathy
  • LYST-l mRNA was found to be most abundant in thymus (adult and fetal), peripheral blood leukocytes, bone marrow, and several regions of the adult brain In contrast, no LYST-l mRNA was detected in fetal brain Negligible LYST-l transcription was also apparent in heart, lung, kidney, or liver at any developmental stage
  • Lyst (Lysosomal trafficking regulator)
  • Lyst Lysosomal trafficking regulator
  • Lyst-I contains intron-derived sequence at the 3' end Lyst-ll corresponds to a smaller mRNA observed on Northern blots While several other genes generate an alternative C-terminus by incomplete splicing (Myers et al, 1995, Sugimoto et al, 1995, Sygiyama et al, 1996, Zhao and Manlley, 1996, Van De Wetering et al, 1996), the bg gene is unique in that the predicted structures of the two C-termini are quite different
  • the C-terminus of Lyst-I contains a 'WD'-repeat domain that is similar to the ⁇ -subunit of heterotrimeric G proteins and which may assume a propeller-like secondary structure (Lambright et al.
  • Lyst-II has a C-terminal prenylation motif that could provide attachment to the lysosomal membrane Although the prenylation signal is absent from Lyst-I, it contains a hydrophobic region that is predicted to be membrane associated The significance of these divergent features is increased by the fact that Lyst is not predicted to have transmembrane helices
  • Stathmin is a coiled-coil phosphoprotein thought to regulate microtubule polymerization and to act as a relay for intracellular signal transduction (Sobel 1991, Belmont and Mitchison, 1996) This region of LYST may encode a coiled-coil protein interaction domain and may regulate microtubule-mediated lysosome trafficking Intriguingly, a defect in microtubule dynamics has previously been documented in CHS (Oliver et al, 1975) and intact microtubules are required for maintenance of lysosomal morphology and trafficking (Matteoni and Kreis, 1987, Swanson et al, 1987, Swanson et al, 1992, Oka and
  • LYST-l and LYST-ll transcripts were abundant in the latter brain tissues, peripheral blood leukocytes, and bone marrow Only the smaller LYST isoforms were expressed in several tissues, including heart, fetal heart, aorta, thyroid gland, salivary gland, kidney, liver, fetal liver, appendix, lung, fetal lung, and fetal brain.
  • the developmental pattern of LYST mRNA isoform expression in brain was particularly interesting, since only the smaller LYST isoforms were expressed in fetal brain, whereas the largest isoform (LYST-l) predominated in many regions of the adult brain
  • PEPTIDE SEQUENCE OF LONG ISOFORM (SEQ ID NO: 8)
  • PEPTIDE SEQUENCE OF SHORT ISOFORM (SEQ ID NO: 10)
  • Lyst2 was identified in a search for human genes similar in sequence to Lystl (the CH gene) Mouse Lystl cDNA sequence was compared with Genbank sequences, and significant similarity (52%) was noted between residues 3275 to 3413 of Lystl (Genbank Accession number U70015) and R17955 R17955 is an uncharacterized human expressed sequence tag 292 bp in length
  • the corresponding partial length cDNA clone (#32273) was obtained from Image consortium This cDNA clone was derived from a cDNA library of human infant brain, and is 1979-bp in length The clone was designated human LYST2.
  • the LYST2 clone was sequenced using standard methodologies The DNA sequence is given below (SEQ ID NO.11)
  • This DNA sequence corresponds to the 3' end of the coding domain of human LYST2 and the 3 ' untranslated region
  • Amino acids 2 to 140 of the predicted human LYST2 protein share only a 51 8% amino acid identity with amino acids 3275 to 3413 of mouse and human Lystl .
  • the C-terminal residues of LYST2 are not similar to LYSTl, but do have a similar predicted secondary structure:
  • This region of LYSTl contains WD repeats and is predicted to assume a propellor-like secondary structure, similar to the beta subunit of heterotrimeric G proteins.
  • the corresponding region of LYST2 also contains WD repeats and is also similar in sequence to the beta subunit of heterotrimeric G proteins (30.4% identity from LYST2 amino acid 285 to 418 to the guanine nucleotide-binding protein beta subunit-like protein P49027).
  • the stop codons of mouse Lystl and human LYST2 occur approximately the same distance from the matching region
  • mice revealed Lyst2 to map to mouse Chromosome3 between D3M ⁇ t21 and D3MU22 This contrasts with Lyst, which maps on mouse Chromosome 13 Pulsed field gel electrophoresis blots of mouse DNA hybridized with a Lyst 2 probe showed a single band, indicating that Lyst2 is a single genetic locus
  • Lyst2 is abundantly expressed in mouse brain, and moderately expressed in mouse kidney, and weakly expressed in mouse heart, lung, skeletal muscle, and testis. Lyst2 is not expressed in mouse spleen or liver.
  • LYST2 was expressed as follows Moderate expression was observed in melanoma cells, weak expression in HeLa cells, colorectal carcinoma cells, and in spleen, lymph node, thymus, and appendix No expression was detected in peripheral blood leucocyte, bone marrow, fetal liver, lung carcinoma, or leukemia cell lines (K562, MOLT4, Raji, HL60)
  • the major transcript was 13-kb in size in human RNA
  • LYST2 appears to be similar in size to the largest LYSTl mRNA, but has a very different tissue distribution of expression, being abundantly expressed only in brain LYST2 appears to be a brain-specific homologue of LYSTl, and may function to regulate protein trafficking to the lysosome and late endosome within the brain
  • the relative abundance of LYST2 mRNA isoforms in human tissues at different developmental stages was examined by sequential hybridization of a poly(A) + RNA dot blot with a LYST2 cDNA probe
  • the quantity of poly(A) + RNA loaded on the blot was normalized to eight housekeeping genes (phospholipase, ribosomal protein S9, tubulin, a highly basic 23-kDa protein, glyceraldehyde-3 -phosphate dehydrogenase, hypoxanthine guanine phosphoribosil transferase, ⁇ -actin, and ubiquitin) to allow estimation of the relative abundance of LYST2 mRNA isoforms in different tissues
  • LYST2 transcripts were detected in all brain regions and in kidney LYST2 transc ⁇ pts were detected in those regions at all developmental stages
  • a mouse embryo (day 14.5 post-coitum) cDNA library was hybridized with a probe corresponding to human LYST2. Two clones were isolated and sequenced They contained overlapping sequences that were assembled by alignment with human LYST2 and represent 2543 bp of cDNA sequence
  • Mouse Lyst2 shares 98% amino acid identity with human LYST
  • Beguez-Cesar "Neutropenia cronica maligna familiar con granulaations atipicas de los leucocitos," -5o/. Soc. Cubana Pediat., 15 900-922, 1943 Beidler, Hilliard and Rill, "Ultrasensitive staining of nucleic acids with silver,” Anal. Biochem.,
  • mice J Cell Biol, 67 774-788, 1975 Brunialti et al, "The mouse mutation progressive motor neuronopathy (pmn) maps to chromosome 13," Genomics, 29 131-135, 1995 Burkhardt, Wiebel, Hester and Argon, "The giant organelles in beige and Chediak-Higashi fibroblasts are derived from late endosomes and mature lysosomes," J. Exp. Med, 178 1845-1856, 1993 Campbell, “Monoclonal Antibody Technology, Laboratory Techniques in Biochemistry and Molecular Biology,” Vol 13, Burden and Von Knippenberg, Eds pp 75-83, Elsevier,
  • Kingsmore et al "A 6000 kilobase segment of chromosome 1 is conserved in human and mouse," EMBO J., 8 4073-4080, 1989 Kingsmore et al, "Glycine receptor ⁇ -subunit gene mutation in spastic mouse associated with
  • Genomics 22 202-204 1 94 Matteoni and Kreis, “Translocation and clustering of endosomes and lysosomes depends on microtubules," J. Cell Biol, 105 1253-1265, 1987 Maucuer, Ca onis, and Sobel, Proc. Natl Acad. Sci USA, 92 3100-3104, 1995 McBride et al, Genomics, 6 219-225, 1990
  • mice selectively lack lysosomal elastase and cathepsin G," J. Exp. Med.
  • Windhorst, Zelickson and Good "A human pigmentary dilution based on a heritable subcellular structural defect - the Chediak-Higashi syndrome," J. Invest. Dermatol, 50 9-18 , 1968 Wo ⁇ f et a , Compu. Appl. Biosci., 4(1): 187-91, 1988.
  • Wolff, Dale, Clark, Root, and Kimball "The Chediak-Higashi syndrome studies of host defenses," Ann. Intern. Med., 76:293-306, 1972 Wong and Neumann, "Electric field mediated gene transfer," Biochem. Biophys. Res. Commun.
  • Chediak-Higashi syndrome Defects expressed by cultured melanocytes," Lab. Invest.,
  • TTCAAATCCC TTTTACTTCA GTCAAGCCAT GGATTTAGTT CAAGAATTTA TCCAGCACCA 240
  • AACAGCACCA GACCTGGGAT TTCTGAGAAA GAGTGCTGAC AGCGTGCGTG GATTCCAGTC 2100
  • TTCTCTCCTC ATACAACAGG GAACTGTGAA AATCCTTCTA GGCGGGTTCT TGAATATTTT 2880

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Toxicology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Zoology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Disclosed are compositions comprising murine Lyst1 and Lyst2 genes and human LYST1 and LYST2 genes. Also disclosed are the Lyst1, Lyst2, LYST1, and LYST2 proteins encoded by these genes, respectively. Also disclosed are methods of using these genes in identifying patients with Chediak-Higashi Syndrome and detecting CHS-related nucleic acid and/or protein sequences. Also disclosed are methods for the recombinant expression of LYST1, Lyst1, LYST2, and Lyst2 polypeptides, antibodies raised against these polypeptides, and therapeutic approaches to treatment of autoimmune diseases and certain types of tumors. Assays for detection of the gene mutations resulting in CH Syndrome, as well as diagnostic probes for the detection of Lyst1, Lyst2, LYST1, and LYST2 genes are also provided.

Description

DESCRIPTION
LYSTl AND LYST2 GENE COMPOSITIONS AND METHODS OF USE
1. Background of the Invention
The present application is a continuation in part of U. S. Provisional Patent Application Serial No. 60/XXX,XXX, filed December 23, 1996 and of U. S. Provisional Patent Application Serial No. 60/XXX,XXX, filed December 20, 1996, which is a contiuation in part of U. S . Provisional Patent Application Serial No 60/011, 146, filed February 1, 1996, the entire contents of which are specifically incorporated herein by reference. The United States government has certain rights in the present invention pursuant to Grants Al 39651 and 5P30-AR 41943 from the National Institutes of Health.
1.1 Field of the Invention
The present invention relates generally to the field of molecular biology. More particularly, certain embodiments concern methods and compositions comprising novel DNA segments, and proteins derived from mammalian species. More particularly, the invention provides Lystl and Lyst2 gene compositions from murine origins and the homologous LYSTl and LYST2 gene compositions from human origins. Various methods for making and using these LYSTILyst DNA segments, native peptides and synthetic protein derivatives are disclosed, such as, for example, the use of DNA segments as diagnostic probes and templates for protein production, and the use of LYSTl, Lystl, LYST2, and Lyst2 proteins, fusion protein carriers and Lyst-derived peptides in various pharmacological and immunological applications.
1.2 Description of the Related Art
1.2.1 Chediak-Higashi (CH) Syndrome
Chediak-Higashi syndrome (CHS) is an autosomal recessive, immune deficiency disease that maps on chromosome (Chr) Iq42-q43 (Goodrich and Holcombe, 1995; Barrat et al. 1996; Fukai etai, 1996). Affected individuals have giant, perinuclear lysosomes, defective granulocyte, NK and cytolytic T cell function, and die prematurely of infection or malignancy (Beguez Cesar, 1943, Blume et al, 1968; Wolff et al., 1972; Blume and Wolff, 1972, Root et al., 1972; Roder et al., 1982; Baetz et al., 1995). CHS patients also exhibit partial oculocutaneous albinism, platelet storage pool deficiency and neurologic defects such as peripheral neuropathy and ataxia (Windhorst et al. , 1968, Meyers et al. , 1974, Maeda et al. , 1989, Pettit and Berdal, 1984, Misra et al. , 1991 ) Recently it was demonstrated that intracellular protein transport to and from the lysosome is disordered in CHS (Baetz et al., 1995, Brandt et al., 1975, Burkhardt et al., 1993, Zhao et al., 1994) Such functional defects m secretory lysosomes of granular cells (leukocytes, melanocytes, megakaryocytes and cerebellar Purkinje cells) provide a unifying hypothesis that can explain the diverse clinical features of CHS (Griffiths, 1996)
As an antecedent to identification of the human CHS gene, the inventors undertook positional cloning of the mouse mutation beige (bg), which had long been considered homologous to CHS The clinical and pathologic features of CHS and bg are very similar and bg maps on proximal mouse Chr 13 within a linkage group conserved with human chromosome Iq42-q43 (the position of the CHS locus) (Jenkins etai, 1991) Additional evidence that human CHS and bg mice were homologous disorders came from interspecific genetic complementation studies, which demonstrated that fusion of bg mouse and human CHS fibroblasts failed to reverse lysosomal morphologic abnormalities (Penner and Prieur, 1987)
Recently the inventors' group and one other succeeded in identifying the gene that is defective in bg mice (Perou et ai, 1996a) However, the reported bg candidate cDNA sequences (Lyst and BG) were different Both sequences were isolated from the same yeast artificial chromosome (YAC) clone This YAC had been authenticated by mapping within the bg critical region and by restoration of normal lysosomal morphology to bg fibroblasts upon transfection
(Perou et ai, 1996a, Perou et ai, 1996b) Furthermore, both of the candidate gene sequences contained mutations in different bg alleles
1.3 Deficiencies in the Prior Art
Methods for the treatment and diagnosis of Chediak-Higashi Syndrome have not been developed because the sequence of the CH gene has not been identified in mice or humans
Despite some recent studies in mice, there is only speculation that a linkage similar to that found in beige mice might exist in the human gene (Owen, et al, 1986) There is some evidence that indicate that the CH mutation is located in the same gene in mouse, mink and human (Perou and Kaplan, 1993), however, except for the beige mouse, the locus of the mutation has not been identified CHS patients have been reported to suffer from several serious medical conditions, including impaired natural killer cell activity (Haliotis et al, 1980) and defective lymphocyte- mediated antibody dependent cell mediated leukocyte mediated ADCC against tumor cell targets (Klein, et al, 1980). Despite the recognition of these deficiencies, little progress in treatment has been achieved, mainly because the gene harboring the mutation leading to these impairments has not yet been identified.
Chediak-Higashi Syndrome occurs only in a small minority of the population. However, there is a growing realization of the potential role of the CH gene product (LYSTl) in developing treatments for conditions such as systemic autoimmune disease and possibly certain types of malignancy related to the regulation of protein trafficking within cells by the CH gene (LYSTl). Therefore, what is lacking in the prior art is the isolation and characterization of the CH gene from mice and humans, useful in the development of treatments and assays for autoimmune diseases such as CHS and certain forms of cancer.
2. Summary of the Invention Positional cloning of the mouse CHS homologous is facilitated by the existence of numerous remutations at the bg locus. All have arisen spontaneously, with the exception of the SΕfh i-bg allele, which was induced by radiation. The present invention addresses one or more of the foregoing or other problems associated with the detection of Chediak-Higashi Syndrome in humans. Both the mouse gene and the homologous human have been cloned and sequenced.
The isolation and sequencing of the Chediak-Higashi gene (LYSTl) from both murine and human sources has now provided methods of detecting CHS at the gene level, such as by various assays making use of the gene, gene segments and/or the encoded proteins or polypeptides. In addition to the practical value, the gene provides a tool for understanding and controlling mechanisms of regulation of protein trafficking to lysosomes, and particularly to the contribution of vesicular sorting to diverse cellular functions. An immediate result of the identification of the LYSTl gene is the ability to perform linkage analysis and to identify individuals at risk to have progeny carrying the mutated gene. The inventors have shown that the murine gene, Lystl, and BG sequences are derived from a single gene with alternatively spliced mRNAs. In an important embodiment, the inventors have also identified the human homolog of the bg gene (Lystl), LYSTl. LYSTl maps within the CHS critical region and is mutated in several CHS patients. 2.1 LYST and Lyst Gene Compositions
As used herein, the term "DNA segment" refers to a DNA molecule that has been isolated free of total genomic DNA of a particular species Therefore, a DNA segment encoding LYST/Lyst refers to a DNA segment that contains LYST or Lyst coding sequences yet is isolated away from, or purified free from, total genomic DNA of the species from which the DNA segment is obtained Included within the term "DNA segment", are DNA segments and smaller fragments of such segments, and also recombinant vectors, including, for example, plasmids, cosmids, phagemids, phage, viruses, and the like Preferred LYST genes are the LYSTl and LYST2 genes from human origin, while preferred Lyst genes are the Lystl and Lyst2 genes from murine origin
Similarly, a DNA segment comprising an isolated or purified LYST/Lyst gene refers to a DNA segment including a LYST or Lyst coding sequence and, in certain aspects, regulatory sequences, isolated substantially away from other naturally occurring genes or protein encoding sequences In this respect, the term "gene" is used for simplicity to refer to a functional protein, polypeptide or peptide encoding unit As will be understood by those in the art, this functional term includes both genomic sequences, extra-genomic and plasmid-encoded sequences and smaller engineered gene segments that express, or may be adapted to express, proteins, polypeptides or peptides Such segments may be naturally isolated, or modified synthetically by the hand of man Preferred DNAs are those which comprise one or more LYST genes, with human LYSTl and LYST2 genes being particularly preferred, or one or more Lyst genes, with murine Lystl and Lyst2 genes being particularly preferred
"Isolated substantially away from other coding sequences" means that the gene of interest, in this case, a gene encoding a LYST/Lyst protein or peptide, forms the significant part of the coding region of the DNA segment, and that the DNA segment does not contain large portions of naturally-occurring coding DNA, such as large chromosomal fragments or other functional genes or polypeptide coding regions Of course, this refers to the DNA segment as originally isolated, and does not exclude genes or coding regions later added to the segment by the hand of man
In particular embodiments, the invention concerns isolated DNA segments and recombinant vectors incorporating DNA sequences that encode a LYST/Lyst species that includes within its amino acid sequence an amino acid sequence essentially as set forth in SEQ ED NO 2,
SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ ID NO 10, SEQ ID NO 12, or SEQ ID NO 14 In other particular embodiments, .the inv 5ention concerns isolated DNA segments and recombinant vectors incorporating DNA sequences that include within their sequence a nucleotide sequence essentially as set forth in SEQ ID NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ID NO 9, SEQ LO NO 11, or SEQ ID NO 13
The term "a sequence essentially as set forth in SEQ ID NO 2, SEQ ID NO 4, SEQ ID
NO 6, SEQ ID NO 8, SEQ LD NO 10, SEQ LD NO 12, or SEQ LD NO 14" means that the sequence substantially corresponds to a portion of SEQ LD NO 2, SEQ LD NO 4, SEQ LD NO 6, SEQ ID NO 8, SEQ ID NO 10, SEQ LD NO 12, or SEQ ID NO 14, and has relatively few amino acids that are not identical to, or a biologically functional equivalent of, the amino acids of SEQ ID NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ D NO 10, SEQ LD NO.12, or
SEQ ED NO 14 The term "biologically functional equivalent" is well understood in the art and is further defined in detail herein (for example, see Illustrative Embodiments) Accordingly, sequences that have between about 70% and about 80%, or more preferably, between about 81% and about 90%, or even more preferably, between about 91% and about 99%, of amino acids that are identical or functionally equivalent to the amino acids SEQ ID NO.2, SEQ ID NO 4, SEQ LD NO 6, SEQ ED NO 8, SEQ LD NO 10, SEQ ID NO 12, or SEQ ED NO 14 will be sequences that are "essentially as set forth in SEQ ID NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ LD NO 8, SEQ ID NO 10, SEQ ID NO 12, or SEQ LD NO 14"
In certain other embodiments, the invention concerns isolated DNA segments and recombinant vectors that include within their sequence a nucleic acid sequence essentially as set forth in SEQ LD NO 1, SEQ ED NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ ID NO 9, SEQ LD NO 11, or SEQ ID NO 13 The term "essentially as set forth in SEQ LD NO 1, SEQ LD NO.3, SEQ LD NO 5, SEQ ID NO 7, SEQ ID NO 9, SEQ ID NO 1 1, or SEQ LD NO 13" is used in the same sense as described above and means that the nucleic acid sequence substantially corresponds to a portion of SEQ ID NO.1, SEQ LD NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ LD NO 9, SEQ ID NO 11, or SEQ LD NO: 13 and has relatively few codons that are not identical, or functionally equivalent, to the codons of SEQ ID NO 1, SEQ ID NO 3, SEQ LD NO.5, SEQ ED NO 7, SEQ LD NO 9, SEQ ID NO 11 , or SEQ ID NO 13 Again, DNA segments that encode proteins exhibiting LYST, Lyst, LYST-like, or Lyst-like activity will be most preferred
It will also be understood that amino acid and nucleic acid sequences may include additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences, and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence meets the criteria set forth above, including the maintenance of biological protein activity where protein expression is concerned The addition of terminal sequences particularly applies to nucleic acid sequences that may, for example, include various non-coding sequences flanking either of the 5' or 3' portions of the coding region or may include various upstream or downstream regulatory or structural genes
Naturally, the present invention also encompasses DNA segments that are complementary, or essentially complementary, to the sequence set forth in SEQ LD NOT, SEQ ED NO.3, SEQ ED NO:5, SEQ LD NO 7, SEQ ID NO 9, SEQ ID NOT 1, or SEQ ED NO 13 Nucleic acid sequences that are "complementary" are those that are capable of base-pairing according to the standard Watson-Crick complementarity rules As used herein, the term "complementary sequences" means nucleic acid sequences that are substantially complementary, as may be assessed by the same nucleotide comparison set forth above, or as defined as being capable of hybridizing to the nucleic acid segment of SEQ LD NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ LD NO 7, SEQ ID NO 9, SEQ ED NO 11, or SEQ LD NO 13 under relatively stringent conditions such as those described herein
The nucleic acid segments of the present invention, regardless of the length of the coding sequence itself, may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably It is therefore contemplated that a nucleic acid fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol For example, nucleic acid fragments may be prepared that include a short contiguous stretch identical to or complementary to SEQ ID NOT, SEQ ID NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ LD NO 9, SEQ ED NO 11, or SEQ ED NO 3, such as about 14 nucleotides, and that are up to about 10,000 or about 5,000 base pairs in length, with segments of about 3,000 being preferred in certain cases DNA segments with total lengths of about 2,000, about 1,000, about 500, about 200, about 100 and about 50 base pairs in length (including all intermediate lengths) are also contemplated to be useful
It will be readily understood that "intermediate lengths", in these contexts, means any length between the quoted ranges, such as 14, 15, 16, 17, 18, 19, 20, etc., 21, 22, 23, etc., 30, 31, 32, etc., 50, 51, 52, 53, etc., 100, 101, 102, 103, 7 etc.; 150, 151, 152, 153, etc., including all integers through the 200-500, 501-1,000, 1,001-2,000, 2,001-3,000, 3,001-5,000, 5,001-10,000 ranges, up to and including sequences of about 12,001, 12,002, 12,003, 13,001, 13,002 and the like
It will also be understood that this invention is not limited to the particular nucleic acid sequences disclosed in SEQ ED NO 1, SEQ LD NO 3, SEQ ID NO 5, SEQ ED NO.7, SEQ ED NO.9, SEQ ED NO,l 1, or SEQ ID NO 13, or to the particular amino acid sequences as disclosed in SEQ LD NO 2, SEQ ID NO 4, SEQ ID NO 6, SEQ ID NO 8, SEQ ID NO TO, SEQ LD NO.12, or SEQ ED NO 14 Recombinant vectors and isolated DNA segments may therefore variously include the LYST or Lyst coding regions themselves, coding regions bearing selected alterations or modifications in the basic coding region, or they may encode larger polypeptides that nevertheless include LYST, Lyst, LYST-like, or Lyst-like coding regions or may encode biologically functional equivalent proteins or peptides that have variant amino acids sequences
If desired, one may also prepare fusion proteins and peptides, e.g., where the LYST or Lyst coding regions are aligned within the same expression unit with other proteins or peptides having desired functions, such as for purification or immunodetection purposes (e.g., proteins that may be purified by affinity chromatography and enzyme label coding regions, respectively).
Recombinant vectors form further aspects of the present invention Particularly useful vectors are contemplated to be those vectors in which the coding portion of the DNA segment, whether encoding a full length protein or smaller peptide, is positioned under the control of a promoter The promoter may be in the form of the promoter that is naturally associated with a LYSTl, Lystl, LYST2, or Lyst2 gene, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment, for example, using recombinant cloning and/or PCR™ technology, in connection with the compositions disclosed herein
In other embodiments, it is contemplated that certain advantages will be gained by positioning the coding DNA segment under the control of a recombinant, or heterologous, promoter As used herein, a recombinant or heterologous promoter is intended to refer to a promoter that is not normally associated with a LYST/Lyst gene in its natural environment Such promoters may include LYST or Lyst promoters normally associated with other genes, and/or promoters isolated from any bacterial, viral, eukaryotic, or mammalian cell Naturally, it will be important to employ a promoter that effectively directs the expression of the DNA segment in the cell type, organism, or even animal, chosen for expression. The use of promoter and cell type combinations for protein expression is generally known to those of skill in the art of molecular biology, for example, see Sambrook et al, 1989. The promoters employed may be constitutive, or inducible, and can be used under the appropriate conditions to direct high level expression of the introduced DNA segment, such as is advantageous in the large-scale production of recombinant proteins or peptides.
Prokaryotic expression of nucleic acid segments of the present invention may be performed using methods known to those of skill in the art, and will likely comprise expression vectors and promotor sequences such as those obtained from tac, trp, lac, laclIV5 or T7. When expression of the recombinant LYSTl LYST2, Lystl or Lyst2 proteins is desired in eukaryotic cells, a number of expression systems are available and known to those of skill in the art. An exemplary eukaryotic promoter system contemplated for use in high-level expression is the Pichia expression vector system (Pharmacia LKB Biotechnology).
In connection with expression embodiments to prepare recombinant recombinant LYSTl
LYST2, Lystl or Lyst2 proteins and peptides, it is contemplated that longer DNA segments will most often be used, with DNA segments encoding the entire LYSTl LYST2, Lystl or Lyst2 or functional domains, epitopes, ligand binding domains, subunits, etc. being most preferred. However, it will be appreciated that the use of shorter DNA segments to direct the expression of LYSTl LYST2, Lystl or Lyst2 peptides or epitopic core regions, such as may be used to generate anti-LYST or Lyst antibodies, also falls within the scope of the invention. DNA segments that encode peptide antigens from about 15 to about 100 amino acids in length, or more preferably, from about 15 to about 50 amino acids in length are contemplated to be particularly useful.
The LYST or Lyst genes and DNA segments may also be used in connection with somatic expression in an animal or in the creation of a transgenic animal. Again, in such embodiments, the use of a recombinant vector that directs the expression of the full length or active LYST/Lyst protein is particularly contemplated. Expression of a LYST/Lyst transgene in animals is particularly contemplated to be useful in the production of anti-LYST/Lyst antibodies for use in passive immunization methods, the detection of LYST/Lyst proteins, and the purification of
LYSTLyst protein in large quantity. 7
In addition to their use in directing the expression of LYST/Lyst, the nucleic acid sequences disclosed herein also have a variety of other uses For example, they also have utility as probes or primers in nucleic acid hybridization embodiments As such, it is contemplated that nucleic acid segments that comprise a sequence region that consists of at least a 14 nucleotide long contiguous sequence that has the same sequence as, or is complementary to, a 14 nucleotide long contiguous sequence of SEQ ED NO 1, SEQ ID NO 3, SEQ ID NO 5, SEQ ED NO 7, SEQ ED NO 9, SEQ LD NO 1 1, or SEQ ED NO 13 will find particular utility Longer contiguous identical or complementary sequences, e.g., those of about 20, 30, 40, 50, 100, 200, 500, 1000 (including all intermediate lengths) and even up to full length sequences will also be of use in certain embodiments
The ability of such nucleic acid probes to specifically hybridize to LYST/Lyst-encoding sequences will enable them to be of use in detecting the presence of complementary sequences in a given sample However, other uses are envisioned, including the use of the sequence information for the preparation of mutant species primers, or primers for use in preparing other genetic constructions
Nucleic acid molecules having sequence regions consisting of contiguous nucleotide stretches of 10-14, 15-20, 30, 50, or even of 100-200 nucleotides or so, identical or complementary to SEQ ID NO 1, SEQ ID NO.3, SEQ ID NO 5, SEQ ED NO 7, SEQ ID NO 9, SEQ ED NO 1 1, or SEQ LD NO.13 are particularly contemplated as hybridization probes for use in, e.g., Southern and Northern blotting This would allow LYST/Lyst structural or regulatory genes to be analyzed, both in diverse cell types and also in various bacterial cells The total size of fragment, as well as the size of the complementary stretch(es), will ultimately depend on the intended use or application of the particular nucleic acid segment Smaller fragments will generally find use in hybridization embodiments, wherein the length of the contiguous complementary region may be varied, such as between about 14 and about 100 nucleotides, but larger contiguous complementarity stretches may be used, according to the length complementary sequences one wishes to detect
The use of a hybridization probe of about 14-25 nucleotides in length allows the formation of a duplex molecule that is both stable and selective Molecules having contiguous complementary sequences over stretches greater than 14 bases in length are generally preferred, though, m order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained One will generally prefer to design nucleic acid molecules having gene-complementary stretches of 15 to 25 contiguous nucleotides, or even longer where desired
Hybridization probes may be selected from any portion of any of the sequences disclosed herein. All that is required is to review the sequence set forth in SEQ ID NO 1, SEQ LD NO.3, SEQ LD NO 5, SEQ ID NO 7, SEQ LD NO.9, SEQ LD NOT 1, or SEQ LD NO 13 and to select any continuous portion of the sequence, from about 14-25 nucleotides in length up to and including the full length sequence, that one wishes to utilize as a probe or primer The choice of probe and primer sequences may be governed by various factors, such as, by way of example only, one may wish to employ primers from towards the termini of the total sequence
The process of selecting and preparing a nucleic acid segment that includes a contiguous sequence from within SEQ ID NO 1, SEQ ED NO 3, SEQ ID NO 5, SEQ ID NO 7, SEQ ED NO 9, SEQ LD NO 11, or SEQ ID NO 13, may alternatively be described as preparing a nucleic acid fragment Of course, fragments may also be obtained by other techniques such as, e.g., by mechanical shearing or by restriction enzyme digestion Small nucleic acid segments or fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, as is commonly practiced using an automated oligonucleotide synthesizer Also, fragments may be obtained by application of nucleic acid reproduction technology, such as the PCR™ technology of U S Patent 4,683,202 (incorporated herein by reference), by introducing selected sequences into recombinant vectors for recombinant production, and by other recombinant DNA techniques generally known to those of skill in the art of molecular biology.
Accordingly, the nucleotide sequences of the invention may be used for their ability to selectively form duplex molecules with complementary stretches of the entire LYST/Lyst gene or gene fragments Depending on the application envisioned, one will desire to employ varying conditions of hybridization to achieve varying degrees of selectivity of probe towards target sequence For applications requiring high selectivity, one will typically desire to employ relatively stringent conditions to form the hybrids, e.g., one will select relatively low salt and/or high temperature conditions, such as provided by about 0 02 M to about 0 15 M NaCl at temperatures of 50°C to 70°C Such selective conditions tolerate little, if any, mismatch between the probe and the template or target strand, and would be particularly suitable for isolating LYST or Lyst genes.
Of course, for some applications, for example, where one desires to prepare mutants employing a mutant primer strand hybridized to an underlying template or where one seeks to isolate LYST or Lyst sequences from related species, functional equivalents, or the like, less stringent hybridization conditions will typically be needed in order to allow formation of the heteroduplex. In these circumstances, one may desire to employ conditions such as about 0.15 M to about 0.9 M salt, at temperatures ranging from 20°C to 55°C. Cross-hybridizing species can thereby be readily identified as positively hybridizing signals with respect to control hybridizations. In any case, it is generally appreciated that conditions can be rendered more stringent by the addition of increasing amounts of formamide, which serves to destabilize the hybrid duplex in the same manner as increased temperature. Thus, hybridization conditions can be readily manipulated, and thus will generally be a method of choice depending on the desired results.
In certain embodiments, it will be advantageous to employ nucleic acid sequences of the present invention in combination with an appropriate means, such as a label, for determining hybridization. A wide variety of appropriate indicator means are known in the art, including fluorescent, radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of giving a detectable signal. In preferred embodiments, one will likely desire to employ a fluorescent label or an enzyme tag, such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other environmental undesirable reagents. In the case of enzyme tags, colorimetric indicator substrates are known that can be employed to provide a means visible to the human eye or spectrophotometrically, to identify specific hybridization with complementary nucleic acid- containing samples.
In general, it is envisioned that the hybridization probes described herein will be useful both as reagents in solution hybridization as well as in embodiments employing a solid phase. In embodiments involving a solid phase, the test DNA (or RNA) is adsorbed or otherwise affixed to a selected matrix or surface. This fixed, single-stranded nucleic acid is then subjected to specific hybridization with selected probes under desired conditions. The selected conditions will depend on the particular circumstances based on the particular criteria required (depending, for example, on the G+C content, type of target nucleic acid, source of nucleic acid, size of hybridization probe, etc.) Following washing of the hybridized surface so as to remove nonspecifically bound probe molecules, specific hybridization is detected, or even quantitated, by means of the label
2.2 Recombinant Host Cells and Vectors
Particular aspects of the invention concern the use of plasmid vectors for the cloning and expression of recombinant peptides, and particular peptide epitopes comprising either native, or site-specifically mutated LYST or Lyst proteins, peptides, or epitopes The generation of recombinant vectors, transformation of host cells, and expression of recombinant proteins is well- known to those of skill in the art Prokaryotic hosts are preferred for expression of the peptide compositions of the present invention An example of a preferred prokaryotic host is E. coli, and in particular, E. coli strains JM101, XLl-Blue™, RR1, LE392, B, X1776 (ATCC31537), and W3110 (F', λ", prototrophic, ATCC273325) Alternatively, other Enterobactenaceae species such as Salmonella typhimurmm and Serratia marcescens, or even other Gram-negative hosts including various Pseudomonas species may be used in the recombinant expression of the genetic constructs disclosed herein
In general, plasmid vectors containing replicon and control sequences which are derived from species compatible with the host cell are used in connection with these hosts The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells For example, E. coli may be typically transformed using vectors such as pBR322, or any of its derivatives (Bolivar et al, 1977) pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells pBR322, its derivatives, or other microbial plasmids or bacteriophage may also contain, or be modified to contain, promoters which can be used by the microbial organism for expression of endogenous proteins
In addition, phage vectors containing replicon and control sequences that are compatible with the host microorganism can be used as transforming vectors in connection with these hosts
For example, bacteriophage such as λGEM™-l 1 may be utilized in making a recombinant vector which can be used to transform susceptible host cells such as E. coli LE392
Those promoters most commonly used in recombinant DNA construction include the β- lactamase (penicillinase) and lactose promoter systems (Chang et al , 1978, Itakura et al., \911, Goeddel et al , 1979) or the tryptophan (trp) promoter system (Goeddel et al , 1980) The use of recombinant and native microbial promoters is well-known to those of skill in the art, and details concerning their nucleotide sequences and specific methodologies are in the public domain, enabling a skilled worker to construct particular recombinant vectors and expression systems for the purpose of producing compositions of the present invention.
In addition to the preferred embodiment expression in prokaryotes, eukaryotic microbes, such as yeast cultures may also be used in conjunction with the methods disclosed herein. Saccharomyces cerevisiae, or common bakers' yeast is the most commonly used among eukaryotic microorganisms, although a number of other species may also be employed for such eukaryotic expression systems. For expression in Saccharomyces, the plasmid YRp7, for example, is commonly used (Stinchcomb et al, 1979; Kingsman et al, 1979; Tschemper et al., 1980). This plasmid already contains the trpL gene which provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example ATCC44076 or PEP4-1 (Jones, 1977). The presence of the trpL lesion as a characteristic of the yeast host cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan.
Suitable promoting sequences in yeast vectors include the promoters for 3- phosphoglycerate kinase (Hitzeman et al, 1980) or other glycolytic enzymes (Hess et al, 1968; Holland et ai, 1978), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- phosphoglycerate utase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. In constructing suitable expression plasmids, the termination sequences associated with these genes are also ligated into the expression vector 3' of the sequence desired to be expressed to provide polyadenylation of the mRNA and termination. Other promoters, which have the additional advantage of transcription controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization. Any plasmid vector containing a yeast-compatible promoter, an origin of replication, and termination sequences is suitable.
In addition to microorganisms, cultures of cells derived from multicellular organisms may also be used as hosts in the routine practice of the disclosed methods. In principle, any such cell culture is workable, whether from vertebrate or invertebrate culture However, interest has been greatest in vertebrate cells, and propagation of vertebrate cells in culture (tissue culture) has become a routine procedure in recent years Examples of such useful host cell lines are VERO and HeLa cells, Chinese hamster ovary (CHO) cell lines, and W138, BHK, COS-7, 293 and MDCK cell lines Expression vectors for such cells ordinarily include (if necessary) an origin of replication, a promoter located in front of the gene to be expressed, along with any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional terminator sequences
For use in mammalian cells, the control functions on the expression vectors are often obtained from viral material For example, commonly used promoters are derived from poiyoma, Adenovirus 2, and most frequently Simian Virus 40 (SV40) The early and late promoters of SV40 virus are particularly useful because both are obtained easily from the virus as a fragment which also contains the SV40 viral origin of replication (Fiers et al, 1978) Smaller or larger SV40 fragments may also be used, provided there is included the approximately 250 bp sequence extending from the Hinάlll site toward the BgH site located in the viral origin of replication
Further, it is also possible, and often desirable, to utilize promoter or control sequences normally associated with the desired gene sequence, provided such control sequences are compatible with the host cell systems
The origin of replication may be obtained from either construction of the vector to include an exogenous origin, such as may be derived from SV40 or other viral (e.g., Poiyoma, Adeno, VSV, BPV) source, or may be obtained from the host cell chromosomal replication mechanism If the vector is integrated into the host cell chromosome, the latter is often sufficient.
It will be further understood that certain of the polypeptides may be present in quantities below the detection limits of the Coomassie brilliant blue staining procedure usually employed in the analysis of SDS/PAGE gels, or that their presence may be masked by an inactive polypeptide of similar Mr Although not necessary to the routine practice of the present invention, it is contemplated that other detection techniques may be employed advantageously in the visualization of particular polypeptides of interest Immunologically-based techniques such as Western blotting using enzymatically-, radiolabel-, or fluorescently-tagged antibodies described herein are considered to be of particular use in this regard Alternatively, the peptides of the present invention may be detected by using antibodies of the present invention in combination Iff with secondary antibodies having affinity for such primary antibodies This secondary antibody may be enzymatically- or radiolabeled, or alternatively, fluorescently-, or colloidal gold-tagged Means for the labeling and detection of such two-step secondary antibody techniques are well- known to those of skill m the art
2.3 Recombinant Expression of one or more LYST Gene Products
As used throughout, a "LYST/Lyst" gene is intended to mean a LYSToτ Lyst gene from a mammalian source, with human LYST and muπne Lyst genes being most preferred In keeping with the genetic nomenclature schemes known to those of skill in the art, "LYST' genes are those genes derived from human sources while "Lyst" genes are those genes derived from muπne sources Thus, LYSTl and LYST2 genes are two genes of the "LYST/Lyst" family which are isolated from humans, while Lystl and Lyst2 represent two genes of the "LYST/Lyst" family which are their murine homologs, respectively
In analogous fashion, a "LYST Lyst" protein is intended to mean a LYST or Lyst protein isolated from a mammalian source, with human and murine peptides being most preferred In keeping with the genetic nomenclature schemes known to those of skill in the art, "LYST" proteins are those proteins encoded by LYST genes derived from human sources while "Lyst" proteins are those proteins encoded by Lyst genes derived from murine sources Thus, LYSTl and LYST2 are the proper designations of two proteins of the "LYST/Lyst" protein family which are isolated from humans, while Lystl and Lyst2 represent the two homologous proteins of the LYST/Lyst protein family isolated from muπnes
Because there are long and short isoforms of these proteins, the inventors have referred throughout the specification to "Lystl isoform I," "Lystl isoform II," and so forth to distinguish between the two isoforms Such isoform designations may also be abbreviated as "Lystl-I" or "Lystl-II," and so forth Human protein isoforms may be referred to in corresponding manner "LYSTl -I" and "LYSTl -isoform I" describe the long isoform of the human protein, while
"LYSTl -II" and "LYSTl -isoform II" are terms used to described the short isoform of the human proteins Therefore, Lystl-I and Lystl-II are terms used to represent two isoforms of the murine isoforms of Lystl, and LYSTl -I and LYSTl -II are terms used to represent two isoforms of the human LYSTl Similarly, Lyst2-I and Lyst2-II would represent two isoforms of the muπne Lyst2 protein, while LYST2-I and LYST2-II would represent two isoforms of the human LYST2 protein The present invention also concerns recombinant host cells for expression of an isolated LYSTl, Lystl, LYST2, or Lyst2 gene. It is contemplated that virtually any host cell may be employed for this purpose, but certain advantages may be found in using a bacterial host cell such as E. coli, S. typhimurium, B. subtilis, or others. Expression in eukaryotic cells is also contemplated such as those derived from yeast, insect, or mammalian cell lines. These recombinant host cells may be employed in connection with "overexpressing" the LYSTl, Lystl, LYST2, or Lyst2 protein, that is, increasing the level of expression over that found naturally in mammalian cells. As is well known to those of skill in the art, many such vectors and host cells are readily available for the recombinant expression of proteins, one particular detailed example of a suitable vector for expression in mammalian cells is that described in TJ. S. Patent 5,168,050, incorporated herein by reference. However, there is no requirement that a highly purified vector be used, so long as the coding segment employed encodes a protein or peptide of interest (e.g., the LYSTl, Lystl, LYST2, or Lyst2 protein) and does not include any coding or regulatory sequences that would have an adverse effect on cells. Therefore, it will also be understood that useful nucleic acid sequences may include additional residues, such as additional non-coding sequences flanking either of the 5' or 3' portions of the coding region or may include various regulatory sequences.
After identifying an appropriate epitope-encoding nucleic acid molecule, it may be inserted into any one of the many vectors currently known in the art, so that it will direct the expression and production of the protein or peptide epitope of interest (e.g., the LYSTl, Lystl, LYST2, or
Lyst2 protein) when incorporated into a host cell In a recombinant expression vector, the coding portion of the DNA segment is positioned under the control of a promoter. The promoter may be in the form of the promoter which is naturally associated with a LYST1-, Lystl-, LYST2-, or Lyst2-encoding nucleic acid segment, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment, for example, using recombinant cloning and/or PCR™ technology, in connection with the compositions disclosed herein. Direct amplification of nucleic acids using the PCR™ technology of U.S. Patents 4,683,195 and 4,683,202 (herein incorporated by reference) are particularly contemplated to be useful in such methodologies.
In certain embodiments, it is contemplated that particular advantages will be gained by positioning the LYST1-, Lystl-, LYST2-, or Lyst2-encoding DNA segment under the control of a recombinant, or heterologous, promoter As used herein, a recombinant or heterologous promoter is intended to refer to a promoter that is not normally associated with a LYSTl, Lystl, LYST2, or Lyst2-encoding DNA segment in its natural environment Such promoters may include those normally associated with other genes, and/or promoters isolated from any other bacterial, viral, eukaryotic, or mammalian cell Naturally, it will be important to employ a promoter that effectively directs the expression of the DNA segment in the particular cell containing the vector comprising the LYST1-, Lystl-, LYST2-, or Lyst2-encoding nucleic acid segment
The use of recombinant promoters to achieve protein expression is generally known to those of skill in the art of molecular biology, for example, see Sambrook et al, (1989). The promoters employed may be constitutive, or inducible, and can be used under the appropriate conditions to direct high level or regulated expression of the introduced DNA segment. For eukaryotic expression, the currently preferred promoters are those such as CMV, RS V LTR, the SV40 promoter alone, and the SV40 promoter in combination with the SV40 enhancer. In certain embodiments, the expression of recombinant LYSTl, Lystl, LYST2, or Lyst2 protein is carried out using prokaryotic expression systems, and in particular bacterial systems such as E. coli Such prokaryotic expression of nucleic acid segments of the present invention may be performed using methods known to those of skill in the art, and will likely comprise expression vectors and promotor sequences such as those obtained from tac, trp, lac, lacUV5 or T7 promotors
For the expression of the LYSTl, Lystl, LYST2, or Lyst2 protein and LYST1-, Lystl-, LYST2-, or Lyst2-derived epitopes, once a suitable clone or clones have been obtained, whether they be native sequences or genetically-modified, one may proceed to prepare an expression system for the recombinant preparation of the LYSTl, Lystl, LYST2, or Lyst2 protein or peptides derived from one or more of the LYSTl, Lystl, LYST2, or Lyst2 proteins The engineering of DNA segment(s) for expression in a prokaryotic or eukaryotic system may be performed by techniques generally known to those of skill in recombinant expression. It is believed that virtually any expression system may be employed in the expression of LYSTl, Lystl, LYST2, or Lyst2 proteins or epitopes derived from such proteins
Alternatively, it may be desirable in certain embodiments to express the gene products or derived epitopes in eukaryotic expression systems The DNA sequences encoding the desired , r epitope (either native or mutagenized) may be separately expressed in various eukaryotic systems as is well-known to those of skill in the art.
It is proposed that transformation of host cells with DNA segments encoding such epitopes will provide a convenient means for obtaining the protein or peptide of interest Genomic sequences are suitable for eukaryotic expression, as the host cell will, of course, process the genomic transcripts to yield functional mRNA for translation into protein
It is similarly believed that almost any eukaryotic expression system may be utilized for the expression of the proteins of the present invention, or of peptides or epitopes derived from such proteins, e.g., baculovirus-based, glutamine synthase-based or dihydrofolate reductase-based systems may be employed In preferred embodiments it is contemplated that plasmid vectors incorporating an origin of replication and an efficient eukaryotic promoter, as exemplified by the eukaryotic vectors of the pCMV series, such as pCMV5, will be of most use
For expression in this manner, one would position the coding sequences adjacent to and under the control of the promoter It is understood in the art that to bring a coding sequence under the control of such a promoter, one positions the 5' end of the transcription initiation site of the transcriptional reading frame of the protein between about 1 and about 50 nucleotides "downstream" of (i.e., 3' of) the chosen promoter
Where eukaryotic expression is contemplated, one will also typically desire to incorporate into the transcriptional unit which includes nucleic acid sequences encoding the LYST/Lsyt gene product or LYST/Lyst-derived peptides, an appropriate polyadenylation site (e.g.,
5'-AATAAA-3') if one was not contained within the original cloned segment Typically, the poly- A addition site is placed about 30 to 2000 nucleotides "downstream" of the termination site of the protein at a position prior to transcription termination.
It is contemplated that virtually any of the commonly employed host cells can be used in connection with the expression of the LYSTl, Lystl, LYST2, or Lyst2 proteins and epitopes derived therefrom in accordance herewith Examples include cell lines typically employed for eukaryotic expression such as 239, AtT-20, HepG2, VERO, HeLa, CHO, WI 38, BHK, COS-7, RIN and MDCK cell lines It is further contemplated that the protein 'Ts, peptides, or epitopic peptides derived from native or recombinant LYSTl, Lystl, LYST2, or Lyst2 proteins may be "overexpressed", i.e., expressed in increased levels relative to its natural expression in human cells, or even relative to the expression of other proteins in a recombinant host cell containing LYSTl -, Lystl-, LYST2-, or Lyst2-encoding DNA segments. Such overexpression may be assessed by a variety of methods, lincluding radiolabeling and/or protein purification However, facile and direct methods are preferred, for example, those involving SDS/PAGE and protein staining or Western blotting, followed by quantitative analyses, such as densitometric scanning of the resultant gel or blot. A specific increase in the level of the recombinant protein or peptide in comparison to the level in natural LYST1-, Lystl-, LYST2-, or Lyst2-producing cells is indicative of overexpression, as is a relative abundance of the specific protein in relation to the other proteins produced by the host cell and, e.g., visible on a gel.
As used herein, the term "engineered" or "recombinant" cell is intended to refer to a cell into which a recombinant gene, such as a gene encoding LYSTl, Lystl, LYST2, or Lyst2 has been introduced. Therefore, engineered cells are distinguishable from naturally occurring cells which do not contain a recombinantly introduced gene. Engineered cells are thus cells having a gene or genes introduced through the hand of man. Recombinantly introduced genes will either be in the form of a single structural gene, an entire genomic clone comprising a structural gene and flanking DNA, or an operon or other functional nucleic acid segment which may also include genes positioned either upstream and/or downstream of the promotor, regulatory elements, with or without introns, or a cDNA clone comprising the structural gene itself, or even genes not naturally associated with the particular gene of interest.
Where the introduction of a recombinant version of one or more of the foregoing genes is required, it will be important to introduce the gene such that it is under the control of a promoter that effectively directs the expression of the gene in the cell type chosen for engineering. In general, one will desire to employ a promoter that allows constitutive (constant) expression of the gene of interest. Commonly used constitutive eukaryotic promoters include viral promotors such as the cytomegalovirus (CMV) promoter, the Rous sarcoma long-terminal repeat (LTR) sequence, or the SV40 early gene promoter. The use of these constitutive promoters will ensure a high, constant level of expression of the introduced genes. The inventors have noticed that the level of expression from the introduced genes of interest can vary in different clones, or genes isolated from different strains or bacteria Thus, the level of expression of a particular recombinant gene can be chosen by evaluating different clones derived from each transfection study, once that line is chosen, the constitutive promoter ensures that the desired level of expression is permanently maintained It may also be possible to use promoters that are specific for cell type used for engineering, such as the insulin promoter in insulinoma cell lines, or the prolactin or growth hormone promoters in anterior pituitary cell lines.
2.4 Detection of LYST/Lyst Gene Products
A further aspect of the invention is the preparation of immunological compositions, and in particular anti- LYST/Lyst antibodies for diagnostic and therapeutic methods relating to the detection and diagnosis of CHS. Methods for diagnosing CHS and the detection of LYST/Lyst - encoding nucleic acid segments in clinical samples using nucleic acid compositions are also obtained from the invention The nucleic acid sequences encoding LYST/Lyst are useful as diagnostic probes using conventional techniques such as in Southern hybridization analyses or Northern hybridization analyses to detect the presence of LYST/Lyst nucleic acid segments within a clinical sample from a patient suspected of having such a condition In a preferred embodiment, nucleic acid sequences as disclosed in SEQ ID NO 1, SEQ ID NO 3, SEQ LD NO.5, SEQ LD NO: 7, SEQ LD NO 9, SEQ ID NO 11 and SEQ ID NO 13 are preferable as probes for such hybridization analyses
2.5 Methods for Producing an Immune Response
Also disclosed in a method of generating an immune response in an animal. The method generally involves administering to an animal a pharmaceutical composition comprising an immunologically effective amount of a peptide composition disclosed herein Preferred peptide compositions include the peptide disclosed in SEQ ID NO.2, SEQ ID NO 4, SEQ ID NO 6, SEQ LD NO 8, SEQ ID NO: 10, SEQ ID NO 12, or SEQ ID NO 14.
The invention also encompasses LYST/Lyst and LYST/Lyst -derived peptide antigen compositions together with pharmaceutically-acceptable excipients, carriers, diluents, adjuvants, and other components, such as additional peptides, antigens, or outer membrane preparations, as may be employed in the formulation of particular vaccines
Antibodies may be of several types including those raised in heterologous donor animals or human volunteers immunized with the LYST/Lyst gene product, monoclonal antibodies (mAbs) resulting from hybridomas derived from fusions of B cells from immunized animals or humans with compatible myeloma cell linςs, so-called "humanized" mAbs resulting from expression of gene fusions of combinatorial determining regions of mAb-encoding genes from heterologous species with genes encoding human antibodies, or LYST/Lyst -reactive antibody-containing fractions of plasma from human donors suspected of having CHS. It is contemplated that any of the techniques described above might be used for the vaccination of subjects for the purpose of antibody production Optimal dosing of such antibodies is highly dependent upon the pharmacokinetics of the specific antibody population in the particular species to be treated
Using the peptide antigens described herein, the present invention also provides methods of generating an immune response, which methods generally comprise administering to an animal, a pharmaceutically-acceptable composition comprising an immunologically effective amount of a LYST/Lyst peptide composition Preferred animals include mammals, and particularly humans Other preferred animals include murines, bovines, equines, porcines, canines, and felines The composition may include partially or significantly purified LYST/Lyst peptide epitopes, obtained from natural or recombinant sources, which proteins or peptides may be obtainable naturally or either chemically synthesized, or alternatively produced vitro from recombinant host cells expressing DNA segments encoding such epitopes Smaller peptides that include reactive epitopes, such as those between about 10 and about 50, or even between about 50 and about 100 amino acids in length will often be preferred The antigenic proteins or peptides may also be combined with other agents, such as other LYST/Lyst -related peptides or nucleic acid compositions, if desired
By "immunologically effective amount" is meant an amount of a peptide composition that is capable of generating an immune response in the recipient animal This includes both the generation of an antibody response (B cell response), and/or the stimulation of a cytotoxic immune response (T cell response). The generation of such an immune response will have utility in both the production of useful bioreagents, e.g., CTLs and, more particularly, reactive antibodies, for use in diagnostic embodiments, and will also have utility in various prophylactic or therapeutic embodiments Therefore, although these methods for the stimulation of an immune response include vaccination regimens and treatment regimens, it will be understood that achieving either of these end results is not necessary for practicing these aspects of the invention Further means contemplated by the inventors for generating an immune response in an animal includes administering to the animal, or human subject, a pharmaceutically-acceptable composition comprising an immunologically effective amount of a nucleic acid composition encoding a LYST/Lyst epitope, or an immunologically effective amount of an attenuated live organism that includes and expresses such a nucleic acid composition The "immunologically effective amounts" are those amounts capable of stimulating a B cell and/or T cell response
Immunoformulations of this invention, whether intended for vaccination, treatment, or for the generation of antibodies useful in the detection of CHS, may comprise native, or synthetically- derived antigenic peptide fragments from these proteins As such, antigenic functional equivalents of the proteins and peptides described herein also fall within the scope of the present invention
An "antigenically functional equivalent" protein or peptide is one that incorporates an epitope that is immunologically cross-reactive with one or more epitopes derived from any of the particular proteins disclosed Antigenically functional equivalents, or epitopic sequences, may be first designed or predicted and then tested, or may simply be directly tested for cross-reactivity
The identification or design of suitable epitopes, and/or their functional equivalents, suitable for use in immunoformulations, vaccines, or simply as antigens (e.g., for use in detection protocols), is a relatively straightforward matter For example, one may employ the methods of Hopp, as enabled in U S Patent 4,554,101, incorporated herein by reference, that teaches the identification and preparation of epitopes from amino acid sequences on the basis of hydrophilicity The methods described in several other papers, and software programs based thereon, can also be used to identify epitopic core sequences, for example, Chou and Fasman (1974a,b, 1978a,b, 1979), Jameson and Wolf (1988), Wolf et ai, (1988), and Kyte and Doolittle (1982) address this subject The amino acid sequence of these "epitopic core sequences" may then be readily incorporated into peptides, either through the application of peptide synthesis or recombinant technology
It is proposed that the use of shorter antigenic peptides, e.g., about 25 to about 50, or even about 15 to 25 amino acids in length, that incorporate epitopes of the LYST/Lyst protein will provide advantages in certain circumstances, for example, in the preparation of vaccines or in immunologic detection assays Exemplary advantages include the ease of preparation and purification, the relatively low cost and improved reproducibility of production, and advantageous biodistribution 13
In still further embodiments, the present invention concerns immunodetection methods and associated kits. It is contemplated that the proteins or peptides of the invention may be employed to detect antibodies having reactivity therewith, or, alternatively, antibodies prepared in accordance with the present invention, may be employed to detect LYST/Lyst proteins or peptides. Either type of kit may be used in the immunodetection of compounds, present within clinical samples, that are indicative of CHS The kits may also be used in antigen or antibody purification, as appropriate.
In general, the preferred immunodetection methods will include first obtaining a sample suspected of containing a LYST/Lyst -reactive antibody, such as a biological sample from a patient, and contacting the sample with a first LYST/Lyst protein or peptide under conditions effective to allow the formation of an immunocomplex (primary immune complex) One then detects the presence of any primary immunocomplexes that are formed Preferable LYST/LYST proteins include LYSTl and LYST2 from human origins, and Lystl and Lyst2 proteins derived from murine origins
Contacting the chosen sample with the LYST/Lyst protein or peptide under conditions effective to allow the formation of (primary) immune complexes is generally a matter of simply adding the protein or peptide composition to the sample One then incubates the mixture for a period of time sufficient to allow the added antigens to form immune complexes with, i.e., to bind to, any antibodies present within the sample After this time, the sample composition, such as a tissue section, ELISA plate, dot blot or western blot, will generally be washed to remove any non- specifically bound antigen species, allowing only those specifically bound species within the immune complexes to be detected.
The detection of immunocomplex formation is well known in the art and may be achieved through the application of numerous approaches known to the skilled artisan and described in various publications, such as, e.g., Nakamura et al, (1987), incorporated herein by reference Detection of primary immune complexes is generally based upon the detection of a label or marker, such as a radioactive, fluorescent, biological or enzymatic label, with enzyme tags such as alkaline phosphatase, urease, horseradish peroxidase and glucose oxidase being suitable The particular antigen employed may itself be linked to a detectable label, wherein one would then simply detect this label, thereby allowing the amount of bound antigen present in the composition to be determined. Alternatively, the primary immune.complexes may be detected by means of a second binding ligand that is linked to a detectable label and that has binding affinity for the first protein or peptide. The second binding ligand is itself often an antibody, which may thus be termed a "secondary" antibody The primary immune complexes are contacted with the labeled, secondary binding ligand, or antibody, under conditions effective and for a period of time sufficient to allow the formation of secondary immune complexes The secondary immune complexes are then generally washed to remove any non-specifically bound labeled secondary antibodies and the remaining bound label is then detected
For diagnostic purposes, it is proposed that virtually any sample suspected of containing the antibodies of interest may be employed. Exemplary samples include clinical samples obtained from a patient such as blood or serum samples, bronchoalveolar fluid, ear swabs, sputum samples, middle ear fluid or even perhaps urine samples may be employed. This allows for the diagnosis of CHS and related disorders. Furthermore, it is contemplated that such embodiments may have application to non-clinical samples, such as in the titering of antibody samples, in the selection of hybridomas, and the like Alternatively, the clinical samples may be from veterinary sources and may include such domestic animals as cattle, sheep, and goats Samples from feline, canine, and equine sources may also be used in accordance with the methods described herein.
In related embodiments, the present invention contemplates the preparation of kits that may be employed to detect the presence of LYST/Lyst -specific antibodies in a sample Generally speaking, kits in accordance with the present invention will include a suitable protein or peptide together with an immunodetection reagent, and a means for containing the protein or peptide and reagent.
The immunodetection reagent will typically comprise a label associated with a LYST/Lyst protein or peptide, or associated with a secondary binding ligand. Exemplary ligands might include a secondary antibody directed against the first LYST/Lyst or peptide or antibody, or a biotin or avidin (or streptavidin) ligand having an associated label. Detectable labels linked to antibodies that have binding affinity for a human antibody are also contemplated, e.g., for protocols where the first reagent is a LYST/Lyst peptide that is used to bind to a reactive antibody from a human sample Of course, as noted above, a number of exemplary labels are known in the art and all such labels may be employed in connection with the present invention. The kits may contain antigen or antibody-label conjugates either in fully conjugated form, in the form of intermediates, or as separate moieties to be conjugated by the user of the kit.
The container means will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which the antigen may be placed, and preferably suitably allocated. Where a second binding ligand is provided, the kit will also generally contain a second vial or other container into which this ligand or antibody may be placed. The kits of the present invention will also typically include a means for containing the vials in close confinement for commercial sale, such as, e.g., injection or blow-molded plastic containers into which the desired vials are retained.
2.6 Formulation as Vaccines
It is expected that to achieve an "immunologically effective formulation" it may be desirable to administer LYST- or Eyst-encoding proteins to the human or animal subject in a pharmaceutically acceptable composition comprising an immunologically effective amount of LYST or Lyst proteins or peptides mixed with other excipients, carriers, or diluents which may improve or otherwise alter stimulation of B cell and/or T cell responses, or immunologically inert salts, organic acids and bases, carbohydrates, and the like, which promote stability of such mixtures. Immunostimulatory excipients, often referred to as adjuvants, may include salts of aluminum (often referred to as Alums), simple or complex fatty acids and sterol compounds, physiologically acceptable oils, polymeric carbohydrates, chemically or genetically modified protein toxins, and various particulate or emulsified combinations thereof. LYST or Lyst proteins or peptides within these mixtures, or each variant if more than one are present, would be expected to comprise about 0.0001 to 1.0 milligrams, or more preferably about 0.001 to 0.1 milligrams, or even more preferably less than 0.1 milligrams per dose.
It is also contemplated that attenuated organisms may be engineered to express recombinant LYST or Lyst proteins or peptides, and the organisms themselves be delivery vehicles for the invention. Pox-, polio-, adeno-, or other viruses, and bacteria such as Salmonella, Shigella, Listeria, Streptococcus species may also be used in conjunction with the methods and compositions disclosed herein.
The naked DNA technology, often referred to as genetic immunization, has been shown to be suitable for protection against infectious organisms. Such DNA segments could be used in a variety of forms including naked DNA and plasmid DNA, and may administered to the subject in a variety of ways including parenteral, mucosal, and so-called microprojectile-based "gene-gun" inoculations. The use of LYST or Lyst nucleic acid compositions of the present invention in such immunization techniques is thus proposed to be useful as a vaccination strategy against Lyme 5 disease.
It is recognized by those skilled in the art that an optimal dosing schedule of a vaccination regimen may include as many as five to six, but preferably three to five, or even more preferably one to three administrations of the immunizing entity given at intervals of as few as two to four weeks, to as long as five to ten years, or occasionally at even longer intervals.
l o 2.7 USE OF LYSTI PEPTIDES/APTAMERS As PHARMACEUTICALS THAT BLOCK OR MIMIC LYSTI FUNCTION
Lyst regulates degranulation of lysosomes, late endosomes and acidic secretory granules primarily in leukocytes. Blockade of such degranulation using dominant-negatively acting truncated Lyst peptides may reasonably be expected to be efficacious in inflammatory and 15 autoimmune diseases such as asthma, urticaria, inflammatory bowel disease, systemic lupus erythematosus, rheumatoid arthritis, psoriasis, systemic vasculitis, glomerulonephritis, multiple sclerosis, post-angioplasty restenosis. Proof of this principal is documented in Clark et al, 1982, who demonstrated that bg mice are protected from lupus nephritis.
2.8 USE OF PHARMACEUTICAL COMPOUNDS THAT BLOCK OR MIMIC LYSTI FUNCTION
20 Lyst regulates degranulation of lysosomes, late endosomes and acidic secretory granules primarily in leukocytes. Blockade of such degranulation using dominant-negatively acting truncated Lyst peptides may reasonably be expected to be efficacious in inflammatory and autoimmune diseases such as asthma, urticaria, inflammatory bowel disease systemic lupus erythematosus, rheumatoid arthritis, psoriasis, systemic vasculitis, glomerulonephritis, multiple 5 sclerosis, post-angioplasty restenosis. Proof of this principal is documented in Clark et al, (1982) who demonstrated that bg mice are protected from lupus nephritis.
Lyst peptides that mimic or augment Lyst function may reasonably be expected to be efficacious in the treatment of neoplasia. Proof of this principle is documented in Aboud et al. (1993) and Hayakawa et al. (1986), who demonstrate that bg mice and CHS patients are susceptible to development off neoplasia, and have more aggressive neoplasms with accelerated metastases
2.9 USE OF LYST2 PEPTIDES/APTAMERS AS PHARMACEUTICAL AGENTS THAT BLOCK LYST2 FUNCTION OR REPRODUCE LYST2 FUNCTIONS Lyst2 is thought to act to regulate degranulation of vesicles within cells in the brain and kidney Bblockade of such degranulation using dominant-negatively acting truncated Lyst2 peptides may reasonably be expected to be efficacious for the treatment of neurologic and renal degenerative diseases such as Alzheimer's disease, motor neuron disease, Parkinson's disease, acute tubular necrosis, glomerulonephritis and glomerulosclerosis
2.10 USE OF PHARMACEUTICAL COMPOUNDS THAT BLOCK OR MIMIC LYST2 FUNCTIONS
Drugs that mimic the action of dominant-negatively acting truncated Lyst2 peptides Lyst2 is thought to act to regulate degranulation of vesicles within cells in the brain and kidney Blockade of such degranulation using dominant-negatively acting truncated Lyst2 peptides may reasonably be expected to be efficacious for the treatment of neurologic and renal degenerative diseases such as Alzheimer's disease, motor neuron disease, Parkinson's disease, acute tubular necrosis, glomerulonephritis and glomerulosclerosis
3. BRIEF DESCRIPTION OF THE DRAWINGS
The drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein
FIG. 1A. Ethidium bromide-stained pulsed field gels of DNA from clones derived from a mouse YAC library YAC clone numbers are shown above each panel and molecular size standards (in kilobases) to the left of each panel 1380 is the host S. cerevisiae strain and does not contain a YAC Sizes of YAC clones are 151H1 = 950-kb, 195A8 = 650-kb,
64F5 = 580-kb, 93E4 = 370-kb, 68E12 = 500-kb, 55F3 = 550-kb, 135G3 = 750-kb, 148H8 = 1000-kb, 84A8 = 370-kb, 148E11 = 650-kb, 165F7 = 500-kb. FIG. IB. Autoradiographs of corresponding Southern blots from the gels shown in FIG 1A hybridized with pBR322 (which cross-hybridizes to pYAC4 YAC clone numbers are shown above each panel and molecular size standards (in kilobases) to the left of each panel 1380 is the host S. cerevisiae strain and does not contain a YAC Sizes of YAC clones are 151H1 = 950-kb, 195A8 = 650-kb, 64F5 = 580-kb, 93E4 = 370-kb, 68E12 =
500-kb, 55F3 = 550-kb, 135G3 = 750-kb, 148H8 = 1000-kb, 84A8 = 370-kb, 148E11 = 650-kb, 165F7 = 500-kb
FIG. 2. STS content mapping of bg critical region YAC and PI clones The presence of an STS (y-axis) in a YAC/P1 clone (x-axis) is indicated by a filled box Each contig is identified by the degree of shading of the box The bg critical region extends from proximal to D13MM34 to the interval between D13MH207 and D13Mιt 62/D13Mιt305 (crossover location indicated by a double line) STS used for isolation of YAC clones were Nιd5' for 151H1, 195A8, 64F5, 93E4, 68E12, and 55F3, Estm9 for 148E1 1, D13MM34 for 165F7, D13Sfkl3 for 84A8, and D13Mιt207 for 135G3 and 148H8 PI clones 8591 and 8592 were identified with D13Sfkl3 YAC clone 64F5 is chimeric, YAC clone 84A8 has acquired an internal deletion which includes D13Sfk6. The relative orientation with respect to the centromere of the contig composed of the 9 clones 195A8- 55F3 has not been established, The position of clones 165F7 and 148E1 1 with respect to this contig has not been established
FIG. 3A. Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse
Chr 1 genetic markers in 504 C57BL/6J-6g/ X (C51BL/6J-bg J x CAST/EiJ)Fι backcross mice Closed boxes represent the homozygous C3H pattern and open boxes the Fi pattern Number of mice of each haplotype are indicated
FIG. 3B. Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse Chr 1 genetic markers in 111 (C57BL/6J -Wsh-bgJ x Mus domes cus PAC)F, X
C57BL/6J-Z>g backcross mice Closed boxes represent the homozygous C3H pattern and open boxes the Fi pattern Number of mice of each haplotype are indicated
FIG. 3C. Genetic mapping of bg on mouse Chr 1 Haplotype analysis of proximal mouse Chr 1 genetic markers in 1 11 (C57BL/6J- A-bgJ x Mus usculus PWK)F, X C57BL/6J- bg backcross mice Closed boxes represent the homozygous C3H pattern and open boxes the Fi pattern Number of mice of each haplotype are indicated
FIG. 3D. Genetic mapping of bg on mouse Chr 1 Composite linkage map of mouse Chr 13 in the vicinity of bg Loci are positioned according to their Approximate relative positions of loci were ascertained by integration of data from the above three backcrosses and from
Dietrich et al, (1994)
FIG. 4. Autoradiograph of a pulse field gel Southern blot of mouse DNA probed with Nd
Restriction endonucleases are shown above the panel and molecular size standards (in kilobases) to the left +/+=DBA/2J DNA, bg=SB-bg/bg DNA Upon reprobing this blot with Gli 3 or Estm9 all DBA/2J and SB-bg bands were of identical size
FIG. 5A. DNA sequence of the CH gene (LYSTl) from position 1 to position 1400 The DNA sequence continues in FIG 5B
FIG. 5B. Continuation of the DNA sequence of the CH gene in FIG 5A beginning at position 1401 and continuing to position 2800
FIG. 5C. Continuation of the DNA sequence of the CH gene in FIG 5B beginning at position 2801 and continuing to position 3514
FIG. 6. Amino acid sequence of the CH protein
FIG. 7A. Genetic mapping of the Bl gene
FIG. 7B. Genetic mapping of the bg and Bl genes
FIG. 7C. Detailed map showing the localiztion of the bg and Bl genes
FIG. 8. Bl cDNA clones.
FIG. 9A. Deletion of part of Bl in bglu Probe used in Southern analysis is probe A from FIG 8
FIG. 9B. Deletion of part of Bl in bg Probe used in Southern analysis is probe B from FIG 8 FIG. 9C. Deletion of part of Bl in bg11J Probe used in Southern analysis is probe C from
FIG 8 (In bpI U a deletion from bp 1250 to 2400 was observed
FIG. 10. Physical mapping of Bl gene within bg critical region
FIG. 11. Genetic and physical map of the bg non-recombinant interval on mouse chromosome 13 showing the location of Lyst Mouse chromosome 13 is shown by the horizontal line with the centromere on the left The bg critical region is delineated by chromosome crossovers (denoted with an X) in animals 134 and 137 of an interspecific mouse backcross [C57BL/6-<V x (C57BL/6J- x CAST EiJ)F,] Microsatellite markers D13MU172 and D13MU239 flank bg proximal.y, D13MU162 and D13M 305 lie distal to bg (indicated by turquoise circles) YAC and PI clones identified by PCR™ screening (Kusumi et al, 1993; Pierce et al , 1992) with oligonucleotides corresponding to Nid or D13Sfkl3 are shown above the chromosome Novel sequence-tagged sites (STS, indicated by dark blue circles), generated by inverse repetitive element PCR or direct or direct cDNA selection, were used to order clones within the contiguous array. Novel mouse chromosome 13 STSs are numbered 1-18, corresponding to D13Sfkl to
D13Sfkl8, respectively Lyst was isolated from YAC 195A8, a 650-kb clone, by using direct cDNA selection The physical location of Z- st-associated STSs on YAC and PI clones are shown in red (MGD accession number MGD-PMEX-13)
FIG. 12A. Intragenic deletion of Lyst in bg"J DNA Southern blot identification of an intragenic Lyst deletion in bg"J DNA. A Southern blot was sequentially hybridized
(Barbosa et al , 1995) with 3 Lyst probes, This panel shows the probe (nucleotides 1,262-
3,433 of Lyst cDNA) which extends upstream from the bg"J deletion Restriction endonucleases are indicated at the bottom of the panel, and molecular size standards (in kb) are shown to the left Similar results were obtained with 3 additional restriction endonucleases. The bg1,J mutation was discovered in 1992 at The Jackson Laboratory in a C57BL/6J-y_» mouse at generation N4 after transfer from B6C3Fe--Jt α The mutation jb had, in turn, been discovered 14 generations earlier in B6C3F e-a/a-hyh mice at generation
N3 after transfer from C57BL/10J The hyh mutation arose in C57BL/10J mice, and was maintained in that strain until transfer at F15 Thus the possible contributors of genetic information to bg"J include C57BL/6J, C3HeB/FeJ and C57BL/10J Southern blots were 3 / prepared from genomic DNA of all potential progenitor mouse strains, but only
C57BL/10J, C57BL/6J and C57BL/6J-Z>g;y are shown
FIG. 12B. Intragenic deletion of Lyst in bg"J DNA Southern blot identification of an intragenic Lyst deletion in bg"J DNA A Southern blot was sequentially hybridized (Barbosa et al , 1995) with 3 Lyst probes This panel shows the probe (nucleotides 2,835-
3,433 of Lyst cDNA) is completely deleted Restriction endonucleases are indicated at the bottom of each panel, and molecular size standards (in kb) are shown to the left
FIG. 12C. Intragenic deletion of Lyst in bg" DNA Southern blot identification of an intragenic Lyst deletion in bg"J DNA A Southern blot was sequentially hybridized (Barbosa et al , 1995) with 3 Lyst probes, Shown in this panel are results when the probe
(nucleotides 3,594-4,237 of Lyst cDNA) extends downstream from the big bg"J deletion Restriction endonucleases are indicated at the bottom of each panel, and molecular size standards (in kb) are shown to the left Similar results were obtained with 3 additional restriction endonucleases
FIG. 12D. Intragenic deletion of Lyst in bg"J DNA PCR™ analysis of the bg"J deletion C57BL/10J, C3HeB/FeJ, C57BL/6J and C57BL/6J-_>g" genomic DNA and Lyst cDNA were used as templates in the PCR™ reactions Amplicons illustrated correspond to Lyst cDNA nucleotides 1,337-1,837, which represent exon β and are upstream from the deletion No amplicon was observed in control PCR™ reactions performed without template More than 30 other STSs that had been localized within the bg non- recombinant interval PCR™ amplified normally from bg"J DNA
FIG. 12E. Intragenic deletion of Lyst in bg"J DNA PCR™ analysis of the bg"J deletion C57BL/10J, C3HeB/FeJ, C57BL/6J and C57BL/6J- ' genomic DNA and Lyst cDNA were used as templates in the PCR™ reactions Amplicons illustrated correspond to nucleotides 2,670-3,210, which represent exon γ, which is deleted in bg"J DNA No amplicon was observed in control PCR™ reactions performed without template More than 30 other STSs that had been localized within the bg non-recombinant interval PCR™ amplified normally from bg"J DNA FIG. 12F. Intragenic deletion of Lyst in bg"J DNA PCR™ analysis of the bg"J deletion C57BL/10J, C3HeB/FeJ, C57BL/6J and C51BLI61-bg" genomic DNA and Lyst cDNA were used as templates in the PCR™ reactions Amplicons illustrated correspond to nucleotides 4,913-5,433, which represents an exon downstream from the deletion No amplicon was observed in control PCR™ reactions performed without template.
FIG. 12G. Intragenic deletion of Lyst in bg"J DNA. Genomic structure of Lyst in the vicinity of the bg"J deletion Lyst exons (α, β, γ, δ, ε, and φ) are depicted by black boxes, and intervening introns by a solid line. Nucleotides of the mouse Lyst cDNA that correspond to exonic boundaries are indicated above the boxes The 3' end of exon β, and all of exons γ and δ, are deleted in bg1" DNA The region of Lyst protein that is deleted in bg"J contains a pair of helices with N-terminal phosphorylation sites Genomic structure and intronic sequences were ascertained by sequence analysis of nested PCR™ products, performed with exonic primers and PI clone DNA as template (Kingsmore et al, 1994) Boundaries of the bg'" deletion were determined by PCR™ of genomic DNA.
FIG. 13A. Northern blot analysis of mouse and human Lyst Northern blots of 2 μg poly(A)+ RNA from various mouse tissues (Clontech) hybridized with probes that correspond to nucleotides 4,423-4,621 of mouse Lyst cDNA
FIG. 13B. Northern blot analysis of mouse and human Lyst Northern blots of 2 μg poly(A)+ RNA from various mouse tissues (Clontech) hybridized with probes that correspond to nucleotides 1,430-2,457 (exon β)of mouse Lyst cDNA. Molecular size standards (in kb) are shown to the left. Hybridization of mouse mRNA with probes from mouse Lyst exons α and γ gave identical results to those shown with exon β, whereas probes from exons δ, ε, and φ gave results identical to those shown in FIG 13 A.
FIG. 13C. Northern blot analysis of mouse and human Lyst. Northern blot of 2 μg poly(A)+ RNA from various human Iymphoid tissues, hybridized with a probe that corresponds to nucleotides 357-800 of human LYST cDNA Molecular size standards (in kb) are shown to the left.
FIG. 13D. Northern blot analysis of mouse and human Lyst Northern blot of 2 μg poly(A)+ RNA from human cancer cell lines, hybridized with a probe that corresponds to nucleotides 357-800 of human LYST cDNA Molecular size standards (in kb) are shown to the left
FIG. 14A. Mutation analysis of LYST cDNA from CHS patients A Northern blot of 2 ώg aliquots of lymphoblastoid poly(A)+ RNA from CHS patients and a control The probe used for hybridization corresponds to nucleotides 490 to 817 of LYST
FIG. 14B. SSCP analysis of cDNA corresponding to LYST nucleotides 439 to 806 Each lane contains samples from individual patients as indicated Note the appearance of an extra band in lanes corresponding to patients 371 and 373
FIG. 14C. Sequence chromatograms showing mutations in LYST cDNA clones from patients 371 and 373 The upper part is normal human Z7SJcDNA sequence The arrows indicate the positions of a G insertion (patient 371) and C to T substitution (patient 373) The antisense strand of LYST is shown
FIG. 15A. Physical mapping of LYST Monochromosomal somatic cell hybrid blot (BIOS Laboratories, New Haven, Connecticut) containing DNA from 24 somatic cell hybrid cell lines and three control DNAs (human, hamster, or mouse) digested with EcoKL The cell line and chromosome number are indicated at the top of the figure *Mix lane consists of I 5 mg human DNA and 4 5 mg mouse DNA **Human/hamster somatic cell hybrid All others are human/mouse hybrids Molecular size standards (in kb) are shown to the right The blot was hybridized with a probe corresponding to nucleotides 2923-4865 of human ZTSrcDNA
FIG. 15B. Southern blot of CHS critical region YACs digested with Taql The YAC coordinates are indicated at the top and molecular size standards (in kb) are shown to the left The probe used for the hybridization corresponds to LYST nucleotides 490 to 817
FIG. 15C. The same Southern blot shown in panel B, rehybridized with a probe corresponding to LYST nucleotides 4551 to 4977 A third LSYT probe (corresponding to nucleotides 3032-4722) also hybridized to the same YAC clones
FIG. 15D. Physical map of human chromosome 1 showing the location of LYST within a YAC contig of the CHS critical region (Barrat et al. 1996) The upper part represents chromosome 1 The microsatellite markers D1S179 (centromeric) and WI- 12396 (telomeric) flank the CHS locus . YAC clones are shown below the chromosome The figure is not drawn to scale
FIG. 16A Genomic organization of LYST Schematic representation of PCR™ clones corresponding to the human LYST cDNA (Genbank accession number U70064) The solid and open bars represent the LYST coding region and the 5' TJTR, respectively
Nucleotide 5095 corresponds to the transition between sequences conserved with Lyst (Barbosa et al, 1996) and BG (Perou et al, 1996a) The three human ESTs identified by database searches with the mouse sequence (#1, #2 and #3, Genbank accession numbers #1, L77889, #2, W26957, #3, H51623) are shown at the top Clones #4, #5, #6 and #8 are RT-PCR™ products Clone #7 is a 2 kb 5'RACE product
FIG. 16B Alternative splicing of mouse Lyst Solid boxes represent Lyst exons σ and τ Splicing of exon σ to exon τ occurs in the Lyst-l mRNA (12 kb) The hatched box represents the intronic region that forms the 3' end of the Z st-II ORF (5 9 kb) An asterisk indicates a stop codon and an 'A' indicates a polyadenylation signal within the intron Nucleotide positions indicated are from Genbank accession number L77884
(Lyst-ll) and U70015 (Lyst-I)
FIG. 16C. Detection of Lyst-l and Lyst-ll by RT-PCR™ and genomic PCR™
DNAse-treated mouse melanocyte RNA was reverse transcribed and amplified with primers Fl/Rl (expected amplicon size 273 bp) or F1/R2 (expected amplicon size 560 bp) RNAse-treated C57BL/6J DNA was amplified with primers Fl/Rl The primer sequences are
F 1 , 5 '-TGTGGAATAC ATCC AATGAATCCGAGAGTGC-3 ',
F2, 5'-GAGCCAAGAAAGAGGCTGAT-3';
Rl , 5'-GGTTTCGGACTCAAAAGTTTGTCGGAACTT-3',
R2, 5'-GAGACCCATATGGAGATTTC-3' 4. DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
4.1 LYST-ENCODING NUCLEIC ACID SEGMENTS
As used herein, the term "LYSTl gene" is used to refer to a gene or DNA coding region that encodes a Chediak-Higashi protein, polypeptide or peptide.
The definition of a "LYSTl gene", as used herein, is a gene that hybridizes, under relatively stringent hybridization conditions (see, e.g., Maniatis et al, 1982), to DNA sequences presently known to include LYSTl gene sequences. It will, of course, be understood that one or more than one genes encoding LYSTl proteins or peptides may be used in the methods and compositions of the invention. The nucleic acid compositions and methods disclosed herein may entail the administration of one, two, three, or more, genes or gene segments. The maximum number of genes that may be used is limited only by practical considerations, such as the effort involved in simultaneously preparing a large number of gene constructs or even the possibility of eliciting a significant adverse cytotoxic effect.
As used herein, the term "LYST2 gene" is used to refer to a gene or DNA coding region that encodes a LYST2 protein, polypeptide or peptide.
The definition of a "LYST2 gene", as used herein, is a gene that hybridizes, under relatively stringent hybridization conditions (see, e.g., Maniatis et al, 1982), to DNA sequences presently known to include LYST2 gene sequences. It will, of course, be understood that one or more than one genes encoding LYST2 proteins or peptides may be used in the methods and compositions of the invention. The nucleic acid compositions and methods disclosed herein may entail the administration of one, two, three, or more, genes or gene segments. The maximum number of genes that may be used is limited only by practical considerations, such as the effort involved in simultaneously preparing a large number of gene constructs or even the possibility of eliciting a significant adverse cytotoxic effect.
In those embodiments involving multiple genes of the present invention, the LYST and
Lyst genes disclosed herein may be combined on a single genetic construct under control of one or more promoters, or they may be prepared as separate constructs of the same of different types.
Thus, an almost endless combination of different genes and genetic constructs may be employed.
Certain gene combinations may be designed to, or their use may otherwise result in, achieving synergistic effects on formation of an immune response, or the development of antibodies to gene products encoded by such nucleic acid segments, or in the production of diagnostic and treatment protocols for, among other things, Chediak-Higashi Syndrome. Any and all such combinations are intended to fall within the scope of the present invention. Indeed, many synergistic effects have been described in the scientific literature, so that one of ordinary skill in the art would readily be able to identify likely synergistic gene combinations, or even gene-protein combinations.
It will also be understood that, if desired, the nucleic segment or gene could be administered in combination with further agents, such as, e.g. , proteins or polypeptides or various pharmaceutically active agents. So long as genetic material forms part of the composition, there is virtually no limit to other components which may also be included, given that the additional agents do not cause a significant adverse effect upon contact with the target cells or tissues.
4.2 THERAPEUTIC AND DIAGNOSTIC KITS
Therapeutic kits comprising, in suitable container means, a LYST or Lyst composition of the present invention in a pharmaceutically acceptable formulation represent another aspect of the invention. The LYST or Lyst composition may be native LYST or Lyst protein, truncated LYST or Lyst protein, site-specifically mutated LYST or Lyst-encoding DNAs, or LYST- or Lyst- derived peptide epitopes, or alternatively antibodies which bind the native LYST or Lyst gene product, truncated LYST or Lyst protein, site-specifically mutated LYST or Lyst protein, or LYST- or Lyst-encoded peptide epitopes. In other embodiments, the LYST or Lyst composition may be nucleic acid segments encoding one or more native LYST or Lyst proteins, truncated LYST or Lyst proteins, site-specifically mutated LYST or Lyst proteins, or peptide epitope derivatives of LYST or Lyst. Such nucleic acid segments may be DNA or RNA, and may be either native, recombinant, or mutagenized nucleic acid segments.
The kits may comprise a single container means that contains the LYST or Lyst composition. The container means may, if desired, contain a pharmaceutically acceptable sterile excipient, having associated with it, the LYST or Lyst composition and, optionally, a detectable label or imaging agent. The formulation may be in the form of a gelatinous composition, e.g., a collagenous- LYST or Lyst composition, or may even be in a more fluid form. The container means may itself be a syringe, pipette, or other such like apparatus, from which the LYST or Lyst composition may be applied to a tissue site, injected into an animal, or otherwise administered as needed. However, the single container means may contain a dry, or lyophilized, mixture of a
LYST or Lyst composition, which may or may not require pre-wetting before use.
Alternatively, the kits of the invention may comprise distinct container means for each component. In such cases, one container would contain the LYST or Lyst composition, either as a sterile DNA solution or in a lyophilized form, and the other container would include the matrix, which may or may not itself be pre-wetted with a sterile solution, or be in a gelatinous, liquid or other syringeable form.
The kits may also comprise a second or third container means for containing a sterile, pharmaceutically acceptable buffer, diluent or solvent. Such a solution may be required to formulate the LYST or Lyst component into a more suitable form for application to the body, e.g., as a topical preparation, or alternatively, in oral, parenteral, or intravenous forms. It should be noted, however, that all components of a kit could be supplied in a dry form (lyophilized), which would allow for "wetting" upon contact with body fluids. Thus, the presence of any type of pharmaceutically acceptable buffer or solvent is not a requirement for the kits of the invention. The kits may also comprise a second or third container means for containing a pharmaceutically acceptable detectable imaging agent or composition.
The container means will generally be a container such as a vial, test tube, flask, bottle, syringe or other container means, into which the components of the kit may placed. The matrix and gene components may also be aliquoted into smaller containers, should this be desired. The kits of the present invention may also include a means for containing the individual containers in close confinement for commercial sale, such as, e.g., injection or blow-molded plastic containers into which the desired vials or syringes are retained.
Irrespective of the number of containers, the kits of the invention may also comprise, or be packaged with, an instrument for assisting with the placement of the ultimate LYST or Lyst composition within the body of an animal. Such an instrument may be a syringe, pipette, forceps, or any such medically approved delivery vehicle.
4.3 METHODS OF NUCLEIC ACID DELIVERY AND DNA TRANSFECTION
In certain embodiments, it is contemplated that the nucleic acid segments disclosed herein will be used to transfect appropriate host cells. Technology for introduction of DNA into cells is well-known to those of skill in the art Four general methods for delivering a nucleic segment into cells have been described
(1) chemical methods (Graham and VanDerEb, 1973);
(2) physical methods such as microinjection (Capecchi, 1980), electroporation (Wong and Neumann, 1982, Fromm et al, 1985) and the gene gun (Yang et al, 1990);
(3) viral vectors (Clapp, 1993, Eglitis and Anderson, 1988), and
(4) receptor-mediated mechanisms (Curiel et al, 1991, Wagner et al. , 1992)
4.4 LIPOSOMES AND NANOCAPSULES
In certain embodiments, the inventors contemplate the use of liposomes and/or nanocapsules for the introduction of particular peptides or nucleic acid segments into host cells Such formulations may be preferred for the introduction of pharmaceutically-acceptable formulations of the nucleic acids, peptides, and/or antibodies disclosed herein The formation and use of liposomes is generally known to those of skill in the art (see for example, Couvreur et al. , 1977 which describes the use of liposomes and nanocapsules in the targeted antibiotic therapy of intracellular bacterial infections and diseases) Recently, liposomes were developed with improved serum stability and circulation half-times (Gabizon and Papahadjopoulos, 1988, Allen and Choun, 1987)
Nanocapsules can generally entrap compounds in a stable and reproducible way (Henry- Michelland et ai, 1987) To avoid side effects due to intracellular polymeric overloading, such ultrafine particles (sized around 0.1 μm) should be designed using polymers able to be degraded in vivo Biodegradable polyalkyl-cyanoacrylate nanoparticles that meet these requirements are contemplated for use in the present invention, and such particles may be are easily made, as described (Couvreur et ai, 1977, 1988).
Liposomes are formed from phospholipids that are dispersed in an aqueous medium and spontaneously form multilamellar concentric bilayer vesicles (also termed multilamellar vesicles
(MLVs) MLVs generally have diameters of from 25 nm to 4 μm Sonication of MLVs results in the formation of small unilamellar vesicles (SUVs) with diameters in the range of 200 to 500 C, containing an aqueous solution in the core 3°l In addition to the teachings of Couvreur et al. (1988), the following information may be utilized in generating liposomal formulations Phospholipids can form a variety of structures other than liposomes when dispersed in water, depending on the molar ratio of lipid to water At low ratios the liposome is the preferred structure The physical characteristics of liposomes depend on pH, ionic strength and the presence of divalent cations Liposomes can show low permeability to ionic and polar substances, but at elevated temperatures undergo a phase transition which markedly alters their permeability The phase transition involves a change from a closely packed, ordered structure, known as the gel state, to a loosely packed, less-ordered structure, known as the fluid state This occurs at a characteristic phase-transition temperature and results in an increase in permeability to ions, sugars and drugs
Liposomes interact with cells via four different mechanisms Endocytosis by phagocytic cells of the reticuloendothelial system such as macrophages and neutrophils, adsorption to the cell surface, either by nonspecific weak hydrophobic or electrostatic forces, or by specific interactions with cell-surface components, fusion with the plasma cell membrane by insertion of the lipid bilayer of the liposome into the plasma membrane, with simultaneous release of liposomal contents into the cytoplasm, and by transfer of liposomal lipids to cellular or subcellular membranes, or vice versa, without any association of the liposome contents It often is difficult to determine which mechanism is operative and more than one may operate at the same time
4.5 METHODS FOR PREPARING ANTIBODY COMPOSITIONS In another aspect, the present invention contemplates an antibody that is immunoreactive with a polypeptide of the invention As stated above, one of the uses for LYST- or Lyst-derived epitopic peptides according to the present invention is to generate antibodies Reference to antibodies throughout the specification includes whole polyclonal and monoclonal antibodies (mAbs), and parts thereof, either alone or conjugated with other moieties Antibody parts include Fab and F(ab)2 fragments and single chain antibodies The antibodies may be made in vivo in suitable laboratory animals or vitro using recombinant DNA techniques In a preferred embodiment, an antibody is a polyclonal antibody Means for preparing and characterizing antibodies are well known in the art (See, e.g., Harlow and Lane, 1988)
Briefly, a polyclonal antibody is prepared by immunizing an animal with an immunogen comprising a polypeptide of the present invention and collecting antisera from that immunized animal. A wide range of animal species .can be used for the production of antisera. Typically an animal used for production of anti-antisera is a rabbit, a mouse, a rat, a hamster or a guinea pig Because of the relatively large blood volume of rabbits, a rabbit is a preferred choice for production of polyclonal antibodies.
Antibodies, both polyclonal and monoclonal, specific LYST- or Lyst-derived epitopes may be prepared using conventional immunization techniques, as will be generally known to those of skill in the art. A composition containing antigenic epitopes of particular proteins can be used to immunize one or more experimental animals, such as a rabbit or mouse, which will then proceed to produce specific antibodies against LYST- or Lyst-derived peptides. Polyclonal antisera may be obtained, after allowing time for antibody generation, simply by bleeding the animal and preparing serum samples from the whole blood.
The amount of immunogen composition used in the production of polyclonal antibodies varies upon the nature of the immunogen, as well as the animal used for immunization. A variety of routes can be used to administer the immunogen (subcutaneous, intramuscular, intradermal, intravenous and intraperitoneal). The production of polyclonal antibodies may be monitored by sampling blood of the immunized animal at various points following immunization. A second, booster injection, also may be given. The process of boosting and titering is repeated until a suitable titer is achieved. When a desired level of immunogenicity is obtained, the immunized animal can be bled and the serum isolated and stored, and/or the animal can be used to generate mAbs (below).
One of the important features obtained from the present invention is a polyclonal sera that is relatively homogenous with respect to the specificity of the antibodies therein. Typically, polyclonal antisera is derived from a variety of different "clones," i.e., B-cells of different lineage. mAbs, by contrast, are defined as coming from antibody-producing cells with a common B-cell ancestor, hence their "mono" clonality.
When peptides are used as antigens to raise polyclonal sera, one would expect considerably less variation in the clonal nature of the sera than if a whole antigen were employed. Unfortunately, if incomplete fragments of an epitope are presented, the peptide may very well assume multiple (and probably non-native) conformations. As a result, even short peptides can produce polyclonal antisera with relatively plur 4al* specificities and, unfortunately, an antisera that does not react or reacts poorly with the native molecule
Polyclonal antisera according to present invention is produced against peptides that are predicted to comprise whole, intact epitopes It is believed that these epitopes are, therefore, more stable in an immunologic sense and thus express a more consistent immunologic target for the immune system Under this model, the number of potential B-cell clones that will respond to this peptide is considerably smaller and, hence, the homogeneity of the resulting sera will be higher In various embodiments, the present invention provides for polyclonal antisera where the clonality, i.e., the percentage of clone reacting with the same molecular determinant, is at least 80% Even higher clonality - 90%, 95% or greater - is contemplated
To obtain mAbs, one would also initially immunize an experimental animal, often preferably a mouse, with a LYST- or Lyst-containing composition One would then, after a period of time sufficient to allow antibody generation, obtain a population of spleen or lymph cells from the animal The spleen or lymph cells can then be fused with cell lines, such as human or mouse myeloma strains, to produce antibody-secreting hybridomas These hybridomas may be isolated to obtain individual clones which can then be screened for production of antibody to the desired peptide
Following immunization, spleen cells are removed and fused, using a standard fusion protocol with plasmacytoma cells to produce hybridomas secreting mAbs against the LYST or Lyst protein Hybridomas which produce mAbs to the selected antigens are identified using standard techniques, such as ELISA and Western blot methods Hybridoma clones can then be cultured in liquid media and the culture supernatants purified to provide the LYST- or Lyst- specific mAbs
It is proposed that the mAbs of the present invention will also find useful application in immunochemical procedures, such as ELISA and Western blot methods, as well as other procedures such as immunoprecipitation, immunocytological methods, etc. which may utilize antibodies specific to the LYST or Lyst protein In particular, anti-LYST/Lyst antibodies may be used in immunoabsorbent protocols to purify native or recombinant LYST/Lyst proteins or LYST/Lyst-derived peptide species or synthetic or natural variants thereof The antibodies disclosed herein may be employed in antibody cloning protocols to obtain cDNAs or genes encoding LYST/Lyst proteins from other species or organisms, or to identify proteins having significant homology to the LYST Lyst gene product. They may also be used in inhibition studies to analyze the effects of LYST/Lyst protein in cells, tissues, or whole animals Anti- LYST/Lyst antibodies will also be useful in immunolocalization studies to analyze the distribution of cells expressing LYST/Lyst protein during particular cellular activities, or for example, to determine the cellular or tissue-specific distribution of LYST/Lyst under different physiological conditions. A particularly useful application of such antibodies is in purifying native or recombinant LYST/Lyst proteins, for example, using an antibody affinity column. The operation of all such immunological techniques will be known to those of skill in the art in light of the present disclosure.
4.6 RECOMBINANT EXPRESSION OF "LYST FAMILY" PEPTIDES
Recombinant clones expressing the "LYST family" nucleic acid segments may be used to prepare purified recombinant LYST protein (rLYST), purified rLYST-derived peptide antigens as well as mutant or variant recombinant protein species in significant quantities. The selected antigens, and variants thereof, are proposed to have significant utility in diagnosing and treating CHS. For example, it is proposed that rLYSTs, peptide variants thereof, and/or antibodies against such rLYSTs may also be used in immunoassays to detect the presence of LYST or as vaccines or immunotherapeutics to treat CHS and related disorders. Additionally, by application of techniques such as DNA mutagenesis, the present invention allows the ready preparation of so- called "second generation" molecules having modified or simplified protein structures. Second generation proteins will typically share one or more properties in common with the full-length antigen, such as a particular antigenic/immunogenic epitopic core sequence. Epitopic sequences can be obtained from relatively short mo:ecules prepared from knowledge of the peptide, or encoding DNA sequence information. Such variant molecules may not only be derived from selected immunogenic/ antigenic regions of the protein structure, but may additionally, or alternatively, include one or more functionally equivalent amino acids selected on the basis of similarities or even differences with respect to the natural sequence. 4.7 ANTIBODY COMPOSITIONS AND FORMULATIONS THEREOF
Means for prepanng and characterizing antibodies are well known in the art (See, e.g., Harlow and Lane (1988), incorporated herein by reference) The methods for generating mAbs generally begin along the same lines as those for preparing polyclonal antibodies Briefly, a polyclonal antibody is prepared by immunizing an animal with an immunogenic composition in accordance with the present invention and collecting antisera from that immunized animal A wide range of animal species can be used for the production of antisera Typically the animal used for production of anti-antisera is a rabbit, a mouse, a rat, a hamster, a guinea pig or a goat Because of the relatively large blood volume of rabbits, a rabbit is a preferred choice for production of polyclonal antibodies
As is well known in the art, a given composition may vary in its immunogenicity It is often necessary therefore to boost the host immune system, as may be achieved by coupling a peptide or polypeptide immunogen to a' carrier Exemplary and preferred carriers are keyhole limpet hemocyanin (KLH) and bovine serum albumin (BSA) Other albumins such as ovalbumin, mouse serum albumin or rabbit serum albumin can also be used as carriers Means for conjugating a polypeptide to a carrier protein are well known in the art and include glutaraldehyde, /w-maleimidobenzoyl-N-hydroxysuccinimide ester, carbodiimide and bis-biazotized benzidine
mAbs may be readily prepared through use of well-known techniques, such as those exemplified in U S Patent 4,196,265, incorporated herein by reference Typically, this technique involves immunizing a suitable animal with a selected immunogen composition, e.g., a purified or partially purified protein, polypeptide or peptide The immunizing composition is administered in a manner effective to stimulate antibody producing cells Rodents such as mice and rats are preferred animals, however, the use of rabbit, sheep or frog cells is also possible The use of rats may provide certain advantages (Goding, 1986), but mice are preferred, with the BALB/c mouse being most preferred as this is most routinely used and generally gives a higher percentage of stable fusions.
Following immunization, somatic cells with the potential for producing antibodies, specifically B-lymphocytes (B-cells), are selected for use in the mAb generating protocol These cells may be obtained from biopsied spleens, tonsils or lymph nodes, or from a peripheral blood sample Spleen cells and peripheral blood cells are preferred, the former because they are a rich source of antibody-producing cells that are m Ψt the dividing plasmablast stage, and the latter because peripheral blood is easily accessible Often, a panel of animals will have been immunized and the spleen of animal with the highest antibody titer will be removed and the spleen lymphocytes obtained by homogenizing the spleen with a syringe Typically, a spleen from an immunized mouse contains approximately about 5 x 107 to about 2 x 108 lymphocytes
The antibody-producing B lymphocytes from the immunized animal are then fused with cells of an immortal myeloma cell, generally one of the same species as the animal that was immunized Myeloma cell lines suited for use in hybridoma-producing fusion procedures preferably are non-antibody-producing, have high fusion efficiency, and enzyme deficiencies that render them incapable of growing in certain selective media which support the growth of only the desired fused cells (hybridomas)
Any one of a number of myeloma cells may be used, as are known to those of skill in the art (Goding, 1986, Campbell, 1984) For example, where the immunized animal is a mouse, one may use P3-X63/Ag8, X63-Ag8 653, NS1/1 Ag 4 1, Sp210-Agl4, FO, NSO/U, MPC-11, MPC11-X45-GTG 1 7 and S194/5XX0 Bui, for rats, one may use R210 RCY3, Y3-Ag 1 2 3, IR983F and 4B210, and U-266, GM1500-GRG2, LICR-LON-HMy2 and UC729-6 are all useful in connection with human cell fusions
One preferred murine myeloma cell is the NS-1 myeloma cell line (also termed P3-NS-1- Ag4-1), which is readily available from the NIGMS Human Genetic Mutant Cell Repository by requesting cell line repository number GM3573 Another mouse myeloma cell line that may be used is the 8-azaguanine-resistant mouse murine myeloma SP2/0 non-producer cell line
Methods for generating hybrids of antibody-producing spleen or lymph node cells and myeloma cells usually comprise mixing somatic cells with myeloma cells in a 2 1 ratio, though the ratio may vary from about 20 1 to about 1 1, respectively, in the presence of an agent or agents (chemical or electrical) that promote the fusion of cell membranes Fusion methods using Sendai virus have been described (Kohler and Milstein, 1975, 1976), and those using polyethylene glycol (PEG), such as 37% (v/v) PEG, by Gefter et al ( 977) The use of electrically induced fusion methods is also appropriate (Goding, 1986)
Fusion procedures usually produce viable hybrids at low frequencies, about 1 x 10"6 to about 1 x 10"8 However, this does not pose a problem, as the viable, fused hybrids are differentiated from the parental, unfused cells (particularly the unfused myeloma cells that would normally continue to divide indefinitely) by culturing in a selective medium The selective medium is generally one that contains an agent that blocks the de novo synthesis of nucleotides in the tissue culture media Exemplary and preferred agents are aminopterin, methotrexate, and azaserine Aminopterin and methotrexate block de novo synthesis of both purines and pyrimidines, whereas azaserine blocks only purine synthesis Where aminopterin or methotrexate is used, the media is supplemented with hypoxanthine and thymidine as a source of nucleotides (HAT medium) Where azaserine is used, the media is supplemented with hypoxanthine
The preferred selection medium is HAT Only cells capable of operating nucleotide salvage pathways are able to survive in HAT medium The myeloma cells are defective in key enzymes of the salvage pathway, e.g., hypoxanthine phosphoribosyl transferase (HPRT), and they cannot survive The B-cells can operate this pathway, but they have a limited life span in culture and generally die within about two weeks Therefore, the only cells that can survive in the selective media are those hybrids formed from myeloma and B-cells
This culturing provides a population of hybridomas from which specific hybridomas are selected Typically, selection of hybridomas is performed by culturing the cells by single-clone dilution in microtiter plates, followed by testing the individual clonal supernatants (after about two to three weeks) for the desired reactivity The assay should be sensitive, simple and rapid, such as radioimmunoassays, enzyme immunoassays, cytotoxicity assays, plaque assays, dot immunobinding assays, and the like
The selected hybridomas would then be serially diluted and cloned into individual antibody-producing cell lines, which clones can then be propagated indefinitely to provide mAbs The cell lines may be exploited for mAb production in two basic ways A sample of the hybridoma can be injected (often into the peritoneal cavity) into a histocompatible animal of the type that was used to provide the somatic and myeloma cells for the original fusion The injected animal develops tumors secreting the specific mAb produced by the fused cell hybrid The body fluids of the animal, such as serum or ascites fluid, can then be tapped to provide mAbs in high concentration The individual cell lines could also be cultured in vitro, where the mAbs are naturally secreted into the culture medium from which they can be readily obtained in high concentrations mAbs produced by either means may be further purified, if desired, using filtration, centrifugation and various chromatographic methods such as HPLC or affinity chromatography.
4.8 IMMUNOASSAYS
As noted, it is proposed that native and synthetically-derived peptides and peptide epitopes of the invention will find utility as immunogens, e.g., in connection with vaccine development, or as antigens in immunoassays for the detection of reactive antibodies. Turning first to immunoassays, in their most simple and direct sense, preferred immunoassays of the invention include the various types of enzyme linked immunosorbent assays (ELISAs), as are known to those of skill in the art. However, it will be readily appreciated that the utility of LYST-derived proteins and peptides is not limited to such assays, and that other useful embodiments include RIAs and other non-enzyme linked antibody binding assays and procedures.
In preferred ELISA assays, proteins or peptides incorporating LYST, rLYST, or LYST- derived protein antigen sequences are immobilized onto a selected surface, preferably a surface exhibiting a protein affinity, such as the wells of a polystyrene microtiter plate. After washing to remove incompletely adsorbed material, one would then generally desire to bind or coat a nonspecific protein that is known to be antigenically neutral with regard to the test antisera, such as bovine serum albumin (BSA) or casein, onto the well. This allows for blocking of nonspecific adsorption sites on the immobilizing surface and thus reduces the background caused by nonspecific binding of antisera onto the surface.
After binding of antigenic material to the well, coating with a non-reactive material to reduce background, and washing to remove unbound material, the immobilizing surface is contacted with the antisera or clinical or biological extract to be tested in a manner conducive to immune complex (antigen/antibody) formation. Such conditions preferably include diluting the antisera with diluents such as BSA, bovine gamma globulin (BGG) and phosphate buffered saline (PBS)/Tween™. These added agents also tend to assist in the reduction of nonspecific background. The layered antisera is then allowed to incubate for, e.g., from 2 to 4 hours, at temperatures preferably on the order of about 25° to about 27°. Following incubation, the antisera-contacted surface is washed so as to remove non-immunocomplexed material. A preferred washing procedure includes washing with a solution such as PBS/Tween™, or borate buffer. Following formation of specific immuno 7co7mplexes between the test sample and the bound antigen, and subsequent washing, the occurrence and the amount of immunocomplex formation may be determined by subjecting the complex to a second antibody having specificity for the first
Of course, in that the test sample will typically be of human origin, the second antibody will preferably be an antibody having specificity for human antibodies To provide a detecting means, the second antibody will preferably have an associated detectable label, such as an enzyme label, that will generate a signal, such as color development upon incubating with an appropriate chromogenic substrate Thus, for example, one will desire to contact and incubate the antisera- bound surface with a urease or peroxidase-conjugated anti-human IgG for a period of time and under conditions that favor the development of immunocomplex formation (e.g., incubation for 2 hours at room temperature in a PBS-containing solution such as PBS-Tween™)
After incubation with the second enzyme-tagged antibody, and subsequent to washing to remove unbound material, the amount of label is quantified by incubation with a chromogenic substrate such as urea and bromocresol purple or 2,2'-azino-di-(3-ethyl-benzthiazoline)-6-sulfonic acid (ABTS) and H2O , in the case of peroxidase as the enzyme label Quantitation is then achieved by measuring the degree of color generation, e.g., using a visible spectrum spectrophotometer
ELISAs may be used in conjunction with the invention In one such ELISA assay, proteins or peptides incorporating antigenic sequences of the present invention are immobilized onto a selected surface, preferably a surface exhibiting a protein affinity such as the wells of a polystyrene microtiter plate After washing to remove incompletely adsorbed material, it is desirable to bind or coat the assay plate wells with a nonspecific protein that is known to be antigenically neutral with regard to the test antisera such as bovine serum albumin (BSA), casein or solutions of powdered milk This allows for blocking of nonspecific adsorption sites on the immobilizing surface and thus reduces the background caused by nonspecific binding of antisera onto the surface.
4.9 IMMUNOPRECIPITATION
The anti-LYST protein antibodies of the present invention are particularly useful for the isolation of LYST protein antigens by immunoprecipitation Immunoprecipitation involves the separation of the target antigen component -from a complex mixture, and is used to discriminate or isolate minute amounts of protein.
In an alternative embodiment the antibodies of the present invention are useful for the close juxtaposition of two antigens This is particularly useful for increasing the localized concentration of antigens, e.g., enzyme-substrate pairs
4.10 WESTERN BLOTS
The compositions of the present invention will find great use in immunoblot or western blot analysis. The anti-LYST antibodies may be used as high-affinity primary reagents for the identification of proteins immobilized onto a solid support matrix, such as nitrocellulose, nylon or combinations thereof In conjunction with immunoprecipitation, followed by gel electrophoresis, these may be used as a single step reagent for use in detecting antigens against which secondary reagents used in the detection of the antigen cause an adverse background. This is especially useful when the antigens studied are immunoglobulins (precluding the use of immunoglobulins binding bacterial cell wall components), the antigens studied cross-react with the detecting agent, or they migrate at the same relative molecular weight as a cross-reacting signal. Immunologically- based detection methods in conjunction with Western blotting (including enzymatically-, radiolabel-, or fluorescently-tagged secondary antibodies against the toxin moiety) are considered to be of particular use in this regard
4.11 PHARMACEUTICAL COMPOSITIONS The pharmaceutical compositions disclosed herein may be orally administered, for example, with an inert diluent or with an assimilable edible carrier, or they may be enclosed in hard or soft shell gelatin capsule, or they may be compressed into tablets, or they may be incorporated directly with the food of the diet For oral therapeutic administration, the active compounds may be incorporated with excipients and used in the form of ingestible tablets, buccal tables, troches, capsules, elixirs, suspensions, syrups, wafers, and the like. Such compositions and preparations should contain at least 0.1% of active compound. The percentage of the compositions and preparations may, of course, be varied and may conveniently be between about 2 to about 60% of the weight of the unit The amount of active compounds in such therapeutically useful compositions is such that a suitable dosage will be obtained. Vcl The tablets, troches, pills, capsules. and the like may also contain the following: a binder, as gum tragacanth, acacia, cornstarch, or gelatin; excipients, such as dicalcium phosphate; a disintegrating agent, such as corn starch, potato starch, alginic acid and the like; a lubricant, such as magnesium stearate; and a sweetening agent, such as sucrose, lactose or saccharin may be added or a flavoring agent, such as peppermint, oil of wintergreen, or cherry flavoring. When the dosage unit form is a capsule, it may contain, in addition to materials of the above type, a liquid carrier. Various other materials may be present as coatings or to otherwise modify the physical form of the dosage unit. For instance, tablets, pills, or capsules may be coated with shellac, sugar or both. A syrup of elixir may contain the active compounds sucrose as a sweetening agent methyl and propylparabens as preservatives, a dye and flavoring, such as cherry or orange flavor.
Of course, any material used in preparing any dosage unit form should be pharmaceutically pure and substantially non-toxic in the amounts employed. In addition, the active compounds may be incorporated into sustained-release preparation and formulations.
The active compounds may also be administered parenterally or intraperitoneally. Solutions of the active compounds as free base or pharmacologically acceptable salts can be prepared in water suitably mixed with a surfactant, such as hydroxypropylcellulose. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms.
The pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions. In all cases the form must be sterile and must be fluid to the extent that easy syringability exists. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms, such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils. The proper fluidity can be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. The prevention of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars or sodium chloride Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions are prepared by incorporating the active compounds in the required amount in the appropriate solvent with various of the other ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
As used herein, "pharmaceutically acceptable carrier" includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absoφtion delaying agents and the like. The use of such media and agents for pharmaceutical active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions.
For oral prophylaxis the polypeptide may be incorporated with excipients and used in the form of non-ingestible mouthwashes and dentifrices A mouthwash may be prepared incorporating the active ingredient in the required amount in an appropriate solvent, such as a sodium borate solution (Dobell's Solution). Alternatively, the active ingredient may be incorporated into an antiseptic wash containing sodium borate, glycerin and potassium bicarbonate The active ingredient may also be dispersed in dentifrices, including gels, pastes, powders and slurries. The active ingredient may be added in a therapeutically effective amount to a paste dentifrice that may include water, binders, abrasives, flavoring agents, foaming agents, and humectants.
The phrase "pharmaceutically-acceptable" refers to molecular entities and compositions that do not produce an allergic or similar untoward reaction when administered to a human. The preparation of an aqueous composition that contains a protein as an active ingredient is well understood in the art. Typically, such compositions are prepared as injectables, either as liquid I solutions or suspensions, solid forms suitable for solution in, or suspension in, liquid prior to injection can also be prepared The preparation can also be emulsified
The composition can be formulated in a neutral or salt form Pharmaceutically-acceptable salts, include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like Upon formulation, solutions will be administered in a manner compatible with the dosage formulation and in such amount as is therapeutically effective The formulations are easily administered in a variety of dosage forms such as injectable solutions, drug release capsules and the like
For parenteral administration in an aqueous solution, for example, the solution should be suitably buffered if necessary and the liquid diluent first rendered isotonic with sufficient saline or glucose These particular aqueous solutions are especially suitable for intravenous, intramuscular, subcutaneous and intraperitoneal administration In this connection, sterile aqueous media which can be employed will be known to those of skill in the art in light of the present disclosure For example, one dosage could be dissolved in 1 ml of isotonic NaCl solution and either added to 1000 ml of hypodermoclysis fluid or injected at the proposed site of infusion, (see for example, "Remington's Pharmaceutical Sciences" 15th Edition, pages 1035-1038 and 1570-1580) Some variation in dosage will necessarily occur depending on the condition of the subject being treated
The person responsible for administration will, in any event, determine the appropriate dose for the individual subject Moreover, for human administration, preparations should meet sterility, pyrogenicity, general safety and purity standards as required by FDA Office of Biologies standards
4.12. EPITOPIC CORE SEQUENCES
The present invention is also directed to protein or peptide compositions, free from total cells and other peptides, which comprise a purified protein or peptide which incorporates an epitope that is immunologically cross-reactive with one or more of the antibodies of the present invention As used herein, the term "incorporating an epitope(s) that is immunologically cross- reactive with one or more anti-LYST protein antibodies" is intended to refer to a peptide or protein antigen which includes a primary, secondary or tertiary structure similar to an epitope located within a LYST polypeptide. The level of similarity will generally be to such a degree that monoclonal or polyclonal antibodies directed against the LYST polypeptide will also bind to, react with, or otherwise recognize, the cross-reactive peptide or protein antigen. Various immunoassay methods may be employed in conjunction with such antibodies, such as, for example, Western blotting, ELISA, RIA, and the like, all of which are known to those of skill in the art.
The identification of LYST-derived epitopes such as those derived from the LYST gene or LYST-like gene products and/or their functional equivalents, suitable for use in vaccines is a relatively straightforward matter For example, one may employ the methods of Hopp, as taught in U.S. Patent 4,554,101, incorporated herein by reference, which teaches the identification and preparation of epitopes from amino acid sequences on the basis of hydrophilicity. The methods described in several other papers, and software programs based thereon, can also be used to identify epitopic core sequences (see, for example, Jameson and Wolf, 1988; Wolf et al, 1988, U.S. Patent Number 4,554,101). The amino acid sequence of these "epitopic core sequences" may then be readily incorporated into peptides, either through the application of peptide synthesis or recombinant technology.
Preferred peptides for use in accordance with the present invention will generally be on the order of about 5 to about 25 amino acids in length, and more preferably about 8 to about 20 amino acids in length. It is proposed that shorter antigenic peptide sequences will provide advantages in certain circumstances, for example, in the preparation of vaccines or in immunologic detection assays. Exemplary advantages include the ease of preparation and purification, the relatively low cost and improved reproducibility of production, and advantageous biodistribution.
It is proposed that particular advantages of the present invention may be realized through the preparation of synthetic peptides which include modified and/or extended epitopic/immunogenic core sequences which result in a "universal" epitopic peptide directed to the LYST gene product or LYST-related sequences It is proposed that these regions represent S3 those which are most likely to promote T-cell or B-cell stimulation in an animal, and, hence, elicit specific antibody production in such an animal
An epitopic core sequence, as used herein, is a relatively short stretch of amino acids that is "complementary" to, and therefore will bind, antigen binding sites on LYST protein epitope- specific antibodies Additionally or alternatively, an epitopic core sequence is one that will elicit antibodies that are cross-reactive with antibodies directed against the peptide compositions of the present invention It will be understood that in the context of the present disclosure, the term "complementary" refers to amino acids or peptides that exhibit an attractive force towards each other Thus, certain epitope core sequences of the present invention may be operationally defined in terms of their ability to compete with or perhaps displace the binding of the desired protein antigen with the corresponding protein-directed antisera
In general, the size of the polypeptide antigen is not believed to be particularly crucial, so long as it is at least large enough to carry the identified core sequence or sequences The smallest useful core sequence expected by the present disclosure would generally be on the order of about 5 amino acids in length, with sequences on the order of 8 or 25 being more preferred Thus, this size will generally correspond to the smallest peptide antigens prepared in accordance with the invention However, the size of the antigen may be larger where desired, so long as it contains a basic epitopic core sequence.
The identification of epitopic core sequences is known to those of skill in the art, for example, as described in U S Patent 4,554,101, incorporated herein by reference, which teaches the identification and preparation of epitopes from amino acid sequences on the basis of hydrophilicity Moreover, numerous computer programs are available for use in predicting antigenic portions of proteins (see e.g., Jameson and Wolf, 1988, Wolf et al, 1988) Computerized peptide sequence analysis programs (e.g., DNAStar® software, DNAStar, Inc , Madison, WI) may also be useful in designing synthetic LYST peptides and peptide analogs in accordance with the present disclosure
To confirm that a protein or peptide is immunologically cross-reactive with, or a biological functional equivalent of, one or more epitopes of the disclosed peptides is also a straightforward matter This can be readily determined using specific assays, e.g., of a single proposed epitopic sequence, or using more general screens, e.g., of a pool of randomly generated synthetic peptides or protein fragments The screening assays, may be employed to identify either equivalent antigens or cross-reactive antibodies In any event, the principle is the same, i.e., based upon competition for binding sites between antibodies and antigens
Suitable competition assays that may be employed include protocols based upon immunohistochemical assays, ELIS As, RIAs, Western or dot blotting and the like In any of the competitive assays, one of the binding components, generally the known element, such as the LYST gene product or LYST-derived peptides, or a known antibody, will be labeled with a detectable label and the test components, that generally remain unlabeled, will be tested for their ability to reduce the amount of label that is bound to the corresponding reactive antibody or antigen
As an exemplary embodiment, to conduct a competition study between a LYST protein and any test antigen, one would first label LYST with a detectable label, such as, e.g., biotin or an enzymatic, radioactive or fluorogenic label, to enable subsequent identification One would then incubate the labeled antigen with the other, test, antigen to be examined at various ratios (e.g., 1 : 1, 1 10 and 1 100) and, after mixing, one would then add the mixture to an antibody of the present invention Preferably, the known antibody would be immobilized, e.g., by attaching to an ELISA plate The ability of the mixture to bind to the antibody would be determined by detecting the presence of the specifically bound label This value would then be compared to a control value in which no potentially competing (test) antigen was included in the incubation
The assay may be any one of a range of immunological assays based upon hybridization, and the reactive antigens would be detected by means of detecting their label, e.g., using streptavidin in the case of biotinylated antigens or by using a chromogenic substrate in connection with an enzymatic label or by simply detecting a radioactive or fluorescent label An antigen that binds to the same antibody as LYST, for example, will be able to effectively compete for binding to and thus will significantly reduce LYST binding, as evidenced by a reduction in the amount of label detected
The reactivity of the labeled antigen, e.g., a LYST composition, in the absence of any test antigen would be the control high value The control low value would be obtained by incubating the labeled antigen with an excess of unlabeled LYST antigen, when competition would occur and reduce binding A significant reduction in labeled antigen reactivity in the presence of a test antigen is indicative of a test antigen that is "cross-reactive", i.e., that has binding affinity for the same antibody. "A significant reduction", in terms of the present application, may be defined as a reproducible (i.e., consistently observed) reduction in binding.
In addition to the peptidyl compounds described herein, the inventors also contemplate that other sterically similar compounds may be formulated to mimic the key portions of the peptide structure. Such compounds, which may be termed peptidomimetics, may be used in the same manner as the peptides of the invention and hence are also functional equivalents. The generation of a structural functional equivalent may be achieved by the techniques of modelling and chemical design known to those of skill in the art. It will be understood that all such sterically similar constructs fall within the scope of the present invention.
Syntheses of epitopic sequences, or peptides which include an antigenic epitope within their sequence, are readily achieved using conventional synthetic techniques such as the solid phase method (e.g., through the use of a commercially-available peptide synthesizer such as an Applied Biosystems Model 430A Peptide Synthesizer). Peptide antigens synthesized in this manner may then be aliquoted in predetermined amounts and stored in conventional manners, such as in aqueous solutions or, even more preferably, in a powder or lyophilized state pending use.
In general, due to the relative stability of peptides, they may be readily stored in aqueous solutions for fairly long periods of time if desired, e.g. , up to six months or more, in virtually any aqueous solution without appreciable degradation or loss of antigenic activity. However, where extended aqueous storage is contemplated it will generally be desirable to include agents including buffers such as Tris or phosphate buffers to maintain a pH of about 7.0 to about 7.5. Moreover, it may be desirable to include agents which will inhibit microbial growth, such as sodium azide or Merthiolate. For extended storage in an aqueous state it will be desirable to store the solutions at 4°C, or more preferably, frozen. Of course, where the peptides are stored in a lyophilized or powdered state, they may be stored virtually indefinitely, e.g., in metered aliquots that may be rehydrated with a predetermined amount of water (preferably distilled) or buffer prior to use.
4.13 SITE-SPECIFIC MUTAGENESIS
Site-specific mutagenesis is a technique useful in the preparation of individual peptides, or biologically functional equivalent proteins or peptides, through specific mutagenesis of the & underlying DNA. The technique, well-known to those of skill in the art, further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the DNA Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed. Typically, a primer of about 14 to about 25 nucleotides in length is preferred, with about 5 to about 10 residues on both sides of the junction of the sequence being altered.
In general, the technique of site-specific mutagenesis is well known in the art, as exemplified by various publications. As will be appreciated, the technique typically employs a phage vector which exists in both a single stranded and double stranded form Typical vectors useful in site-directed mutagenesis include vectors such as the Ml 3 phage. These phage are readily commercially-available and their use is generally well-known to those skilled in the art. Double-stranded plasmids are also routinely employed in site directed mutagenesis which eliminates the step of transferring the gene of interest from a plasmid to a phage
In general, site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector or melting apart of two strands of a double-stranded vector which includes within its sequence a DNA sequence which encodes the desired peptide An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically. This primer is then annealed with the single-stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated sequence and the second strand bears the desired mutation. This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement.
The preparation of sequence variants of the selected peptide-encoding DNA segments using site-directed mutagenesis is provided as a means of producing potentially useful species and is not meant to be limiting as there are other ways in which sequence variants of peptides and the DNA sequences encoding them may be obtained. For example, recombinant vectors encoding the desired peptide sequence may be treated with mutagenic agents, such as hydroxylamine, to obtain sequence variants Specific details regarding these methods and protocols are found in the teachings ofMaloy et ai, 1994; Segal, 1976; Prokop and Bajpai, 1991; Kuby, 1994; and Maniatis et al. , 1982, each incorporated herein by reference, for that purpose
4.14 BIOLOGICAL FUNCTIONAL EQUIVALENTS Modification and changes may be made in the structure of the peptides of the present invention and DNA segments which encode them and still obtain a functional molecule that encodes a protein or peptide with desirable characteristics. The following is a discussion based upon changing the amino acids of a protein to create an equivalent, or even an improved, second- generation molecule. The amino acid changes may be achieved by changing the codons of the DNA sequence, according to the codon chart listed in TABLE 1.
For example, certain amino acids may be substituted for other amino acids in a protein structure without appreciable loss of interactive binding capacity with structures such as, for example, antigen-binding regions of antibodies or binding sites on substrate molecules Since it is the interactive capacity and nature of a protein that defines that protein's biological functional activity, certain amino acid sequence substitutions can be made in a protein sequence, and, of course, its underlying DNA coding sequence, and nevertheless obtain a protein with like properties It is thus contemplated by the inventors that various changes may be made in the peptide sequences of the disclosed compositions, or corresponding DNA sequences which encode said peptides without appreciable loss of their biological utility or activity Sη
In making such changes, the hydropathic index of amino acids may be considered The importance of the hydropathic amino acid index in conferring interactive biologic function on a protein is generally understood in the art (Kyte and Doolittle, 1982, incorporate herein by reference) It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like Each amino acid has been assigned a hydropathic index on the basis of their hydrophobicity and charge characteristics (Kyte and Doolittle, 1982), these are isoleucine (+4 5), valine (+4 2), leucine (+3 8), phenylalanine (+2 8), cysteine/cystine (+2 5), methionine (+1 9), alanine (+1 8), glycine (-0 4), threonine (-0 7), serine (-0 8), tryptophan (-0 9), tyrosine (-1.3), proline (-1 6), histidine (-3 2), glutamate (-3 5), glutamine (-3 5), aspartate (-3 5), asparagine (-3 5), lysine (-3 9), and arginine (—4.5)
It is known in the art that certain amino acids may be substituted by other amino acids having a similar hydropathic index or score and still result in a protein with similar biological activity, i.e., still obtain a biological functionally equivalent protein In making such changes, the substitution of amino acids whose hydropathic indices are within ±2 is preferred, those which are within ±1 are particularly preferred, and those within ±0 5 are even more particularly preferred It is also understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity U S Patent 4,554, 101, incorporated herein by reference, states that the greatest local average hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein
As detailed in U S Patent 4,554, 101, the following hydrophilicity values have been assigned to amino acid residues arginine (+3.0), lysine (+3 0), aspartate (+3 0 ± 1), glutamate (+3 0 ± 1), serine (+0.3), asparagine (+0 2), glutamine (+0 2), glycine (0), threonine (-0 4), proline (-0 5 ± 1), alanine (-0 5), histidine (-0.5), cysteine (-1 0), methionine (-1 3), valine (- 1 5), leucine (-1 8), isoleucine (-1 8), tyrosine (-2 3), phenylalanine (-2 5), tryptophan (-3 4). It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent, and in particular, an immunologically equivalent protein In such changes, the substitution of amino acids whose hydrophilicity values are within ±2 is preferred, those which are within ±1 are particularly preferred, and those within ±0 5 are even more particularly preferred As outlined above, amino acid substitutions are generally therefore based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like Exemplary substitutions which take various of the foregoing characteristics into consideration are well known to those of skill in the art and include arginine and lysine, glutamate and aspartate, serine and threonine, glutamine and asparagine, and valine, leucine and isoleucine.
* * * * * * * * * *
5. EXAMPLES
The following examples are included to demonstrate preferred embodiments of the invention It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventors to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention
5.1 EXAMPLE 1 - MAPPING OF THE BG CRITICAL REGION ON MOUSE CHR 13
Three mouse mutations whose molecular basis is unknown, beige (bg), crinkled (cr), and progressive motor neuronophathy (p n), are clustered within 2 cM on proximal mouse Chr 13 As part of a regional positional cloning effort, a high resolution physical map has been established of a 0 24 cM interval of mouse Chr 13 which corresponds to the bg critical region 1 1 Yeast- artificial chromosomes (YACs) and 2 PI clones, isolated using bg critical region STS, were characterized by STS-content mapping This was achieved using existing microsatellite markers and 20 novel sequence tagged sites (STS) which were generated from critical region YAC clone DNA by inverse-repetitive element PCR™ and direct selection 2400-kb of the region was isolated in YAC and PI clones Expressed sequence tags were identified from a Z> -critical region YAC clone by direct selection, and represent potential candidates for bg and cr.
Positional cloning represents an approach to disease gene identification based solely upon chromosomal location In the 10 years since its inception, positional cloning has become established as a general, relatively efficient mode of identification of genes causing mammalian Mendelian disorders (Collins, 1995). Recently e dleveloped techniques and resources have both disencumbered and codified positional cloning, precise genetic mapping of a locus is followed by physical mapping and cloning of the resultant nonrecombinant interval in overlapping genomic clones (contigs) constructed using vectors which accommodate large DNA inserts Transcribed sequences are then systematically identified from contig genomic clones and screened for mutations in affected individuals An additional advantage of positional cloning is that it represents a regional, rather than disease-specific, approach Thus reagents and resources developed for the purpose of cloning a specific disease gene, such as novel sequence tagged sites (STS), precise genetic maps, and establishment of relationships among clones in a contig, are also useful in positionally cloning other loci mapping within the same genomic region
The region of proximal mouse Chr 13 adjacent to the extra-toes (Xt) locus is rich in mutant phenotypes, and represents an interval where a regional approach to disease gene identification may be synergistic Xt is homologous to the human disorder Greig cephalopolysyndactyly, using a positional candidate approach, mutations in a zinc-finger gene (Gli3) were shown to underlie Xt (Vortkamp et al, 1992, Hui and Joyner, 1993) Very close to Xt lies the recessive mutation progressive motor neuronopathy (p n), a model for Werdnig- Hoffmann spinal muscular atrophy (0 recombinants in 246 meioses, Brunialti et al, 1995) The recessive mutation crinkled (cr) maps approximately 2 cM proximal to Xt (23 recombinants in 1197 meioses, Swank et al, 1991, Lyon et al, 1967) Finally, beige (bg), the homolog of human Chediak-Higashi syndrome, maps between cr and Xt (Lane, 1971, Lyon and Meredith, 1969) bg is particularly amenable to a positional cloning approach for 3 additional reasons
(1 ) the existence of numerous bg alleles facilitates candidate gene mutation analysis,
(2 ) bg is associated with a characteristic cellular phenotype (giant, perinuclear, dysfunctional lysosomes) offering the possibility of screening candidate genes by genetic complementation, and
(3 ) direct selection can be utilized to identify transcribed sequences which are candidates for bg from YAC clones since all cell types are affected in bg homozygotes
Positional cloning of bg has been performed as an antecedent to identification of the homologous human gene, which is probably defective in human Chediak-Higashi syndrome Using backcross mice, bg was previously located to a 0.24 cM interval on Chr 13 The example illustrates the further characterization of the bg critical region with 20 novel sequence tagged sites (STS), and the isolation of overlapping YAC and PI clones which encompass most of this region of mouse Chr 13.
5.1.1 MATERIALS AND METHODS
5.1.1.1 YAC MANIPULATION
A mouse genomic DNA library constructed in the vector pYAC4 (Kusumi et al, \99Λ; Research Genetics Inc.) was screened by PCR™ with primers derived from STS flanking bg. False positive PCR™ products were minimized by raising annealing temperatures, and addition of an enhancer of polymerase specificity as necessary (Perfect Match, Stratagene, La Jolla, CA). Veracity of PCR™ products was checked by product digestion with suitable restriction endonucleases, and by inclusion of control yeast DNA in all PCR™ reactions. Individual colonies of yeast clones containing YACs of interest were isolated on plates and frozen in 50% glycerol to prevent occurrence of microdeletions. YAC clones were grown in liquid YPD medium, converted to spheroplasts at exponential growth using Zymolase (ICN Pharmaceuticals, Costa Mesa, CA), and chromosomal DNA purified in agarose. YAC DNA was separated from host yeast chromosomes using preparative pulsed field electrophoresis (PFGE) with low melting point agarose (SeaPlaque™ GTG, FMC Bioproducts, Rockland, ME), and excised with a sterile blade.
5.1.1.2 Pi CLONES
A mouse genomic DNA library constructed in the vector PI (Pierce et al, 1992; Genome Systems Inc., St. Louis, MO) was screened by PCR™ with primers derived from STS flanking bg. Stabs corresponding to positive clones were streaked on kanamycin plates, and DNA prepared from individual colonies as described (Pierce et al, 1992).
5.1.1.3 PULSED FIELD ELECTROPHORESIS
Preparation of high molecular weight DNA in agarose blocks, restriction enzyme digestion, PFGE, and Southern transfer were performed as previously described (Kingsmore et al, 1989). In brief, mouse splenocytes, lymph node cells, or yeast spheroplasts, were suspended in 0.5% low-melting point agarose (InCert®, FMC BioProducts) at 1-2 x 107 cells per ml (mammalian cells) or 1-2 x 1010 cells per ml (yeast). DNA was prepared by incubation of agarose blocks in 500 mM EDTA (pH 9.0), 1% sodium lauroyl sarcosinate, 2% proteinase K at 50°C twice for 24 h. Blocks were then washed, treated with phenylmethylsulfonylfluoride, washed again, and digested with 2-10 units/μgDNA of restriction endonucleases (Boehringer-Mannheim Biochemicals, Indianapolis, IN), if necessary PFGE was carried out in 1% agarose gels (Fastlane, FMC BioProducts) at 14°C in IX TBE using a Gene Navigator unit (Pharmacia, Piscataway, NJ). Separation of 50-1500 kb DNA molecules was achieved using pulses ramped from 70-145 sec at 145 V for 46 h. Gels were stained with ethidium bromide to visualize molecular size standards (oligomers of λ phage, and chromosomes of Saccharomyces cerevisiae [FMC BioProducts]). Southern transfer of DNA onto Zeta-probe™ membranes (Bio-Rad Laboratories), and filter hybridizations were performed as previously described (Kingsmore et al, 1989) Assignment of two probes to a common restriction fragment was based on sequential hybridization of a filter and exhibition of identity by double or partial digests.
5.1.1.4 MOLECULAR PROBES
All probes were labeled by the hexanucleotide technique with "-[32P]dCTP as previously described (Kingsmore et al, 1989). Restriction endonuclease fragments representing ends of YAC clones were identified by Southern blot hybridization with pBR322 (which hybridizes efficiently to pYAC4), YAC clone internal restriction endonuclease fragments were identified by hybridization with a mouse B 1 repetitive element probe.
5.1.1.5 INTERSPERSED REPETITIVE ELEMENT-POLYMERASE CHAIN REACTION
IRE-PCR™ was performed essentially as described using mouse Bl repetitive element primers and PFGE-purified YAC DNA as template (Hunter et al, 1993; Si mler et al, 1991) The Bl repetitive element-specific primers used were 5'-CCAGGACACCAGGGCTACAGAG-3' (SEQ ID NO:75) (forward primer, derived from the 3'-end of Bl) and /or 5'- CCCGAGTGCTGGGATTAAAG-3' (SEQ ID NO:76) (reverse primer, derived from the 5'-end of Bl). Inter-Bl PCR™ was performed with the forward primer alone, the reverse primer alone, or both primers together. PCR™ amplification reactions were performed using 40 ng of YAC DNA, 1 μM of each primer, and 200 μM of each dNTP in a 20 μl reaction. Cycling parameters were 95°C for 2 min, followed by 32 cycles of 94°C for 20 sec, 55°C for 30 sec, and 72°C for 2 min. IRE-PCR™ products were isolated either by band excision from low-melting agarose gels, or by TA subcloning (Invitrogen) IRE-PCR™ products were sequenced, screened for the presence of common mouse repetitive element sequences, and nonrepetitive regions of the sequence used to design oligonucleotides suitable for sequence tagged sites (STS).
5.1.1.6 DIRECT SELECTION
Direct selection was performed as previously described (Lovett et al, 1991, Lovett, 1994) Briefly, cDNA was generated from mouse spleen by reverse transcription using random- and oligo(dT)-priming, ligated to amplification cassettes, and PCR™ amplified. Preparative PFGE was used to purify YAC 195A8 DNA, which was biotin-labelled, denatured, and hybridized in solution to the denatured cDNA pool. Repetitive elements, cDNA corresponding to rRNA, and yeast genes were blocked to Cot=20 YAC DNA (with annealed cDNAs) was captured on streptavidin-coated beads, washed at high stringency, and encoded cDNAs eluted Eluted cDNAs were PCR™-amplified, and subjected to a further round of direct selection Selected cDNAs were reamplified by PCR™, subcloned into λgtlO, and individual clones picked into SM buffer in 96-well plates. Direct selection products were amplified from phage-containing supernatents by PCR™ with the following primers
5'-GTTGTAAAACGACGGCCAGTGGCAAGTTCAGCCTGGTTAAG-3' (SEQ ID NO:77); and
5'-CACAGGAAACAGCTATGACCAGAGTATTTCTTCCAGGGTA-3' (SEQ ID NO:78) .
Direct selection amplicons were cycle sequenced with standard Ml 3 forward and reverse primers Oligonucleotides suitable for STS were designed using direct selection product sequences
5.1.1.7 STS PCR™
PCR™ amplification reactions were performed using 40 ng of template DNA (YAC clone, PI clone, S. cerevisiae strain 1380, or C57BL/6J genomic DNA), 1 μM of each primer, and 200 μM of each dNTP in a 20 μl reaction as described (Barbosa et al, 1995) Cycling parameters were 95°C for 2 min, followed by 34 cycles of 94°C for 20 sec, 45-58°C for 30 sec, and 72°C for 20 sec. Amplification products were separated on 3% agarose gels, and visualized by ethidium bromide staining, or by end-labeling one of the primers using [γ-[ P]ATP and T4 polynucleotide kinase, and separation of products on 6% denaturing polyacrylamide gels, with autoradiographic visualization Simple sequence length polymorphism (SSLP) primers were as described (Dietrich icS et al, 1994; Research Genetics Inc., Hunstsville, AL). Novel STS primer sequences, amplicon sizes, and annealing temperatures are summarized in Table 2.
5.1.2 RESULTS AND DISCUSSION
5.1.2.1 ISOLATION OF YACS AND Pis 1 1 YAC clones and 2 PI clones were isolated from mouse YAC and PI libraries by
PCR™ using markers genetically mapped within the bg critical region. YAC clone sizes, as determined by PFGE, Southern blotting and hybridization with pBR322, are illustrated in FIG. 1. YAC clones were examined for chimerism, microdeletions, and overlaps by STS content mapping. Previously described SSLP were the first source of STS to be utilized. The genomic region encompassing bg is particularly rich in such SSLP (38 have been localized within a 2 cM interval containing bg; Dietrich et al, 1994). Additional proximal chromosome 13 STS were generated using IRE-PCR™ and direct selection.
5.1.2.2 NOVEL CHR 13 STS DERIVED BY IRE-PCR™
IRE-PCR™ represents a rapid and facile method with which to saturate a genomic region with novel STS for initial characterization of YAC clones and contig development (Hunter et al, 1993; Simmler et al, 1991). IRE-PCR™ was performed using YAC DNA as template and primers derived from ends of the mouse repetitive element Bl which were oriented in opposite directions. IRE-PCR™ products were subcloned, sequenced, and nonrepetitive regions used to design oligonucleotides suitable for sequence tagged sites. 12 novel STS (D13Sfkl-D13Sfkl2) were developed by this method (Table 2), and physically assigned to Chr 13 YAC and PI clones by PCR™ (FIG. 2).
5.1.2.3 NOVEL CHR 13 STS DERIVED BY DIRECT SELECTION
Direct selection was performed with YAC 195A8, a 650-kb YAC which was easily purified from preparative pulsed field gels since it did not comigrate with host yeast chromosomes. 192 candidate cDNA fragments were eluted from YAC195A8 following two rounds of direct selection with mouse splenocyte cDNA. 56 of these direct selection products were sequenced. Comparison with DNA sequence databases revealed 2 (4%) nidogen (Nid), 32 (57%) novel, 12 (21%) repetitive elements (B l=2, B2=l, LINE1=4, IAP=2, XL30=1, T=1, ( satellite=l), and 9 (16%) contaminants (rRNA=3, actin=l, Nip2=l, plasmid=4). The presence of Nid cDNA fragments among these products confirmed the efficacy of the selection procedure in enriching for YAC 195A8-encoded genes. Furthermore, of 8 STS corresponding to novel direct selection products, 7 mapped back to YAC195A8 by PCR™ analysis (Dl3Sβl3-Dl3Sfkl9; Table 2, FIG. 2). D13Sfkl3 and D13Sfkl8 also hybridized sufficiently well to Southern blots to permit physical mapping adjacent to Nid on a polymorphic NotI fragment (1 100-kb in DBA/2J DΝA and 1150-kb in SB/LeJ DΝA). D13Sfkl3 was also genetically mapped within the bg critical region in 504 backcross mice [C57BL/6J- bg' X (CSTBL/όl-bg' x CAST/Ei)Fj] using a Taql polymorphism.
5.1.2.4 ARRANGEMENT OF PROXIMAL CHR 13 YAC AND PI CLONES IN CONTIGS
YAC and PI clones were typed for the presence or absence of STS derived from SSLP, IRE-PCR™ amplicons, and direct selection products. STS content mapping enabled examination of clones for chimerism and microdeletions. One YAC clone, 64F5, was chimeric. This YAC, while 580-kb in size (FIG. 1), contained only D13MU44, and not STS derived from the 5'- or 3'- ends of Nid (FIG. 2). Since the latter two STS are separated by less than 65-kb in mouse genomic DNA (Durkin et al, 1995), and since D13M 44 is located within the Nid gene, the portion of YAC 64F5 derived from Chr 13 was concluded to be less than 80-kb.
YAC clone (84A8) contained an internal deletion which included D13Sfk6 (FIG. 2). Furthermore, the physical size of 84 A8 (370-kb) was considerably smaller than expected: the distance between the other genetic markers it encompassed was approximately 600-kb, confirming a substantial genomic deletion within this YAC. Some YAC clones have been reported to be unstable in culture, and become progressively smaller with time (Nehls et al, 1995). YAC 84 A8 may exhibit such instability.
STS content mapping also enabled ordering of YAC and PI clones within the bg critical region and integration of clones into 2 contigs (FIG. 2). Contig 1 comprised 7 YAC and 2 PI clones, extended from D13Sfkl9 to D13Sfk2, and was approximately 1150-kb in length. The orientation this contig with respect to centromere was not established. The second contig 2 consisted of 2 YAC clones. It extended from D13M 207 (proximal) to D13SfklO (distal), and was approximately 1000-kb in length. Contig 2 spanned the crossover defining the distal border of the bg critical region (FIG. 2). Despite STS content mapping, 2 additional critical region YAC clones remained unlinked with these contigs (165F7 and 148E11). Isolation of YAC end clones O will be necessary to definitively evaluate whether overlaps exist between these YACs and contig 1 or 2.
Efforts to identify YAC clones corresponding to one critical region genetic marker (D13M 114) and the two STS which define the proximal border of the bg critical region (D13M 172 and D13MU239) were unsuccessful, furthermore, these STS were not present in any of the Chr 13 YAC/P1 clones identified. These data suggest that a region of the nonrecombinant interval remains unrepresented in the present YAC and PI clones, or, alternatively, that additional microdeletions exist in the YAC clones Based upon evaluation of overlaps between YAC and PI clones, the bg critical region was estimated to be at least 2400-kb in length
Direct selection products identified from YAC 195A8 using splenocyte cDNA not only allowed STS content mapping of Chr 13 YACs, but also constitute candidate genes for bg and cr.
Both of these mouse mutations appear to result from defects in constitutively expressed genes by virtue of abnormal phenotypes in all organs examined The large number of bg alleles available enables effective screening of candidate genes by a combination of Southern and northern hybridization and RT- PCR™, using nucleic acid from multiple bg alleles and coisogenic controls
While such studies are inefficient methods for detection of point mutations, they are highly effective in detection of intragenic deletions, retrotranspositions, and genomic rearrangements, which together account for a large enough proportion of spontaneous mouse mutations to make likely the detection of a mutation in one of the bg alleles While only one allele of cr exists, it arose in offspring of a mouse treated with nitrogen mustard, and therefore is more likely to be associated with a genomic rearrangement detectable using the same screening techniques
In summary, approximately 2400-kb of the bg critical region has been physically mapped and isolated in the form of YAC and PI clones These studies represent an necessary intermediate step in positional cloning of bg, and may also be of value in positional cloning of cr and pmn. TABLE 2
Novel Sequence Tagged Sites (STS) Isolated from bg Critical Region YACs by lRE-PCR™(D13Sfkl-D13Sflil2) or Direct Selection (D13Sn<13-D13Sfl<19)
' no
5.2 EXAMPLE 2 - MAPPING OF THE BEIGE LOCUS TO MOUSE LYST 13
This example illustrates the generation of a high resolution genetic map of proximal Chr 13 in the vicinity of bg, and the identification of two genes which are tightly linked to bg. These studies precisely localize bg on Chr 13, and provide a foundation for YAC contig development and efficient screening of candidate genes for bg.
5.2.1 MATERIALS AND METHODS
5.2.1.1 MICE
C57BL/6J-Z / X (C51BL/6J-bgJ x CAST/EiJ)F] backcross mice were bred and maintained as described (Barbosa et al, 1995). (C57BU6J-bgJ x PWK)Fι X C57BL/6J-_»g^ backcross mice, and (C51BL/6J-bg x PA Fi X C57BL/6J-δg J backcross mice used have been described (Holcombe et al, 1991).
5.2.1.2 SOUTHERN HYBRIDIZATION
DNA was isolated from mouse organs using standard techniques and digested with restriction endonucleases, and 10 μg samples were subjected to electrophoresis on 0.9% agarose gels. DNA was transferred to Zeta-probe membranes (Bio-Rad Laboratories, Hercules, CA), and filter hybridizations were performed as previously described (Barbosa et al, 1995).
5.2.1.3 NORTHERN BLOT ANALYSIS
20 μg of total RNA prepared from liver, spleen and kidney of C57BL/6J-+/+, C57BL/6J- bg1, SBlLel-bg, and C3H/HeJ-Z>g'2/ mice using standard techniques, was separated on formaldehyde agarose gels, transferred to Zeta-probe membranes (Bio-Rad Laboratories), and hybridized as previously described (Kingsmore et al, 1994).
5.2.1.4 RT- PCR™ ASSAYS
Total RNA was prepared from liver of C57BL/6J-+/+, C57BL/6J- , SB/LeJ- and C3H/HeJ- bg" mice by extraction with phenol / guanidine isothiocyanate (TRIzol7, Gibco BRL,
Gaithersburg, MD). The template for quantitative RT- PCR™ assays was 1-10 ng of first- strand 7/ cDNA, which had been synthesized from total RNA with an oligo(dT) primer and Moloney murine leukemia virus reverse transcriptase (Stratagene, La Jolla, CA) The nidogen (Nid) primers used for RT- PCR™ correspond to bp 3805-3822, and bp 3938-3955 of the mouse Nid cDNA (Durkin et al, 1988) The Estm9 primers used were
5'-CAGGTGGAGATGCTGTTC-3' (Fl) (SEQ ID NO 59)
5'-GAGATGCCTTCAGGCAGT-3' (Rl) (SEQ ID NO 60) 5'-CCGTTAGTGTGTAGTCTC-3' (F2) (SEQ ID NO 61) 5'-CTTGCTCTCACTGTTCTC-3' (R2) (SEQ ID NO.62)
These correspond to the 5' and 3' ends, respectively, of an Estm9 cDNA (Bettenhausen and Gossler, 1995) RT-PCR7 products were amplified from bg, bg', bg2 , and -r/+ RNA with Nid primers or Estm9 primers Fl-Rl or F2-R2 Quantitative RT-PCR7 of aldolase A, which is constitutively expressed, was also performed, to ensure that equal amounts of bg, bg , bg2 , and +/+ template were used
(Aldolase A primer 1 5'-TGGATGGGCTGTCTGAACGC-3\ (SEQ ID NO:63),
primer 2 5'-TGCTGGCAGATGCTGGCATA-3', (SEQ ID NO.64)
PCR™ reactions were performed in a 50 μl volume containing 1-20 ng of cDNA, 1 μM of each primer, 200 μM each dNTP, 10 mM Tris-HCl, pH 8 8, 50 mM KC1, 1 5 mM MgCl2, and 1.25 TJ AmpliTaq7 DNA polymerase (Perkin-Elmer Cetus, Norwalk, CT) Cycling profiles consisted of an initial denaturation (94EC for 2 min) followed by 25 cycles of 94EC for 30 sec, 55-58EC for 30 sec, and 72EC for 1 minute per kb of expected product length PCR™ products were separated by electrophoresis on agarose gels, and quantified by intensity of ethidium bromide staining
5.2.1.5 SSLP PCR™ PCR™ amplification reactions were performed using 40 ng of genomic DNA, 1 μM of each primer (Dietrich et al, 1994; Research Genetics, Inc , Huntsville, AL), and 200 μM of each dNTP in a 20 μl reaction as described (Barbosa et al, 1995) Cycling parameters were 95EC for 2 min, followed by 36-38 cycles of 94EC for 20 sec, 58EC for 30 sec, 72EC for 10 sec Where possible, amplification products (20 μl) were separated on 3% agarose gels, and visualized by ethidium bromide staining. SSLP with allele sizes differing among strains by less than 8 bp were typed by end-labeling one of the primers using [γ32P]ATP and T4 polynucleotide kinase, separation of amplification products (4 μl) on 6% denaturing polyacrylamide gels, and visualization by autoradiography. SSLP allele sizes are summarized in FIG. 3A, FIG. 3B, FIG. 3C and FIG. 3D.
5.2.1.6 PULSED FIELD ELECTROPHORESIS
Preparation of high molecular weight DNA in agarose blocks, restriction enzyme digestion, pulsed field electrophoresis (PFGE), and Southern transfer were performed as previously described (Kingsmore et α/., 1989). In brief, mouse splenocytes or lymph node cells were suspended in 0.5% low-melting point agarose (InCert, FMC BioProducts, Rockland, ME) at 1-2 H 107 cells per ml. DNA was prepared by incubation of agarose blocks in 500 mM EDTA (pH 9.0), 1% sodium lauroyl sarcosinate, 2% proteinase K at 50EC twice for 24 h. Blocks were then washed, treated with phenylmethylsulfonylfluoride, washed again, and digested with 2-10 units/μg DNA of restriction endonucleases (Boehringer Mannheim Biochemicals). PFGE was carried out in 1% agarose gels (Fastlane, FMC BioProducts) at 14EC in IX TBE using a Gene Navigator system (Pharmacia, Piscataway, NJ). Separation of 50-1500 kb DNA molecules was achieved using pulses ramped from 70-145 sec at 145 V for 46 hr; 1000-6000 kb DNA was resolved by pulses of 15-90 min at 50 V for 6 or 10 days. Gels were stained with ethidium bromide to visualize molecular size standards (oligomers of λ phage, and chromosomes of Saccharomyces cerevisiae and Schizosaccharomyces pombe [FMC BioProducts]). Southern transfer of DNA onto Zeta-probe® membranes (Bio-Rad Laboratories), and filter hybridizations were performed as previously described (Kingsmore et al, 1989). Assignment of two probes to a common restriction fragment was based on sequential hybridization of a filter and exhibition of identity by double- or partial-digests.
5.2.1.7 MOLECULAR PROBES
All probes were labeled by the hexanucleotide technique with "-[32P]dCTP as previously described (Kingsmore et /., 1989). The nidogen (Nid) probe used was pN-5 (Jenkins et al, 1991). The glioblastoma oncogene homolog-3 (GU3) probe was derived from pGH3a (Hui and Joyner, 1993). The probes used for the T cell receptor ( chain (Tcrg), and the mid-gestation 13 embryo cDNA ESTM9, have been described previously (Holcombe et al, 1991) Informative
CAST/EiJ RFLV sizes are summarized in FIG. 3D; informative PAC and PWK RFLV for Tcrg were as described (Holcombe et al, 1991)
5.2.2 RESULTS Previous mapping studies, using 3 separate backcrosses segregating for the bg locus (2 intraspecific backcrosses [(C3H/HeJ x C57BL/6J-Z /)Fι X C51BLI6J-bg J ], and [(C57BL/6J - Wh-bgJ x Mus domesticus PAC)Fι X C57BL/6J-Z>g,/ ], and an intersubspecific backcross [(C57BL/6J-0 -bgJ x Mus usculus ?WK)Ει X C57BL/6J- .}, have shown bg to lie proximal to Tcrg on mouse Chr 13 (Holcombe et al, 1987, 1991). In order to assess candidate genes for linkage to bg and as a precedent to positional cloning, the inventors have now generated a high- resolution linkage map of proximal mouse Chr 13 using the latter 2 backcrosses and a third, novel backcross
5.2.2.1 PHENOTYPIC ANALYSIS OF BG BACKCROSS MICE
Three backcrosses segregating for bg were utilized, Phenotypic analysis of 109 (C57BL/6J -Wsh-bg! x Mus domesticus PAC)F, X C57BL/6J-_ / backcross mice, and 11 1 (C57BL/6J-0^- bgJ x Mus musculus PWK)Fι X C57BL/6J-όg J backcross mice has been reported previously (Holcombe et al, 1991). The third backcross was established between CSlBhlβi-bg1 mice and Mus castaneus (CAST/EiJ), and 504 [C57BL/6J-Z / X (C57BL/6J-_>g/ x CAST/EiJ)F, ] progeny were generated. Mus castaneus was chosen as the second parent in the latter intrasubspecific backcross due to the increased likelihood of detection of DNA polymorphism in comparison to intraspecific crosses Mice were phenotyped for the presence or absence of a beige-colored coat; Penetrance of bg in all of the crosses was complete (359 of 726 backcross mice [49%] exhibited a beige-colored coat).
5.2.2.2 IDENTIFICATION OF INFORMATIVE RFLV AND SSLP Informative RFLV were ascertained by hybridizing gene probes to Southern blots containing genomic DNA from C57BL/6J-Z / and CAST/EiJ, PAC, or PWK parental mice digested with various restriction endonucleases. Table 3 lists the sizes of unique CAST/EiJ RFLV for GU3 and Nid. PWK and PAC RFLV for Tcrg have been described previously (Holcombe et al, 1991); CAST/EiJ RFLV for Estm9 have been described previously. Informative SSLP were ascertained by PCR™ of genomic DNA .from C57BL/6J-2 / and CAST/EiJ, PAC, and PWK parental mice. Approximate sizes of SSLP- PCR™ products are listed in Table 3
5.2.2.3 PRECISE GENETIC MAPPING OF BG ON PROXIMAL MOUSE CHR 13
111 (C57BL/6J -Wsh-bgJ x Mus domesticus PAC)F, X CSlBL/όJ-bg1 backcross mice, 111 (C57BL/6J-røΛ,'-bgJ x Mus musculus PWK)Fj X C57BL/6J-Z># J backcross mice, and 504 [C57BL/6J-_^ X (C57BL/6J-Z>£ x CAST/EiJ)Fi ] backcross mice were genotyped for a total of 23 SSLPs and 3 RFLVs known to map to proximal mouse Chr 13 At each locus, backcross DNA displayed either the homozygous or heterozygous Fi pattern. Linkage relationships were determined using segregation analysis (Green, 1981), and the best gene order decided by minimization of crossover events and elimination of double crossover events (Bishop, 1985) Haplotype analysis for each cross is shown in FIG 3A, FIG. 3B, FIG 3C, and FIG 3D
Upon retyping of previously published genotypes of the PAC and PWK backcrosses (Holcombe et al, 1991), 4 errors were detected In each case, the coat-color had been incorrectly assigned, resulting in the generation of a double crossover within a genetic interval of less than 0 5 cM, since such events are predicated against by positive interference, these animals were excluded from subsequent analysis Upon exclusion of these animals, no significant differences in gene order or recombination frequencies were found among the three crosses
The best gene order and recombination frequency (± standard deviation) for the [CSTB Iβl-bg* X (CSlBL/βJ-bg1 x CAST/EiJ)Fi ] backcioss was centromere - D13MU158, D13MU172, D13M 205, D13M 206, D13MU239 - 0.20 ± 0 20 cM - bg ', Nid, Estm9, D13MU44, D13MU114, D13Mιtl34, D13MU207 - 0.20 ± 0.20 cM - GU3, D13MU56, D13MM62, D13MH174, D13Mιt237, D13Mιt240, D13M 305 - 0.20 ± 0.20 cM - D13Mιt218, D13MU219, D13M 271 - 0.40 ± 0.28 cM - D13MU3, D13MU133 - telomere
The best gene order and recombination frequency (± standard deviation) for the [(C57BL/6J -Wsh-bg' x Mus domesticus PAC)F] X C57BL/6J-_> J] backcross was centromere -
D13M 79 - 5 4 ± 2.1 cM - D13Mιtl - 0 9 ± 0 9 cM - bg", D13MU44, D13Mιtl34, D13Mιtl74,
D13MU205 - 0 9 ± 0.9 cM - Tcrg, D13MU218, D13MU219 - 3 6 ± 1 8 cM - D13M 3 - telomere
The best gene order and recombination frequency (± standard deviation) for the [(C57BL/6J-ΣPrA-bgJ x Mus musculus PWK)Fι X C57BL/6J- ] backcross was centromere - D13MU79 - 5 4 ± 2 1 cM - D13Mιtl - 0 9 ± 0 9 P cM - bg" , D13Mιt44, D13Mιtl34, D13Mιt205, D13M 237 - 0 9 ± 0 9 cM - D13MU174 - 0 9 ± 0 9 cM - Tcrg, D13Mιt218, D13Mιt219 - 0 9 ± 0 9 cM - D13MU3 - telomere
A composite linkage map of proximal mouse Chr 13, derived by integration of these 3 crosses, is shown in FIG 3D The combined results delimit the region containing bg to a 0 24 ± 0 17 interval on Chr 13, flanked proximally by the genetic markers D13MU172 and D13M 239, and distally by Glι3, D13Mιt56, D13MU162, D13Mιt237, D13MU240, and D13M 305 bg cosegregated with 6 genetic markers (Nid, Estm9, D13MU44, D13MU114, D13Mιtl34 and D13Mιt207) Backcross mice with recombination events which define the bg nonrecombinant interval were derived from the [C57BL/6J-6g X (C57BL/6J-^ x CAST/EiJ)F] ] backcross
5.2.2.4 EVALUATION OF THE CANDIDACY OF NID AND ESTM9 FOR CAUSALITY IN BG
Given the availability of numerous bg alleles, it was reasoned that northern, Southern, and RT- PCR™ analyses would be effective modalities for initial evaluation of the candidacy of Nid and Estm9 for causality in bg
Southern blots were generated with DNA from 6 bg alleles SB/LeJ-Z>g, C57BL/6J-£g'/,
C3lVHeJ-bg ZJ, OBA/2J-bg &/, C51BL/6J-bg ιω, C51BL/65-bg '", and from appropriate +/+ coisogenic controls using 5 restriction endonucleases (EcoRI, Hindlll, BamHI, Mspl, and Taql) No restriction fragment length differences were observed between bg alleles and coisogenic controls upon hybridization with Nid or Estm9, excluding a deletion or insertion in these genes from causality in these bg alleles
Expression of Nid and Estm9 in bg mice was examined by northern blot analysis and quantitative RT-PCR7 Hybridization of northern blots of liver and kidney RNA from +/+, bg, bg1 , and bg v with probes for Nid and Estm9, yielded signals of similar size and intensity in bg and +/+ RNA Furthermore, no difference in amplicon size or amount was observed upon quantitative RT-PCR7 using liver or kidney RNA from +/+, bg, bg1 , and bg ^mice and oligonucleotides for Nid or Estm9, indicating expression of Nid and Estm9 to be grossly intact in bg 5.2.2.5 PHYSICAL MAPPING OF PROXIMAL MOUSE CHR 1 IN THE VICINITY OF BG
Cytogenetic and physical mapping studies have demonstrated mouse mutations induced by gonadal x-irradiation to be frequently associated with genomic rearrangements (typically deletions or translocations) The SBfLeJ-bg allele was discovered among the offspring of a male which had received such treatment In order to examine SBfLeJ-bg DNA for a genomic rearrangement, physical mapping studies were undertaken by pulsed field gel electrophoresis using high molecular weight DNA and restriction endonucleases which cleave infrequently PFGE- Southern blots were generated using DNA from DBA/2, C57BL/6J-bgJ, CAST/EiJ and SBfLeJ-bg splenocytes, and probed sequentially with the 3 genes which map in the vicinity of bg (Nid, Estm9, and Glι3) Physical linkage of these genes was not possible, since hybridization with Estm9, Glι3 and Nid gene probes revealed no bands of identical size (Table 4)
No differences were observed in the sizes of bands identified in SBILeJ-bg and control DNA upon hybridization with GU3 or Estm9 (Table 4) However, hybridization of the same blots with an Nid gene probe did reveal band size disparities With 5 restriction endonucleases (Nøtl, Mlu , Nru , and Srfl complete digests, Nael partial digest, and NotllMlul double digest), differences were observed between DBA/2 and the other DΝAs (CSlBLIβJ-bg1 , CAST/EiJ and SBfLeJ-bg) In each case, the DBA/2 fragment was 25-50kb smaller than the band identified in C57BL/6J-/V, SBfLeJ-bg, or CAST/EiJ DΝA (FIG 3 A, FIG 3B, FIG 3C, and FIG 3D, Table 4) No differences in Nid band sizes were evident among other mouse strains examined (C57BL/6J-_>g^, SBfLeJ-bg, and CAST/EiJ) Other restriction endonucleases, which identify smaller fragments when probed with Nid (BssHIl, Clal, Nael, Smal, Xhoϊ) were identical in all strains tested (FIG 3 A, FIG 3B, FIG 3C, FIG 3D, Table 4) Nid fragment size differences were observed using both methylation-sensitive and -insensitive restriction endonucleases.
5.2.3 DISCUSSION Previous studies have localized bg to proximal Chr 13 Lyon et α/.,(1969) demonstrated bg to be 0 5 cM proximal to the mutation Xt, which corresponds to the GU3 gene Several groups have demonstrated tight linkage between bg and Tcrg (Holcombe et α/., 1987, 1991, Justice et α/., 1990) Jenkins et α/.,(1991) found bg to cosegregate with Nid in 123 meiotic events Precise genetic mapping of bg has been undertaken with respect to these genes and recently identified SSLP markers (Dietrich et al, 1994) as an antecedent to generation of a YAC contig of the genomic region encompassing bg These results are in agreement with previous studies of genetic marker order on chromosome 13, although the greater number of meioses utilized in the present study permitted separation of loci which cosegregated in previous studies, and enabled localization of bg to a 0 24 cM interval on proximal mouse Chr 13 No statistically significant differences in genetic distances between markers were observed among the present crosses or between them and previous studies Cosegregation of bg and Nid was observed in 504 meiotic events, suggesting bg to map within a linkage group conserved between proximal mouse Chr 13 and the distal long arm of human Chr 1 (Jenkins et α/., 1991) By implication, the homologous human locus, CHS, may be expected to lie on human Chr lq42 l-lq43, which represent the approximate limits of this conserved linkage group (Jenkins et a/., 1991, Mattei et cr/., 1994) Localization of bg to a 0.24 cM interval will enable the generation of a YAC contig encompassing bg Those genetic markers which cosegregate with bg will serve as nucleation points for rapid contig assembly
If it is assumed that a haploid mouse genome is 1500cM in size and contains 60,000, randomly distributed genes, it would be expected that the 0 24 cM bg critical region should contain 10 genes In the present report, two genes, Nid and Estm9, were localized within this interval, and thereby represent candidate genes for the bg locus Nidogen, however, can be excluded from candidacy for bg for functional reasons While bg mice exhibit a constitutive intracellular defect in lysosomal trafficking, nidogen is a component of basement membranes, a specialized extracellular matrix structure limited to certain tissues (Durkin et αl, 1988) The candidacy of Estm9 cannot yet be evaluated on functional grounds Estm9 is a novel mouse expressed sequence which was recently identified from a day 10 5 p c mouse embryo cDNA library (Bettenhausen and Gossler, 1995) Comparison of partial Estm9 cDNA sequences with DNA and peptide databases demonstrate significant sequence similarity only with uncharacterized human ESTs While the function of Estm9 is unknown, expression analysis reveals it to be constitutively expressed, temporally and spatially, in the mouse (Bettenhausen and Gossler, 1995)
Initial genetic evaluation of the candidacy of Nid and Estm9 for bg by northern and Southern blot hybridization or quantitative RT- PCR™, revealed no differences between several bg alleles and coisogenic controls These studies do not definitively exclude Nid or Estm9 from candidacy for bg. A more robust method of evaluation for bg candidate genes would be genetic complementation Cell lines derived from bg mice exhibit pathognomonic phenotypes (Burkhardt et αl, 1993, Gow et αl, 1993, Baetz et αl., 1995), which can be abrogated by genetic complementation (Perou and Kaplan, 1993; Penner and Prieur, 1987, Gow et αl., 1993) Studies to examine the ability of Nid or Estm9 to complement ^-associated phenotypes in vitro are being pursued.
Physical mapping studies of the bg critical region were undertaken to evaluate the radiation-induced SB-Z allele for the presence of a gross genomic rearrangement. SB-bg - specific restriction fragment length differences were not observed with Nid, Estm9, or GU3 gene probes. Furthermore, all critical region SSLP amplicons (D13MU44, D13MU114, D13M 134 and D13MU207) were present in SB-bg DNA. Together, these data preclude the existence of a gross genomic rearrangement in SB-bg DNA. However, DBA/2-specific pulsed-field electrophoresis RFLPs were observed with Nid using 5 restriction endonucleases. In each case, the DBA/2 fragment identified with Nid was 25-50 kb smaller than the corresponding band identified in control DNA (FIG. 3A, FIG. 3B, FIG. 3C, and FIG. 3D). No difference in band sizes were observed among other strains or upon reprobing of PFGE-Southern blots with GU3 or Estm9. Since fragment size differences were observed with many rare-cutting restriction endonucleases, including several which are methylation-insensitive, it is unlikely that they are merely interstrain differences in DNA methylation or point mutations. Instead, it is suggested that a genomic rearrangement has occurred in the DBA/2 mouse at a distance of less than 900 kb from Nid (FIG. 3D). The rearrangement may represent a small (25-50 kb) genomic deletion in the DBA/2 mouse. The functional significance of such a putative rearrangement is uncertain Interestingly, a similar phenomenon was recently described in the vicinity of the human nidogen gene (Goodrich and Holcombe, 1995) upon hybridization to pulsed field gel electrophoresis Southern blots of human genomic DNA digested with Sail, nidogen identified polymorphic band sizes in Caucasian populations. In 2 CHS patients that have been examined to date, homozygosity for one NID allele was observed, suggesting the possibility of linkage of human CHS and NTD (Goodrich and Holcombe, 1995). Definitive mapping of human CHS, however, must await identification of the mouse bg gene. On a practical note, the interstrain differences in pulsed field restriction fragment length provide a physical landmark within the bg nonrecombinant interval. Thus bg candidate genes can be easily screened for physical linkage of with Nid as a means of determining whether or not they lie within the bg nonrecombinant interval.
In summary, the bg locus has been localized, which is the mouse homolog of human CHS, to a genomic interval corresponding to approximately one four-hundredth of mouse Chr 13. This represents an important intermediate step in the positional cloning of bg, and thereby human CHS. TABLE 3
Informative Proximal Chr 13 SSLP Allele Sizes
SSLPAIIele Size (bp) C57BL/6J- CAST/EiJPAC PWK
. TABLE 4 Restriction Fragment Length Polymorphisms Used To Genetically Map Nid And GU3 In
504 [C57BL/6J-_9g X (C57BL/6J-_3£/ X CAST/Eij)F!] Mice
91 TABLE 5
PFGE restriction fragment sizes (in kb) of Nid and Estm9 in SB/LeJ-bg and DBA/2J DNA
*SBfLeJ-bg band sizes were also observed using CAST/EiJ and C57BL/6J genomic DNAs
5.3 EXAMPLE 3 - IDENTIFICATION OF THE HOMOLOGOUS BEIGE AND CHS GENES
As described above, the inventors have localized the bg locus within a 0 24 centimorgan interval on mouse chromosome 13, and isolated contiguous arrays of YACs that cover 2,400 kb of this interval Candidate cDNAs for bg were isolated from YAC 195A8, which contains 650 kb of the bg non-recombinant interval using direct cDNA selection with mouse spleen cDNA (FIG 11) Of 56 candidate cDNA clones analyzed from a direct-selection study, evidence for causality in bg was found in one (see below), and this gene was designated Lyst (lysosomal trafficking regulator) As this clone was 132 nucleotides long, additional Lyst sequences were sought by screening three mouse cDNA libraries and performing polymerase chain reaction (PCR™) amplification of cDNA ends (Kingsmore et al, 1994) Ten overlapping Lyst clones were identified, representing ~7 kb (Genbank accession number, L77889) These were physically assigned to mouse chromosome 13 with pulsed field gel electrophoresis (PFGE) Southern blots, confirming that they were all derived from a single gene (mouse genome database accession number, MGD-PMEX-14) The Lyst probes identified the same polymorphic PFGE restriction fragments as nidogen (Nid), indicating that Lyst and Nid are clustered within 650 kb Lyst was also mapped genetically in 504[C57BL/6-_ / x (C57BL/6J- x CAST/ELOFj] backcross mice by means of three Taql restriction fragment length polymorphisms (RFLPs) The Lyst RFLPs cosegregated with bg (and Nid), confirming their colocalization on proximal mouse chromosome 13 (MGD accession number, MGD-CREX-615)
Evidence for Lyst mutations was found in two bg alleles A 5-kb genomic deletion that contained the 3' end of Lyst exon β, and exons γ and δ, was identified in bg1" DNA (FIG 12) The bg"J deletion corresponds to the loss of -400 internal amino acids of the predicted Lyst peptide Furthermore, whereas the 5' end of the bg'" deletion occurs within Lyst exon β, the 3' end is intronic Therefore the truncated Lyst mRNA in bg"j mice is also anticipated to splice incorrectly, terminate prematurely, and lack polyadenylation
Quantitative reverse transcription (RT)-PCR™ demonstrated a moderate decrease in Lyst mRNA in bg and bg' liver, and a gross reduction in bg11 (Lyst ΔOD after normalization for β-actin mRNA, +/+, 1 00, bg^lbg ', 0 19, bglbg, 0 28, bg'lbg1, 0 40) A commensurate reduction in bg" transcript abundance was noted by using several primer pairs derived from different regions of the Lyst cDNA Aberrant Lyst RT-PCR™ products were not observed The particularly striking (more than fivefold) reduction in Lyst expression evident in bg^ homozygotes suggested the existence of a mutation in bg21 Lyst that results in decreased transcription or mRNA instability The molecular basis of the decrease in Lyst mRNA in bg11 is not yet known, but it is reminiscent of the leaky ablation of mature message associated with an intronic retrotransposition event (Kingsmore et al, 1994)
The predicted open reading frame (ORF) of Lyst was 4,635 nucleotides, encoding a protein of 1,545 amino acids and relative molecular mass 172,500 ( - 172 5K) (FIG 13a)
Nucleotides 51-74 are rich in CG nucleotides, a common feature of the 5' region of housekeeping genes Comparison with DNA databases indicated that Lyst is novel, and resembles only uncharacterized human-expressed sequence tags (ESTs) The sequence of a cDNA clone corresponding to one such human EST (Genbank accession number L77889) matched the 5' region of mouse Lyst (nucleotide identity was 76% in the 5' untranslated region (UTR), 91% in the ORF, and amin-acid identity was 97%, FIG 13c), another human EST matched the 3' region of the mouse Lyst coding domain (Genbank accession number W26957) On hybridization to PFGE Southern blots of mouse DNA, the human clones identified restriction fragments that were indistinguishable from mouse Lystl; physical mapping of the human clones to the same region of the mouse genome as Lyst indicates that they are indeed homologous to Lyst 12>
It has been suggested that CHS and bg represent homologous disorders, as their clinical features(Blume and Wolff, 1972) and defects in lysosomal transport (Burkhardt et al, 1993) are identical. Homology of bg and CHS is supported by genetic complementation studies; fusion of fibroblasts from bg mice and CHS patients failed to reverse lysosomal abnormalities, in contrast to fusions with normal cells (Perou and Kaplan, 1993). Furthermore, recent genetic linkage studies have shown that CHS maps within a linkage group conserved between human Chromosome l q43 and the bg region on mouse Chromosome 13. Therefore LYST mutations in CHS patients were sought by sequencing LYST lymphoblast and fibroblast cDNAs corresponding to these ESTs from 10 CHS patients. In one patient, a single-base insertional mutation was found at nucleotides 117-118 of the LYST coding domain, resulting in a frame shift and termination after amino acid 62 (FIG. 13c).
Previous studies showing spontaneous aggregation of membrane-bound concanavalin A (capping) suggest that there is a defect in microtubule dynamics in bg cells (Oliver, Zurier and Berlin, 1975; Oliver and Zurier, 1976). In a search of the SWISSPROT database, using Blitz and BLASTP, a similarity was found between a domain in Lyst and stathmin (oncoprotein 18), a phosphoprotein that may regulate polymeration of microtubules (Belmont and Mitchison, 1996) (27% identity from residues 463 to 536; best expected occurrence by chance, 4.36 x W6). The domain is stathmin that matches Lyst is helical and has heptad repeats that participate in coiled- coil interactions with other proteins (Sobel, 1991; Maucuer et al, 1995). The stathmin-like region of Lyst is also predicted to be helical and formed coiled coils. However, it is the charged residues, rather than the hydrophobic ones, that are conserved between Lyst and stathmin, suggesting that the sequence similarity is not primarily due to conserved secondary structure. Thus this region of Lyst potentially encodes a coiled-coil protein-interaction domain that may regulate microtubule-mediated lysoso e transport. Although Lyst is no predicted to have transmembrane helices, the C-terminal tetrapeptide (CYSP; amino acids 1,542-1,545) is strikingly similar to known prenylation sites, which could provide attachment to lysosomal/late endosomal membranes through thioester linkage with the cysteine.
Previous studies of bg leukocytes have shown correction of microtubule function (as assessed by Concavalin A capping) and natural killer activity when treated with inhibitors of protein kinase C (PKC) breakdown (Sato et al, 1990; Ito et al, 1989), suggesting that bg might be regulated by phosphorylation. Lyst contains 25 sites of potential phosphorylation by PKC, 36 by casein kinase II (CKII) (many of which overlap those of PKC), two by cAMP-dependent protein kinase, and one by tyrosine kinase (FIG. 136). Almost half of the predicted helices outside the stathmin-like region (14 of 30) have a PKC- or CKII-phosphorylation signal at their amino terminus, and eight of them form consecutive helical pairs Thus Lyst seems to contain helical bundles with clusters of phosphorylation sites at either end Stathmin also has an N- terminal phosphorylation site and helix motif, and these Lyst domains may have a similar 'signal relaying' function to stathmin (Sobel, 1991, Maucuer et al, 1995) Furthermore, phosphorylation of these positions could provide a control mechanism by causing a conformational shift in the bundles, thereby affecting interactions with other molecules.
Northern analysis and RT-PCR™ indicated that Lyst is ubiquitously transcribed, both temporally and spatially, in mouse and human tissues (FIG. 14) Northern blot analysis also revealed complex alternative splicing of Lyst mRNA, with both constitutive and anatomically restricted Lyst mRNA isoforms The largest Lyst transcript in human and mouse was 12-14 kb, but this transcript was not constitutively expressed In mRNA from mouse spleen, human peripheral blood leukocytes, promyelocytic leukaemia HL-60, and several leukaemia lines, the 12-14 kb isoform was either undetectable or barely detectable, but smaller Lyst transcripts were abundant (FIG 14) Given the significance for bg mice and CHS patients of defects in the lysosomal and late-endosomal compartments of granulocytes, NK cells and cytolytic T lymphocytes (Gallin et al, 1974; Roder and Duwe, 1979, Saxena et al, 1982, Baetz et al, 1995), it is likely that these Lyst mRNAs of ~3 kb and 4 kb represent the transcripts of primary functional significance Probes derived from the 5' or 3' ends of the Lyst
5.4 EXAMPLE 4 ~ MUTATION ANALYSIS AND PHYSICAL AND GENETIC MAPPING ESTABLISH HUMAN L ST AS THE CHS GENE
5.4.1 MATERIALS AND METHODS
5.4.1.1 CLONING OF THE HUMAN LYST GENE Segments of the human LYST sequence were obtained by an anchored, nested PCR™
(5' RACE-PCR™) using liver cDNA as a template (Clontech Laboratories, Palo Alto, CA), by
RT-PCR™ using total RNA and by sequencing of human ESTs similar in sequence to mouse Lyst.
For the 5' RACE-PCR™ two nested primers were used that were derived from a human EST
(GenBank accession number W26957) and had the following nucleotide sequence: 5'-CCAAGATGAAAGCAGCCGATGGGGAAAACT-3' (SEQ ID NO: 65) and
5'-TCAGCCTCTTTCTTGCTCCGTGAAACTGCT-3' (SEQ ID NO: 66) .
For RT-PCR™ experiments, total RNA was prepared from the promyelocytic HL-60 cell line Reverse transcription was performed with Expand (Boehringer Mannheim, Meylan France) with the following primer pairs
5'-AGTTTATGAGTCCAAATGAT-3' (SEQ ID NO: 67) and 5'-GAATGATGAAGTTGCTCTGA-3' (bp 490-2034) (SEQ ID NO: 68), 5'-CAGCAGTTCTTCAGATGGA-3' (SEQ ID NO: 69) and 5'-ATCTTTCTGTTGTTCCCCTA-3' (bp 1,891-3050) (SEQ ID NO: 70), 5'-TAGGGGAGCAACAGAAAGAT-3' (SEQ ID NO:71)and
5'-GCTCATAGTAGTATCACTTT-3' (bp 3320-4722) (SEQ ID NO:72) .
The primers used to amplify the cDNA between bp 1891 and 3050 were derived from the mouse Lyst sequence Human primers were designed from the sequence of the PCR™ product (1159 bp) and used to amplify the flanking sequences
5.4.1.2 DNA SEQUENCING AND SEQUENCE ANALYSIS
PCR™ products were cloned using a TA cloning kit (Invitrogen Corporation, San Diego California) and both strands were cycle sequenced The sequences were analyzed with the GCG Package (Devereux et ai, 1984) and searches of the National Center for Biotechnology Information database were performed using the BLAST network server (Altschul et ai, 1990) (National Library of Medicine, via INTERNET) and the Whitehead Institute Sequence Analysis Programs (MIT, Cambridge, Massachusetts)
5.4.1.3 SOUTHERNAND NORTHERNBLOTANALYSIS
Preparation of mouse, human and yeast DNA samples, digestion with restriction endonucleases, agarose gel electrophoresis and Southern transfers were performed using standard techniques (Maniatis et ai, 1984) The EcøRI monochromosomal somatic cell hybrid blot was obtained from BIOS Laboratories (New Haven, Connecticut) Isolation of poly(A)+RNA from fibroblast and EBV-transformed B lymphoblast cell lines, formaldehyde agarose gel electrophoresis and Northern blotting were perforrrred according to standard procedures (Maniatis et ai, 1984). Membranes were hybridization with various LYST or actin probes labeled with a32P-dCTP Mouse genetic mapping analyses were performed as described (Barbosa et ai, 1995).
5.4.1.4 SSCP ANALYSIS
Detection of nucleotide changes by SSCP was performed as described by Orita et al.
(1989). Briefly, each PCR™ product was mixed with an equal volume of denaturing buffer and heated to 95°C for 3 min , after which the samples were loaded onto 0.8 mm thick, 10% native polyacrylamide gels. Gels were run at ambient temperature at 9 W for 6-10 hours, depending on the size of the PCR™ product Bands were visualized by silver- staining (Beidler et al. , 1982)
5.4.1.5 ALLELE-SPECIFIC OLIGONUCLEOTIDE ANALYSIS
PCR™ products spanning the mutation site in patient 371 were transferred to nylon membranes using a slot blot apparatus Approximately 5 ng of each PCR™ product was treated with a denaturing solution (0.5 M NaOH, 1 5 M NaCl), split in half and loaded in duplicate Two 17 mer oligonucleotides were synthesized that span the region containing the mutation One contained the sequence of the normal allele (5'-CGCACATGGCAACCCTT-3')(SEQ ID NO 73), while the other contained the sequence of the mutant allele (5'-GCACATGGGCAACCCTT-3') (SEQ ID NO 74) These were end-labeled with γ3 P-dATP using T4 polynucleotide kinase and hybridized to the membranes at 50°C Hybridization and wash buffers were as described (Church and Gilbert, 1984). Membranes were sequentially washed at 45°C, 55°C and 65°C for 10 min each and exposed to X-ray film
5.4.2 RESULTS
5.4.2.1 A QUESTION OF TWO BG GENES
In order to resolve the dilemma created by the existence of two different bg candidate genes (Lyst and BG), the inventors isolated and sequenced additional mouse cDNA and genomic clones corresponding to the 3' end of Lyst An anchored, nested PCR™ (3HACE-PCR™) from this region yielded two fragments (1.25 kb and 2 kb). The 1 25 kb clone contained the previously published 3' end of Lyst, while the 2 kb clone contained sequences derived from Lyst (at the 5' end) and from BG (at the 3' end). Reverse- tran %scrniption and PCR™ (RT-PCR™) confirmed that nucleotides 1-4706 of Lyst also represent the previously undetermined 5' end of the BG open reading frame (FIG. 15c). A full length cDNA was assembled from nucleotides 1-4706 of Lyst, the 2 kb 3 'RACE-PCR™ clone and 6824 nucleotides of BG cDNA This 11,817 bp cDNA sequence (Lyst-l, Genbank accession number U70015) corresponds to the largest mRNA observed in Northern blots (-12 kb) (Goodrich and Holcombe, 1995).
Analysis of a PI genomic clone (number 8592) containing Lyst and BG revealed that the l l,817bp Lyst-l cDNA results from splicing of Lyst exon σ (containing nucleotide 4706) to downstream exon τ (FIG. 15b). Incomplete splicing and reading through the intron σ' interposed between exons σ and τ yields the 5893 bp cDNA described by Barbosa et ai (1996) (Lyst-ll, FIG. 15b, Genbank accession number L77884). Intron σ' encodes 37 in-frame amino acids followed by a stop codon and a polyadenylation signal. Lyst-ll corresponds to a smaller (~4kb) mRNA observed on Northern blots. Lyst-l and Z-yst-II are both present in poly(A)+ RNA from many mouse tissues (FIG. 15b). The putative Lyst-I protein is of relative molecular mass 425,287 (Mr 425K) while that of Lyst-II is predicted to be of Mr 172.5K.
5.4.2.2 SEQUENCE OF HUMAN LYSTl AND LYST2 cDNAs cDNAs corresponding to LYSTl, the human homolog of y^tZ-isoform I (which is the largest mRNA isoform of the bg gene) were obtained by identification of human expressed sequence tags (ESTs) similar in sequence to mouse Lystl by database searches (Genbank accession numbers L77889, W26957 and H51623). Intervening cDNA sequences were isolated using RT-PCR™ with primers derived from mouse Lystl sequence and adjacent ESTs. The partial LYSTl cDNA sequence (Genbank Accession number U70064, 7.1 kb) was assembled by alignment of these clones with mouse Lystl cDNA. Human LYSTl has 82% predicted amino acid identity with mouse Lystl over 1,990 amino acids. The predicted human LYSTl amino acid sequence contains a 6 amino acid insertion relative to mouse Lystl at residue 1,039. Recently, another group has published the sequence of the human LYSTl cDNA (Nagle et al, 1996). The cDNA sequence of the present invention differs in at 4 nucleotides and 3 predicted amino acids from that of Nagle et al. (1996). This 13.5 kb cDNA sequence corresponds to the largest mRNA (LYSTl -isoform I) observed on northern blots of human tissues (caption in FIG. 2). These northern blots also demonstrated the existence of a smaller LYST isoform (-4.5 kb, designated
ZFSr-isoform II) that was similar in size to the smaller mouse Lystl mRNA, and that appeared to differ in distribution of expression in human tissues from LYSTl -isoform I. Assuming that the genomic derivation of human LYSTl -isoform II was the same as mouse Lystl-isoform II, the sequence of the 3' end of the human LYSTl -II isoform was sought by cloning human LYSTl intron F' using PCR™ of human genomic DNA with primers derived from LYSTl exon F and mouse intron F' (caption in FIG. 2). The sequence of the 5' end of human LYSTl intron F' contained 17 codons in frame Wit LYSTl exon F, followed by a stop codon. By amplification of a LYSTl -isoform II cDNA from human peripheral blood RNA by RT-PCR™ with primers from a 5' LYSTl exon and LYSTl intron F', it was demonstrated that this intron was indeed retained in human LYSTl -isoform II mRNA. Nucleotides 1-5905 of human LYSTl -isoform II cDNA are identical to ETSJV-isoform I, and are followed by intron F' sequence (Genbank accession number U84744)(FIG. 2). The predicted intron-encoded amino termini of the mouse Lystl-isoform II and human LYSTl- isoform II peptides shared 65% identity.
The only significant sequence similarity of LYSTl- isoform II to known proteins was with the stathmin family. Identity with mouse Lystl-isoform II in this region (amino acids 376-540) was 92% (and similarity was 99%)(FIG. 5).
5.4.2.3 GENETIC AND PHYSICAL MAPPING OF LYST
A 2 kb human LYST probe was assigned to human chromosome 1 by hybridization to human-rodent somatic cell hybrid DNA (FIG. 16). All of the bands that segregated with human DNA hybridized only to somatic cell hybrids containing human chromosome 1 DNA.
In order to precisely map LYST on human chromosome 1, LYST probes were hybridized to
YAC clones encompassing the CHS critical region (FIG. 16b and FIG. 16c) (Barrat et ai 1996). Three probes, derived from different segments of the LYST cDNA each hybridized to five CHS critical region YACs (FIG. 16d), confirming localization to the correct interval.
Genetic mapping in 504 [C57BL/6J-_ / x (C57BL/6J- x CAST/EiJ)Fι] backcross mice was used to determine whether LYST was the human homolog of the mouse bg gene. Using one Xbal and two Taql RFLPs, LYST was shown to cosegregate with bg and Lyst on mouse Chromosome 13. 5.4.2.4 MUTATION ANALYSIS
As an initial screen for LYST mutations in CHS patients, we analyzed northern blots of poly(A)+ RNA from CHS patients The largest EyST" mRNA species (LYST-l, approximately 12 kb) was greatly reduced in abundance or absent in lymphoblastoid mRNA of patients PI and P3, respectively (FIG 4a), while the smaller LYST transcript (LYST-ll, approximately 4 4 kb) was both present and undiminished in abundance Rehybridization of this blot with an actin probe confirmed that absence of the larger transcript was not due to uneven gel loading or RNA degradation Fibroblast poly(A)+ RNA from three other CHS patients (369, 371 and 373) showed a moderate reduction in LYST-l mRNA (51-60% of control by densitometry), while the LYST-ll mRNA was essentially unaltered in abundance (103-147% of control)
Single-strand conformation polymorphism (SSCP) analysis was undertaken using cDNA samples derived from lymphoblastoid or fibroblast cells lines from CHS patients Anomalous bands were detected in PCR™ products from the 5' end of the LYST ORF in two unrelated CHS patients different from those with aberrant northern blot patterns (371 and 373, FIG 4b) Subsequent sequence analysis identified a C to T transition at nucleotide 148 of the coding domain in patient 373 (FIG 4c) Four of nine cDNA clones derived from patient 373 contained this mutation Restriction enzyme digestion confirmed this mutation Taql digestion of LYST cDNA. (nucleotide 520 to 808) showed loss of this restriction site in patient 373 to be heterozygous The C to T substitution creates a stop codon at amino acid 50 (R50X)
Patient 371 had previously been shown to have a frame-shift mutation with a G insertion at nucleotide 118 of the coding domain (FIG 4c)[Barbosa et al, 1996] Each of five cDNA clones isolated from lymphoblasts of patient 371 were found to contain this mutation Allele-specific oligonucleotide hybridization of cDNA from this patient failed to detect a signal with an oligonucleotide corresponding to the normal allele, suggesting that the patient is either homozygous or hemizygous for this mutation
Mutations were identified in three other CHS patients cDNA isolated from EBV-transformed lymphoblasts from patient 372 (deposited at the Coriell Institute as GM03365) contained a homozygous C to T transition at nucleotide 3310 of the coding domain, that created a stop codon at amino acid 1 104 (Rl 104X) [Nagle et al, 1996] Patient 370 contained a homozygous C to T transition at nucleotide 3085 of the coding domain, that created a stop codon at amino acid 1029 (Q1029X) Patient 369 had a heterozygous frame shift mutation Nucleotides 3073 and 3074 of the coding domain were.deleted in two of five cDNA clones isolated from this patient The deletion results in a frame shift at codon 1026 and termination at codon 1030
Lymphoblasts from all of these patients (369, 370, 371, 372, 373, PI and P3) contain the giant perinuclear lysosomal vesicles that are the hallmark of CHS Patients 369, 370, and 371 had typical clinical presentations of CHS, with recurrent childhood infections and oculocutaneous albinism The parents of patients 369 and 370 are known not to have been cosanguinous In contrast, the clinical course of patients 372 and 373 was milder Lymphoblasts were immortalized from patient 372 at 27 years of age He had oculocutaneous albinism, recurrent skin infections, and peripheral neuropathy Patient 373 has not had systemic infections and is alive at age 37 Patient 373 does, however, have hypopigmented hair and irides as well as peripheral neuropathy
5.4.2.5 EXPRESSION OFLEST-I AND - Y-SΓ-II IN HUMAN TISSUES
Analysis of northern blots of mouse mRNA had suggested that the relative abundance of mouse Lyst-l and Eyst-II transcripts differed from tissue to tissue (Barbosa et al. , 1996) The relative abundance of EySrmRNA isoforms in human tissues at different developmental stages was examined by sequential hybridization of a poly(A)+ RNA dot blot with several Z-FSrcDNA probes The quantity of poly(A)+ RNA loaded on the blot was normalized to eight housekeeping genes (phospholipase, ribosomal protein S9, tubulin, a highly basic 23 kD protein, glyceraldehyde-3-phosphate dehydrogenase, hypoxanthine guanine phosphoribosil transferase, $-actin, and ubiquitin) to allow estimation of the relative abundance of ZyS7'mRNA isoforms m different tissues
Using a probe that hybridized only to LYST-l transcripts (the largest LYST isoform) on northern blots (Barbosa et al, 1996), LYST-l mRNA was found to be most abundant in thymus (adult and fetal), peripheral blood leukocytes, bone marrow, and several regions of the adult brain In contrast, no LYST-l mRNA was detected in fetal brain Negligible LYST-l transcription was also apparent in heart, lung, kidney, or liver at any developmental stage
A somewhat different pattern of expression was evident upon rehybridization of the blot with a probe derived from the 5' end of the coding domain of LYST, a region that hybridized to both LYST-l and LYST-ll mRNAs on northern blots (Barbosa et al, 1996) Consonant with the pattern of LYST-l transcription was abundant expression detected with this probe in peripheral blood leukocytes, thymus (adult and fetal), and bone marrow, and negligible expression detected in skeletal muscle However, several tissues with abundant LYST-l transcripts, exhibited considerably less hybridization signal with the LYST-l + LYST-ll probe, including most regions of the adult brain, fetal and adult thymus, and spleen Furthermore, several tissues with negligible LYST-l transcription exhibited intense hybridization with the LYST-l + LYST-ll probe, including adult and fetal heart, kidney, liver, and lung, and adult aorta, thyroid gland, salivary gland, appendix, and fetal brain
5.4.3 DISCUSSION
As described above, the novel mouse gene, Lyst (Lysosomal trafficking regulator), was identified from a bg critical region YAC and showed that it was mutated in two bg alleles The inventors also identified two human ESTs similar in sequence to mouse Lyst and identified a mutation in one of these ESTs in a CHS patient Simultaneously, another group published a partial cDNA sequence (BG) that had been isolated from the same YAC (Perou et al, 1996a) This partial cDNA was mutated in two other bg alleles, but was different in sequence from Lyst The inventors have resolved this bg gene dilemma by demonstrating that Lyst and BG sequences are derived from a single gene with alternatively spliced mRNAs The unrelated cDNA sequences that had been reported are derived from non-overlapping parts of two Lyst isoforms with different predicted C-terminal regions The inventors described a 5893 bp cDNA (Lyst) while Perou et al reported a partial cDNA sequence (BG) without a 5' end (Perou et al, 1996a) By sequencing additional RT-PCR™ products, the inventors have shown that nucleotides 1-4706 of Lyst also represent the previously undetermined 5' region of BG Alternative splicing at nucleotide 4706, however, results in bg gene isoforms that contain the 3' region of BG or Lyst Splicing of Lyst exon σ (containing nucleotide 4706) to exon τ results in an mRNA (Lyst-l) that corresponds to the largest band observed on Northern blots and that contains BG sequence at the 3' end Incomplete splicing at nucleotide 4706 results in the 5893 bp cDNA (Eyst-II) described by Barbosa et al. (1995) and contains intron-derived sequence at the 3' end Lyst-ll corresponds to a smaller mRNA observed on Northern blots While several other genes generate an alternative C-terminus by incomplete splicing (Myers et al, 1995, Sugimoto et al, 1995, Sygiyama et al, 1996, Zhao and Manlley, 1996, Van De Wetering et al, 1996), the bg gene is unique in that the predicted structures of the two C-termini are quite different The C-terminus of Lyst-I contains a 'WD'-repeat domain that is similar to the β-subunit of heterotrimeric G proteins and which may assume a propeller-like secondary structure (Lambright et al. , 1996) In contrast, Lyst-II has a C-terminal prenylation motif that could provide attachment to the lysosomal membrane Although the prenylation signal is absent from Lyst-I, it contains a hydrophobic region that is predicted to be membrane associated The significance of these divergent features is increased by the fact that Lyst is not predicted to have transmembrane helices
Identification of the human homolog of the bg gene, LYST, provided a second line of evidence that Lyst and BG are derived from a single gene, since the LYST sequence overlaps both Lyst and BG The LYST cDNA identified corresponds to the mouse Lyst-l isoform Northern blots of human tissues had suggested that a similar complexity exists in the transcription of LYST, the homologous human gene (Barbosa et al, 1996) We recently identified two human ESTs homologous to mouse Lyst and described a mutation in one of these ESTs in a CHS patient
(Barbosa et al, 1996) Subsequently, another group published the cDNA sequence of the largest LYST isoforms (LYST-l), and identified mutations in this gene in 2 additional patients with CHS (Nagle et al, 1996) Here we have described the identification of a second isoform of human LYST This cDNA, designated LYST-ll, encodes a protein of 1531 amino acids that is homologous to mouse Lyst-ll Like the latter, human LYST-ll mRNA arises through incomplete splicing and retention of a transcribed intron that encodes the C-terminus of the predicted LYST-II protein The mouse and human LYST-ll -specific codons share 65 % predicted amino acid identity The stop codon, however, is not precisely conserved between human and mouse LYST-ll While mouse Lyst-II is predicted to contain a C-terminal prenylation motif (CYSP), translation of human LYST-ll is predicted to terminate 22 codons earlier and to lack this motif
Several of the predicted structural features of mouse Lyst were conserved in human The most notable of these was a region similar in sequence to stathmin (amino acids 376-540) While mouse and human LYST had an overall amino acid identity of 81%, identity in the stathmin-like domain was 92% (and similarity was 99%) Stathmin is a coiled-coil phosphoprotein thought to regulate microtubule polymerization and to act as a relay for intracellular signal transduction (Sobel 1991, Belmont and Mitchison, 1996) This region of LYST may encode a coiled-coil protein interaction domain and may regulate microtubule-mediated lysosome trafficking Intriguingly, a defect in microtubule dynamics has previously been documented in CHS (Oliver et al, 1975) and intact microtubules are required for maintenance of lysosomal morphology and trafficking (Matteoni and Kreis, 1987, Swanson et al, 1987, Swanson et al, 1992, Oka and
Weigel, 1983) Other putative structural features of LYST that are conserved between human and mouse are several pairs of predicted helices with a protein kinase C- or casein kinase II-phosphorylation signal at their N-terminus. These helical bundles have been hypothesized to have a signal transduction function similar to stathmin. The conserved phosphorylation sites have been hypothesized to affect interactions of LYST with other molecules through phosphorylation — dependent conformational shifts in the helical bundles. The conservation of these features between human and mouse lends credence to their biological relevance.
In order to evaluate the candidacy of LYST for CHS, segments of the LYST sequence were mapped in the human genome. The CHS locus was recently assigned to human chromosome lq42-43 (Goodrich and Holcombe, 1995; Barrat et al. 1996; Fukai et al, 1996), a result that had been expected based on linkage conservation between the mouse chromosome 13 region containing the bg locus and human chromosome Iq42-q43 (Beguez-Cesar, 1943). D1S2680 and D1S163 were previously shown to represent the telomeric and centromeric limits, respectively, of the CHS critical region (Barrat et al. 1996). Human LYST mapped within this CHS critical region. The localization of all LYST PCR™ products to CHS critical region YACs also precluded the possibility that the LYST sequence had been assembled from segments of closely related genes.
Northern blots demonstrated a 12 kb mRNA (corresponding to LYST-l) to be severely reduced in abundance in two CHS patients. A 4.4 kb band (corresponding to LYST-ll), however, was present in mRNA from these patients in normal abundance. These results suggest that, at least in some patients, CHS results from loss of the protein encoded by LYST-l rather than LYST-ll. This result is surprising since previous Northern blots had suggested that the major ETSJmRNA in granular cells was LYST-ll, while LYST-l was either undetectable or barely detectable in these cells. Because lysosomal trafficking defects in granular cells account for the clinical features of CHS (Griffiths, 1996), it had been hypothesized that the 4.4 kb ZyS7'-II mRNA represented the transcript of primary functional significance. In this context, it is interesting to note that the bg^ mutation results in the generation of a premature stop codon in Lyst-l that is unlikely to affect Lyst-ll mRNA processing (Perou et al, 1996a). These results suggest that defects in LYST-l alone can elicit CHS and that LYST-ll expression alone cannot compensate for loss of LYST-l.
Mutations were identified within the coding domain of LYST in five CHS patients, two of which have been reported previously (Barbosa et al, 1996; Nagle et al, 1996). The genetic lesions in three CHS (patients 370, 372 and.373) were C to T transitions that resulted in premature termination (Q1029X, Rl 104X and R50X, respectively)[Nagle et al, 1996] Two other patients had coding domain frame shift mutations that induced premature termination One of these, patient 371, had a G insertion at nucleotide 1 18 of the coding domain, leading to premature termination at codon 63 (Barbosa et al, 1996) Allele-specific oligonucleotide analysis indicated that this mutation was either homozygous or that mRNA corresponding to this region is not produced from the other allele (hemizygosity) Patient 369 was heterozygous for a dinucleotide deletion that results in premature termination at codon 1030 Interestingly, all bg and CHS mutations identified to date are predicted to result in the production of either truncated or absent LYST proteins (Barbosa et al, 1996, Nagle et al, 1996) Unlike Fanconi anemia, type C, there does not appear to be a correlation between the length of the truncated LYST proteins (which may or may not be stable) with clinical features or disease severity in CHS patients However, until the other mutant allele in patients 369 and 373 are identified, and the exact effects of each mutation at the protein level are characterized, such correlation is imprecise
Comparison of transcription of LYST-l and LTS7-II in human tissues at different developmental stages revealed an overlapping but distinct pattern of expression A quantitative estimate of the expression of the smaller ETS mRNA isoforms was obtained by subtraction of the relative hybridization intensity obtained with an LYST-l specific probe from that obtained with a probe that hybridizes to all LYST transcripts LYST-l transcripts predominated in thymus, fetal thymus, spleen, and brain (with the exception of amygdala, occipital lobe, putamen, and pituitary gland). Both LYST-l and LYST-ll transcripts were abundant in the latter brain tissues, peripheral blood leukocytes, and bone marrow Only the smaller LYST isoforms were expressed in several tissues, including heart, fetal heart, aorta, thyroid gland, salivary gland, kidney, liver, fetal liver, appendix, lung, fetal lung, and fetal brain. The developmental pattern of LYST mRNA isoform expression in brain was particularly interesting, since only the smaller LYST isoforms were expressed in fetal brain, whereas the largest isoform (LYST-l) predominated in many regions of the adult brain
In summary, the inventors have shown that the same gene is mutated in human CHS and bg mice. Without bone marrow transplantation, CHS patients typically die in childhood of infection and malignancy. The existence of an animal model of CHS with a similar genetic lesion will assist efforts to develop novel therapies for this disease 5.5 EXAMPLE 5 - DNA SEQUENCES OF MOUSE LYSTI
5.5.1 cDNA SEQUENCE OF LONG ISOFORM (SEQ ID NO:3)
1 TGAGAGCTCA CGCTGGCCTG GCAGCCTTGG TGAGTCGGGA TTCTCCTGCA
51 CCGGCGGGCG AGAGCGCGCG GCGGACCACA GAGCGGAGGT GAAGCCTTAT 101 GCTGAGACAG TTTTATCTAG TTCATGAACC CAAATTATAT ACAAGCTGAA
151 TGTTACAGAA GTGCTGAAAG ACTGCTCTGT CATGAGCACG GACAGCAACT
201 CATTGGCACG TGAGTTTCTG ATTGATGTCA ACCAGCTTTG CAATGCAGTG
251 GTCCAGAGGG CAGAAGCCAG GGAAGAAGAA GAAGAGGAGA CACACATGGC
301 AACTCTTGGA CAGTACCTTG TCCATGGACG AGGATTTCTG TTACTTACCA 351 AACTAAATTC TATCATTGAT CAGGCCCTGA CATGCAGAGA AGAACTCCTG
4 01 ACTCTTCTTC TGTCGCTCCT TCCCTTGGTG TGGAAGATAC CTGTCCAGGA
451 ACAGCAGGCA ACAGATTTTA ACCTGCCACT GTCATCTGAT ATAATCCTGA
501 CCAAAGAAAA GAACTCAAGT TTGCAAAAAT CAACTCAGGG AAAATTATAT
551 TTAGAAGGAA GTGCTCCATC TGGTCAGGTT TCTGCAAAAG TAAACCTTTT 601 TCGAAAAATC AGGCGACAGC GTAAAAGTAC CCATCGTTAT TCTGTAAGAG
651 ATGCAAGAAA GACACAGCTC TCCACCTCTG ACTCCGAAGG CAACTCAGAT
701 GAAAAGAGTA CGGTTGTGAG TAAACACAGG AGGCTCCACG CGCTGCCACG
751 GTTCCTGACG CAGTCTCCTA AGGAAGGCCA CCTCGTAGCC AAACCTGACC
801 CCTCTGCCAC CAAAGAACAG GTCCTTTCTG ACACCATGTC TGTGGAAAAC 851 TCCAGAGAAG TCATTCTGAG ACAGGATTCA AATGGTGACA TATTAAGTGA
901 GCCAGCTGCT TTGTCTATTC TCAGTAACAT GAATAATTCT CCTTTTGACT
951 TATGTCATGT TTTGTTATCT CTATTGGAAA AAGTTTGTAA GTTTGACATT
1001 GCTTTGAATC ATAATTCTTC CCTAGCACTC AGTGTAGTAC CCACACTGAC
1051 TGAGTTCCTA GCAGGCTTTG GGGACTGCTG TAACCAGAGT GACACTTTGG 1101 AGGGACAACT GGTTTCTGCA GGTTGGACAG AAGAGCCGGT AGCTTTGGTT
1 151 CAACGGATGC TCTTTCGAAC CGTGCTGCAC CTTATGTCAG TAGACGTTAG
1201 CACTGCAGAG GCAATGCCAG AAAGTCTTAG GAAAAATTTG ACTGAATTGC
1251 TTAGGGCAGC TTTAAAAATT AGAGCTTGCT TGGAAAAGCA GCCTGAGCCT
1301 TTCTCCCCGA GACAAAAGAA AACACTACAG GAGGTCCAGG AGGGCTTTGT 1351 ATTTTCCAAG TATCGTCACC GAGCCCTTCT ACTACCTGAG CTTCTGGAAG
1401 GAGTTCTACA GCTCCTCATC TCTTGTCTTC AGAGTGCAGC TTCAAATCCC
1451 TTTTACTTCA GTCAAGCCAT GGATTTAGTT CAAGAATTTA TCCAGCAGCA -
1501 AGGATTTAAT CTCTTTGGAA CAGCAGTTCT TCAGATGGAA TGGCTGCTTA
1551 CAAGGGACGG TGTTCCTTCA GAAGCTGCAG AACATTTGAA AGCTCTGATA 1601 AACAGTGTAA TAAAAATAAT GAGTACTGTG AAAAAGGTGA AATCAGAGCA
1651 ACTTCATCAT TCCATGTGCA CAAGGAAAAG ACACCGGCGT TGTGAGTATT
1701 CCCACTTCAT GCAGCACCAC CGCGATCTTT CAGGGCTCCT GGTTTCAGCT
1751 TTTAAAAATC AGCTTTCTAA AAGCCCCTTT GAAGAGACCG CAGAGGGAGA
1 801 TGTGCAGTAT CCAGAGCGCT GCTGCTGCAT CGCCGTGTGC GCTCACCAGT 1851 GCTTGCGCTT GCTGCAGCAG GTTTCCCTGA GCACCACGTG TGTCCAGATC
1901 CTATCAGGTG TACACAGTGT TGGAATCTGT TGTTGTATGG ATCCTAAGTC
1951 TGTGATCGCC CCTTTACTGC ATGCTTTTAA GTTGCCAGCA CTGAAAGCTT
2001 TCCAGCAGCA TATACTGAAT GTCCTGAGCA AACTTCTTGT GGATCAGTTA
2051 GGAGGAGCAG AGCTATCACC GAGAATTAAA AAAGCAGCTT GCAACATCTG 2101 TACTGTGGAC TCTGACCAAC TGGCTAAGTT AGGAGAGACA CTGCAAGGCA
2151 CCTTGTGTGG TGCTGGTCCT ACCTCCGGCT TGCCCAGTCC TTCCTACCGA
2201 TTTCAGGGGA TCCTGCCCAG CAGCGGCTCT GAAGACTTGC TGTGGAAGTG
2251 GGATGCATTA GAGGCTTATC AGAGCTTTGT CTTTCAAGAA GACAGATTAC
2301 ATAACATTCA GATTGCAAAT CACATTTGTA ATTTACTCCA GAAAGGCAAT 2351 GTAGTTGTTC AGTGGAAATT GTATAATTAT ATCTTTAATC CTGTGCTCCA
2401 AAGAGGAGTT GAATTAGTAC ATCATTGTCA ACAGCTAAGC ATTCCTTCAG
2451 CTCAGACTCA CATGTGTAGC CAACTGAAAC AGTATTTGCC TCAGGAAGTG
2501 CTTCAGATTT ATTTAAAAAC TCTACCTGTC CTACTTAAAT CCAGGGTAAT 5301 ATAAAGACAT TTTGAGATGT GATGAAATCA GAGACCTTTT TATGACCAAG
5351 AAAGAAGTGG ATGTTGGTCT CTTAATTGAA AGTCTTTCAG TTGTTTATAC
54 01 AACTTGCTGT CCTGCTCAGT ACACCATCTA TGAACCAGTG ATTCGACTCA
5451 AGGGCCAAGT GAAAACTCAG CCCTCTCAAA GACCCTTCAG CTCAAAGGAA 5501 GCCCAGAGCA TCTTGCTAGA ACCTTCTCAA CTCAAAGGCC TCCAACCTAC
5551 GGAATGTAAA GCCATCCAGG GCATTCTGCA TGAGATTGGT GGGGCTGGCA
5601 CATTTGTTTT TCTCTTTGCT AGGGTTGTTG AACTTAGTAG CTGTGAAGAA
5651 ACTCAAGCAT TAGCACTGCG GGTTATACTG TCTTTAATTA AGTACAGCCA
5701 ACAGAGAACA CAGGAACTGG AAAATTGTAA TGGACTCTCT ATGATTCACC 5751 AAGTGTTGGT CAAACAGAAA TGCATTGTTG GCTTTCACAT TTTGAAGACC
5801 CTTCTTGAAG GTTGCTGCGG TGAAGAAGTT ATCCACGTCA GTGAGCATGG
5851 AGAGTTCAAG CTGGATGTTG AGTCTCATGC TATAATCCAA GATGTTAAGC
5901 TGCTGCAGGA ACTGTTACTT GACTGGAAGA TATGGAATAA GGCAGAGCAA
5951 GGTGTGTGGG AGACTCTGCT AGCAGCTTTG GAAGTCCTCA TCCGGGTAGA 6001 GCACCACCAG CAGCAGTTTA ATATTAAGCA GTTGCTGAAC GCCCACGTGG
6051 TTCACCACTT CCTACTGACC TGTCAGGTTT TACAGGAACA CAGAGAGGGG
61 01 CAGCTTACAT CTATGCCCCG AGAAGTTTGT AGATCATTTG TGAAAATCAT
6151 TGCAGAAGTC CTTGGTTCTC CTCCAGACTT GGAATTATTG ACAGTTATTT
6201 TCAATTTCCT GTTAGCTGTA CACCCTCCTA CTAATACTTA TGTTTGTCAC 6251 AATCCCACAA ACTTCTACTT CTCTTTGCAC ATAGATGGCA AGATCTTTCA
6301 GGAGAAAGTG CAGTCACTCG CGTACCTGAG GCATTCTAGC AGCGGAGGGC
6351 AAGCCTTTCC CAGCCCTGGA TTCCTGGTAA TAAGCCCATC TGCCTTTACT
6401 GCAGCTCCTC CTGAAGGAAC CAGTTCTTCC AATATTGTTC CACAGCGGAT
6451 GGCTGCTCAG ATGGTTCGAT CTAGAAGTCT ACCAGCATTT CCTACTTATT 6501 TACCACTAAT ACGAGCACAA AAACTGGCTG CAAGTTTGGG TTTTAGTGTT
6551 GACAAGTTAC AAAATATTGC AGATGCCAAC CCAGAGAAAC AGAATCTTTT
6601 AGGAAGACCC TACGCACTGA AAACAAGCAA AGAGGAAGCA TTCATCAGCA 6651 GCTGTGAGTC TGCAAAGACT GTTTGTGAAA TGGAGGCTCT TCTTGGAGCC 6701 CACGCCTCTG CCAATGGGGT TTCCAGAGGA TCACCGAGGT TCCCCAGGGC 6751 CAGAGTAGAT CACAAAGATG TGGGAACAGA GCCCAGATCA GATGATGACA
6801 GTCCTGGGGA TGAGTCTTAC CCACGTCGGC CTGACAACCT CAAGGGACTG
6851 GCCTCATTCC AGCGAAGCCA AAGCACTGTC GCAAGCCTTG GGCTGGCGTT
6901 TCCCTCTCAG AATGGATCTG CAGTTGCTAG CAGGTGGCCA AGTCTTGTTG
6951 ATAGGAATGC TGATGACTGG GAGAACTTTA CCTTTTCTCC TGCTTATGAG 7001 GCAAGCTACA ACCGAGCCAC AAGCACCCAC AGTGTCATTG AAGACTGTCT
7 051 GATACCTATC TGCTGTGGAT TATATGAACT CTTAAGTGGG GTTCTTCTTG 7101 TCCTGCCTGA TGCTATGCTT GAAGATGTGA TGGACAGGAT TATTCAAGCA 7151 GATATTCTTC TAGTCCTTGT TAACCACCCA TCACCTGCTA TCCAGCAAGG 7201 AGTAATTAAA CTGTTACATG CATACATTAA TAGAGCATCA AAGGAGCAAA 7251 AGGACAAGTT TCTGAAGAAC CGTGGCTTTT CCTTATTAGC CAACCAGTTG
7301 TATCTTCATA GGGGAACTCA GGAGTTGTTG GAGTGCTTTG TTGAAATGTT
7351 CTTTGGTCGA CCGATTGGCC TGGATGAAGA ATTTGATCTG GAGGAAGTGA
7401 AGCACATGGA ACTGTTCCAG AAGTGGTCTG TCATTCCCGT TCTCGGACTA
7451 ATAGAGACCT CTCTCTATGA CAATGTCCTC TTGCACAATG CTCTTTTACT 7501 TCTTCTGCAA GTTTTAAACT CTTGTTCCAA GGTAGCAGAC ATGCTACTGG
7551 ACAATGGTCT ACTCTATGTA TTATGTAATA CAGTAGCAGC CCTGAATGGA
7 601 TTAGAAAAGA ACATTCCTGT GAACGAATAC AAATTGCTCG CATGTGATAT
7651 ACAGCAGCTT TTCATAGCAG TTACAATTCA TGCTTGCAGT TCCTCAGGCA
7701 CACAGTATTT TAGAGTGATT GAAGACCTTA TTGTACTTCT TGGATATCTT 7751 CATAATAGCA AAAACAAGAG GACACAAAAT ATGGCTTTGG CCCTGCAGCT
7801 TAGAGTTCTC CAGGCTGCTT TGGAATTTAT AAGGAGCACA GCCAATCATG
7851 ACTCTGAAAG TCCAGTGCAC TCGCCTTCTG CCCACCGCCA TTCAGTGCCT
7 901 CCG7ΛAGCGGA GAAGCATTGC TGGTTCTCGC AAATTCCCTC TGGCTCAGAC
7951 AGAGTCTCTG CTGATGAAGA TGCGCTCAGT GGCCAGCGAT GAGCTACACT 8001 CTATGATGCA GAGGAGGATG AGCCAAGAGC ACCCCAGCCA GGCCTCGGAG
5.5.2 cDNA SEQUENCE OF SHORT ISOFORM (SEQ H) NO:5)
1 TGAGAGCTCA CGCTGGCCTG GCAGCCTTGG TGAGTCGGGA TTCTCCTGCA
51 CCGGCGGGCG AGAGCGCGCG GCGGACCACA GAGCGGAGGT GAAGCCTTAT
101 GCTGAGACAG TTTTATCTAG TTCATGAACC CAAATTATAT ACAAGCTGAA
151 TGTTACAGAA GTGCTGAAAG ACTGCTCTGT CATGAGCACG GACAGCAACT
201 CATTGGCACG TGAGTTTCTG ATTGATGTCA ACCAGCTTTG CAATGCAGTG 251 GTCCAGAGGG CAGAAGCCAG GGAAGAAGAA GAAGAGGAGA CACACATGGC
301 AACTCTTGGA CAGTACCTTG TCCATGGACG AGGATTTCTG TTACTTACCA
351 AACTAAATTC TATCATTGAT CAGGCCCTGA CATGCAGAGA AGAACTCCTG
401 ACTCTTCTTC TGTCGCTCCT TCCCTTGGTG TGGAAGATAC CTGTCCAGGA
451 ACAGCAGGCA ACAGATTTTA ACCTGCCACT GTCATCTGAT ATAATCCTGA 501 CCAAAGAAAA GAACTCAAGT TTGCAAAAAT CAACTCAGGG AAAATTATAT
551 TTAGAAGGAA GTGCTCCATC TGGTCAGGTT TCTGCAAAAG TAAACCTTTT
601 TCGAAAAATC AGGCGACAGC GTAAAAGTAC CCATCGTTAT TCTGTAAGAG
651 ATGCAAGAAA GACACAGCTC TCCACCTCTG ACTCCGAAGG CAACTCAGAT
701 GAAAAGAGTA CGGTTGTGAG TAAACACAGG AGGCTCCACG CGCTGCCACG 751 GTTCCTGACG CAGTCTCCTA AGGAAGGCCA CCTCGTAGCC AAACCTGACC
801 CCTCTGCCAC CAAAGAACAG GTCCTTTCTG ACACCATGTC TGTGGAAAAC
851 TCCAGAGAAG TCATTCTGAG ACAGGATTCA AATGGTGACA TATTAAGTGA
901 GCCAGCTGCT TTGTCTATTC TCAGTAACAT GAATAATTCT CCTTTTGACT
951 TATGTCATGT TTTGTTATCT CTATTGGAAA AAGTTTGTAA GTTTGACATT 1001 GCTTTGAATC ATAATTCTTC CCTAGCACTC AGTGTAGTAC CCACACTGAC
1051 TGAGTTCCTA GCAGGCTTTG GGGACTGCTG TAACCAGAGT GACACTTTGG
1101 AGGGACAACT GGTTTCTGCA GGTTGGACAG AAGAGCCGGT AGCTTTGGTT
1151 CAACGGATGC TCTTTCGAAC CGTGCTGCAC CTTATGTCAG TAGACGTTAG
1201 CACTGCAGAG GCAATGCCAG AAAGTCTTAG GAAAAATTTG ACTGAATTGC 1251 TTAGGGCAGC TTTAAAAATT AGAGCTTGCT TGGAAAAGCA GCCTGAGCCT
1301 TTCTCCCCGA GACAAAAGAA AACACTACAG GAGGTCCAGG AGGGCTTTGT
1351 ATTTTCCAAG TATCGTCACC GAGCCCTTCT ACTACCTGAG CTTCTGGAAG lol
4151 AACTCTGATT TCCAAGCATG CCAGAGAGTA CTGGTGGATC TCTTGGTATC
4201 TTTGATGAGC TCAAGAACGT GTTCAGAAGA CTTAACTCTT CTTTGGAGAA
4251 TATTTCTGGA GAAATCTCCT TGTACAGAAA TTCTTCTCCT TGGTATTCAC
4301 AAAATTGTTG AAAGTGATTT TACTATGAGC CCTTCACAGT GTCTGACCTT
4351 TCCTTTCCTG CATACCCCGA GTTTAAGCAA TGGTGTCTTA TCAC GAAAC
4401 CTCCTGGGAT TCTTAACAGT AAAGCCTTAG GCTTATTGAG AAGAGCACGG
4451 ATTTCCCGAG GCAAGAAAGA GGCTGATAGA GAGAGTTTTC CCTATAGGCT
4501 GCTTTCCTCT TGGCATATAG CCCCAATCCA CCTGCCGTTG CTGGGACAGA
4551 ACTGCTGGCC ACACCTGTCA GAAGGATTTA GTGTTTCTCT TGTGGGTTTA
4601 ATGTGGAATA CATCCAATGA ATCCGAGAGT GCTGCAGAAA GGGGAAAAAG
4651 AGTAAAGAAA AGAAACAAAC CATCAGTTCT GGAAGACAGC AGTTTTGAAG
4701 GAGCAGGTAT GATGGCAGGG TCTGATCTAT AT CTAAGAT TCTTCAAATA
4751 GCTGCTTGCC TGAGTTTTAA GCATATCTGG CAGTATTTt ATGTATTCTT
4801 TAAATGTTAT TCACCTTAAA GATCCTACTT CACTACTGAA TTACCAAAGC
4851 CTGAGTTTTC AAACAGCCTT GAAATCTTCA TTGTCTCTAA ACTTTAGATA
4901 GGGAAGTGGG GATGCTCTGT TTCTGCAACA GCTGTTGAAG TTAGCAGTCC
4951 CATGACTGTG TTAGTGTGGC TTCTGATACT AGATAGTTAT AAATAAAACC
5001 CTATGGCCAT TTTTATTTTA AGTTCTCCTT CTGTGTCTTA CACCAATGGC
5051 CCCTTCTAGT TACTGTCCCT GATCATTTAT ATGTAACAGT CCAAAGTTAG
5101 AACAGAGTTC ATCTGTAACT GAAGAACTGC TGTTAGGATG TACTGAAATT
5151 GAATTTTGTT TTTGTTCTCT TCTTTTTTAA GCAATCAACA GTTTCTTAAG
5201 TCATATAGCA GCTAGAGGAA GTAGTCTTAA AAACTGGCTG TGTATTTTTT
5251 TAACCTGTTA AAAATGGTGG CTAATATTTT TATACCCTAA TAATTGATAA
5301 TGTTCCTCTT TTTTAAAAGT CTGAGCTTTT GGACATGCAC TGTTTATGTT
5351 AGTACATCTT AGCTTAGTTT AACATAAAGT CACATCATAG TAACAAATAG
5401 CTTATCACAC ATATTCCACC TGCCATTGCT GTCACAGATA ATGGGAATAT
5451 AGAGGCAACT CAAGATTTAA GTAGTAAGGT GCCATTGGGA GGGGTAAGCA
5501 GCTAGCTCAC AGCCATAAAC ACTTCTCTCA GCGGAGACAA ACTGTGATTC
5551 AGGGTTTGGC ATCACTTAGC ATGGTTATTT CAAGGTTGTT CACTACCTTA
5601 AA AATGATC ATTTGAGCAG TGCAGCTTTT CTAAGAAGAG TATTAATAAT
5651 ATTATAGATC GTGCCTTTGT AACAATTTTT TTAGTGCAAG GCATCTGTTG
5701 ATGGCATGTG CTCCCTGGGC CATGGTCAGT TGTGTTAGAG TGACCCAATC 5751 CAACAAAAGC AGAACCTTGG TATGGAGTGT GGCTGACGAT GGTCCTTTAG
5801 CACCCTCAGG CCTTGTAGTT TAAAGCATTT AATAACTTTT AAAACACTGG 5851 AGTCTTTAGT GAGGACCTGC CCGGGCGGCC GCCACCGCGG TGG
5.6 EXAMPLE 6 ~ DEDUCED AMINO ACID SEQUENCES OF MOUSE LYSTI PROTEINS
5.6.1 PEPTIDE SEQUENCE OF LONG ISOFORM (SEQ ID NO:4)
5.6.2 PEPTIDE SEQUENCE OF SHORT ISOFORM (SEQ ID NO:6)
5.7 EXAMPLE 7 - DNA SEQUENCES OF HUMAN L YST1 GENE
5.7.1 cDNA SEQUENCE OF LONG ISOFORM (SEQ ID NO:7)
1 CGCAAGGGCT TCTAAGAAGC CATCCCAATG ACCTTTTGGC TTTGAGAAGA 51 GCAGTCCTCA TACCAGAGTG TTTGGGGTTT TGGCCTCTTT CAGTGTTTAT 2851 CAGTGCTTGC CTCAGGACGT GCTTCAGATT TATGTAAAAA CTCTGCCTAT
2901 CCTGCTTAAA TCCAGGGTAA TAAGAGATTT GTTTTTGAGT TGTAATGGAG
2951 TAAGTCAAAT AATCGAATTA AATTGCTTAA ATGGTATTCG AAGTCATTCT 3001 CTAAAAGCAT TTGAAACTCT GATAATCAGC CTAGGGGAGC AACAGAAAGA 3051 TGCCTCAGTT CCAGATATTG ATGGGATAGA CATTGAACAG AAGGAGTTGT
3101 CCTCTGTACA TGTGGGTACT TCTTTTCATC ATCAGCAAGC TTATTCAGAT
3151 TCTCCTCAGA GTCTCAGCAA ATTTTATGCT GGCCTCAAAG AAGCTTATCC
3201 AAAGAGACGG AAGACTGTTA ACCAAGATGT TCATATCAAC ACAATAAACC
3251 TATTCCTCTG TGTGGCTTTT TTATGCGTAA GTAAAGAAGC AGAGTCTGAC 3301 AGGGAGTCGG CCAATGACTC AGAAGATACT TCTGGCTATG ACAGCACAGC
3351 CAGCGAGCCT TTAAGTCATA TGCTGCCATG TATATCTCTC GAGAGCCTTG
3401 TCTTGCCTTC TCCTGAACAT ATGCACCAAG CAGCAGACAT TTGGTCTATG
3451 TGTCGTTGGA TCTACATGTT GAGTTCAGTG TTCCAGAAAC AGTTTTATAG
3501 GCTTGGTGGT TTCCGAGTAT GCCATAAGTT AATATTTATG ATAATACAGA 3551 AACTGTTCAG AAGTCACAAA GAGGAGCAAG GAAAAAAGGA GGGAGATACA
3601 AGTGTAAATG AAAACCAGGA TTTAAACAGA ATTTCTCAAC CTAAGAGAAC
3651 TATGAAGGAA GATTTATTAT CTTTGGCTAT AAAAAGTGAC CCCATACCAT 3701 CAGAACTAGG TAGTCTAAAA AAGAGTGCTG ACAGTTTAGG TAAATTAGAG 3751 TTACAGCATA TTTCTTCCAT AAATGTGGAA GAAGTTTCAG CTACTGAAGC 3801 CGCTCCCGAG GAAGCAAAGC TATTTACAAG TCAAGAAAGT GAGACCTCAC
3851 TTCAAAGTAT ACGACTTTTG GAAGCCCTTC TGGCCATTTG TCTTCATGGT
3901 GCCAGAACTA GTCAACAGAA GATGGAATTG GAGTTACCTA ATCAGAACTT
3951 GTCTGTGGAA AGTATATTAT TTGAAATGAG GGACCATCTT TCCCAGTCAA
4001 AGGTGATTGA AACACAACTA GCAAAGCCTT TATTTGATGC CCTGCTTCGA 4051 GTTGCCCTCG GGAATTATTC AGCAGATTTT GAACATAATG ATGCTATGAC
4101 TGAGAAGAGT CATCAATCTG CAGAAGAATT GTCATCCCAG CCTGGTGATT
4151 TTTCAGAAGA AGCTGAGGAT TCTCAGTGTT GTAGTTTTAA ACTTTTAGTT
4201 GAAGAAGAAG GTTACGAAGC AGATAGTGAA AGCAATCCTG AAGATGGCGA
4251 AACCCAGGAT GATGGGGTAG ACTTAAAGTC TGAAACAGAA GGTTTCAGTG 4301 CATCAAGCAG TCCAAATGAC TTACTCGAAA ACCTCACTCA AGGGGAAATA
4351 ATTTATCCTG AGATTTGTAT GCTGGAATTA AATTTGCTTT CTGCTAGTAA
4401 AGCCAAACTT GATGTGCTTG CCCATGTATT TGAGAGTTTT TTGAAAATTA
4451 TTAGGCAGAA AGAAAAGAAT GTTTTTCTGC TCATGCAACA GGGAACTGTG
4501 AAAAATCTTT TAGGAGGGTT CTTGAGTATT TTAACACAGG ATGATTCTGA 4551 TTTTCAAGCA TGCCAGAGAG TATTGGTGGA TCTTTTGGTA TCTTTGATGA
4601 GTTCAAGAAC ATGTTCAGAA GAGCTAACCC TTCTTTTGAG AATATTTCTG
4651 GAGAAATCTC CTTGTACAAA AATTCTTCTT CTGGGTATTC TGAAAATTAT
4701 TGAAAGTGAT ACTACTATGA GCCCTTCACA GTATCTAACC TTCCCTTTAC
4751 TGCACGCTCC AAATTTAAGC AACGGTGTTT CATCACAAAA GTATCCTGGG 4801 ATTTTAAACA GTAAGGCCAT GGGTTTATTG AGAAGAGCAC GAGTTTCACG
4851 GAGCAAGAAA GAGGCTGATA GAGAGAGTTT TCCCCATCGG CTGCTTTCAT
4901 CTTGGCACAT AGCCCCAGTC CACCTGCCGT TGCTGGGGCA AAACTGCTGG
4951 CCACACCTAT CAGAAGGTTT CAGTGTTTCC CTGTGGTTTA ATGTGGAGTG
5001 TATCCATGAA GCTGAGAGTA CTACAGAAAA AGGAAAGAAG ATAAAGAAAA 5051 GAAACAAATC ATTAATTTTA CCAGATAGCA GTTTTGATGG TACAGAGAGC
5101 GACAGACCAG AAGGTGCAGA GTACATAAAT CCTGGTGAAA GACTCATAGA
5151 AGAAGGATGT ATTCATATAA TTTCACTGGG ATCCAAAGCG TTGATGATCC
5201 AAGTGTGGGC TGATCCCCAC AATGCCACTC TTATCTTTCG TGTGTGCATG
5251 GATTCAAATG ATGACATGAA AGCTGTTTTA CTAGCACAGG TTGAATCACA 5301 GGAGAATATT TTCCTCCCAA GCAAATGGCA ACATTTAGTA CTCACCTACT
5351 TACAGCAGCC CCAAGGGAAA AGGAGGATTC ATGGGAAAAT CTCCATATGG
5401 GTCTCTGGAC AGAGGAAGCC TGATGTTACT TTGGATTTTA TGCTTCCAAG
5451 AAAAACAAGT TTGTCATCTG ATAGCAATAA AACATTTTGC ATGATTGGCC
5501 ATTGTTTATC ATCCCAAGAA GAGTTTTTGC AGTTGGCTGG AAAATGGGAC 5551 CTGGGAAATT TGCTTCTCTT CAACGGAGCT AAGGTTGGTT CACAAGAGGC CTTTTATCTG TATGCTTGTG GACCCAACCA TACATCTGTA ATGCCATGTA AGTATGGCAA GCCAGTCAAT GACTACTCCA AATATATTAA TAAAGAAATT TTGCGATGTG AACAAATCAG AGAATTTTTT ATGACCAAGA AAGATGTGGA TATTGGTCTC TTAATTGGAG TCTTTCAGTT GTTTATACAA CTTACTGTCC TGCTCCAGTA TACCATCTAT GAACCAGTGA TTAGACTTAA AGGTCAAATG AAAACCCAAC TCTCTCAAAG ACCCTTCAGC TCAAAAGAAG TTCAGAGCAT CTTATTAGAA CCTCATCATC TAAAGAATCT CCAACCTACT GAATATAAAA CTATTCAAGG CATTCTGCAC GAAATTGGTG GAACTGGCAT ATTTGTTTTT CTCTTTGCCA GGGTTGTTGA ACTCAGTAGC TGTGAAGAAA CTCAAGCATT AGCACTGCGA GTTATACTCT CATTAATTAA ATACAACCAA CAAAGAGTAC ATGAATTAGA AAATTGTAAT GGACTTTCTA TGATTCATCA GGTGTTGATC AAACAAAAAT GCATTGTTGG GTTTTACATT TTGAAGACCC TTCTTGAAGG ATGCTGTGGT GAAGATATTA TTTATATGAA TGAGAATGGA GAGTTTAAGT TGGATGTAGA CTCTAATGCT ATAATCCAAG ATGTTAAGCT GTTAGAGGAA CTATTGCTTG ACTGGAAGAT ATGGAGTAAA GCAGAGCAAG GTGTTTGGGA AACTTTGCTA GCAGCTCTAG AAGTCCTCAT CAGAGCAGAT CACCACCAGC AGATGTTTAA TATTAAGCAG TTATTGAAAG CTCAAGTGGT TCATCACTTT CTACTGACTT GTCAGGTTTT GCAGGAATAC AAAGAGGGGC AACTCACACC CATGCCCCGA GAGATGGCAA GATCTTTCAG GAGAAAGTGC GGTCAATCAT GTACCTGAGG CATTCCAGCA GTGGAGGAAG GTCCCTTATG AGCCCTGGAT TTATGGTAAT AAGCCCATCT GGTTTTACTG CTTCACCATA TGAAGGAGAG AATTCCTCTA ATATTATTCC ACAACAGATG GCCGCCCATA TGCTGCGTTC TAGAAGCCTA CCAGCATTCC CTACTTCTTC ACTACTAACG CAATCACAAA AACTGACTGG AAGTTTGGGT TGTAGTATCG ACAGGTTACA AAATATTGCA GATACTTATG TTGCCACCCA ATCAAAGAAA CAAAATTCTT TGGGGAGTTC CGACACACTG AAAAAAGGCA AAGAGGACGC ATTCATCAGT AGCTGTGAGT CTGCAAAAAC TGTTTGTGAA ATGGAAGCTG TCCTCTCAGC CCAGGTCTCT GTCAGTGATG TCCCAAAGGG AGTGCTGGGA TTTCCAGTGG TCAAAGCAGA TCATAAACAG TTGGGAGCAG AACCCAGGTC AGAAGATGAC AGTCCTGGGG ATGAGTCCTG CCCANCGCCG AGCCCTATGC A
5.7.2 cDNA SEQUENCE OF SHORT ISOFORM (SEQ ID NO:9)
1 CGCAAGGGCT TCTAAGAAGC CATCCCAATG ACCTTTTGGC TTTGAGAAGA
51 GCAGTCCTCA TACCAGAGTG TTTGGGGTTT TGGCCTCTTT CAGTGTTTAT 101 TCATTCTTAC GTGGGAAAGT TGTATTCCGA GGTTTCTGTG GTGCATGAAG
151 CTTTTGCCTT CACCATCTGT TCCCGTGTCT TCCTCGGGTG ACATCAGAGT
201 ACAGCAGTAT TTTCCCTTGC CATCTAATGG GGTTTGGGCT GTTTGACTCA
251 ACCGTGTGTG TTCCTCAATG CCAGGGGAAT AATCCTACCC TAAGTCAGCT
301 GAACAGAAGC CAGGATCTAA CTGCAAACAA GAGACCCAGC TTGCTTAACA 351 GCATGGAAGA GAACCAGTTT CCTTGCAGCT ACCTGGGAAG ACGGTTGCTA
401 ATTAGCCTGC AACAAAGAGT TCCTTGCTCA TCTAAAAGAG GCAATCACCG
451 TTCAGGTGAA GCTTTGTTCT AAGAATATTT GTTTCATCTA GTTTATGAGT
501 CCAAATGATA TAGACTGTAA ATGTCACAGC AGTGGTGAAA GACTGCTCGG
551 TCATGAGCAC CGACAGTAAC TCACTGGCAC GTGAATTTCT GACCGATGTC 601 AACCGGCTTT GCAATGCAGT GGTCCAGAGG GTGGAGGCCA GGGAGGAAGA
651 AGAGGAGGAG ACGCACATGG CAACCCTTGG ACAGTACCTT GTCCATGGTC
701 GAGGATTTCT ATTACTTACC AAGCTAAATT CTATAATTGA TCAGGCATTG
751 ACATGTAGAG AAGAACTCCT GACTCTTCTT CTGTCTCTCC TTCCACTGGT
801 ATGGAAGATA CCTGTCCAAG AAGAAAAGGC AACAGATTTT AACCTACCGC 851 TCTCAGCAGA TATAATCCTG ACCAAAGAAA AGAACTCAAG TTCACAAAGA
901 TCCACTCAGG AAAAATTACA TTTAGAAGGA AGTGCCCTGT CTAGTCAGGT
951 TTCTGCAAAA GTAAATGTTT TTCGAAAAAG CAGACGACAG CGTAAAATTA CCCATCGCTA TTCTGTAAGA GATGCAAGAA AGACACAGCT CTCCACCTCA GATTCAGAAG CCAATTCAGA TGAAAAAGGC ATAGCAATGA ATAAGCATAG AAGGCCCCAT CTGCTGCATC ATTTTTTAAC ATCGTTTCCT AAACAAGACC ACCCCAAAGC TAAACTTGAC CGCTTAGCAA CCAAAGAACA GACTCCTCCA GATGCTATGG CTTTGGAAAA TTCCAGAGAG ATTATTCCAA GACAGGGGTC AAACACTGAC ATTTTAAGTG AGCCAGCTGC CTTGTCTGTT ATCAGTAACA TGAACAATTC TCCATTTGAC TTATGTCATG TTTTGTTATC TTTATTAGAA AAAGTTTGTA AGTTTGACGT TACCTTGAAT CATAATTCTC CTTTAGCAGC CAGTGTAGTG CCCACACTAA CTGAATTCCT AGCAGGCTTT GGGGACTGCT GCAGTCTGAG CGACAACTTG GAGAGTCGAG TAGTTTCTGC AGGTTGGACC GAAGAACCGG TGGCTTTGAT TCAAAGGATG CTCTTTCGAA CAGTGTTGCA TCTTCTGTCA GTAGATGTTA GTACTGCAGA GATGATGCCA GAAAATCTTA GGAAAAATTT AACTGAATTG CTTAGAGCAG CTTTAAAAAT TAGAATATGC CTAGAAAAGC AGCCTGACCC TTTTGCACCA AGACAAAAGA AAACACTGCA GGAGGTTCAG GAAGATTTTG TGTTTTCAAA GTATCGTCAT AGAGCCCTTC TTTTACCTGA GCTTTTGGAA GGAGTTCTTC AGATTCTGAT CTGTTGTCTT CAGAGTGCAG CTTCAAATCC CTTCTACTTC AGTCAAGCCA TGGATTTGGT TCAAGAATTC ATTCAGCATC ATGGATTTAA TTTATTTGAA ACAGCAGTTC TTCAAATGGA ATGGCTGGTT TTAAGAGATG GAGTTCCTCC CGAGGCCTCA GAGCATTTGA AAGCCCTAAT AAATAGTGTG ATGAAAATAA TGAGCACTGT CAAAAAAGTG AAATCAGAGC AACTTCATCA TTCGATGTGT ACAAGAAAAA GGCACAGACG ATGTGAATAT TCTCATTTTA TGCATCATCA CCGAGATCTC TCAGGTCTTC TGGTTTCGGC TTTTAAAAAC CAGGTTTCCA AAAACCCATT TGAAGAGACT GCAGATGGAG ATGTTTATTA TCCTGAGCGG TGCTGTTGCA TTGCAGTGTG TGCCCATCAG TGCTTGCGCT TACTGCAGCA GGCTTCCTTG AGCAGCACTT GTGTCCAGAT CCTATCGGGT GTTCATAACA TTGGAATATG CTGTTGTATG GATCCCAAAT CTGTAATCAT TCCTTTGCTC CATGCTTTTA AATTGCCAGC ACTGAAAAAT TTTCAGCAGC ATATATTGAA TATCCTTAAC AAACTTATTT TGGATCAGTT AGGAGGAGCA GAGATATCAC CAAAAATTAA AAAAGCAGCT TGTAATATTT GTACTGTTGA CTCTGACCAA CTAGCCCAAT TAGAAGAGAC ACTGCAGGGA AACTTATGTG ATGCTGAACT CTCCTCAAGT TTATCCAGTC CTTCTTACAG ATTTCAAGGG ATCCTGCCCA GCAGTGGATC TGAAGATTTG TTGTGGAAAT GGGATGCTTT AAAGGCTTAT CAGAACTTTG TTTTTGGAGA AGACAGATTA CATAGTATAC AGATTGCAAA TCACATTTGC AATTTAATCC AGAAAGGCAA TATAGTTGTT CAGTGGAAAT TATATAATTA CATATTTAAT CCTGTGCTCC AAAGAGGAGT TGAATTAGCA CATCATTGTC AACACCTAAG CGTTACTTCA GCTCAAAGTC ATGTATGTAG CCATCATAAC CAGTGCTTGC CTCAGGACGT GCTTCAGATT TATGTAAAAA CTCTGCCTAT CCTGCTTAAA TCCAGGGTAA TAAGAGATTT GTTTTTGAGT TGTAATGGAG TAAGTCAAAT AATCGAATTA AATTGCTTAA ATGGTATTCG AAGTCATTCT CTAAAAGCAT TTGAAACTCT GATAATCAGC CTAGGGGAGC AACAGAAAGA TGCCTCAGTT CCAGATATTG ATGGGATAGA CATTGAACAG AAGGAGTTGT CCTCTGTACA TGTGGGTACT TCTTTTCATC ATCAGCAAGC TTATTCAGAT TCTCCTCAGA GTCTCAGCAA ATTTTATGCT GGCCTCAAAG AAGCTTATCC AAAGAGACGG AAGACTGTTA ACCAAGATGT TCATATCAAC ACAATAAACC TATTCCTCTG TGTGGCTTTT TTATGCGTAA GTAAAGAAGC AGAGTCTGAC AGGGAGTCGG CCAATGACTC AGAAGATACT TCTGGCTATG ACAGCACAGC CAGCGAGCCT TTAAGTCATA TGCTGCCATG TATATCTCTC GAGAGCCTTG TCTTGCCTTC TCCTGAACAT ATGCACCAAG CAGCAGACAT TTGGTCTATG TGTCGTTGGA TCTACATGTT GAGTTCAGTG TTCCAGAAAC AGTTTTATAG GCTTGGTGGT TTCCGAGTAT GCCATAAGTT AATATTTATG ATAATACAGA AACTGTTCAG AAGTCACAAA GAGGAGCAAG GAAAAAAGGA GGGAGATACA AGTGTAAATG AAAACCAGGA TTTAAACAGA ATTTCTCAAC CTAAGAGAAC TATGAAGGAA GATTTATTAT CTTTGGCTAT AAAAAGTGAC CCCATACCAT CAGAACTAGG TAGTCTAAAA AAGAGTGCTG ACAGTTTAGG TAAATTAGAG
5.8 EXAMPLE 8 - DEDUCED AMINO ACID SEQUENCES OF HUMAN LYSTl PROTEIN
5.8.1 PEPTIDE SEQUENCE OF LONG ISOFORM (SEQ ID NO: 8)
5.8.2 PEPTIDE SEQUENCE OF SHORT ISOFORM (SEQ ID NO: 10)
5.9 EXAMPLE 9 — IDENTIFICATION OF A DNA SEGMENT ENCODING LYST2
Lyst2 was identified in a search for human genes similar in sequence to Lystl (the CH gene) Mouse Lystl cDNA sequence was compared with Genbank sequences, and significant similarity (52%) was noted between residues 3275 to 3413 of Lystl (Genbank Accession number U70015) and R17955 R17955 is an uncharacterized human expressed sequence tag 292 bp in length The corresponding partial length cDNA clone (#32273) was obtained from Image consortium This cDNA clone was derived from a cDNA library of human infant brain, and is 1979-bp in length The clone was designated human LYST2.
5.10 EXAMPLE 10 - DNA SEQUENCE OF THE HUMAN L YST2 GENE
The LYST2 clone was sequenced using standard methodologies The DNA sequence is given below (SEQ ID NO.11)
1 ATACTTCTGA TGTAAAGGAA CTAATTCCAG AGTTCTACTA CCTACCAGAG 51 ATGTTTGTCA ACAGTAATGG ATATAATCTT GGAGTCAGAG AAGATGAAGT
101 AGTGGTAAAT GATGTTGATC TTCCCCCTTG GGCAAAAAAA CCTGAAGACT
151 TTGTGCGGAT CAACAGGATG GCCCTAGAAA GTGAATTTGT TTCTTGCCAA
201 CTTCATCAGT GGATCGACCT TATATTTGGC TATAAGCAGC GAGGACCAGA
251 AGCAGTTCGT GCTCTGAATG TTTTTCACTA CTTGACTTAT GAAGGCTCTG 301 TGAACCTGGA TAGTATCACT GATCCTGTGC TCAGGGAGGC CATGGAGGCA
351 CAGATACAGA ACTTTGGACA GACGCCATCT CAGTTGCTTA TTGAGCCACA
401 TCCGCCTCGG AACTCTGCCA TGCACCTGTG TTTCCTTCCA CAGAGTCCGC
451 TCATGTTTAA AGATCAGATG CAACAGGATG TGATAATGGT GCTGAAGTTT
501 CCTTCAAATT CTCCAGTAAC CCATGTGGCA GCCAACACTC TGCCCCACTT 551 GACCATCCCC GCAGTGGTGA CAGTGACTTG CAGCCGACTC TTTGCAGTGA
601 ATAGATGGCA CAACACAGTA GGCCTCAGAG GAGCTCCAGG ATACTCCTTG
651 GATCAAGCCC ACCATCTTCC CATTGAAATG GATCCATTAA TAGCCAATAA
701 TTCAGGTGTA AACAAACGGC AGATCACAGA CCTCGTTGAC CAGAGTATAC
751 AAATCAATGC ACATTGTTTT GTGGTAACAG CAGATAATCG CTATATTCTT 801 ATCTGTGGAT TCTGGGATAA GAGCTTCAGA GTTTATACTA CAGAAACAGG
851 GAAATTGACT CAGATTGTAT TTGGCCATTG GGATGTGGTC ACTTGCTTGG 901 CCAGGTCCGA GTCATACATT GGTGGGGACT GCTACATCGT GTCCGGATCT
951 CGAGATGCCA CCCTGCTGCT CTGGTACTGG AGTGGGCGGC ACCATATCAT
1001 AGGAGACAAC CCTAACAGCA GTGACTATCC GGCACCAAGA GCCGTCCTCA
1051 CAGGCCATGA CCATGAAGTT GTCTGTGTTT CTGTCTGTGC AGAACTTGGG 1101 CTTGTTATCA GTGGTGCTAA AGAGGGCCCT TGCCTTGTCC ACACCATCAC
1151 TGGAGATTTG CTGAGAGCCC TTGAAGGACC AGAAAACTGC TTATTCCCAC
1201 GCTTGATATC TGTCTCCAGC GAAGGCCACT GTATCATATA CTATGAACGA
1251 GGGCGATTCA GTAATTTCAG CATTAATGGG AAACTTTTGG CTCAAATGGA
1301 GATCAATGAT TCAACACGGG CCATTCTCCT GAGCAGTGAC GGCCAGAACC 1351 TGGTCACCGG AGGGGACAAT GGGGTAGTAG AGGTCTGGCA GGCCTGTGAC
1401 TTCAAGCAAC TGTACATTTA ACCCTGGATG TGATGCTGGC ATTAGAGCAA
1451 TGGACTTGTC CCATGACCAG AGGACTCTGA TCACTGGCAT GGCTTCTGGT
1501 AGCATTGTAG CTTTTAATAT AGATTTTAAT CGGTGGCATT ATGAGCATCA
1551 GAACAGATAC TGAAGATAAA GGAAGAACCA AAAGCCAAGT TAAAGCTGAG 1601 GGCACAAGTG CTGCATGGAA AGGCAATATC TCTGGTGGAA AAAATTCGTC
1651 TACATCGACC TCCGTTTGTA CATTCCATCA CACCCAGCAA TAGCTGTACA
1701 TTGTAGTCAG CAACCATTTT ACTTTGTGTG TTTTTTCACG ACTGAACACC
1751 AGCTGCTATC AAGCAAGCTT ATATCATGTA AATTATATGA ATTAGGAGAT
1801 GTTTTGGTAA TTATTTCATA TATTGTTGTT TATTGAGAAA AGGTTGTAGG 1851 ATGTGTCACA AGAGACTTTT GACAATTCTG AGGAACCTTG TGTCCAGTTG
1901 TTACAAAGTT TAAGCTTTGA ACCTAACCTG CATCCCATTT CCAGCCTCTT
1951 TTCAAGCTGA GAAAAAAAAA AAAAAAAAA (SEQ ID NO: 11)
This DNA sequence corresponds to the 3' end of the coding domain of human LYST2 and the 3 ' untranslated region
5.11 EXAMPLE 11 ~ AMINO ACID SEQUENCE OF THE HUMAN LYST2 PROTEIN
Translation of the DNA of SEQ ID NO 1 1 provided the deduced amino acid sequence of the LYST2 protein (SEQ ID NO 12) which is shown below 1 TSDVKELIPE FYYLPEMFVN SNGYNLGVRE DEVWNDVDL PPWAKKPEDF
51 VRINRMALES EFVSCQLHQW IDLIFGYKQR GPEAVRALNV FHYLTYEGSV 101 NLDSITDPVL REAMEAQIQN FGQTPSQLLI EPHPPRNSAM HLCFLPQSPL 151 MFKDQMQQDV IMVLKFPSNS PVTHVAANTL PHLTIPAWT VTCSRLFAVN 201 RWHNTVGLRG APGYSLDQAH HLPIEMDPLI ANNSGVNKRQ ITDLVDQSIQ 251 INAHCFWTA DNRYILICGF WDKSFRVYTT ETGKLTQIVF GHWDWTCLA 301 RSESYIGGDC YIVSGSRDAT LLLWYWSGRH HIIGDNPNSS DYPAPRAVLT 351 GHDHEWCVS VCAELGLVIS G \A\vKEGPCLVH TITGDLLRAL EGPENCLFPR 401 LISVSSEGHC IIYYERGRFS NFSINGKLLA QMEINDSTRA ILLSSDGQNL 451 VTGGDNGWE VWQACDFKQL Yl (SEQ ID NO:2)
Amino acids 2 to 140 of the predicted human LYST2 protein share only a 51 8% amino acid identity with amino acids 3275 to 3413 of mouse and human Lystl . The C-terminal residues of LYST2 are not similar to LYSTl, but do have a similar predicted secondary structure: This region of LYSTl contains WD repeats and is predicted to assume a propellor-like secondary structure, similar to the beta subunit of heterotrimeric G proteins. The corresponding region of LYST2 also contains WD repeats and is also similar in sequence to the beta subunit of heterotrimeric G proteins (30.4% identity from LYST2 amino acid 285 to 418 to the guanine nucleotide-binding protein beta subunit-like protein P49027). Furthermore, the stop codons of mouse Lystl and human LYST2 occur approximately the same distance from the matching region
5.12 EXAMPLE 12 ~ GENETIC MAPPING OF THE LYST2 GENE By hybridization to Southern blots of human-rodent somatic cell hybrids, LYST2 was shown to map on human Chromosome 13 This is in contrast to LYSTl, which maps on human Chromosome 1 Using an Mspl restriction fragment length polymorphism, Lyst2 was mapped by cros-hybridization in the mouse Linkage analysis using DNA from 93 intersubspecific backcross
[CSlBL/βJ-bg1 X (CSlBL/όJ-bg7 x CAST EiJ)F1] mice revealed Lyst2 to map to mouse Chromosome3 between D3Mιt21 and D3MU22 This contrasts with Lyst, which maps on mouse Chromosome 13 Pulsed field gel electrophoresis blots of mouse DNA hybridized with a Lyst 2 probe showed a single band, indicating that Lyst2 is a single genetic locus
5.13 EXAMPLE 13 ~ EXPRESSION ANALYSIS OF THE LYST2 GENE
Hybridization of northern blots of human and mouse tissues with LYST2 revealed the following pattern of expression: Lyst2 is abundantly expressed in mouse brain, and moderately expressed in mouse kidney, and weakly expressed in mouse heart, lung, skeletal muscle, and testis. Lyst2 is not expressed in mouse spleen or liver. The largest (and most prominent) band observed on northern blots was 13kb in size (very similar to the largest Lyst mRNA) Additional transcripts on 6kb and 5kb were evident in mouse brain RNA In selected human tissues, LYST2 was expressed as follows Moderate expression was observed in melanoma cells, weak expression in HeLa cells, colorectal carcinoma cells, and in spleen, lymph node, thymus, and appendix No expression was detected in peripheral blood leucocyte, bone marrow, fetal liver, lung carcinoma, or leukemia cell lines (K562, MOLT4, Raji, HL60)
The major transcript was 13-kb in size in human RNA
In summary, LYST2 appears to be similar in size to the largest LYSTl mRNA, but has a very different tissue distribution of expression, being abundantly expressed only in brain LYST2 appears to be a brain-specific homologue of LYSTl, and may function to regulate protein trafficking to the lysosome and late endosome within the brain
The relative abundance of LYST2 mRNA isoforms in human tissues at different developmental stages was examined by sequential hybridization of a poly(A)+ RNA dot blot with a LYST2 cDNA probe The quantity of poly(A)+ RNA loaded on the blot was normalized to eight housekeeping genes (phospholipase, ribosomal protein S9, tubulin, a highly basic 23-kDa protein, glyceraldehyde-3 -phosphate dehydrogenase, hypoxanthine guanine phosphoribosil transferase, β-actin, and ubiquitin) to allow estimation of the relative abundance of LYST2 mRNA isoforms in different tissues
Abundant LYST2 transcripts were detected in all brain regions and in kidney LYST2 transcπpts were detected in those regions at all developmental stages
5.14 EXAMPLE 14 ~ IDENTIFICATION OF MOUSE LYST2 cDNA CLONES
A mouse embryo (day 14.5 post-coitum) cDNA library was hybridized with a probe corresponding to human LYST2. Two clones were isolated and sequenced They contained overlapping sequences that were assembled by alignment with human LYST2 and represent 2543 bp of cDNA sequence
5.15 EXAMPLE 15 - DNA SEQUENCE OF THE MOUSE LYST2 GENE
1 GCAGCAGGGC GAACCGGACC TCTGTGATGT TTAATTTTCC TGACCAAGCA
51 ACAGTTAAAA AAGTTGTCTA CAGCTTGCCT CGGGTTGGAG TGGGGACCAG
101 CTATGGTTTG CCACAAGCCA GGAGGATATC ACTGGCCACT CCTCGACAGC 151 TGTATAAGTC TTCCAATATG ACTCAGCGCT GGCAAAGAAG GGAAATCTCC AACTTTGAGT ATTTGATGTT T lCiTtCAACACG ATAGCAGGTC GGACGTATAA TGATCTGAAC CAGTATCCTG TGTTTCCATG GGTGTTAACA AACTATGAAT CAGAGGAGTT GGACCTGACT CTCCCAGGAA ACTTCAGGCA TCTGTCAAAG CCAAAAGGTG CTTTGAACCC GAAGAGAGCA GTGTTTTACG CAGAGCGCTA TGAGACATGG GAGGAGGATC AAAGCCCACC CTTCCACTAC AACACACATT ACTCAACGGC GACTTCCCCC CTTTCATGGC TTGTTCGGAT TGAGCCATTC ACAACCTTCT TCCTCAATGC AAATGATGGG AAATTTGACC ATCCAGACCG AACCTTCTCA TCCATTGCAA GGTCATGGAG AACCAGTCAG AGAGATACAT CCGATGTCAA GGAACTAATT CCAGAGTTCT ATTACGTACC AGAGATGTTT GTCAACAGCA ATGGGTACCA TCTTGGAGTG AGGGAGGACG AAGTGGTGGT TAATGATGTG GACCTGCCCC CCTGGGCCAA GAAGCCAGAA GACTTTGTGC GGATCAACAG GATGGCCCTG GAAAGTGAAT TTGTTTCTTG CCAACTCCAT CAATGGATTG ACCTTATATT TGGCTACAAA CAGCGAGGGC CAGAGGCAGT CCGTGCTCTC AATGTTTTCC ACTACTTGAC CTACGAAGGC TCTGTAAACC TGGACAGCAT CACAGACCCT GTGCTCCGGG AGGCCATGGT TGCACAGATA CAGAACTTTG CCCAGACGCC ATCTCAGTTG CTCATTGAGC CGCATCCGCC TAGGACTTCA GCCATGCATC TGTGTTCCCT TCCACAGAGC CCACTCATGT TCAAAGATCA GATGCAGCAG GATGTGATCA TGGTGCTGAA GTTTCCATCC AATTCTCCTG TGACTCATGT GGCTGCCAAC ACCCTGCCCC ACCTGACCAT CCCTGCAGTG GTGACAGTGA CCTGCAGCCG ACTGTTTGCA GTGAACAGAT GGCACAACAC AGTCGGCCTC AGAGGAGCCC CCGGATACTC CTTGGATCAA GCACACCATC TTCCCATTGA GATGGACCCA TTAATCGCAA ATAACTCTGG TGTGAACAAG CGGCAGATCA CAGACCTTGT AGACCAGAGC ATCCAGATCA ATGCCCACTG CTTCGTGGTC ACAGCTGATA ATCGCTACAT CCTCATCTGT GGGTTTTGGG ATAAAAGTTT CAGAGTTTAC TCGACAGAAA CAGGGAAACT
GACACAGATT GTATTTGGCC ACTGGGATGT TGTCACATGC CTGGCCAGGT
CGGAGTCCTA CATTGGTGGA GACTGCTACA TAGTGTCTGG ATCTCGGGAC
GCCACCTTGC TTCTCTGGTA CTGGAGTGGG CGTCACCACA TCATCGGAGA
CAACCCCAAT AGCAGTGACT ATCCTGCGCC CAGAGCTGTC CTCACAGGCC ATGACCATGA AGTTGTCTGT GTCTCCGTCT GTGCAGAACT CGGACTCGTT ATCAGTGGTG CTAAAGAGGG CCCTTGCCTC GTTCATACCA TCACTGGAAA TCTGCTGAAG GCCCTGGAAG GACCAGAAAA CTGCTTATTT CCACGCCTAA TTTCGGTATC CAGTGAAGGC CACTGCATCA TATATTATGA GCGAGGACGG TTTAGCAACT TCAGCATCAA TGGGAAACTT TTGGCTCAAA TGGAGATCAA TGATTCCACT AGGGCTATTC TCCTGAGCAG CGATGGACAG AACCTGGTGA CTGGAGGGGA CAATGGTGTG GTGGAGGTCT GGCAGGCCTG TGACTTTAAG CAGCTGTACA TTTACCCAGG ATGTGATGCT GGCATTAGAG CGATGGATTT ATCCCATGAC CAAAGGACTC TGATCACTGG CATGGCTTCC GGCAGCATTG TACTTTTAAT ATAGATTTTA ATCGGTGGCA TTATGAGCAT CAGAACAGTA CTGAAGAGAA GCAGCAGAAG CCACATTCAA GTGAGAGCAC AAGTGCTTCT GTGGAAAGGC AGTATCTCTG GTGGGACGCT GGTCCACATC GGCCTCTGCT TGTACATCCA TCCCACCCAG CAGTCGCCGA ACATCATAGT CGGGAGCCAT TTCACCCTGT TTTTCCAGGA CTGAACACCA GCTGCTGTCA AGCAAGCTTA TATCATGTAA ATTATCTGAA TTAGGAGCCG TTTTGGTAAT TATTTCATAT ATCGCCGTTT ATTGAGAAAA GGTTGTAGGA AGCCTCACAA GAGACTTTTG ACAATTCTGA GGAACCTTGT GCCCAGTTGT TACAAAGTTT AAGCTTTGAA CCTAACTTGC ATCCCATTTC CAGCCTCGGG CTTCACTCGT GCC (SEQ ID NO: 13)
5.16 EXAMPLE 16 - DEDUCED AMINO ACID SEQUENCE OF MOUSE LYST2 PROTEIN
1 SRANRTSVMF NFPDQATVKK WYSLPRVGV GTSYGLPQAR RISLATPRQL
51 YKSSNMTQRW QRREISNFEY LMFLNTIAGR TYNDLNQYPV FPWVLTNYES
101 EELDLTLPGN FRHLSKPKGA LNPKRAVFYA ERYETWEEDQ SPPFHYNTHY
Mouse Lyst2 shares 98% amino acid identity with human LYST
II I 6. References
The following literature citations as well as those cited above are incorporated in pertinent part by reference herein for the reasons cited in the above text
United States Patent 3,791,932 United States Patent 3,949,064
United States Patent 4,174,384
United States Patent 4,196,265
United States Patent 4,271,147
United States Patent 4,554,101 United States Patent 4,578,770
United States Patent 4,596,792
United States Patent 4,599,230
United States Patent 4,599,231
United States Patent 4,601,903 United States Patent 4,608,251
United States Patent 4,683,195
United States Patent 4,683,202
United States Patent 4,952,496
United States Patent 5,168,050 Allen and Choun, "Large Unilamellar Liposomes with Low Uptake into the Reticuloendothelial
System," FEBSLett, 223 42-46, 1987 Altschul, Gish, Miller, Myers and Lipman, "Basic local alignment search tool," J. Mol. Biol ,
215 403-410, 1990 Baetz et al, "Loss of cytotoxic T lymphocyte function in Chediak-Higashi syndrome arises from a secretory defect that prevents lytic granule exocytosis," J. Immunol. 154 6122-6131,
1995 Barbosa, Johnson Achey, Gutierrez, Wakeland, Zerial and Kingsmore, "The rab protein family genetic mapping of six Rab genes in the mouse," Genomics, 30 439-444, 1995 Barbosa, Nguyen, Tchernev, Ashley, Detter, Blaydes, Brandt, Chotai, Hodgman, Solari, Lovett and Kingsmore, "Identification of the homologous beige and Chediak-Higashi syndrome genes," Nature, 382 262-265, 1996 HI
Barrat etai, "Genetic and physical mapping of the Chediak-Higashi syndrome on chromosome lq42-43," Am. J. Hum. Genet., 59 625-632, 1996 Barthold et ai, Infect. Immun., 63 2255-2261, 1995
Bayer and Wilchek, "The use of the avidin-biotin complex as a tool for molecular biology" In Glick, D., Methods of Biochemical Analysis, John Wiley and Sons, New York, 1980
Beguez-Cesar, "Neutropenia cronica maligna familiar con granulaciones atipicas de los leucocitos," -5o/. Soc. Cubana Pediat., 15 900-922, 1943 Beidler, Hilliard and Rill, "Ultrasensitive staining of nucleic acids with silver," Anal. Biochem.,
126 374-380, 1982 Belmont and Mitchison, "Identification of a protein that interacts with tubulin dimers and increases the catastrophe rate of microtubules," Cell, 84 623-631, 1996 Bettenhausen and Gossler, "Efficient isolation of novel mouse genes differentially expressed in early postimplantation embryos," Genomics 28 436-441, 1995 Bianco et al, J. Histochem. Cytochem., 38 1549-1563, 1990 Bishop, "The information content of phase-known matings for ordering genetic loci Genet.
Epidemiol 2 349-361, 1985 Bledsoe et _7/., J. Histochem. Cytochem., 176 7447-7455, 1994 Blume and Wolff, "The Chediak-Higashi syndrome studies in four patients and a review of the literature," Medicine Baltimore 51 247-280, 1972 Blume et al, "Defective granulocyte regulation in the Chediak-Higashi syndrome," N. Engl. J.
Med. 279.1009-1015, 1968 Bolivar et al, Gene, 2 95, 1977 Brandt, Elliott and Swank, "Defective lysosomal enzyme secretion in kidneys of Chediak-Higashi
(beige) mice," J Cell Biol, 67 774-788, 1975 Brunialti et al, "The mouse mutation progressive motor neuronopathy (pmn) maps to chromosome 13," Genomics, 29 131-135, 1995 Burkhardt, Wiebel, Hester and Argon, "The giant organelles in beige and Chediak-Higashi fibroblasts are derived from late endosomes and mature lysosomes," J. Exp. Med, 178 1845-1856, 1993 Campbell, "Monoclonal Antibody Technology, Laboratory Techniques in Biochemistry and Molecular Biology," Vol 13, Burden and Von Knippenberg, Eds pp 75-83, Elsevier,
Amsterdam, 1984 Capecchi, "High efficiency transformation by direct microinjection of DΝA into cultured mammalian cells," Cell, 22(2) 479-488, 1980 II?
Chang et ai, Nature, 375 615, 1978 Cherry, Vaccine, 10 1033-1038, 1992
Chopra et ai, Biochem. J, 232.277-279, 1985
Chou and Fasman, "Conformational Parameters for Amino Acids in Helical, β-Sheet, and Random Coil Regions Calculated from Proteins," Biochemistry, 13(2) 21 1-222, 1974b
Chou and Fasman, "Empirical Predictions of Protein Conformation," Ann. Rev. Biochem.,
47 251-276, 1978b Chou and Fasman, "Prediction of β-Turns," Biophys. J, 26 367-384, 1979 Chou and Fasman, "Prediction of Protein Conformation," Biochemistry, 13(2) 222-245, 1974a Church and Gilbert, "Genomic sequencing," Proc. Nat. Acad. Sci. USA, 81 1991-1995 , 1984
Clapp, "Somatic gene therapy into hematopoietic cells Current status and future implications,"
Clin Permatol, 20(1) 155-168, 1993 Clarke, Rev. Biochem., 61 355-386, 1995
Coburn et ai, "Diverse Lyme Disease Spirochetes Bind Integrin "π_β3 on Human Platelets," Infect. Immun., 62 5559-5567, 1994
Collins, "Positional cloning moves from perditional to traditional," Nat. Genet. 9, 347-350, 1995 Couvreur et al , "Nanocapsules, a New Lysosomotropic Carrier," FEBS Lett., 84 323-326, 1977 Couvreur, "Polyalkyleyanoacryla.es as Colloidal Drug Carriers," Cr t. Rev Ther. Drug Carrier Syst., 5 1-20, 1988 Cox et al. , J. Virol , 67(9) 5664-5667, 1993
Curiel et ai, "Adenovirus enhancement of transferrin-polylysine-mediated gene delivery," Proc.
Natl. Acad. Sci. USA, 88(19) 8850-8854, 1991 Day et al, Biochem. J, 248 801-805, 1987 Deleage and Roux, Protein Engng, 1 289-294, 1987 Devereux, Haeberii and Smithies, "A comprehensive set of sequence analysis programs for the VAX,"
Nucleic Acids Res., 12 387-395, 1984 Dietrich et al "A genetic map of the mouse with 4,006 simple sequence length polymorphisms,"
Nat. Genet. 1 220-245, 1994 Dreher et al, Eur. J. Cell Biol, 53 296-304, 1990 Durkin et al, "Amino acid sequence and domain structure of entactin Homology with epidermal growth factor precursor and low density lipoprotein receptor," J. Cell. Biol 107 2749- 2756, 1988 HI
Durkin et al, "Exon organization of the mouse entactin gene corresponds to the structural domains of the polypeptide and has regional homology the low-density lipoprotein receptor gene," Genomics 26.219-228, 1995 Eglitis and Anderson, "Retroviral vectors for introduction of genes into mammalian cells," Biotechniques, 6(7):608-614, 1988
Fiers et al. , Nature, 273 : 1 13 , 1978 Fromm et ai, "Expression of genes transferred into monocot and dicot plant cells by electroporation," Proc. Natl. Acad. Sci. USA, 82(17):5824-5828, 1985 Fukai eta , "Homozygosity mapping of the gene for Chediak-Higashi syndrome to chromosome Iq42-q44 in a segment of conserved synteny that includes the mouse beige locus (bg)," Am. J.
Hum. Genet., 59:620-624, 1996 Fynan et ai, "DNA vaccines, protective immunizations by parenteral, mucosal, and gene gun inoculations," ro . Natl. Acad. Sci. USA, 90(24) 11478-11482, 1993 Gabizon and Papahadjopoulos, "Liposomes formulations with prolonged circulation time in blood and enhanced uptake by tumors," Proc Natl Acad. Sci. USA, 85 6949-6953, 1988
Gallin et al, "Granulocyte function in the Chediak-Higashi syndrome of mice," Blood
43.201-206, 1974. Geourjon and Deleage, Protein Engng., 1 157-164, 1994.
Goding, Monoclonal Antibodies: Principles and Practice, pp. 60-74 2nd Edition, Academic Press, Orlando, FL, 1986.
Goeddel et al, Nature, 281 : 544, 1979.
Goeddel et al, Nucl. Acids Res., 8 4057, 1980
Goodrich and Holcombe, "Genetic localization of the gene for Chediak-Higashi syndrome to human chromosome lq and linkage to nidogen," J. Invest. Med. 43:Suppl 1, 13a, 1995 Goodrich and Holcombe, "Genetic localization of the gene for Chediak-Higashi syndrome to human chromosome lq and linkage to nidogen," FASEB J, 43.13a., 1995 Gow et al, "Cellular expression of the beige mouse mutation and its correction in hybrids with control human fibroblasts in vitro " Cell Dev. Biol 29:884-891, 1993 Graham and van der Eb, "Transformation of rat cells by DNA of human adenovirus 5," Virology, 54(2).536-539, 1973.
Green, "Linkage, recombination and mapping," In: Genetics and probabilities in animal breeding experiments, pp 77-1 13, Macmillan, New York, 1981 Gribrat, Gamier and Robson, J. Mol. Biol, 198.425-443, 1987. \ Ϊ0
Griffiths, "Secretory lysosomes - a special mechanism of regulated secretion in haemopoietic cells," Trends Cell Biol, 6 329-332, 1996 Harlow and Lane, Antibodies: a Laboratory Manual, Cold Spring Harbor Laboratory, Cold
Spring Harbor, NY 1988 Hearing et al, "The fine structure of melanogenesis in coat color mutants of the mouse," J.
Ultrastruct. Res. 43 88-106, 1973 Hedborn and Heinegard, J. Biol Chem., 264 6898-6905, 1989 Henry-Mi chelland et al. , "Attachment of Antibiotics to Nanoparticles, Preparation, Drug-Release and Antimicrobial Activity in vitro," Int. J. Pharm., 35 121-127, 1987 Hess et al. , J. Adv. Enzyme Reg. , 7 149, 1968 Hitzeman et ai, J. Biol Chem., 255 2073, 1980 Ho et ai, "Site-Directed Mutagenesis by Overlap Extension Using the Polymerase Chain
Reaction," Gene, 11 51-59, 1989 Holcombe et al, "Linkage of loci associated with two pigment mutations on mouse chromosome 13," Genet. Res. 58 41-50, 1991
Holcombe et al, "Lysosomal enzyme activities in Chediak-Higashi syndrome evaluation of lymphoblastoid cell lines and review of the literature," Immunodeficiency, 5 131-140,
1994 Holcombe et al, "Relationship of the genes for Chediak-Higashi syndrome (beige) and the T-cell receptor chain in mouse and man," Genomics 1 287-291, 1987
Holland et al, Biochemistry, 17 4900, 1978 Hui and Joyner, "A mouse model of Greig cephalo-polysyndactyly syndrome the extra-toesJ mutation contains an intragenic deletion of the Gli3 gene," Nat. Genet. 3 241-246, 1993 Hunter et al, "Single-strand conformational polymorphism (SSCP) mapping of the mouse genome integration of the SSCP, microsatellite, and gene maps of mouse chromosome
1," Genomics, 18:510-519, 1993 Hunter, "Radioimmunoassay," In Handbook of Experimental Immunology, D M Weir, ed ,
Blackwell Scientific Publications, Ltd , Oxford, U K , p 14 1-14 40, 1978 Itakura et al, Science, 198 1056, 1977 Ito et al, Biochem. Biophys. Res. Comm., 160 433, 1989
Jameson and Wolf, Compu. Appl. Biosci., 4(1) 181-6, 1988 Jenkins, Justice, Gilbert, Chu and Copeland, "Nidogen/entactin (Nid) maps to the proximal end of mouse chromosome 13 linked to beige (bg) and identifies a new region of homology between mouse and human chromosomes," Genomics, 59 401-403, 1991 Jones, Genetics, 85 12 1977 Justice et al , "A molecular genetic linkage map of mouse chromosome 13 anchored by the beige
(bg) and satin (so) loci," Genomics 6 341-351, 1990 Katz et al, "Mechanisms of human cell-mediated cytotoxicity II Correction of the selective defect in natural killing in the Chediak-Higashi syndrome with inducers of intracellular cyclic GM," J. Immunol. 129 297-302, 1982 Keller et al, J. Am. Med. Assoc, 271 1764-1768, 1994
Kingsmore et al, "A 6000 kilobase segment of chromosome 1 is conserved in human and mouse," EMBO J., 8 4073-4080, 1989 Kingsmore et al, "Glycine receptor β-subunit gene mutation in spastic mouse associated with
LINE-1 element insertion," Nat. Genet. 7 136-142, 1994 Kingsmore, Barbosa, Nguyen, Ashley, Blaydes, Tchernev, Detter and Lovett, "Physical mapping of the beige critical region on mouse Chromosome 13," Mamm. Genome, 7 773-775,
1996a Kingsmore, Barbosa, Tchernev, Detter, Lossie, Seldin and Holcombe, "Positional cloning of the
Chediak-Higashi syndrome gene Genetic mapping of the beige locus on mouse chromosome 13," J. Invest. Med, 44 454-461, 1996b
Kohler and Milstein, Eur. J. Immunol, 6 51 1-519, 1976 Kohler and Milstein, Nature, 256 495-497, 1975 Kolbert et al, Res. Microbiol, 146 5, 1995
Kuby, Immunology, 2nd Edition W H Freeman & Company, New York, 1994 Kusumi et al, "Construction of a large-insert yeast artificial chromosome library of the mouse genome," Mamm. Genome 4 391-392, 1993 Kyte and Doolittle, J. Mol. Biol, 157(1) 105-132, 1982 Lambright, Sondek, Bohm, Skiba, Hamm and Sigler, "The 2 0A crystal structure of a heterotrimeric G protein," Nature, 379.31 1-319, 1996 Lane and Murphy, "Susceptibility to spontaneous pneumonitis in an inbred strain of beige and satin mice," Genetics 72 451-460, 1972 Lane, "Xt-bg-cr linkage," Mouse News Lett. 45 29, 1971 ! ~
Lovett et al, "Direct selection A method for the isolation of cDNAs encoded by large genomic regions," Proc. Natl. Acad. Sci. USA, 88 9628-9632, 1991 Lovrich et al, Infect. Immun., 63 21 13-2119, 1995
Lutzner, M A , Lowrie, C T , Jordan, H W (1967) Giant granules in leukocytes of the beige mouse J. Hered. 58, 299-300
Lyon and Meredith, "Muted, a new mutant affecting coat colour and otoliths of the mouse, and its position in linkage group XIV," Genet. Res. 14 163-166, 1969 Lyon et al, "Occurrences and linkage relations of the mutant extra-toes in the mouse," Genet.
Res. 9 383-385, 1967 Maeda, Sueishi and Lida, "A case report of Chediak-Higashi syndrome complicated with systemic amyloidosis and olivo-cerebellar degeneration, " Pathol Res. Pract., 185 231-237, 1989 Maloy, et al, Microbial Genetics 2nd Edition Jones and Bartlett Publishers, Boston, MA, 1994 Maniatis et al. , Molecular Cloning: a Laboratory Manual, Cold Spring Harbor Laboratory, Cold
Spring Harbor, NY , 1982 Mattei et al, "Chromosomal localization of murine ryanodine receptor genes RYRl, RYR2, and
RYR3 by in situ hybridization," Genomics 22 202-204, 1 94 Matteoni and Kreis, "Translocation and clustering of endosomes and lysosomes depends on microtubules," J. Cell Biol, 105 1253-1265, 1987 Maucuer, Ca onis, and Sobel, Proc. Natl Acad. Sci USA, 92 3100-3104, 1995 McBride et al, Genomics, 6 219-225, 1990
Meyers, Stevens and Padgett, "A platelet serotonin anomaly in the Chediak-Higashi syndrome,"
Res. Commun. Chem. Pathol. Pharmacol, 1 375-380, 1974 Misra, King, Harding, Muddle and Thomas, "Peripheral neuropathy in the Chediak-Higashi syndrome," Ada Neuropathol Berl, 81 354-358, 1991 Myers, Eng, Ponder and Mulligan, "Characterization of RET photo-oncogene 3' splicing variants and polyadenylation sites a novel C-terminus for RET," Oncogene, 11 2039-2045, 1995 Nagle et ai, "Identification and mutation analysis of the complete gene for Chediak-Higashi syndrome," Nature Genet., 14 307-311, 1996 Nakamura et al, "Enzyme Immunoassays Heterogenous and Homogenous Systems," Chapter 27 , 1987
Nea e et al, J. Biol Chem., 264 8653-8661, 1989 Novak et al, "Correction of symptoms of platelet storage pool deficiency in animal models for
Chediak-Higashi syndrome and Hermansky-Pudlak syndrome," Blood 66.1 196-1201 ,
1985 Novak et al, "Platelet storage pool deficiency in mouse pigment mutations associated with several distinct genetic loci," Blood 63.536-544, 1984.
Oka and Weigel, "Microtubule-depolymerizing agents inhibit asialo-orosomucoid delivery to lysosomes but not its endocytosis or degradation in isolated rat hepatocytes," Biochem.
Biphys. Ada, 763.368-376, 1983 Olάberg et al, FMBO J., 8:2601-2606, 1989 Oliver and Zurier, J. Clin. Invest., 57: 1239-1247, 1976.
Oliver, Zurier and Berlin, "Concanavalin A cap formation on polymorphonuclear leukocytes of normal and beige (Chediak-Higashi) mice," Nature, 253:471-473, 1975 Orita, Suzuki, Sekiya and Hayashi, "Rapid and sensitive detection of point mutations and DNA polymorphisms using the polymerase chain reaction," Genomics. 5 874-879, 1989 Penner and Prieur, "Interspecific genetic complementation analysis with fibroblasts from humans and four species of animals with Chediak-Higashi syndrome," Am. J. Med. Genet.
28 455-70, 1987 Perou and Kaplan, "Chediak-Higashi syndrome is not due to a defect in microtubule-based lysosomal mobility," J. Cell. Sci. 106 99-107, 1993 Perou and Kaplan, Somal. CellMolec Genet., 19:459-468, 1993.
Perou et ai, "Identification of the murine beige gene by YAC complementation and positional cloning," Nature Genet., 13 303-308, 1996a. Perou, Justice, Pryor, and Kaplan, "Complementation of the beige mutation in cultured cells by episomally replicating murine yeast artificial chromosomes," Proc. Natl Acad. Sci. USA., 93 5905-5909, 1996b.
Pettit and Berdal, "Chediak-Higashi syndrome Neurologic appearance," Arch. Neurol,
41 :1001-1002, 1984. Pierce et al, "A positive selection vector for cloning high molecular weight DNA by the bacteriophage PI system. Improved cloning efficacy," Proc. Natl Acad. Sci. USA, 89:2056-2060, 1992
Plaas <?t α/., J. Biol. Chem., 265:20634-20640, 1990.
Pringle and Dodd, J. Histochem. Cytochem., 38 1405-1411, 1990.
Prokop and Bajpai, "Recombinant DNA Technology I" Ann. N Y. Acad. Sci., Vol. 646, 1991 YD
Roder and Duwe, Nature, 278 451-453, 1979 Roder, Hahotis, Laing, Kozbor, Rubin, Pross, Boxer, White, Fauci, Mostowski and Matheson, "Further studies of natural killer cell function in Chediak-Higashi patients," Immunology, 46 555-560, 1982 Root, Rosenthal and Balestra, "Abnormal bactericidal, metabolic and lysosomal functions of Chediak-Higashi Syndrome leukocytes," J. Clin Invest., 51 649-665, 1972 Rost and Sander, Proteins, 19 55-72, 1994
Sambrook et ai, Molecular Cloning: A Laboratory Manual, 2nd Edition, Chapter 12 6, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989 Sato et al, , J. Leuk. Biol, 48 377-381, 1990
Saxena, Saxena and Adler, Nature, 295 240-241, 1982 Schaible et al, Proc. Natl. Acad. Sci. USA, 87 3768-3772, 1990 Schaible et α/., Vaccine, 1 1 1049-1054, 1993 Schwan et al, Proc. Natl Acad. Sci. USA, 92 2909-2913, 1995 Segal, Biochemical Calculations, 2nd Edition John Wiley and Sons, New York, 1976
Simmler et al, "Adaptation of the interspersed repetitive sequence polymerase chain reaction to the isolation of mouse DNA probes from somatic cell hybrids on a hamster background," Genomics, 10 770-778, 1991 Sobel, "Stathmin a relay phosphoprotem for multiple signal transduction?," Trends Biochem
Sobel, Trends Biochem. Sci., 16 301-305, 1991 Steere, New Engl. J. Med, 321 586-596, 1989 Steere, Proc. Nat. Acad. Sci. USA., 91 2378-2383, 1994 Stinchcomb et al, Nature, 282 39, 1979 Studier et al, "Use of T7 RNA Polymerase to Direct Expression of Cloned Genes" Methods
Enzymol 185 1990 Sugimoto, Kusakabe and Kai, "Analysis of the in vitro translation product of a novel-type Drosophila melanogaster aldolase mRNA in which two carboxyl-terminal exons remain unspliced," Arch
Biochem. Biophys., 323 361-366, 1995 Swank and Brandt, Am. J. Path., 92 755-769, 1978
Swanson, Bushnell and Silverstein, "Tubular lysosome morphology and distribution within macrophages depend upon the integrity of cytoplasmic microtubules," Proc. Nat. Acad. Sci. USA, 84 1921-1925, 1987 Swanson, Locke, Ansel and Hollenbeck, "Radial movement of lysosomes along microtubules in permealized macrophages," J. Cell Sci. , 103 210-209, 1992 Sygiyama, Nishio, Kishimoto and Akira, "Identification of alternative splicing form of Stat2," FEBS
Lett , 381 191-194, 1996 Takeuchi et al, "Lysosomal elastase and cathepsin G in beige mice Neutrophils of beige
(Chediak-Higashi) mice selectively lack lysosomal elastase and cathepsin G," J. Exp. Med.
163 665-77, 1986 Tang et α/., Nature, 356 152-154, 1992
Targan and Oseas, "The 'lazy' NK cells of Chediak-Higashi syndrome," J. Immunol. 130 2671-2674, 1983
Tschemper et al, Gene, 10 157, 1980
Ulmer et al, "Heterologous Protection Against Influenza by Injection of DNA Encoding a Viral
Protein," Science, 259 1745-1749, 1993 Uπoste et ai, J. Exp. Med, 180 1077-1085, 1994 Van De Wetering, Castrop, Koriinev and Clevers, "Extensive alternative splicing and dual promoter usage generate Tcf-1 protein isoforms with differential transcription control properties," Mol.
Cell B ol, 16 745-752, 1996 Vanderrest and Garrone, FASEB. J., 5 2814-2823, 1991 Vogel and Heinegard, J. Biol Chem., 260 9298-9306, 1985 Vogel and Trotter, Collagen Rel. Res. , 7 105- 1 14, 1987 Vogel et al, Biochem. J, 223 587-597, 1984 Vortkamp et al, "Deletion of GLI3 supports the homology of the human Greig cephalopolysyndactyly syndrome (GCPS) and the mouse mutation extra toes (Xt),"
Mamm. Genome 3 461-463, 1992 Wagner et ai, "Coupling of adenovirus to transferrin-polylysine/DNA complexes greatly enhances receptor-mediated gene delivery and expression of transfected genes," Proc.
Natl Acad. Sci. USA, 89(13) 6099-6103, 1992 Wang et al, J. Exp. Med, 111 699, 1993 Wang et al, J. Immunol, 150 3022, 1993 Whitton et al, J. Virol, 67(1) 348-352, 1993
Willingham, Spicer and Vincent, Expl Cell Res., 136 157-168, 1981 Wilske et al. , Infect. Immun. , 61 2182-2191, 1993 Wilske έ-t α/., J. Clin. Microbiol, 31 340, 1993 \V Wilske et ai, Scand. J. Infect. Dis. Suppl, 77: 108-129, 1991
Windhorst, Zelickson and Good, "A human pigmentary dilution based on a heritable subcellular structural defect - the Chediak-Higashi syndrome," J. Invest. Dermatol, 50 9-18 , 1968 Wo\f et a , Compu. Appl. Biosci., 4(1): 187-91, 1988. Wolff, Dale, Clark, Root, and Kimball, "The Chediak-Higashi syndrome studies of host defenses," Ann. Intern. Med., 76:293-306, 1972 Wong and Neumann, "Electric field mediated gene transfer," Biochem. Biophys. Res. Commun.
107(2):584-587, 1982 Yang and Russel, Proc Natl. Acad. Sci. USA, 87:4144-4148, 1990. Zhao and Manlley, "Complex alternative RNA processing generates an unexpected diversity of poly(A) polymerase forms," Mol. Cell Biol, 16.2378-2386, 1996 Zhao, Boissy, Abdel-Malek, King, Nordlund and Boissy, "On the analysis of the pathophysiology of
Chediak-Higashi syndrome Defects expressed by cultured melanocytes," Lab. Invest.,
71.25-34, 1994
IM
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(l) APPLICANT:
(A) NAME: University of Florida
(B) STREET: 223 Grinter Hall
(C) CITY: Gainesville
(D) STATE: Florida
(E) COUNTRY: USA
(F) POSTAL CODE (ZIP) : 32611
(ii) TITLE OF INVENTION: Lystl and Lyst2 Gene Compositions and Methods of Use
(in) NUMBER OF SEQUENCES: 78
(IV) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.30 (EPO)
(vi) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 60/011,146 <B) FILING DATE: 01-FEB-1996
(vi) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US UNKNOWN
(B) FILING DATE: 23-DEC-1996
(vi) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 60/033,599
(B) FILING DATE: 20-DEC-1996
(2) INFORMATION FOR SEQ ID NO: 1:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3514 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
TTTAAAAATT AGAGCTTGCT TGGAAAAGCA GCCTGAGCCT TTCTCCCCGA GACAAAAGAA 60
AACACTACAG GAGGTCCAGG AGGGCTTTGT ATTTTCCAAG TATCGTCACC GAGCCCTTCT 120
ACTACCTGAG CTTCTGGAAG GAGTTCTACA GCTCCTCATC TCTTGTCTTC AGAGTGCAGC 180
TTCAAATCCC TTTTACTTCA GTCAAGCCAT GGATTTAGTT CAAGAATTTA TCCAGCACCA 240
AGGATTTAAT CTCTTTGGAA CAGCAGTTCT TCAGATGGAA TGGCTGCTTA CAAGGGACGG 300
TGTTCCTTCA GAAGCTGCAG AACATTTGAA AGCTCTGATA AACAGTGTAA TAAAAATAAT 360
GAGTACTGTG AAAAAGGTGA AATCAGAGCA ACTTCATCAT TCCATGTGCA CAAGGAAAAG 420
ACACCGGCGT TGTGAGTATT CCCACTTCAT GCAGCACCAC CGCGATCTTT CAGGGCTCCT 480
GGTTTCAGCT TTTAAAAATC AGCTTTCTAA AAGCCCCTTT GAAGAGACCG CAGAGGGAGA 540 TGTGCAGTAT CCAGAGCGCT GCTGCTGCAT GCTCACCAGT GCTTGCGCTT 600
GCTGCAGCAG GTTTCCCTGA GCACCACGTG TGTCCAGATC CTATCAGGTG TACACAGTGT 660
TGGAATCTGT TGTTGTATGG ATCCTAAGTC TGTGATCGCC CCTTTACTGC ATGCTTTTAA 720
GTTGCCAGCA CTGAAAGCTT TCCAGCAGCA TATACTGAAT GTCCTGAGCA AACTTCTTGT 780
GGATCAGTTA GGAGGAGCAG AGCTATCACC GAGAATTAAA AAAGCAGCTT GCAACATCTG 840
TACTGTGGAC TCTGACCAAC TGGCTAAGTT AGGAGAGACA CTGCAAGGCA CCTTGTGTGG 900
TGCTGGTCCT ACCTCCGGCT TGCCCAGTCC TTCCTACCGA TTTCAGGGGA TCCTGCCCAG 960
CAGCGGCTCT GAAGACTTGC TGTGGAAGTG GGATGCATTA GAGGCTTATC AGAGCTTTGT 1020
CTTTCAAGAA GACAGATTAC ATAACATTCA GATTGCAAAT CACATTTGTA ATTTACTCCA 1080
GAAAGGCAAT GTAGTTGTTC AGTGGAAATT GTATAATTAT ATCTTTAATC CTGTGCTCCA 1140
AAGAGGAGTT GAATTAGTAC ATCATTGTCA ACAGCTAAGC ATTCCTTCAG CTCAGACTCA 1200
CATGTGTAGC CAACTGAAAC AGTATTTGCC TCAGGAAGTG CTTCAGATTT ATTTAAAAAC 1260
TCTACCTGTC CTACTTAAAT CCAGGGTAAT AAGAGATTTG TTTTTAAGTT GTAATGGAGT 1320
AAACCA.CATA ATTGAACTAA ATTACTTAGA TGGGATTCGA AGTCATTCCC TGAAAGCATT 1380
TGAAACTCTG ATTGTCAGCC TAGGGGAACA ACAGAAAGAT GCTGCAGTTC TAGACGTCGA 1440
TGGGTTAGAC ATCCAACAGG AGTTGCCGTC CTTAAGTGTG GGTCCTTCTC TTCATAAGCA 1500
GCAAGCTTCT TCAGATTCTC CTTGCAGTCT CAGGAAGTTT TATGCCAGCC TCAGAGAGCC 1560
TGATCCAAAA AAACGAAAGA CCATTCACCA GGATGTTCAC ATAAACACCA TAAACCTCTT 1620
CCTCTGTGTG GCTTTTCTAT GTGTCAGTAA AGAAGCAGAC TCTGATAGGG AGTCTGCCAA 1680
TGAGTCAGAA GATACTTCTG GCTATGACAG CCCTCCCAGT GAGCCATTAA GTCACATGCT 1740
ACCATGTCTG TCTCTTGAGG ACGTTGTCTT ACCTTCCCCT GAATGTTTGC ACCATGCAGC 1800
AGACATTTGG TCCATGTGTC GTTGGATCTA CATGTTGAAC TCAGTCTTCC AGAAACAATT 1860
TCACAGGCTT GGTGGTTTCC AAGTGTGCCA TGAATTAATA TTTATGATAA TCCAGAAACT 1920
ATTCAGAAGT CATACAGAGG ATCAAGGAAG AAGGCAGGGA GAAATGAGTA GAAATGAAAA 1980
CCAAGAGCTA ATCAGGATAT CTTACCCCGA GCTGACACTG AAGGGAGATG TATCATCTGC 2040
AACAGCACCA GACCTGGGAT TTCTGAGAAA GAGTGCTGAC AGCGTGCGTG GATTCCAGTC 2100
ACAGCCTGTG CTTCCCACAA GTGCAGAGCA GATTGTGGCT ACTGAATCTG TTCCTGGGGA 2160
ACGAAAGGCA TTTATGAGTC AACAAAGTGA GACTTCTCTC CAGAGCATAC GACTTTTGGA 2220
GTCTCTCCTG GACATTTGTC TTCATAGTGC CAGAGCCTGT CAACAGAAGA TGGAATTGGA 2280
GCTACCGTCT CAGGGCTTGT CTGTGGAAAA TATATTGTGT GAACTGAGGG AACACCTTTC 2340
CCAGTCAAAG GTGGCAGAAA CAGAATTAGC AAAGCCTTTA TTTGATGCCC TGCTTCGAGT 2400
AGCCCTGGGG AATCATTCAG CAGATTTGGG CCCTGGTGAT GCTGTGACTG AGAAGAGTCA 2460
TCCCTCTGAG GAAGAGCTGT TGTCCCAGCC CGGAGATTTT TCAGAAGAAG CTGAGGATTC 2520 TCAGTGTTGT AGTTTGAAAC TTCTGGGTGA GGAAGAAGGC TATGAAGCGG ATAGTGAAAG 2580
CAATCCTGAG GATGTTGACA CCCAAGACGA TGGAGTAGAA TTAAATCCTG AAGCAGAAGG 2640
TTTCAGTGGA TCGATTGTTT CAAACAACTT ACTTGAAAAC CTCACTCACG GGGAAATAAT 2700
ATACCCTGAG ATTTGCATGC TGGGATTAAA TTTGCTTTCT GCTAGCAAAG CTAAACTTGA 2760
TGTGCTTGCT CATGTGTTTG AGAGCTTTCT GAAAATTGTC AGGCAGAAGG AAAAGAACAT 2820
TTCTCTCCTC ATACAACAGG GAACTGTGAA AATCCTTCTA GGCGGGTTCT TGAATATTTT 2880
AACACAAACT AACTCTGATT TCCAAGCATG CCAGAGAGTA CTGGTGGATC TCTTGGTATC 2940
TTTGATGAGC TCAAGAACGT GTTCAGAAGA CTTAACTCTT CTTTGGAGAA TATTTCTGGA 3000
GAAATCTCCT TGTACAGAAA TTCTTCTCCT TGGTATTCAC AAAATTGTTG AAAGTGATTT 3060
TACTATGAGC CCTTCACAGT GTCTGACCTT TCCTTTCCTG CATACCCCGA GTTTAAGCAA 3120
TGGTGTCTTA TCACAGAAAC CTCCTGGGAT TCTTAACAGT AAAGCCTTAG GCTTATTGAG 3180
AAGAGCACGG ATTTCCCGAG GCAAGAAAGA GGCTGATAGA GAGAGTTTTC CCTATAGGCT 3240
GCTTTCCTCT TGGCATATAG CCCCAATCCA CCTGCCGTTG CTGGGACAGA ACTGCTGGCC 3300
ACACCTGTCA GAAGGATTTA GTGTTTCTCT TGTGGGTTTA ATGTGGAATA CATCCAATGA 3360
ATCCGAGAGT GCTGCAGAAA GGGGAAAAGG AGTAAGGAAA AGAACAAACC ATCAGTTCTG 3420
GAAGACAGCA GTTTTGAAGG AGCAGAAGGT GATAGACCAG AAGTTACAGA ATCCATCAAT 3480
CCTGGTGACA GCTCGTGCCG AATTCGGCAA CGAG 3514
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1185 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Leu Lys Ile Arg Ala Cys Leu Glu Lys Gin Pro Glu Pro Phe Ser Pro 1 5 10 15
Arg Gin Lys Lys Thr Leu Gin Glu Val Gin Glu Gly Phe Val Phe Ser 20 25 30
Lys Tyr Arg His Arg Ala Leu Leu Leu Pro Glu Leu Leu Glu Gly Val 35 40 45
Leu Gin Leu Leu Ile Ser Cys Leu Gin Ser Ala Ala Ser Asn Pro Phe 50 55 60
Tyr Phe Ser Gin Ala Met Asp Leu Val Gin Glu Phe Ile Gin His Gin 65 70 75 80
Gly Phe Asn Leu Phe Gly Thr Ala Val Leu Gin Met Glu Trp Leu Leu 85 90 95
Thr Arg Asp Gly Val Pro Ser Glu Ala Ala Glu His Leu Lys Ala Leu I3&
100 105 110
Ile Asn Ser Val Ile Lys Ile Met Ser Thr Val Lys Lys Val Lys Ser 115 120 125
Glu Gin Leu His His Ser Met Cys Thr Arg Lys Arg His Arg Arg Cys 130 135 140
Glu Tyr Ser His Phe Met Gin His His Arg Asp Leu Ser Gly Leu Leu 145 150 155 160
Val Ser Ala Phe Lys Asn Gin Leu Ser Lys Ser Pro Phe Glu Glu Thr 165 170 175
Ala Glu Gly Asp Val Gin Tyr Pro Glu Arg Cys Cys Cys Ile Ala Val 180 185 190
Cys Ala His Gin Cys Leu Arg Leu Leu Gin Gin Val Ser Leu Ser Thr 195 200 205
Thr Cys Val Gin Ile Leu Ser Gly Val His Ser Val Gly Ile Cys Cys 210 215 220
Cys Met Asp Pro Lys Ser Val Ile Ala Pro Leu Leu His Ala Phe Lys 225 230 235 240
Leu Pro Ala Leu Lys Ala Phe Gin Gin His Ile Leu Asn Val Leu Ser 245 250 255
Lys Leu Leu Val Asp Gin Leu Gly Gly Ala Glu Leu Ser Pro Arg Ile 260 265 270
Lys Lys Ala Ala Cys Asn Ile Cys Thr Val Asp Ser Asp Gin Leu Ala 275 280 285
Lys Leu Gly Glu Thr Leu Gin Gly Thr Leu Cys Gly Ala Gly Pro Thr 290 295 300
Ser Gly Leu Pro Ser Pro Ser Tyr Arg Phe Gin Gly Ile Leu Pro Ser 305 310 315 320
Ser Gly Ser Glu Asp Leu Leu Trp Lys Trp Asp Ala Leu Glu Ala Tyr 325 330 335
Gin Ser Phe Val Phe Gin Glu Asp Arg Leu His Asn Ile Gin Ile Ala 340 345 350
Asn His Ile Cys Asn Leu Leu Gin Lys Gly Asn Val Val Val Gin Trp 355 360 365
Lys Leu Tyr Asn Tyr Ile Phe Asn Pro Val Leu Gin Arg Gly Val Glu 370 375 380
Leu Val His His Cys Gin Gin Leu Ser Ile Pro Ser Ala Gin Thr His 385 390 395 400
Met Cys Ser Gin Leu Lys Gin Tyr Leu Pro Gin Glu Val Leu Gin Ile 405 410 415
Tyr Leu Lys Thr Leu Pro Val Leu Leu Lys Ser Arg Val Ile Arg Asp 420 425 430
Leu Phe Leu Ser Cys Asn Gly Val Asn His Ile Ile Glu Leu Asn Tyr 435 440 445 Leu Asp Gly lie Arg Ser His Ser Leu Lys Ala Phe Glu Thr Leu Ile 450 455. 460
Val Ser Leu Gly Glu Gin Gin Lys Asp Ala Ala Val Leu Asp Val Asp 465 470 475 480
Gly Leu Asp Ile Gin Gin Glu Leu Pro Ser Leu Ser Val Gly Pro Ser 485 490 495
Leu His Lys Gin Gin Ala Ser Ser Asp Ser Pro Cys Ser Leu Arg Lys 500 505 510
Phe Tyr Ala Ser Leu Arg Glu Pro Asp Pro Lys Lys Arg Lys Thr Ile 515 520 525
His Gin Asp Val His Ile Asn Thr Ile Asn Leu Phe Leu Cys Val Ala 530 535 540
Phe Leu Cys Val Ser Lys Glu Ala Asp Ser Asp Arg Glu Ser Ala Asn 545 550 555 560
Glu Ser Glu Asp Thr Ser Gly Tyr Asp Ser Pro Pro Ser Glu Pro Leu 565 570 575
Ser His Met Leu Pro Cys Leu Ser Leu Glu Asp Val Val Leu Pro Ser 580 585 590
Pro Glu Cys Leu His His Ala Ala Asp Ile Trp Ser Met Cys Arg Trp 595 600 605
Ile Tyr Met Leu Asn Ser Val Phe Gin Lys Gin Phe His Arg Leu Gly 610 615 620
Gly Phe Gin Val Cys His Glu Leu Ile Phe Met Ile lie Gin Lys Leu 625 630 635 640
Phe Arg Ser His Thr Glu Asp Gin Gly Arg Arg Gin Gly Glu Met Ser 645 650 655
Arg Asn Glu Asn Gin Glu Leu Ile Arg Ile Ser Tyr Pro Glu Leu Thr 660 665 670
Leu Lys Gly Asp Val Ser Ser Ala Thr Ala Pro Asp Leu Gly Phe Leu 675 680 685
Arg Lys Ser Ala Asp Ser Val Arg Gly Phe Gin Ser Gin Pro Val Leu 690 695 700
Pro Thr Ser Ala Glu Gin Ile Val Ala Thr Glu Ser Val Pro Gly Glu 705 710 715 720
Arg Lys Ala Phe Met Ser Gin Gin Ser Glu Thr Ser Leu Gin Ser Ile 725 730 735
Arg Leu Leu Glu Ser Leu Leu Asp Ile Cys Leu His Ser Ala Arg Ala 740 745 750
Cys Gin Gin Lys Met Glu Leu Glu Leu Pro Ser Gin Gly Leu Ser Val 755 760 765
Glu Asn lie Leu Cys Glu Leu Arg Glu His Leu Ser Gin Ser Lys Val 770 775 780
Ala Glu Thr Glu Leu Ala Lys Pro Leu Phe Asp Ala Leu Leu Arg Val 785 790 795 800 Ala Leu Gly Asn His Ser Ala Asp Leu Gly Pro Gly Asp Ala Val Thr 805 810 815
Glu Lys Ser His Pro Ser Glu Glu Glu Leu Leu Ser Gin Pro Gly Asp 820 825 830
Phe Ser Glu Glu Ala Glu Asp Ser Gin Cys Cys Ser Leu Lys Leu Leu 835 840 845
Gly Glu Glu Glu Gly Tyr Glu Ala Asp Ser Glu Ser Asn Pro Glu Asp 850 855 860
Val Asp Thr Gin Asp Asp Gly Val Glu Leu Asn Pro Glu Ala Glu Gly 865 870 875 880
Phe Ser Gly Ser Ile Val Ser Asn Asn Leu Leu Glu Asn Leu Thr His 885 890 895
Gly Glu Ile Ile Tyr Pro Glu Ile Cys Met Leu Gly Leu Asn Leu Leu 900 905 910
Ser Ala Ser Lys Ala Lys Leu Asp Val Leu Ala His Val Phe Glu Ser 915 920 925
Phe Leu Lys Ile Val Arg Gin Lys Glu Lys Asn Ile Ser Leu Leu Ile 930 935 940
Gin Gin Gly Thr Val Lys Ile Leu Leu Gly Gly Phe Leu Asn Ile Leu 945 950 955 960
Thr Gin Thr Asn Ser Asp Phe Gin Ala Cys Gin Arg Val Leu Val Asp 965 970 975
Leu Leu Val Ser Leu Met Ser Ser Arg Thr Cys Ser Glu Asp Leu Thr 980 985 990
Leu Leu Trp Arg Ile Phe Leu Glu Lys Ser Pro Cys Thr Glu lie Leu 995 1000 1005
Leu Leu Gly Ile His Lys Ile Val Glu Ser Asp Phe Thr Met Ser Pro 1010 1015 1020
Ser Gin Cys Leu Thr Phe Pro Phe Leu His Thr Pro Ser Leu Ser Asn 1025 1030 1035 1040
Gly Val Leu Ser Gin Lys Pro Pro Gly Ile Leu Asn Ser Lys Ala Leu 1045 1050 1055
Gly Leu Leu Arg Arg Ala Arg Ile Ser Arg Gly Lys Lys Glu Ala Asp 1060 1065 1070
Arg Glu Ser Phe Pro Tyr Arg Leu Leu Ser Ser Trp His Ile Ala Pro 1075 1080 1085
Ile His Leu Pro Leu Leu Gly Gin Asn Cys Trp Pro His Leu Ser Glu 1090 1095 1100
Gly Phe Ser Val Ser Leu Val Gly Leu Met Trp Asn Thr Ser Asn Glu 1105 1110 1115 1120
Ser Glu Ser Ala Ala Glu Arg Gly Lys Arg Val Lys Lys Arg Asn Lys 1125 1130 1135
Pro Ser Val Leu Glu Asp Ser Ser Phe Glu Gly Ala Gly Met Met Ala
Gly Ser Asp Leu Tyr Thr Lys Ile Leu Gin lie Ala Ala Cys Leu Ser 1155 1160 1165
Phe Lys His Ile Trp Gin Tyr Phe Asn Val Phe Phe Lys Cys Tyr Ser 1170 1175 1180
Pro 1185
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11817 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
TGAGAGCTCA CGCTGGCCTG GCAGCCTTGG TGAGTCGGGA TTCTCCTGCA CCGGCGGGCG 60
AGAGCGCGCG GCGGACCACA GAGCGGAGGT GAAGCCTTAT GCTGAGACAG TTTTATCTAG 120
TTCATGAACC CAAATTATAT ACAAGCTGAA TGTTACAGAA GTGCTGAAAG ACTGCTCTGT 180
CATGAGCACG GACAGCAACT CATTGGCACG TGAGTTTCTG ATTGATGTCA ACCAGCTTTG 240
CAATGCAGTG GTCCAGAGGG CAGAAGCCAG GGAAGAAGAA GAAGAGGAGA CACACATGGC 300
AACTCTTGGA CAGTACCTTG TCCATGGACG AGGATTTCTG TTACTTACCA AACTAAATTC 360
TATCATTGAT CAGGCCCTGA CATGCAGAGA AGAACTCCTG ACTCTTCTTC TGTCGCTCCT 420
TCCCTTGGTG TGGAAGATAC CTGTCCAGGA ACAGCAGGCA ACAGATTTTA ACCTGCCACT 480
GTCATCTGAT ATAATCCTGA CCAAAGAAAA GAACTCAAGT TTGCAAAAAT CAACTCAGGG 540
AAAATTATAT TTAGAAGGAA GTGCTCCATC TGGTCAGGTT TCTGCAAAAG TAAACCTTTT 600
TCGAAAAATC AGGCGACAGC GTAAAAGTAC CCATCGTTAT TCTGTAAGAG ATGCAAGAAA 660
GACACAGCTC TCCACCTCTG ACTCCGAAGG CAACTCAGAT GAAAAGAGTA CGGTTGTGAG 720
TAAACACAGG AGGCTCCACG CGCTGCCACG GTTCCTGACG CAGTCTCCTA AGGAAGGCCA 780
CCTCGTAGCC AAACCTGACC CCTCTGCCAC CAAAGAACAG GTCCTTTCTG ACACCATGTC 840
TGTGGAAAAC TCCAGAGAAG TCATTCTGAG ACAGGATTCA AATGGTGACA TATTAAGTGA 900
GCCAGCTGCT TTGTCTATTC TCAGTAACAT GAATAATTCT CCTTTTGACT TATGTCATGT 960
TTTGTTATCT CTATTGGAAA AAGTTTGTAA GTTTGACATT GCTTTGAATC ATAATTCTTC 1020
CCTAGCACTC AGTGTAGTAC CCACACTGAC TGAGTTCCTA GCAGGCTTTG GGGACTGCTG 1080
TAACCAGAGT GACACTTTGG AGGGACAACT GGTTTCTGCA GGTTGGACAG AAGAGCCGGT 1140
AGCTTTGGTT CAACGGATGC TCTTTCGAAC CGTGCTGCAC CTTATGTCAG TAGACGTTAG 1200
CACTGCAGAG GCAATGCCAG AAAGTCTTAG GAAAAATTTG ACTGAATTGC TTAGGGCAGC 1260
TTTAAAAATT AGAGCTTGCT TGGAAAAGCA GCCTGAGCCT TTCTCCCCGA GACAAAAGAA 1320 AACACTACAG GAGGTCCAGG AGGGCTTTGT ATTTTCCAAG TATCGTCACC GAGCCCTTCT 1380
ACTACCTGAG CTTCTGGAAG GAGTTCTACA GCTCCTCATC TCTTGTCTTC AGAGTGCAGC 1440
TTCAAATCCC TTTTACTTCA GTCAAGCCAT GGATTTAGTT CAAGAATTTA TCCAGCACCA 1500
AGGATTTAAT CTCTTTGGAA CAGCAGTTCT TCAGATGGAA TGGCTGCTTA CAAGGGACGG 1560
TGTTCCTTCA GAAGCTGCAG AACATTTGAA AGCTCTGATA AACAGTGTAA TAAAAATAAT 1620
GAGTACTGTG AAAAAGGTGA AATCAGAGCA ACTTCATCAT TCCATGTGCA CAAGGAAAAG 1680
ACACCGGCGT TGTGAGTATT CCCACTTCAT GCAGCACCAC CGCGATCTTT CAGGGCTCCT 1740
GGTTTCAGCT TTTAAAAATC AGCTTTCTAA AAGCCCCTTT GAAGAGACCG CAGAGGGAGA 1800
TGTGCAGTAT CCAGAGCGCT GCTGCTGCAT CGCCGTGTGC GCTCACCAGT GCTTGCGCTT 1860
GCTGCAGCAG GTTTCCCTGA GCACCACGTG TGTCCAGATC CTATCAGGTG TACACAGTGT 1920
TGGAATCTGT TGTTGTATGG ATCCTAAGTC TGTGATCGCC CCTTTACTGC ATGCTTTTAA 1980
GTTGCCAGCA CTGAAAGCTT TCCAGCAGCA TATACTGAAT GTCCTGAGCA AACTTCTTGT 2040
GGATCAGTTA GGAGGAGCAG AGCTATCACC GAGAATTAAA AAAGCAGCTT GCAACATCTG 2100
TACTGTGGAC TCTGACCAAC TGGCTAAGTT AGGAGAGACA CTGCAAGGCA CCTTGTGTGG 2160
TGCTGGTCCT ACCTCCGGCT TGCCCAGTCC TTCCTACCGA TTTCAGGGGA TCCTGCCCAG 2220
CAGCGGCTCT GAAGACTTGC TGTGGAAGTG GGATGCATTA GAGGCTTATC AGAGCTTTGT 2280
CTTTCAAGAA GACAGATTAC ATAACATTCA GATTGCAAAT CACATTTGTA ATTTACTCCA 2340
GAAAGGCAAT GTAGTTGTTC AGTGGAAATT GTATAATTAT ATCTTTAATC CTGTGCTCCA 2400
AAGAGGAGTT GAATTAGTAC ATCATTGTCA ACAGCTAAGC ATTCCTTCAG CTCAGACTCA 2460
CATGTGTAGC CAACTGAAAC AGTATTTGCC TCAGGAAGTG CTTCAGATTT ATTTAAAAAC 2520
TCTACCTGTC CTACTTAAAT CCAGGGTAAT AAGAGATTTG TTTTTAAGTT GTAATGGAGT 2580
AAACCACATA ATTGAACTAA ATTACTTAGA TGGGATTCGA AGTCATTCCC TGAAAGCATT 2640
TGAAACTCTG ATTGTCAGCC TAGGGGAACA ACAGAAAGAT GCTGCAGTTC TAGACGTCGA 2700
TGGGTTAGAC ATCCAACAGG AGTTGCCGTC CTTAAGTGTG GGTCCTTCTC TTCATAAGCA 2760
GCAAGCTTCT TCAGATTCTC CTTGCAGTCT CAGGAAGTTT TATGCCAGCC TCAGAGAGCC 2820
TGATCCAAAA AAACGAAAGA CCATTCACCA GGATGTTCAC ATAAACACCA TAAACCTCTT 2880
CCTCTGTGTG GCTTTTCTAT GTGTCAGTAA AGAAGCAGAC TCTGATAGGG AGTCTGCCAA 2940
TGAGTCAGAA GATACTTCTG GCTATGACAG CCCTCCCAGT GAGCCATTAA GTCACATGCT 3000
ACCATGTCTG TCTCTTGAGG ACGTTGTCTT ACCTTCCCCT GAATGTTTGC ACCATGCAGC 3060
AGACATTTGG TCCATGTGTC GTTGGATCTA CATGTTGAAC TCAGTCTTCC AGAAACAATT 3120
TCACAGGCTT GGTGGTTTCC AAGTGTGCCA TGAATTAATA TTTATGATAA TCCAGAAACT 3180
ATTCAGAAGT CATACAGAGG ATCAAGGAAG AAGGCAGGGA GAAATGAGTA GAAATGAAAA 3240 CCAAGAGCTA ATCAGGATAT CTTACCCCGA GCTGACACTG AAGGGAGATG TATCATCTGC 3300
AACAGCACCA GACCTGGGAT TTCTGAGAAA GAGTGCTGAC AGCGTGCGTG GATTCCAGTC 3360
ACAGCCTGTG CTTCCCACAA GTGCAGAGCA GATTGTGGCT ACTGAATCTG TTCCTGGGGA 3420
ACGAAAGGCA TTTATGAGTC AACAAAGTGA GACTTCTCTC CAGAGCATAC GACTTTTGGA 3480
GTCTCTCCTG GACATTTGTC TTCATAGTGC CAGAGCCTGT CAACAGAAGA TGGAATTGGA 3540
GCTACCGTCT CAGGGCTTGT CTGTGGAAAA TATATTGTGT GAACTGAGGG AACACCTTTC 3600
CCAGTCAAAG GTGGCAGAAA CAGAATTAGC AAAGCCTTTA TTTGATGCCC TGCTTCGAGT 3660
AGCCCTGGGG AATCATTCAG CAGATTTGGG CCCTGGTGAT GCTGTGACTG AGAAGAGTCA 3720
TCCCTCTGAG GAAGAGCTGT TGTCCCAGCC CGGAGATTTT TCAGAAGAAG CTGAGGATTC 3780
TCAGTGTTGT AGTTTGAAAC TTCTGGGTGA GGAAGAAGGC TATGAAGCGG ATAGTGAAAG 3840
CAATCCTGAG GATGTTGACA CCCAAGACGA TGGAGTAGAA TTAAATCCTG AAGCAGAAGG 3900
TTTCAGTGGA TCGATTGTTT CAAACAACTT ACTTGAAAAC CTCACTCACG GGGAAATAAT 3960
ATACCCTGAG ATTTGCATGC TGGGATTAAA TTTGCTTTCT GCTAGCAAAG CTAAACTTGA 4020
TGTGCTTGCT CATGTGTTTG AGAGCTTTCT GAAAATTGTC AGGCAGAAGG AAAAGAACAT 4080
TTCTCTCCTC ATACAACAGG GAACTGTGAA AATCCTTCTA GGCGGGTTCT TGAATATTTT 4140
AACACAAACT AACTCTGATT TCCAAGCATG CCAGAGAGTA CTGGTGGATC TCTTGGTATC 4200
TTTGATGAGC TCAAGAACGT GTTCAGAAGA CTTAACTCTT CTTTGGAGAA TATTTCTGGA 4260
GAAATCTCCT TGTACAGAAA TTCTTCTCCT TGGTATTCAC AAAATTGTTG AAAGTGATTT 4320
TACTATGAGC CCTTCACAGT GTCTGACCTT TCCTTTCCTG CATACCCCGA GTTTAAGCAA 4380
TGGTGTCTTA TCACAGAAAC CTCCTGGGAT TCTTAACAGT AAAGCCTTAG GCTTATTGAG 4440
AAGAGCACGG ATTTCCCGAG GCAAGAAAGA GGCTGATAGA GAGAGTTTTC CCTATAGGCT 4500
GCTTTCCTCT TGGCATATAG CCCCAATCCA CCTGCCGTTG CTGGGACAGA ACTGCTGGCC 4560
ACACCTGTCA GAAGGATTTA GTGTTTCTCT TGTGGGTTTA ATGTGGAATA CATCCAATGA 4620
ATCCGAGAGT GCTGCAGAAA GGGGAAAAAG AGTAAAGAAA AGAAACAAAC CATCAGTTCT 4680
GGAAGACAGC AGTTTTGAAG GAGCAGAAGG TGATAGACCA GAAGTTACAG AATCCATCAA 4740
TCCTGGTGAC AGACTCATAG AAGACGGCTG TATTCATTTG ATTTCACTGG GGTCCAAAGC 4800
ATTGATGATC CAAGTGTGGG CTGATCCCCA CAGTGGCACT TTTATCTTTC GTGTGTGCAT 4860
GGACTCAAAT GATGACACGA AAGCTGTCTC ACTAGCACAG GTGGAATCAC AGGAGAATAT 4920
TTTCTTTCCA AGCAAATGGC AACACTTAGT ACTTACCTAT ATTCAGCATC CTCAAGGGAA 4980
AAAGAATGTC CATGGGGAAA TCTCCATATG GGTCTCTGGG CAGAGGAAGA CTGATGTCAT 5040
CTTGGATTTT GTGCTCCCAA GAAAAACAAG CTTATCATCA GACAGCAATA AAACATTTTG 5100
CATGATTGGT CATTGCTTAA CATCCCAAGA AGAGTCTCTG CAATTAGCTG GAAAATGGGA 5160
CCTGGGGAAC TTGCTCCTCT TCAATGGAGC TAAAATTGGC TCACAAGAGG CCTTTTTCCT 5220 GTATGCTTGT GGACCCAACT ACACATCCAT CATGCCGTGT AAATATGGAC AGCCAGTCAT 5280
TGACTACTCC AAATACATTA ATAAAGACAT TTTGAGATGT GATGAAATCA GAGACCTTTT 5340
TATGACCAAG AAAGAAGTGG ATGTTGGTCT CTTAATTGAA AGTCTTTCAG TTGTTTATAC 5400
AACTTGCTGT CCTGCTCAGT ACACCATCTA TGAACCAGTG ATTCGACTCA AGGGCCAAGT 5460
GAAAACTCAG CCCTCTCAAA GACCCTTCAG CTCAAAGGAA GCCCAGAGCA TCTTGCTAGA 5520
ACCTTCTCAA CTCAAAGGCC TCCAACCTAC GGAATGTAAA GCCATCCAGG GCATTCTGCA 5580
TGAGATTGGT GGGGCTGGCA CATTTGTTTT TCTCTTTGCT AGGGTTGTTG AACTTAGTAG 5640
CTGTGAAGAA ACTCAAGCAT TAGCACTGCG GGTTATACTG TCTTTAATTA AGTACAGCCA 5700
ACAGAGAACA CAGGAACTGG AAAATTGTAA TGGACTCTCT ATGATTCACC AAGTGTTGGT 5760
CAAACAGAAA TGCATTGTTG GCTTTCACAT TTTGAAGACC CTTCTTGAAG GTTGCTGCGG 5820
TGAAGAAGTT ATCCACGTCA GTGAGCATGG AGAGTTCAAG CTGGATGTTG AGTCTCATGC 5880
TATAATCCAA GATGTTAAGC TGCTGCAGGA ACTGTTACTT GACTGGAAGA TATGGAATAA 5940
GGCAGAGCAA GGTGTGTGGG AGACTCTGCT AGCAGCTTTG GAAGTCCTCA TCCGGGTAGA 6000
GCACCACCAG CAGCAGTTTA ATATTAAGCA GTTGCTGAAC GCCCACGTGG TTCACCACTT 6060
CCTACTGACC TGTCAGGTTT TACAGGAACA CAGAGAGGGG CAGCTTACAT CTATGCCCCG 6120
AGAAGTTTGT AGATCATTTG TGAAAATCAT TGCAGAAGTC CTTGGTTCTC CTCCAGACTT 6180
GGAΛTTATTG ACAGTTATTT TCAATTTCCT GTTAGCTGTA CACCCTCCTA CTAATACTTA 6240
TGTTTGTCAC AATCCCACAA ACTTCTACTT CTCTTTGCAC ATAGATGGCA AGATCTTTCA 6300
GGAGAAAGTG CAGTCACTCG CGTACCTGAG GCATTCTAGC AGCGGAGGGC AAGCCTTTCC 6360
CAGCCCTGGA TTCCTGGTAA TAAGCCCATC TGCCTTTACT GCAGCTCCTC CTGAAGGAAC 6420
CAGTTCTTCC AATATTGTTC CACAGCGGAT GGCTGCTCAG ATGGTTCGAT CTAGAAGTCT 6480
ACCAGCATTT CCTACTTATT TACCACTAAT ACGAGCACAA AAACTGGCTG CAAGTTTGGG 6540
TTTTAGTGTT GACAAGTTAC AAAATATTGC AGATGCCAAC CCAGAGAAAC AGAATCTTTT 6600
AGGAAGACCC TACGCACTGA AAACAAGCAA AGAGGAAGCA TTCATCAGCA GCTGTGAGTC 6660
TGCAAAGACT GTTTGTGAAA TGGAGGCTCT TCTTGGAGCC CACGCCTCTG CCAATGGGGT 6720
TTCCAGAGGA TCACCGAGGT TCCCCAGGGC CAGAGTAGAT CACAAAGATG TGGGAACAGA 6780
GCCCAGATCA GATGATGACA GTCCTGGGGA TGAGTCTTAC CCACGTCGGC CTGACAACCT 6840
CAAGGGACTG GCCTCATTCC AGCGAAGCCA AAGCACTGTC GCAAGCCTTG GGCTGGCGTT 6900
TCCCTCTCAG AATGGATCTG CAGTTGCTAG CAGGTGGCCA AGTCTTGTTG ATAGGAATGC 6960
TGATGACTGG GAGAACTTTA CCTTTTCTCC TGCTTATGAG GCAAGCTACA ACCGAGCCAC 7020
AAGCACCCAC AGTGTCATTG AAGACTGTCT GATACCTATC TGCTGTGGAT TATATGAACT 7080
CTTAAGTGGG GTTCTTCTTG TCCTGCCTGA TGCTATGCTT GAAGATGTGA TGGACAGGAT 7140 131
TATTCAAGCA GATATTCTTC TAGTCCTTGT TAACCACCCA TCACCTGCTA TCCAGCAAGG 7200
AGTAATTAAA CTGTTACATG CATACATTAA TAGAGCATCA AAGGAGCAAA AGGACAAGTT 7260
TCTGAAGAAC CGTGGCTTTT CCTTATTAGC CAACCAGTTG TATCTTCATA GGGGAACTCA 7320
GGAGTTGTTG GAGTGCTTTG TTGAAATGTT CTTTGGTCGA CCGATTGGCC TGGATGAAGA 7380
ATTTGATCTG GAGGAAGTGA AGCACATGGA ACTGTTCCAG AAGTGGTCTG TCATTCCCGT 7440
TCTCGGACTA ATAGAGACCT CTCTCTATGA CAATGTCCTC TTGCACAATG CTCTTTTACT 7500
TCTTCTGCAA GTTTTAAACT CTTGTTCCAA GGTAGCAGAC ATGCTACTGG ACAATGGTCT 7560
ACTCTATGTA TTATGTAATA CAGTAGCAGC CCTGAATGGA TTAGAAAAGA ACATTCCTGT 7620
GAACGAATAC AAATTGCTCG CATGTGATAT ACAGCAGCTT TTCATAGCAG TTACAATTCA 7680
TGCTTGCAGT TCCTCAGGCA CACAGTATTT TAGAGTGATT GAAGACCTTA TTGTACTTCT 7740
TGGATATCTT CATAATAGCA AAAACAAGAG GACACAAAAT ATGGCTTTGG CCCTGCAGCT 7800
TAGAGTTCTC CAGGCTGCTT TGGAATTTAT AAGGAGCACA GCCAATCATG ACTCTGAAAG 7860
TCCAGTGCAC TCGCCTTCTG CCCACCGCCA TTCAGTGCCT CCGAAGCGGA GAAGCATTGC 7920
TGGTTCTCGC AAATTCCCTC TGGCTCAGAC AGAGTCTCTG CTGATGAAGA TGCGCTCAGT 7980
GGCCAGCGAT GAGCTACACT CTATGATGCA GAGGAGGATG AGCCAAGAGC ACCCCAGCCA 8040
GGCCTCGGAG GCAGAGCTCG CTCAGAGGCT GCAGAGGCTC ACCATCTTAG CTGTGAACAG 8100
GATTATTTAC CAAGAGTTGA ATTCAGATAT TATTGACATT TTGAGAACTC CAGAAAATAC 8160
ATCCCAAAGC AAGACCTCAG TTTCTCAGAC TGAAATTTCT GAAGAAGACA TGCATCATGA 6220
GCAACCTTCT GTATATAATC CATTTCAAAA AGAAATGTTA ACCTATCTGT TGGATGGCTT 8280
CAAAGTGTGT ATTGGTTCAA GTAAAACTAG CGTTTCTAAG CAGCAGTGGA CTAAAATCCT 8340
GGGGTCTTGT AAAGAAACCC TCCGAGACCA GCTTGGAAGA TTGCTAGCGC ATATTTTGTC 8400
TCCAACCCAC ACTGTACAAG AACGGAAGCA GATACTTGAG ATAGTTCATG AACCAGCTCA 8460
CCAGGATATA CTTCGTGACT GTCTTAGCCC CTCCCCACAA CATGGAGCCA AGTTGGTTTT 8520
GTATTTGTCA GAGTTGATAC ATAATCATCA GGATGAGTTA AGT&AAGAAG AAATGGACAC 8580
AGCAGAACTG CTTATGAATG CTCTAAAGTT ATGTGGCCAC AAGTGCATCC CGCCCAGTGC 8640
CCCTTCCAAA CCAGAGCTCA TTAAGATCAT CAGAGAGGAG CAAAAGAAGT ATGAAAGTGA 8700
AGAGAGTGTG AGCAAAGGCT CATGGCAGAA AACGGTGAAC AACAACCAGC AAAGTCTCTT 8760
CCAGAGGCTC GATTTCAAAT CCAAGGATAT ATCTAAAATC GCTGCAGACA TCACCCAGGC 8820
TGTATCACTC TCCCAAGGCA TTGAAAGGAA GAAGGTGATC CAGCACATCA GAGGGATGTA 8880
CAAAGTTGAC CTGAGTGCCA GCAGGCACTG GCAGGAATGC ATCCAGCAGC TGACACATGA 8940
CAGAGCAGTC TGGTATGACC CAATCTACTA TCCAACTTCA TGGCAGTTGG ATCCAACAGA 9000
AGGGCCAAAC CGAGAGAGGA GACGTTTGCA GAGATGCTAT CTAACTATTC CCAATAAGTA 9060
CCTCCTGAGG GACAGACAGA AGTCAGAAGG TGTGCTCAGG CCCCCACTCT CTTACCTTTT 9120 TGAAGATAAA ACTCATTCTT CCTTCTCCTC .TACTGTCAAA GACAAAGCTG CAAGTGAATC 9180
CATCAGAGTG AATCGAAGAT GTATCAGTGT TGCACCATCT AGAGAGACAG CTGGGGAATT 9240
GTTGTTAGGT AAATGTGGGA TGTACTTTGT GGAAGACAAT GCCTCTGACG CAGTTGAAAG 9300
CTCGAGCCTC CAAGGGGAGT TAGAGCCGGC ATCATTTTCT TGGACATATG AGGAAATTAA 9360
AGAAGTTCAC AGGCGCTGGT GGCAACTAAG AGATAATGCT GTAGAAATCT TTTTAACAAA 9420
TGGCAGAACA CTCCTATTAG CATTTGACAA TAACAAGGTT CGTGATGACG TGTACCAGAG 9480
CATCCTCACA AATAACCTCC CAAATCTTCT GGAGTACGGC AACATCACCG CTCTGACAAA 9540
CCTGTGGTAT TCTGGACAAA TTACCAATTT TGAATATTTG ACTCATTTAA ACAAGCATGC 9600
GGGCCGGTCC TTCAATGATC TCATGCAGTA CCCGGTGTTC CCCTTCATCC TTTCTGACTA 9660
TGTTAGTGAG ACTCTTGACC TCAATGATCC ATCTATCTAC AGAAACCTAT CTAAGCCTAT 9720
AGCTGTGCAG TATAAAGAAA AAGAAGACCG TTACGTTGAC ACATACAAGT ACTTGGAGGA 9780
GGAGTATCGC AAGGGAGCTC GAGAGGATGA CCCCATGCCT CCTGTGCAAC CCTACCACTA 9840
TGGCTCCCAC TACTCCAACA GCGGCACCGT GCTCCACTTC CTGGTCAGGA TGCCGCCTTT 9900
CACTAAAATG TTTCTAGCCT ATCAAGATCA GAGTTTCGAC ATTCCAGACC GAACATTTCA 9960
TTCTACAAAC ACAACTTGGC GCCTCTCCTC CTTTGAGTCC ATGACTGATG TGAAGGAGCT 10020
GATTCCAGAG TTTTTCTATC TTCCTGAGTT CTTAGTGAAC CGTGAAGGCT TTGACTTCGG 10080
TGTTCGTCAG AATGGAGAGC GGGTTAACCA CGTCAATCTT CCTCCCTGGG CACGCAACGA 10140
TCCTCGGCTG TTCATCCTTA TTCACCGGCA AGCACTAGAG TCTGACCATG TGTCCCAGAA 10200
CATCTGTCAC TGGATCGACT TAGTGTTTGG CTACAAGCAA AAGGGGAAGG CGTCTGTTCA 10260
AGCCATCAAT GTCTTCCACC CTGCTACATA TTTTGGAATG GATGTCTCTG CAGTTGAAGA 10320
TCCAGTGCAG AGACGGGCTT TAGAAACCAT GATAAAAACC TACGGGCAGA CCCCACGTCA 10380
GTTGTTCCAC ACAGCCCATG CCAGCCGACC TGGAGCCAAG CTTAACATCG AAGGAGAGCT 10440
TCCAGCAGCT GTTGGCTTGT TAGTCCAGTT CGCTTTCAGA GAGACCCGAG AACCAGTCAA 10500
GGAAGTCACT CATCCGAGCC CTTTGTCATG GATAAAAGGC TTGAAGTGGG GGGAGTACGT 10560
AGGTTCCCCC AGTGCTCCAG TACCTGTGGT CTGCTTCAGC CAGCCCCATG GAGAAAGATT 10620
TGGTTCCCTG CAGGCACTGC CCACCAGAGC CATCTGTGGT TTATCACGAA ACTTCTGTCT 10680
TCTGATGACC TACAACAAGG AGCAAGGTGT GAGAAGCATG AACAACACCA ATATTCAGTG 10740
GTCTGCTATC CTAAGCTGGG GATATGCTGA CAACATCTTA CGGTTGAAAA GTAAGCAGAG 10800
TGAGCCACCA ATCAACTTCA TTCAGAGTTC ACAGCAGCAC CAGGTAACCA GTTGTGCCTG 10860
GGTGCCTGAC AGTTGTCAGC TCTTCACTGG GAGCAAGTGT GGTGTCATCA CAGCCTATAC 10920
CAACAGGCTC ACCAGCAGCA CGCCCTCAGA AATTGAAATG GAGAGTCAGA TGCATCTCTA 10980
TGGACACACA GAGGAGATCA CCGGCTTATG TGTCTGCAAG CCGTACAGCG TGATGATAAG 11040 CGTGAGCAGA GACGGGACCT GCATAGTATG GGACCTGAAC AGGCTGTGCT ATGTACAAAG 11100
TTTGGCTGGA CACAAAAGCC CTGTGACGGC TGTCTCTGCC AGTGAAACGT CAGGTGACAT 11160
TGCTACTGTG TGTGACTCAG CTGGCGGGGG CAGTGACCTG AGACTCTGGA CCGTGAATGG 11220
GGACCTCGTT GGACATGTCC ACTGCAGAGA GATCATTTGT TCTGTAGCTT TCTCCAACCA 11280
GCCTGAGGGA GTCTCCATCA ACGTCATTGC TGGGGGATTA GAAAATGGCA TTGTAAGGCT 11340
ATGGAGCACA TGGGACTTGA AGCCTGTGAG AGAGATTACA TTTCCCAAAT CAAATAAGCC 11400
CATCATAAGC CTGACATTCT CCTGTGATGG CCACCATTTG TACACTGCCA ACAGTGAGGG 11460
GACAGTGATC GCATGGTGCC GGAAGGACCA GCAGCGTGTG AAGCTGCCCA TGTTCTACTC 11520
TTTCCTCAGC AGCTACGCAG CTGGATGAAG AGAAGGAGTG TCCCCACAGG ACATAAGCAC 11580
CGCTCTGCGA GCCTGGCTCC ACCAACTGCA GAAGCAGATG ACTGAGCAGA TATCCAGGAA 11640
AGACAACACA CGTGCCTCTG TGCGCGCTTC CCCAGCCTCC GTGGGCCTGA GAGTAAAGCC 11700
CTGCCCTCAT TCCATAATGG CGTGGAAGGC TGGGTCTGCA CACACTAGCC AATTAAAGTC 11760
AGAATCTTGA TGCTTTTTCC CAAAAGGTTA GGCTGAATCA AAGATCAGGC TCGTGCC 11817
(2) INFORMATION FOR SEQ ID NO: :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3788 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(DI TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
Met Ser Thr Asp Ser Asn Ser Leu Ala Arg Glu Phe Leu lie Asp Val 1 5 10 15
Asn Gin Leu Cys Asn Ala Val Val Gin Arg Ala Glu Ala Arg Glu Glu 20 25 30
Glu Glu Glu Glu Thr His Met Ala Thr Leu Gly Gin Tyr Leu Val His 35 40 45
Gly Arg Gly Phe Leu Leu Leu Thr Lys Leu Asn Ser Ile Ile Asp Gin 50 55 60
Ala Leu Thr Cys Arg Glu Glu Leu Leu Thr Leu Leu Leu Ser Leu Leu 65 70 75 80
Pro Leu Val Trp Lys Ile Pro Val Gin Glu Gin Gin Ala Thr Asp Phe 85 90 95
Asn Leu Pro Leu Ser Ser Asp Ile Ile Leu Thr Lys Glu Lys Asn Ser 100 105 110
Ser Leu Gin Lys Ser Thr Gin Gly Lys Leu Tyr Leu Glu Gly Ser Ala 115 120 125
Pro Ser Gly Gin Val Ser Ala Lys Val Asn Leu Phe Arg Lys Ile Arg 130 135 140
Arg Gin Arg Lys Ser Thr His Arg Tyr Ser Val Arg Asp Ala Arg Lys 145 150 155 160
Thr Gin Leu Ser Thr Ser Asp Ser Glu Gly Asn Ser Asp Glu Lys Ser 165 170 175
Thr Val Val Ser Lys His Arg Arg Leu His Ala Leu Pro Arg Phe Leu 180 185 190
Thr Gin Ser Pro Lys Glu Gly H s Leu Val Ala Lys Pro Asp Pro Ser 195 200 205
Ala Thr Lys Glu Gin Val Leu Ser Asp Thr Met Ser Val Glu Asn Ser 210 215 220
Arg Glu Val Ile Leu Arg Gin Asp Ser Asn Gly Asp Ile Leu Ser Glu 225 230 235 240
Pro Ala Ala Leu Ser Ile Leu Ser Asn Met Asn Asn Ser Pro Phe Asp 245 250 255
Leu Cys His Val Leu Leu Ser Leu Leu Glu Lys Val Cys Lys Phe Asp 260 265 270
Ile Ala Leu Asn His Asn Ser Ser Leu Ala Leu Ser Val Val Pro Thr 275 280 285
Leu Thr Glu Phe Leu Ala Gly Phe Gly Asp Cys Cys Asn Gin Ser Asp 290 295 300
Thr Leu Glu Gly Gin Leu Val Ser Ala Gly Trp Thr Glu Glu Pro Val 305 310 315 320
Ala Leu Val Gin Arg Met Leu Phe Arg Thr Val Leu His Leu Met Ser 325 330 335
Val Asp Val Ser Thr Ala Glu Ala Met Pro Glu Ser Leu Arg Lys Asn 340 345 350
Leu Thr Glu Leu Leu Arg Ala Ala Leu Lys Ile Arg Ala Cys Leu Glu 355 360 365
Lys Gin Pro Glu Pro Phe Ser Pro Arg Gin Lys Lys Thr Leu Gin Glu 370 375 380
Val Gin Glu Gly Phe Val Phe Ser Lys Tyr Arg His Arg Ala Leu Leu 385 390 395 400
Leu Pro Glu Leu Leu Glu Gly Val Leu Gin Leu Leu Ile Ser Cys Leu 405 410 415
Gin Ser Ala Ala Ser Asn Pro Phe Tyr Phe Ser Gin Ala Met Asp Leu 420 425 430
Val Gin Glu Phe Ile Gin His Gin Gly Phe Asn Leu Phe Gly Thr Ala 435 440 445
Val Leu Gin Met Glu Trp Leu Leu Thr Arg Asp Gly Val Pro Ser Glu 450 455 460
Ala Ala Glu His Leu Lys Ala Leu Ile Asn Ser Val Ile Lys Ile Met 465 470 475 480
Ser Thr Val Lys Lys Val Lys Ser Glu Gin Leu His His Ser Met Cys 485 490 495 Thr Arg Lys Arg His Arg Arg Cys Glu /<iT!yr Ser His Phe Met Gin His 500 - 505 510
His Arg Asp Leu Ser Gly Leu Leu Val Ser Ala Phe Lys Asn Gin Leu 515 520 525
Ser Lys Ser Pro Phe Glu Glu Thr Ala Glu Gly Asp Val Gin Tyr Pro 530 535 540
Glu Arg Cys Cys Cys Ile Ala Val Cys Ala His Gin Cys Leu Arg Leu 545 550 555 560
Leu Gin Gin Val Ser Leu Ser Thr Thr Cys Val Gin Ile Leu Ser Gly 565 570 575
Val His Ser Val Gly Ile Cys Cys Cys Met Asp Pro Lys Ser Val Ile 580 585 590
Ala Pro Leu Leu His Ala Phe Lys Leu Pro Ala Leu Lys Ala Phe Gin 595 600 605
Gin His Ile Leu Asn Val Leu Ser Lys Leu Leu Val Asp Gin Leu Gly 610 615 620
Gly Ala Glu Leu Ser Pro Arg Ile Lys Lys Ala Ala Cys Asn Ile Cys 625 630 635 640
Thr Val Asp Ser Asp Gin Leu Ala Lys Leu Gly Glu Thr Leu Gin Gly 645 650 655
Thr Leu Cys Gly Ala Gly Pro Thr Ser Gly Leu Pro Ser Pro Ser Tyr 660 665 670
Arg Phe Gin Gly Ile Leu Pro Ser Ser Gly Ser Glα Asp Leu Leu Trp 675 680 685
Lys Trp Asp Ala Leu Glu Ala Tyr Gin Ser Phe Val Phe Gin Glu Asp 690 695 700
Arg Leu His Asn Ile Gin Ile Ala Asn His Ile Cys Asn Leu Leu Gin 705 710 715 720
Lys Gly Asn Val Val Val Gin Trp Lys Leu Tyr Asn Tyr Ile Phe Asn 725 730 735
Pro Val Leu Gin Arg Gly Val Glu Leu Val His His Cys Gin Gin Leu 740 745 750
Ser Ile Pro Ser Ala Gin Thr His Met Cys Ser Gin Leu Lys Gin Tyr 755 760 765
Leu Pro Gin Glu Val Leu Gin Ile Tyr Leu Lys Thr Leu Pro Val Leu 770 775 780
Leu Lys Ser Arg Val Ile Arg Asp Leu Phe Leu Ser Cys Asn Gly Val 785 790 795 800
Asn His Ile Ile Glu Leu Asn Tyr Leu Asp Gly Ile Arg Ser His Ser 805 810 815
Leu Lys Ala Phe Glu Thr Leu lie Val Ser Leu Gly Glu Gin Gin Lys 820 825 830
Asp Ala Ala Val Leu Asp Val Asp Gly Leu Asp Ile Gin Gin Glu Leu 835 840 845 Pro Ser Leu Ser Val Gly Pro Ser Leu His Lys Gin Gin Ala Ser Ser 850 855 860 sp Ser Pro Cys Ser Leu Arg Lys Phe Tyr Ala Ser Leu Arg Glu Pro 865 870 875 880 sp Pro Lys Lys Arg Lys Thr Ile His Gin Asp Val His Ile Asn Thr 885 890 895
Ile Asn Leu Phe Leu Cys Val Ala Phe Leu Cys Val Ser Lys Glu Ala 900 905 910
Asp Ser Asp Arg Glu Ser Ala Asn Glu Ser Glu Asp Thr Ser Gly Tyr 915 920 925
Asp Ser Pro Pro Ser Glu Pro Leu Ser His Met Leu Pro Cys Leu Ser 930 935 940
Leu Glu Asp Val Val Leu Pro Ser Pro Glu Cys Leu His His Ala Ala 945 950 955 960
Asp Ile Trp Ser Met Cys Arg Trp Ile Tyr Met Leu Asn Ser Val Phe 965 970 975
Gin Lys Gin Phe His Arg Leu Gly Gly Phe Gin Val Cys His Glu Leu 980 985 990
Ile Phe Met Ile Ile Gin Lys Leu Phe Arg Ser His Thr Glu Asp Gin 995 1000 1005
Gly Arg Arg Gin Gly Glu Met Ser Arg Asn Glu Asn Gin Glu Leu Ile 1010 1015 1020
Arg Ile Ser Tyr Pro Glu Leu Thr Leu Lys Gly Asp Val Ser Ser Ala 1025 1030 1035 1040
Thr Ala Pro Asp Leu Gly Phe Leu Arg Lys Ser Ala Asp Ser Val Arg 1045 1050 1055
Gly Phe Gin Ser Gin Pro Val Leu Pro Thr Ser Ala Glu Gin Ile Val 1060 1065 1070
Ala Thr Glu Ser Val Pro Gly Glu Arg Lys Ala Phe Met Ser Gin Gin 1075 1080 1085
Ser Glu Thr Ser Leu Gin Ser Ile Arg Leu Leu Glu Ser Leu Leu Asp 1090 1095 1100
Ile Cys Leu His Ser Ala Arg Ala Cys Gin Gin Lys Met Glu Leu Glu 1105 1110 1115 1120
Leu Pro Ser Gin Gly Leu Ser Val Glu Asn Ile Leu Cys Glu Leu Arg 1125 1130 1135
Glu His Leu Ser Gin Ser Lys Val Ala Glu Thr Glu Leu Ala Lys Pro 1140 1145 1150
Leu Phe Asp Ala Leu Leu Arg Val Ala Leu Gly Asn His Ser Ala Asp 1155 1160 1165
Leu Gly Pro Gly Asp Ala Val Thr Glu Lys Ser His Pro Ser Glu Glu 1170 1175 1180
Glu Leu Leu Ser Gin Pro Gly Asp Phe Ser Glu Glu Ala Glu Asp Ser 1185 1190 m 1195 1200
Gin Cys Cys Ser Leu Lys Leu Leu Gly Glu Glu Glu Gly Tyr Glu Ala 1205 1210 1215
Asp Ser Glu Ser Asn Pro Glu Asp Val Asp Thr Gin Asp Asp Gly Val 1220 1225 1230
Glu Leu Asn Pro Glu Ala Glu Gly Phe Ser Gly Ser Ile Val Ser Asn 1235 1240 1245
Asn Leu Leu Glu Asn Leu Thr His Gly Glu Ile Ile Tyr Pro Glu Ile 1250 1255 1260
Cys Met Leu Gly Leu Asn Leu Leu Ser Ala Ser Lys Ala Lys Leu Asp 1265 1270 1275 1280
Val Leu Ala His Val Phe Glu Ser Phe Leu Lys Ile Val Arg Gin Lys 1285 1290 1295
Glu Lys Asn Ile Ser Leu Leu Ile Gin Gin Gly Thr Val Lys Ile Leu 1300 1305 1310
Leu Gly Gly Phe Leu Asn Ile Leu Thr Gin Thr Asn Ser Asp Phe Gin 1315 1320 1325
Ala Cys Gin Arg Val Leu Val Asp Leu Leu Val Ser Leu Met Ser Ser 1330 1335 1340
Arg Thr Cys Ser Glu Asp Leu Thr Leu Leu Trp Arg Ile Phe Leu Glu 1345 1350 1355 1360
Lys Ser Pro Cys Thr Glu Ile Leu Leu Leu Gly lie His Lys Ile Val 1365 1370 1375
Glu Ser Asp Phe Thr Met Ser Pro Ser Gin Cys Leu Thr Phe Pro Phe 1380 1385 1390
Leu His Thr Pro Ser Leu Ser Asn Gly Val Leu Ser Gin Lys Pro Pro 1395 1400 1405
Gly Ile Leu Asn Ser Lys Ala Leu Gly Leu Leu Arg Arg Ala Arg Ile 1410 1415 1420
Ser Arg Gly Lys Lys Glu Ala Asp Arg Glu Ser Phe Pro Tyr Arg Leu 1425 1430 1435 1440
Leu Ser Ser Trp His Ile Ala Pro Ile His Leu Pro Leu Leu Gly Gin 1445 1450 1455
Asn Cys Trp Pro His Leu Ser Glu Gly Phe Ser Val Ser Leu Val Gly 1460 1465 1470
Leu Met Trp Asn Thr Ser Asn Glu Ser Glu Ser Ala Ala Glu Arg Gly 1475 1480 1485
Lys Arg Val Lys Lys Arg Asn Lys Pro Ser Val Leu Glu Asp Ser Ser 1490 1495 1500
Phe Glu Gly Ala Glu Gly Asp Arg Pro Glu Val Thr Glu Ser Ile Asn 1505 1510 1515 1520
Pro Gly Asp Arg Leu Ile Glu Asp Gly Cys Ile His Leu Ile Ser Leu 1525 1530 1535 /<W
Gly Ser Lys Ala Leu Met Ile Gin Val Trp Ala Asp Pro His Ser Gly 1540 , 1545 1550
Thr Phe Ile Phe Arg Val Cys Met Asp Ser Asn Asp Asp Thr Lys Ala 1555 1560 1565
Val Ser Leu Ala Gin Val Glu Ser Gin Glu Asn Ile Phe Phe Pro Ser 1570 1575 1580
Lys Trp Gin His Leu Val Leu Thr Tyr Ile Gin His Pro Gin Gly Lys 1585 1590 1595 1600
Lys Asn Val His Gly Glu Ile Ser Ile Trp Val Ser Gly Gin Arg Lys 1605 1610 1615
Thr Asp Val Ile Leu Asp Phe Val Leu Pro Arg Lys Thr Ser Leu Ser 1620 1625 1630
Ser Asp Ser Asn Lys Thr Phe Cys Met Ile Gly His Cys Leu Thr Ser 1635 1640 1645
Gin Glu Glu Ser Leu Gin Leu Ala Gly Lys Trp Asp Leu Gly Asn Leu 1650 1655 1660
Leu Leu Phe Asn Gly Ala Lys Ile Gly Ser Gin Glu Ala Phe Phe Leu 1665 1670 1675 1680
Tyr Ala Cys Gly Pro Asn Tyr Thr Ser Ile Met Pro Cys Lys Tyr Gly 1685 1690 1695
Gin Pro Val Ile Asp Tyr Ser Lys Tyr Ile Asn Lys Asp Ile Leu Arg 1700 1705 1710
Cys Asp Glu Ile Arg Asp Leu Phe Met Thr Lys Lys Glu Val Asp Val 1715 1720 1725
Gly Leu Leu Ile Glu Ser Leu Ser Val Val Tyr Thr Thr Cys Cys Pro 1730 1735 1740
Ala Gin Tyr Thr Ile Tyr Glu Pro Val Ile Arg Leu Lys Gly Gin Val 1745 1750 1755 1760
Lys Thr Gin Pro Ser Gin Arg Pro Phe Ser Ser Lys Glu Ala Gin Ser 1765 1770 1775
Ile Leu Leu Glu Pro Ser Gin Leu Lys Gly Leu Gin Pro Thr Glu Cys 1780 1785 1790
Lys Ala Ile Gin Gly Ile Leu His Glu Ile Gly Gly Ala Gly Thr Phe 1795 1800 1805
Val Phe Leu Phe Ala Arg Val Val Glu Leu Ser Ser Cys Glu Glu Thr 1810 1815 1820
Gin Ala Leu Ala Leu Arg Val Ile Leu Ser Leu Ile Lys Tyr Ser Gin 1825 1830 1835 1840
Gin Arg Thr Gin Glu Leu Glu Asn Cys Asn Gly Leu Ser Met Ile His 1845 1850 1855
Gin Val Leu Val Lys Gin Lys Cys Ile Val Gly Phe His Ile Leu Lys 1860 1865 1870
Thr Leu Leu Glu Gly Cys Cys Gly Glu Glu Val Ile His Val Ser Glu 1875 1880 1885 His Gly Glu Phe Lys Leu Asp Val Glu Ser His Ala Ile Ile Gin Asp 1890 1895 ' 1900
Val Lys Leu Leu Gin Glu Leu Leu Leu Asp Trp Lys Ile Trp Asn Lys 1905 1910 1915 1920
Ala Glu Gin Gly Val Trp Glu Thr Leu Leu Ala Ala Leu Glu Val Leu 1925 1930 1935
Ile Arg Val Glu His His Gin Gin Gin Phe Asn Ile Lys Gin Leu Leu 1940 1945 1950
Asn Ala His Val Val His His Phe Leu Leu Thr Cys Gin Val Leu Gin 1955 1960 1965
Glu His Arg Glu Gly Gin Leu Thr Ser Met Pro Arg Glu Val Cys Arg 1970 1975 1980
Ser Phe Val Lys Ile Ile Ala Glu Val Leu Gly Ser Pro Pro Asp Leu 1985 1990 1995 2000
Glu Leu Leu Thr Val Ile Phe Asn Phe Leu Leu Ala Val His Pro Pro 2005 2010 2015
Thr Asn Thr Tyr Val Cys His Asn Pro Thr Asn Phe Tyr Phe Ser Leu 2020 2025 2030
His Ile Asp Gly Lys Ile Phe Gin Glu Lys Val Gin Ser Leu Ala Tyr 2035 2040 2045
Leu Arg His Ser Ser Ser Gly Gly Gin Ala Phe Pro Ser Pro Gly Phe 2050 2055 2060
Leu Val Ile Ser Pro Ser Ala Phe Thr Ala Ala Pro Pro Glu Gly Thr 2065 2070 2075 2080
Ser Ser Ser Asn Ile Val Pro Gin Arg Met Ala Ala Gin Met Val Arg 2085 2090 2095
Ser Arg Ser Leu Pro Ala Phe Pro Thr Tyr Leu Pro Leu Ile Arg Ala 2100 2105 2110
Gin Lys Leu Ala Ala Ser Leu Gly Phe Ser Val Asp Lys Leu Gin Asn 2115 2120 2125
Ile Ala Asp Ala Asn Pro Glu Lys Gin Asn Leu Leu Gly Arg Pro Tyr 2130 2135 2140
Ala Leu Lys Thr Ser Lys Glu Glu Ala Phe Ile Ser Ser Cys Glu Ser 2145 2150 2155 2160
Ala Lys Thr Val Cys Glu Met Glu Ala Leu Leu Gly Ala His Ala Ser 2165 2170 2175
Ala Asn Gly Val Ser Arg Gly Ser Pro Arg Phe Pro Arg Ala Arg Val 2180 2185 2190
Asp His Lys Asp Val Gly Thr Glu Pro Arg Ser Asp Asp Asp Ser Pro 2195 2200 2205
Gly Asp Glu Ser Tyr Pro Arg Arg Pro Asp Asn Leu Lys Gly Leu Ala 2210 2215 2220
Ser Phe Gin Arg Ser Gin Ser Thr Val Ala Ser Leu Gly Leu Ala Phe 2225 2230 2235 2240
Pro Ser Gin Asn Gly Ser Ala al Ala Ser Arg Trp Pro Ser Leu Val 2245 2250 2255
Asp Arg Asn Ala Asp Asp Trp Glu Asn Phe Thr Phe Ser Pro Ala Tyr 2260 2265 2270
Glu Ala Ser Tyr Asn Arg Ala Thr Ser Thr His Ser Val Ile Glu Asp 2275 2280 2285
Cys Leu Ile Pro Ile Cys Cys Gly Leu Tyr Glu Leu Leu Ser Gly Val 2290 2295 2300
Leu Leu Val Leu Pro Asp Ala Met Leu Glu Asp Val Met Asp Arg Ile 2305 2310 2315 2320
Ile Gin Ala Asp Ile Leu Leu Val Leu Val Asn His Pro Ser Pro Ala 2325 2330 2335
Ile Gin Gin Gly Val Ile Lys Leu Leu His Ala Tyr Ile Asn Arg Ala 2340 2345 2350
Ser Lys Glu Gin Lys Asp Lys Phe Leu Lys Asn Arg Gly Phe Ser Leu 2355 2360 2365
Leu Ala Asn Gin Leu Tyr Leu His Arg Gly Thr Gin Glu Leu Leu Glu 2370 2375 2380
Cys Phe Val Glu Met Phe Phe Gly Arg Pro Ile Gly Leu Asp Glu Glu 2385 2390 2395 2400
Pne Asp Leu Glu Glu Val Lys His Met Glu Leu Phe Gin Lys Trp Ser 2405 2410 2415
Val Ile Pro Val Leu Gly Leu Ile Glu Thr Ser Leu Tyr Asp Asn Val 2420 2425 2430
Leu Leu His Asn Ala Leu Leu Leu Leu Leu Gin Val Leu Asn Ser Cys 2435 2440 2445
Ser Lys Val Ala Asp Met Leu Leu Asp Asn Gly Leu Leu Tyr Val Leu 2450 2455 2460
Cys Asn Thr Val Ala Ala Leu Asn Gly Leu Glu Lys Asn Ile Pro Val 2465 2470 2475 2480
Asn Glu Tyr Lys Leu Leu Ala Cys Asp Ile Gin Gin Leu Phe Ile Ala 2485 2490 2495
Val Thr Ile His Ala Cys Ser Ser Ser Gly Thr Gin Tyr Phe Arg Val 2500 2505 2510
Ile Glu Asp Leu Ile Val Leu Leu Gly Tyr Leu His Asn Ser Lys Asn 2515 2520 2525
Lys Arg Thr Gin Asn Met Ala Leu Ala Leu Gin Leu Arg Val Leu Gin 2530 2535 2540
Ala Ala Leu Glu Phe Ile Arg Ser Thr Ala Asn His Asp Ser Glu Ser 2545 2550 2555 2560
Pro Val His Ser Pro Ser Ala His Arg His Ser Val Pro Pro Lys Arg 2565 2570 2575 Arg Ser Ile Ala Gly Ser Arg Lys Phe Pro Leu Ala Gin Thr Glu Ser 2580 _ 2585 2590
Leu Leu Met Lys Met Arg Ser Val Ala Ser Asp Glu Leu His Ser Met 2595 2600 2605
Met Gin Arg Arg Met Ser Gin Glu His Pro Ser Gin Ala Ser Glu Ala 2610 2615 2620
Glu Leu Ala Gin Arg Leu Gin Arg Leu Thr Ile Leu Ala Val Asn Arg 2625 2630 2635 2640 lie Ile Tyr Gin Glu Leu Asn Ser Asp Ile Ile Asp Ile Leu Arg Thr 2645 2650 2655
Pro Glu Asn Thr Ser Gin Ser Lys Thr Ser Val Ser Gin Thr Glu Ile 2660 2665 2670
Ser Glu Glu Asp Met His H s Glu Gin Pro Ser Val Tyr Asn Pro Phe 2675 2680 2685
Gin Lys Glu Met Leu Thr Tyr Leu Leu Asp Gly Phe Lys Val Cys lie 2690 2695 2700
Gly Ser Ser Lys Thr Ser Val Ser Lys Gin Gin Trp Thr Lys Ile Leu 2705 2710 2715 2720
Gly Ser Cys Lys Glu Thr Leu Arg Asp Gin Leu Gly Arg Leu Leu Ala 2725 2730 2735
His Ile Leu Ser Pro Thr His Thr Val Gin Glu Arg Lys Gin Ile Leu 2740 2745 2750
Glu Ile Val His Glu Pro Ala His Gin Asp Ile Leu Arg Asp Cys Leu 2755 2760 2765
Ser Pro Ser Pro Gin His Gly Ala Lys Leu Val Leu Tyr Leu Ser Glu 2770 2775 2780
Leu Ile His Asn His Gin Asp Glu Leu Ser Glu Glu Glu Met Asp Thr 2785 2790 2795 2800
Ala Glu Leu Leu Met Asn Ala Leu Lys Leu Cys Gly His Lys Cys Ile 2805 2810 2815
Pro Pro Ser Ala Pro Ser Lys Pro Glu Leu Ile Lys Ile Ile Arg Glu 2820 2825 2830
Glu Gin Lys Lys Tyr Glu Ser Glu Glu Ser Val Ser Lys Gly Ser Trp 2835 2840 2845
Gin Lys Thr Val Asn Asn Asn Gin Gin Ser Leu Phe Gin Arg Leu Asp 2850 2855 2860
Phe Lys Ser Lys Asp Ile Ser Lys Ile Ala Ala Asp Ile Thr Gin Ala 2865 2870 2875 2880
Val Ser Leu Ser Gin Gly Ile Glu Arg Lys Lys Val Ile Gin His Ile 2885 2890 2895
Arg Gly Met Tyr Lys Val Asp Leu Ser Ala Ser Arg His Trp Gin Glu 2900 2905 2910
Cys lie Gin Gin Leu Thr His Asp Arg Ala Val Trp Tyr Asp Pro Ile 2915 2920 2925 Tyr Tyr Pro Thr Ser Trp Gin .Leu Asp Pro Thr Glu Gly Pro Asn Arg 2930 2935 2940
Glu Arg Arg Arg Leu Gin Arg Cys Tyr Leu Thr Ile Pro Asn Lys Tyr 2945 2950 2955 2960
Leu Leu Arg Asp Arg Gin Lys Ser Glu Gly Val Leu Arg Pro Pro Leu 2965 2970 2975
Ser Tyr Leu Phe Glu Asp Lys Thr His Ser Ser Phe Ser Ser Thr Val 2980 2985 2990
Lys Asp Lys Ala Ala Ser Glu Ser Ile Arg Val Asn Arg Arg Cys Ile 2995 3000 3005
Ser Val Ala Pro Ser Arg Glu Thr Ala Gly Glu Leu Leu Leu Gly Lys 3010 3015 3020
Cys Gly Met Tyr Phe Val Glu Asp Asn Ala Ser Asp Ala Val Glu Ser 3025 3030 3035 3040
Ser Ser Leu Gin Gly Glu Leu Glu Pro Ala Ser Phe Ser Trp Thr Tyr 3045 3050 3055
Glu Glu Ile Lys Glu Val His Arg Arg Trp Trp Gin Leu Arg Asp Asn 3060 3065 3070
Ala Val Glu Ile Phe Leu Thr Asn Gly Arg Thr Leu Leu Leu Ala Phe 3075 3080 3085
Asp Asn Asn Lys Val Arg Asp Asp Val Tyr Gin Ser Ile Leu Thr Asn 3090 3095 3100
Asn Leu Pro Asn Leu Leu Glu Tyr Gly Asn Ile Thr Ala Leu Thr Asn 3105 3110 3115 3120
Leu Trp Tyr Ser Gly Gin Ile Thr Asn Phe Glu Tyr Leu Thr His Leu 3125 3130 3135
Asn Lys His Ala Gly Arg Ser Phe Asn Asp Leu Met Gin Tyr Pro Val 3140 3145 3150
Phe Pro Phe Ile Leu Ser Asp Tyr Val Ser Glu Thr Leu Asp Leu Asn 3155 3160 3165
Asp Pro Ser lie Tyr Arg Asn Leu Ser Lys Pro Ile Ala Val Gin Tyr 3170 3175 3180
Lys Glu Lys Glu Asp Arg Tyr Val Asp Thr Tyr Lys Tyr Leu Glu Glu 3185 3190 3195 3200
Glu Tyr Arg Lys Gly Ala Arg Glu Asp Asp Pro Met Pro Pro Val Gin 3205 3210 3215
Pro Tyr His Tyr Gly Ser His Tyr Ser Asn Ser Gly Thr Val Leu His 3220 3225 3230
Phe Leu Val Arg Met Pro Pro Phe Thr Lys Met Phe Leu Ala Tyr Gin 3235 3240 3245
Asp Gin Ser Phe Asp Ile Pro Asp Arg Thr Phe His Ser Thr Asn Thr 3250 3255 3260
Thr Trp Arg Leu Ser Ser Phe Glu Ser Met Thr Asp Val Lys Glu Leu 3265 3270 3275 3280
Ile Pro Glu Phe Phe Tyr Leu Pro Glu Phe Leu Val Asn Arg Glu Gly 3285 3290 3295
Phe Asp Phe Gly Val Arg Gin Asn Gly Glu Arg Val Asn His Val Asn 3300 3305 3310
Leu Pro Pro Trp Ala Arg Asn Asp Pro Arg Leu Phe Ile Leu Ile His 3315 3320 3325
Arg Gin Ala Leu Glu Ser Asp His Val Ser Gin Asn Ile Cys His Trp 3330 3335 3340
Ile Asp Leu Val Phe Gly Tyr Lys Gin Lys Gly Lys Ala Ser Val Gin 3345 3350 3355 3360
Ala Ile Asn Val Phe His Pro Ala Thr Tyr Phe Gly Met Asp Val Ser 3365 3370 3375
Ala Val Glu Asp Pro Val Gin Arg Arg Ala Leu Glu Thr Met Ile Lys 3380 3385 3390
Thr Tyr Gly Gin Thr Pro Arg Gin Leu Phe His Thr Ala His Ala Ser 3395 3400 3405
Arg Pro Gly Ala Lys Leu Asn Ile Glu Gly Glu Leu Pro Ala Ala Val 3410 3415 3420
Gly Leu Leu Val Gin Phe Ala Phe Arg Glu Thr Arg Glu Pro Val Lys 3425 3430 3435 3440
Glu Val Thr H s Pro Ser Pro Leu Ser Trp Ile Lys Gly Leu Lys Trp 3445 3450 3455
Gly Glu Tyr Val Gly Ser Pro Ser Ala Pro Val Pro Val Val Cys Phe 3460 3465 3470
Ser Gin Pro His Gly Glu Arg Phe Gly Ser Leu Gin Ala Leu Pro Thr 3475 3480 3485
Arg Ala Ile Cys Gly Leu Ser Arg Asn Phe Cys Leu Leu Met Thr Tyr 3490 3495 3500
Asn Lys Glu Gin Gly Val Arg Ser Met Asn Asn Thr Asn Ile Gin Trp 3505 3510 3515 3520
Ser Ala Ile Leu Ser Trp Gly Tyr Ala Asp Asn Ile Leu Arg Leu Lys 3525 3530 3535
Ser Lys Gin Ser Glu Pro Pro Ile Asn Phe Ile Gin Ser Ser Gin Gin 3540 3545 3550
His Gin Val Thr Ser Cys Ala Trp Val Pro Asp Ser Cys Gin Leu Phe 3555 3560 3565
Thr Gly Ser Lys Cys Gly Val Ile Thr Ala Tyr Thr Asn Arg Leu Thr 3570 3575 3580
Ser Ser Thr Pro Ser Glu Ile Glu Met Glu Ser Gin Met His Leu Tyr 3585 3590 3595 3600
Gly His Thr Glu Glu Ile Thr Gly Leu Cys Val Cys Lys Pro Tyr Ser 3605 3610 3615 Val Met Ile Ser Val Ser Arg Asp Gly Thr Cys Ile Val Trp Asp Leu 3620 . 3625 3630
Asn Arg Leu Cys Tyr Val Gin Ser Leu Ala Gly His Lys Ser Pro Val 3635 3640 3645
Thr Ala Val Ser Ala Ser Glu Thr Ser Gly Asp Ile Ala Thr Val Cys 3650 3655 3660
Asp Ser Ala Gly Gly Gly Ser Asp Leu Arg Leu Trp Thr Val Asn Gly 3665 3670 3675 3680
Asp Leu Val Gly His Val His Cys Arg Glu Ile Ile Cys Ser Val Ala 3685 3690 3695
Phe Ser Asn Gin Pro Glu Gly Val Ser Ile Asn Val Ile Ala Gly Gly 3700 3705 3710
Leu Glu Asn Gly Ile Val Arg Leu Trp Ser Thr Trp Asp Leu Lys Pro 3715 3720 3725
Val Arg Glu Ile Thr Phe Pro Lys Ser Asn Lys Pro Ile Ile Ser Leu 3730 3735 3740
Thr Phe Ser Cys Asp Gly His His Leu Tyr Thr Ala Asn Ser Glu Gly 3745 3750 3755 3760
Thr Val Ile Ala Trp Cys Arg Lys Asp Gin Gin Arg Val Lys Leu Pro 3765 3770 3775
Met Phe Tyr Ser Phe Leu Ser Ser Tyr Ala Ala Gly 3780 3785
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5893 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
TGAGAGCTCA CGCTGGCCTG GCAGCCTTGG TGAGTCGGGA TTCTCCTGCA CCGGCGGGCG 60
AGAGCGCGCG GCGGACCACA GAGCGGAGGT GAAGCCTTAT GCTGAGACAG TTTTATCTAG 120
TTCATGAACC CAAATTATAT ACAAGCTGAA TGTTACAGAA GTGCTGAAAG ACTGCTCTGT 180
CATGAGCACG GACAGCAACT CATTGGCACG TGAGTTTCTG ATTGATGTCA ACCAGCTTTG 240
CAATGCAGTG GTCCAGAGGG CAGAAGCCAG GGAAGAAGAA GAAGAGGAGA CACACATGGC 300
AACTCTTGGA CAGTACCTTG TCCATGGACG AGGATTTCTG TTACTTACCA AACTAAATTC 360
TATCATTGAT CAGGCCCTGA CATGCAGAGA AGAACTCCTG ACTCTTCTTC TGTCGCTCCT 420
TCCCTTGGTG TGGAAGATAC CTGTCCAGGA ACAGCAGGCA ACAGATTTTA ACCTGCCACT 480
GTCATCTGAT ATAATCCTGA CCAAAGAAAA GAACTCAAGT TTGCAAAAAT CAACTCAGGG 540
AAAATTATAT TTAGAAGGAA GTGCTCCATC TGGTCAGGTT TCTGCAAAAG TAAACCTTTT 600
TCGAAAAATC AGGCGACAGC GTAAAAGTAC CCATCGTTAT TCTGTAAGAG ATGCAAGAAA 660 16-1
GACACAGCTC TCCACCTCTG ACTCCGAAGG -CAACTCAGAT GAAAAGAGTA CGGTTGTGAG 720
TAAACACAGG AGGCTCCACG CGCTGCCACG GTTCCTGACG CAGTCTCCTA AGGAAGGCCA 780
CCTCGTAGCC AAACCTGACC CCTCTGCCAC CAAAGAACAG GTCCTTTCTG ACACCATGTC 840
TGTGGAAAAC TCCAGAGAAG TCATTCTGAG ACAGGATTCA AATGGTGACA TATTAAGTGA 900
GCCAGCTGCT TTGTCTATTC TCAGTAACAT GAATAATTCT CCTTTTGACT TATGTCATGT 960
TTTGTTATCT CTATTGGAAA AAGTTTGTAA GTTTGACATT GCTTTGAATC ATAATTCTTC 1020
CCTAGCACTC AGTGTAGTAC CCACACTGAC TGAGTTCCTA GCAGGCTTTG GGGACTGCTG 1080
TAACCAGAGT GACACTTTGG AGGGACAACT GGTTTCTGCA GGTTGGACAG AAGAGCCGGT 1140
AGCTTTGGTT CAACGGATGC TCTTTCGAAC CGTGCTGCAC CTTATGTCAG TAGACGTTAG 1200
CACTGCAGAG GCAATGCCAG AAAGTCTTAG GAAAAATTTG ACTGAATTGC TTAGGGCAGC 1260
TTTAAAAATT AGAGCTTGCT TGGAAAAGCA GCCTGAGCCT TTCTCCCCGA GACAAAAGAA 1320
AACACTACAG GAGGTCCAGG AGGGCTTTGT ATTTTCCAAG TATCGTCACC GAGCCCTTCT 1380
ACTACCTGAG CTTCTGGAAG GAGTTCTACA GCTCCTCATC TCTTGTCTTC AGAGTGCAGC 1440
TTCAAATCCC TTTTACTTCA GTCAAGCCAT GGATTTAGTT CAAGAATTTA TCCAGCACCA 1500
AGGATTTAAT CTCTTTGGAA CAGCAGTTCT TCAGATGGAA TGGCTGCTTA CAAGGGACGG 1560
TGTTCCTTCA GAAGCTGCAG AACATTTGAA AGCTCTGATA AACAGTGTAA TAAAAATAAT 1620
GAGTACTGTG AAAAAGGTGA AATCAGAGCA ACTTCATCAT TCCATGTGCA CAAGGAAAAG 1680
ACACCGGCGT TGTGAGTATT CCCACTTCAT GCAGCACCAC CGCGATCTTT CAGGGCTCCT 1740
GGTTTCAGCT TTTAAAAATC AGCTTTCTAA AAGCCCCTTT GAAGAGACCG CAGAGGGAGA 1800
TGTGCAGTAT CCAGAGCGCT GCTGCTGCAT CGCCGTGTGC GCTCACCAGT GCTTGCGCTT 1860
GCTGCAGCAG GTTTCCCTGA GCACCACGTG TGTCCAGATC CTATCAGGTG TACACAGTGT 1920
TGGAATCTGT TGTTGTATGG ATCCTAAGTC TGTGATCGCC CCTTTACTGC ATGCTTTTAA 1980
GTTGCCAGCA CTGAAAGCTT TCCAGCAGCA TATACTGAAT GTCCTGAGCA AACTTCTTGT 2040
GGATCAGTTA GGAGGAGCAG AGCTATCACC GAGAATTAAA AAAGCAGCTT GCAACATCTG 2100
TACTGTGGAC TCTGACCAAC TGGCTAAGTT AGGAGAGACA CTGCAAGGCA CCTTGTGTGG 2160
TGCTGGTCCT ACCTCCGGCT TGCCCAGTCC TTCCTACCGA TTTCAGGGGA TCCTGCCCAG 2220
CAGCGGCTCT GAAGACTTGC TGTGGAAGTG GGATGCATTA GAGGCTTATC AGAGCTTTGT 2280
CTTTCAAGAA GACAGATTAC ATAACATTCA GATTGCAAAT CACATTTGTA ATTTACTCCA 2340
GAAAGGCAAT GTAGTTGTTC AGTGGAAATT GTATAATTAT ATCTTTAATC CTGTGCTCCA 2400
AAGAGGAGTT GAATTAGTAC ATCATTGTCA ACAGCTAAGC ATTCCTTCAG CTCAGACTCA 2460
CATGTGTAGC CAACTGAAAC AGTATTTGCC TCAGGAAGTG CTTCAGATTT ATTTAAAAAC 2520
TCTACCTGTC CTACTTAAAT CCAGGGTAAT AAGAGATTTG TTTTTAAGTT GTAATGGAGT 2580 AAACCACATA ATTGAACTAA ATTACTTAGA TGGGATTCGA AGTCATTCCC TGAAAGCATT 2640
TGAAACTCTG ATTGTCAGCC TAGGGGAACA ACAGAAAGAT GCTGCAGTTC TAGACGTCGA 2700
TGGGTTAGAC ATCCAACAGG AGTTGCCGTC CTTAAGTGTG GGTCCTTCTC TTCATAAGCA 2760
GCAAGCTTCT TCAGATTCTC CTTGCAGTCT CAGGAAGTTT TATGCCAGCC TCAGAGAGCC 2820
TGATCCAAAA AAACGAAAGA CCATTCACCA GGATGTTCAC ATAAACACCA TAAACCTCTT 2880
CCTCTGTGTG GCTTTTCTAT GTGTCAGTAA AGAAGCAGAC TCTGATAGGG AGTCTGCCAA 2940
TGAGTCAGAA GATACTTCTG GCTATGACAG CCCTCCCAGT GAGCCATTAA GTCACATGCT 3000
ACCATGTCTG TCTCTTGAGG ACGTTGTCTT ACCTTCCCCT GAATGTTTGC ACCATGCAGC 3060
AGACATTTGG TCCATGTGTC GTTGGATCTA CATGTTGAAC TCAGTCTTCC AGAAACAATT 3120
TCACAGGCTT GGTGGTTTCC AAGTGTGCCA TGAATTAATA TTTATGATAA TCCAGAAACT 3180
ATTCAGAAGT CATACAGAGG ATCAAGGAAG AAGGCAGGGA GAAATGAGTA GAAATGAAAA 3240
CCAAGAGCTA ATCAGGATAT CTTACCCCGA GCTGACACTG AAGGGAGATG TATCATCTGC 3300
AACAGCACCA GACCTGGGAT TTCTGAGAAA GAGTGCTGAC AGCGTGCGTG GATTCCAGTC 3360
ACAGCCTGTG CTTCCCACAA GTGCAGAGCA GATTGTGGCT ACTGAATCTG TTCCTGGGGA 3420
ACGAAAGGCA TTTATGAGTC AACAAAGTGA GACTTCTCTC CAGAGCATAC GACTTTTGGA 3480
GTCTCTCCTG GACATTTGTC TTCATAGTGC CAGAGCCTGT CAACAGAAGA TGGAATTGGA 3540
GCTACCGTCT CAGGGCTTGT CTGTGGAAAA TATATTGTGT GAACTGAGGG AACACCTTTC 3600
CCAGTCAAAG GTGGCAGAAA CAGAATTAGC AAAGCCTTTA TTTGATGCCC TGCTTCGAGT 3660
AGCCCTGGGG AATCATTCAG CAGATTTGGG CCCTGGTGAT GCTGTGACTG AGAAGAGTCA 3720
TCCCTCTGAG GAAGAGCTGT TGTCCCAGCC CGGAGATTTT TCAGAAGAAG CTGAGGATTC 3780
TCAGTGTTGT AGTTTGAAAC TTCTGGGTGA GGAAGAAGGC TATGAAGCGG ATAGTGAAAG 3840
CAATCCTGAG GATGTTGACA CCCAAGACGA TGGAGTAGAA TTAAATCCTG AAGCAGAAGG 3900
TTTCAGTGGA TCGATTGTTT CAAACAACTT ACTTGAAAAC CTCACTCACG GGGAAATAAT 3960
ATACCCTGAG ATTTGCATGC TGGGATTAAA TTTGCTTTCT GCTAGCAAAG CTAAACTTGA 4020
TGTGCTTGCT CATGTGTTTG AGAGCTTTCT GAAAATTGTC AGGCAGAAGG AAAAGAACAT 4080
TTCTCTCCTC ATACAACAGG GAACTGTGAA AATCCTTCTA GGCGGGTTCT TGAATATTTT 4140
AACACAAACT AACTCTGATT TCCAAGCATG CCAGAGAGTA CTGGTGGATC TCTTGGTATC 4200
TTTGATGAGC TCAAGAACGT GTTCAGAAGA CTTAACTCTT CTTTGGAGAA TATTTCTGGA 4260
GAAATCTCCT TGTACAGAAA TTCTTCTCCT TGGTATTCAC AAAATTGTTG AAAGTGATTT 4320
TACTATGAGC CCTTCACAGT GTCTGACCTT TCCTTTCCTG CATACCCCGA GTTTAAGCAA 4380
TGGTGTCTTA TCACAGAAAC CTCCTGGGAT TCTTAACAGT AAAGCCTTAG GCTTATTGAG 4440
AAGAGCACGG ATTTCCCGAG GCAAGAAAGA GGCTGATAGA GAGAGTTTTC CCTATAGGCT 4500
GCTTTCCTCT TGGCATATAG CCCCAATCCA CCTGCCGTTG CTGGGACAGA ACTGCTGGCC 4560 ACACCTGTCA GAAGGATTTA GTGTTTCTCT TGTGGGTTTA ATGTGGAATA CATCCAATGA 4 620
ATCCGAGAGT GCTGCAGAAA GGGGAAAAAG AGTAAAGAAA AGAAACAAAC CATCAGTTCT 4680
GGAAGACAGC AGTTTTGAAG GAGCAGGTAT GATGGCAGGG TCTGATCTAT ATACTAAGAT 4740
TCTTCAAATA GCTGCTTGCC TGAGTTTTAA GCATATCTGG CAGTATTTTA ATGTATTCTT 4800
TAAATGTTAT TCACCTTAAA GATCCTACTT CACTACTGAA TTACCAAAGC CTGAGTTTTC 4860
AAACAGCCTT GAAATCTTCA TTGTCTCTAA ACTTTAGATA GGGAAGTGGG GATGCTCTGT 4920
TTCTGCAACA GCTGTTGAAG TTAGCAGTCC CATGACTGTG TTAGTGTGGC TTCTGATACT 4980
AGATAGTTAT AAATAAAACC CTATGGCCAT TTTTATTTTA AGTTCTCCTT CTGTGTCTTA 5040
CACCAATGGC CCCTTCTAGT TACTGTCCCT GATCATTTAT ATGTAACAGT CCAAAGTTAG 5100
AACAGAGTTC ATCTGTAACT GAAGAACTGC TGTTAGGATG TACTGAAATT GAATTTTGTT 5160
TTTGTTCTCT TCTTTTTTAA GCAATCAACA GTTTCTTAAG TCATATAGCA GCTAGAGGAA 5220
GTAGTCTTAA AAACTGGCTG TGTATTTTTT TAACCTGTTA AAAATGGTGG CTAATATTTT 5280
TATACCCTAA TAATTGATAA TGTTCCTCTT TTTTAAAAGT CTGAGCTTTT GGACATGCAC 5340
TGTTTATGTT AGTACATCTT AGCTTAGTTT AACATAAAGT CACATCATAG TAACAAATAG 5400
CTTATCACAC ATATTCCACC TGCCATTGCT GTCACAGATA ATGGGAATAT AGAGGCAACT 5460
CAAGATTTAA GTAGTAAGGT GCCATTGGGA GGGGTAAGCA GCTAGCTCAC AGCCATAAAC 5520
ACTTCTCTCA GCGGAGACAA ACTGTGATTC AGGGTTTGGC ATCACTTAGC ATGGTTATTT 5580
CAAGGTTGTT CACTACCTTA AATAATGATC ATTTGAGCAG TGCAGCTTTT CTAAGAAGAG 5640
TATTAATAAT ATTATAGATC GTGCCTTTGT AACAATTTTT TTAGTGCAAG GCATCTGTTG 5700
ATGGCATGTG CTCCCTGGGC CATGGTCAGT TGTGTTAGAG TGACCCAATC CAACAAAAGC 5760
AGAACCTTGG TATGGAGTGT GGCTGACGAT GGTCCTTTAG CACCCTCAGG CCTTGTAGTT 5820
TAAAGCATTT AATAACTTTT AAAACACTGG AGTCTTTAGT GAGGACCTGC CCGGGCGGCC 5880
GCCACCGCGG TGG 5893
( 2 ) INFORMATION FOR SEQ ID NO : 6 :
( l ) SEQUENCE CHARACTERI STICS :
(A) LENGTH : 1545 amino acids
( B ) TYPE : ammo acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
Met Ser Thr Asp Ser Asn Ser Leu Ala Arg Glu Phe Leu Ile Asp Val 1 5 10 15
Asn Gin Leu Cys Asn Ala Val Val Gin Arg Ala Glu Ala Arg Glu Glu 20 25 30
Glu Glu Glu Glu Thr His Met Ala Thr Leu Gly Gin Tyr Leu Val His 35 40 45
Gly Arg Gly Phe Leu Leu Leu Thr Lys Leu Asn Ser Ile Ile Asp Gin 50 55 60
Ala Leu Thr Cys Arg Glu Glu Leu Leu Thr Leu Leu Leu Ser Leu Leu 65 70 75 80
Pro Leu Val Trp Lys Ile Pro Val Gin Glu Gin Gin Ala Thr Asp Phe 85 90 95
Asn Leu Pro Leu Ser Ser Asp Ile Ile Leu Thr Lys Glu Lys Asn Ser 100 105 110
Ser Leu Gin Lys Ser Thr Gin Gly Lys Leu Tyr Leu Glu Gly Ser Ala 115 120 125
Pro Ser Gly Gin Val Ser Ala Lys Val Asn Leu Phe Arg Lys Ile Arg 130 135 140
Arg Gin Arg Lys Ser Thr His Arg Tyr Ser Val Arg Asp Ala Arg Lys 145 150 155 160
Thr Gin Leu Ser Thr Ser Asp Ser Glu Gly Asn Ser Asp Glu Lys Ser 165 170 175
Thr Val Val Ser Lys His Arg Arg Leu His Ala Leu Pro Arg Phe Leu 180 185 190
Thr Gin Ser Pro Lys Glu Gly His Leu Val Ala Lys Pro Asp Pro Ser 195 200 205
Ala Thr Lys Glu Gin Val Leu Ser Asp Thr Met Ser Val Glu Asn Ser 210 215 220
Arg Glu Val Ile Leu Arg Gin Asp Ser Asn Gly Asp Ile Leu Ser Glu 225 230 235 240
Pro Ala Ala Leu Ser Ile Leu Ser Asn Met Asn Asn Ser Pro Phe Asp 245 250 255
Leu Cys His Val Leu Leu Ser Leu Leu Glu Lys Val Cys Lys Phe Asp 260 265 270
Ile Ala Leu Asn His Asn Ser Ser Leu Ala Leu Ser Val Val Pro Thr 275 280 285
Leu Thr Glu Phe Leu Ala Gly Phe Gly Asp Cys Cys Asn Gin Ser Asp 290 295 300
Thr Leu Glu Gly Gin Leu Val Ser Ala Gly Trp Thr Glu Glu Pro Val 305 310 315 320
Ala Leu Val Gin Arg Met Leu Phe Arg Thr Val Leu His Leu Met Ser 325 330 335
Val Asp Val Ser Thr Ala Glu Ala Met Pro Glu Ser Leu Arg Lys Asn 340 345 350
Leu Thr Glu Leu Leu Arg Ala Ala Leu Lys Ile Arg Ala Cys Leu Glu 355 360 365
Lys Gin Pro Glu Pro Phe Ser Pro Arg Gin Lys Lys Thr Leu Gin Glu 370 375 380 Val Gin Glu Gly Phe Val Phe Ser Lys Tyr Arg His Arg Ala Leu Leu 385 390 . 395 400
Leu Pro Glu Leu Leu Glu Gly Val Leu Gin Leu Leu Ile Ser Cys Leu 405 410 415
Gin Ser Ala Ala Ser Asn Pro Phe Tyr Phe Ser Gin Ala Met Asp Leu 420 425 430
Val Gin Glu Phe Ile Gin His Gin Gly Phe Asn Leu Phe Gly Thr Ala 435 440 445
Val Leu Gin Met Glu Trp Leu Leu Thr Arg Asp Gly Val Pro Ser Glu 450 455 460
Ala Ala Glu His Leu Lys Ala Leu Ile Asn Ser Val Ile Lys Ile Met 465 470 475 480
Ser Thr Val Lys Lys Val Lys Ser Glu Gin Leu His His Ser Met Cys 485 490 495
Thr Arg Lys Arg His Arg Arg Cys Glu Tyr Ser His Phe Met Gin His 500 505 510
His Arg Asp Leu Ser Gly Leu Leu Val Ser Ala Phe Lys Asn Gin Leu 515 520 525
Ser Lys Ser Pro Phe Glu Glu Thr Ala Glu Gly Asp Val Gin Tyr Pro 530 535 540
Glu Arg Cys Cys Cys Ile Ala Val Cys Ala His Gin Cys Leu Arg Leu 545 550 555 560
Leu Gin Gin Val Ser Leu Ser Thr Thr Cys Val Gin Ile Leu Ser Gly 565 570 575
Val His Ser Val Gly Ile Cys Cys Cys Met Asp Pro Lys Ser Val Ile 580 585 590
Ala Pro Leu Leu H s Ala Phe Lys Leu Pro Ala Leu Lys Ala Phe Gin 595 600 605
Gin His Ile Leu Asn Val Leu Ser Lys Leu Leu Val Asp Gin Leu Gly 610 615 620
Gly Ala Glu Leu Ser Pro Arg Ile Lys Lys Ala Ala Cys Asn Ile Cys 625 630 635 640
Thr Val Asp Ser Asp Gin Leu Ala Lys Leu Gly Glu Thr Leu Gin Gly 645 650 655
Tnr Leu Cys Gly Ala Gly Pro Thr Ser Gly Leu Pro Ser Pro Ser Tyr 660 665 670
Arg Phe Gin Gly Ile Leu Pro Ser Ser Gly Ser Glu Asp Leu Leu Trp 675 680 685
Lys Trp Asp Ala Leu Glu Ala Tyr Gin Ser Phe Val Phe Gin Glu Asp 690 695 700
Arg Leu His Asn Ile Gin Ile Ala Asn His Ile Cys Asn Leu Leu Gin 705 710 715 720
Lys Gly Asn Val Val Val Gin Trp Lys Leu Tyr Asn Tyr Ile Phe Asn 725 730 735 Pro Val Leu Gin Arg Gly Val. Glu Leu Val His His Cys Gin Gin Leu 740 745 750
Ser Ile Pro Ser Ala Gin Thr His Met Cys Ser Gin Leu Lys Gin Tyr 755 760 765
Leu Pro Gin Glu Val Leu Gin Ile Tyr Leu Lys Thr Leu Pro Val Leu 770 775 780
Leu Lys Ser Arg Val Ile Arg Asp Leu Phe Leu Ser Cys Asn Gly Val 785 790 795 800
Asn His Ile Ile Glu Leu Asn Tyr Leu Asp Gly Ile Arg Ser His Ser 805 810 815
Leu Lys Ala Phe Glu Thr Leu Ile Val Ser Leu Gly Glu Gin Gin Lys 820 825 830
Asp Ala Ala Val Leu Asp Val Asp Gly Leu Asp Ile Gin Gin Glu Leu 835 840 845
Pro Ser Leu Ser Val Gly Pro Ser Leu His Lys Gin Gin Ala Ser Ser 850 855 860
Asp Ser Pro Cys Ser Leu Arg Lys Phe Tyr Ala Ser Leu Arg Glu Pro 865 870 875 880
Asp Pro Lys Lys Arg Lys Thr Ile His Gin Asp Val His Ile Asn Thr 885 890 895
Ile Asn Leu Phe Leu Cys Val Ala Phe Leu Cys Val Ser Lys Glu Ala 900 905 910
Asp Ser Asp Arg Glu Ser Ala Asn Glu Ser Glu Asp Thr Ser Gly Tyr 915 920 925
Asp Ser Pro Pro Ser Glu Pro Leu Ser His Met Leu Pro Cys Leu Ser 930 935 940
Leu Glu Asp Val Val Leu Pro Ser Pro Glu Cys Leu His His Ala Ala 945 950 955 960
Asp Ile Trp Ser Met Cys Arg Trp Ile Tyr Met Leu Asn Ser Val Phe 965 970 975
Gin Lys Gin Phe His Arg Leu Gly Gly Phe Gin Val Cys His Glu Leu 980 985 990
Ile Phe Met Ile Ile Gin Lys Leu Phe Arg Ser His Thr Glu Asp Gin 995 1000 1005
Gly Arg Arg Gin Gly Glu Met Ser Arg Asn Glu Asn Gin Glu Leu Ile 1010 1015 1020
Arg Ile Ser Tyr Pro Glu Leu Thr Leu Lys Gly Asp Val Ser Ser Ala 1025 1030 1035 1040
Thr Ala Pro Asp Leu Gly Phe Leu Arg Lys Ser Ala Asp Ser Val Arg 1045 1050 1055
Gly Phe Gin Ser Gin Pro Val Leu Pro Thr Ser Ala Glu Gin Ile Val 1060 1065 1070
Ala Thr Glu Ser Val Pro Gly Glu Arg Lys Ala Phe Met Ser Gin Gin 1075 1080 1085
Ser Glu Thr Ser Leu Gin Ser Ile Arg Leu Leu Glu Ser Leu Leu Asp 1090 1095 1100
Ile Cys Leu His Ser Ala Arg Ala Cys Gin Gin Lys Met Glu Leu Glu 1105 1110 1115 1120
Leu Pro Ser Gin Gly Leu Ser Val Glu Asn Ile Leu Cys Glu Leu Arg 1125 1130 1135
Glu His Leu Ser Gin Ser Lys Val Ala Glu Thr Glu Leu Ala Lys Pro 1140 1145 1150
Leu Phe Asp Ala Leu Leu Arg Val Ala Leu Gly Asn His Ser Ala Asp 1155 1160 1165
Leu Gly Pro Gly Asp Ala Val Thr Glu Lys Ser His Pro Ser Glu Glu 1170 1175 1180
Glu Leu Leu Ser Gin Pro Gly Asp Phe Ser Glu Glu Ala Glu Asp Ser 1185 1190 1195 1200
Gin Cys Cys Ser Leu Lys Leu Leu Gly Glu Glu Glu Gly Tyr Glu Ala 1205 1210 1215
Asp Ser Glu Ser Asn Pro Glu Asp Val Asp Thr Gin Asp Asp Gly Val 1220 1225 1230
Glu Leu Asn Pro Glu Ala Glu Gly Phe Ser Gly Ser Ile Val Ser Asn 1235 1240 1245
Asn Leu Leu Glu Asn Leu Thr His Gly Glu Ile Ile Tyr Pro Glu Ile 1250 1255 1260
Cys Met Leu Gly Leu Asn Leu Leu Ser Ala Ser Lys Ala Lys Leu Asp 1265 1270 1275 1280
Val Leu Ala His Val Phe Glu Ser Phe Leu Lys Ile Val Arg Gin Lys 1285 1290 1295
Glu Lys Asn Ile Ser Leu Leu Ile Gin Gin Gly Thr Val Lys Ile Leu 1300 1305 1310
Leu Gly Gly Phe Leu Asn Ile Leu Thr Gin Thr Asn Ser Asp Phe Gin 1315 1320 1325
Ala Cys Gin Arg Val Leu Val Asp Leu Leu Val Ser Leu Met Ser Ser 1330 1335 1340
Arg Thr Cys Ser Glu Asp Leu Thr Leu Leu Trp Arg Ile Phe Leu Glu 1345 1350 1355 1360
Lys Ser Pro Cys Thr Glu Ile Leu Leu Leu Gly Ile His Lys Ile Val 1365 1370 1375
Glu Ser Asp Phe Thr Met Ser Pro Ser Gin Cys Leu Thr Phe Pro Phe 1380 1385 1390
Leu His Thr Pro Ser Leu Ser Asn Gly Val Leu Ser Gin Lys Pro Pro 1395 1400 1405
Gly Ile Leu Asn Ser Lys Ala Leu Gly Leu Leu Arg Arg Ala Arg Ile 1410 1415 1420 Ser Arg Gly Lys Lys Glu Ala Asp Arg Glu Ser Phe Pro Tyr Arg Leu 1425 1430 . 1435 1440
Leu Ser Ser Trp His Ile Ala Pro Ile His Leu Pro Leu Leu Gly Gin 1445 1450 1455
Asn Cys Trp Pro His Leu Ser Glu Gly Phe Ser Val Ser Leu Val Gly 1460 1465 1470
Leu Met Trp Asn Thr Ser Asn Glu Ser Glu Ser Ala Ala Glu Arg Gly 1475 1480 1485
Lys Arg Val Lys Lys Arg Asn Lys Pro Ser Val Leu Glu Asp Ser Ser 1490 1495 1500
Phe Glu Gly Ala Gly Met Met Ala Gly Ser Asp Leu Tyr Thr Lys Ile 1505 1510 1515 1520
Leu Gin Ile Ala Ala Cys Leu Ser Phe Lys His Ile Trp Gin Tyr Phe 1525 1530 1535
Asn Val Phe Phe Lys Cys Tyr Ser Pro 1540 1545
(2) INFORMATION FOR SEQ ID NO: 7:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7080 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
CGCAAGGGCT TCTAAGAAGC CATCCCAATG ACCTTTTGGC TTTGAGAAGA GCAGTCCTCA 60
TACCAGAGTG TTTGGGGTTT TGGCCTCTTT CAGTGTTTAT TCATTCTTAC GTGGGAAAGT 120
TGTATTCCGA GGTTTCTGTG GTGCATGAAG CTTTTGCCTT CACCATCTGT TCCCGTGTCT 180
TCCTCGGGTG ACATCAGAGT ACAGCAGTAT TTTCCCTTGC CATCTAATGG GGTTTGGGCT 240
GTTTGACTCA ACCGTGTGTG TTCCTCAATG CCAGGGGAAT AATCCTACCC TAAGTCAGCT 300
GAACAGAAGC CAGGATCTAA CTGCAAACAA GAGACCCAGC TTGCTTAACA GCATGGAAGA 360
GAACCAGTTT CCTTGCAGCT ACCTGGGAAG ACGGTTGCTA ATTAGCCTGC AACAAAGAGT 420
TCCTTGCTCA TCTAAAAGAG GCAATCACCG TTCAGGTGAA GCTTTGTTCT AAGAATATTT 480
GTTTCATCTA GTTTATGAGT CCAAATGATA TAGACTGTAA ATGTCACAGC AGTGGTGAAA 540
GACTGCTCGG TCATGAGCAC CGACAGTAAC TCACTGGCAC GTGAATTTCT GACCGATGTC 600
AACCGGCTTT GCAATGCAGT GGTCCAGAGG GTGGAGGCCA GGGAGGAAGA AGAGGAGGAG 660
ACGCACATGG CAACCCTTGG ACAGTACCTT GTCCATGGTC GAGGATTTCT ATTACTTACC 720
AAGCTAAATT CTATAATTGA TCAGGCATTG ACATGTAGAG AAGAACTCCT GACTCTTCTT 780
CTGTCTCTCC TTCCACTGGT ATGGAAGATA CCTGTCCAAG AAGAAAAGGC AACAGATTTT 840
AACCTACCGC TCTCAGCAGA TATAATCCTG ACCAAAGAAA AGAACTCAAG TTCACAAAGA 900 TCCACTCAGG AAAAATTACA TTTAGAAGGA AGTGCCCTGT CTAGTCAGGT TTCTGCAAAA 960
GTAAATGTTT TTCGAAAAAG CAGACGACAG CGTAAAATTA CCCATCGCTA TTCTGTAAGA 1020
GATGCAAGAA AGACACAGCT CTCCACCTCA GATTCAGAAG CCAATTCAGA TGAAAAAGGC 1080
ATAGCAATGA ATAAGCATAG AAGGCCCCAT CTGCTGCATC ATTTTTTAAC ATCGTTTCCT 1140
AAACAAGACC ACCCCAAAGC TAAACTTGAC CGCTTAGCAA CCAAAGAACA GACTCCTCCA 1200
GATGCTATGG CTTTGGAAAA TTCCAGAGAG ATTATTCCAA GACAGGGGTC AAACACTGAC 1260
ATTTTAAGTG AGCCAGCTGC CTTGTCTGTT ATCAGTAACA TGAACAATTC TCCATTTGAC 1320
TTATGTCATG TTTTGTTATC TTTATTAGAA AAAGTTTGTA AGTTTGACGT TACCTTGAAT 1380
CATAATTCTC CTTTAGCAGC CAGTGTAGTG CCCACACTAA CTGAATTCCT AGCAGGCTTT 1440
GGGGACTGCT GCAGTCTGAG CGACAACTTG GAGAGTCGAG TAGTTTCTGC AGGTTGGACC 1500
GAAGAACCGG TGGCTTTGAT TCAAAGGATG CTCTTTCGAA CAGTGTTGCA TCTTCTGTCA 1560
GTAGATGTTA GTACTGCAGA GATGATGCCA GAAAATCTTA GGAAAAATTT AACTGAATTG 1620
CTTAGAGCAG CTTTAAAAAT TAGAATATGC CTAGAAAAGC AGCCTGACCC TTTTGCACCA 1680
AGACAAAAGA AAACACTGCA GGAGGTTCAG GAAGATTTTG TGTTTTCAAA GTATCGTCAT 1740
AGAGCCCTTC TTTTACCTGA GCTTTTGGAA GGAGTTCTTC AGATTCTGAT CTGTTGTCTT 1800
CAGAGTGCAG CTTCAAATCC CTTCTACTTC AGTCAAGCCA TGGATTTGGT TCAAGAATTC 1860
ATTCAGCATC ATGGATTTAA TTTATTTGAA ACAGCAGTTC TTCAAATGGA ATGGCTGGTT 1920
TTAAGAGATG GAGTTCCTCC CGAGGCCTCA GAGCATTTGA AAGCCCTAAT AAATAGTGTG 1980
ATGAAAATAA TGAGCACTGT CAAAAAAGTG AAATCAGAGC AACTTCATCA TTCGATGTGT 2040
ACAAGAAAAA GGCACAGACG ATGTGAATAT TCTCATTTTA TGCATCATCA CCGAGATCTC 2100
TCAGGTCTTC TGGTTTCGGC TTTTAAAAAC CAGGTTTCCA AAAACCCATT TGAAGAGACT 2160
GCAGATGGAG ATGTTTATTA TCCTGAGCGG TGCTGTTGCA TTGCAGTGTG TGCCCATCAG 2220
TGCTTGCGCT TACTGCAGCA GGCTTCCTTG AGCAGCACTT GTGTCCAGAT CCTATCGGGT 2280
GTTCATAACA TTGGAATATG CTGTTGTATG GATCCCAAAT CTGTAATCAT TCCTTTGCTC 2340
CATGCTTTTA AATTGCCAGC ACTGAAAAAT TTTCAGCAGC ATATATTGAA TATCCTTAAC 2400
AAACTTATTT TGGATCAGTT AGGAGGAGCA GAGATATCAC CAAAAATTAA AAAAGCAGCT 2460
TGTAATATTT GTACTGTTGA CTCTGACCAA CTAGCCCAAT TAGAAGAGAC ACTGCAGGGA 2520
AACTTATGTG ATGCTGAACT CTCCTCAAGT TTATCCAGTC CTTCTTACAG ATTTCAAGGG 2580
ATCCTGCCCA GCAGTGGATC TGAAGATTTG TTGTGGAAAT GGGATGCTTT AAAGGCTTAT 2640
CAGAACTTTG TTTTTGGAGA AGACAGATTA CATAGTATAC AGATTGCAAA TCACATTTGC 2700
AATTTAATCC AGAAAGGCAA TATAGTTGTT CAGTGGAAAT TATATAATTA CATATTTAAT 2760
CCTGTGCTCC AAAGAGGAGT TGAATTAGCA CATCATTGTC AACACCTAAG CGTTACTTCA 2820
GCTCAAAGTC ATGTATGTAG CCATCATAAC CAGTGCTTGC CTCAGGACGT GCTTCAGATT 2880 TATGTAAAAA CTCTGCCTAT CCTGCTTAAA TCCAGGGTAA TAAGAGATTT GTTTTTGAGT 2940
TGTAATGGAG TAAGTCAAAT AATCGAATTA AATTGCTTAA ATGGTATTCG AAGTCATTCT 3000
CTAAAAGCAT TTGAAACTCT GATAATCAGC CTAGGGGAGC AACAGAAAGA TGCCTCAGTT 3060
CCAGATATTG ATGGGATAGA CATTGAACAG AAGGAGTTGT CCTCTGTACA TGTGGGTACT 3120
TCTTTTCATC ATCAGCAAGC TTATTCAGAT TCTCCTCAGA GTCTCAGCAA ATTTTATGCT 3180
GGCCTCAAAG AAGCTTATCC AAAGAGACGG AAGACTGTTA ACCAAGATGT TCATATCAAC 3240
ACAATAAACC TATTCCTCTG TGTGGCTTTT TTATGCGTAA GTAAAGAAGC AGAGTCTGAC 3300
AGGGAGTCGG CCAATGACTC AGAAGATACT TCTGGCTATG ACAGCACAGC CAGCGAGCCT 3360
TTAAGTCATA TGCTGCCATG TATATCTCTC GAGAGCCTTG TCTTGCCTTC TCCTGAACAT 3420
ATGCACCAAG CAGCAGACAT TTGGTCTATG TGTCGTTGGA TCTACATGTT GAGTTCAGTG 3480
TTCCAGAAAC AGTTTTATAG GCTTGGTGGT TTCCGAGTAT GCCATAAGTT AATATTTATG 3540
ATAATACAGA AACTGTTCAG AAGTCACAAA GAGGAGCAAG GAAAAAAGGA GGGAGATACA 3600
AGTGTAAATG AAAACCAGGA TTTAAACAGA ATTTCTCAAC CTAAGAGAAC TATGAAGGAA 3660
GATTTATTAT CTTTGGCTAT AAAAAGTGAC CCCATACCAT CAGAACTAGG TAGTCTAAAA 3720
AAGAGTGCTG ACAGTTTAGG TAAATTAGAG TTACAGCATA TTTCTTCCAT AAATGTGGAA 3780
GAAGTTTCAG CTACTGAAGC CGCTCCCGAG GAAGCAAAGC TATTTACAAG TCAAGAAAGT 3840
GAGACCTCAC TTCAAAGTAT ACGACTTTTG GAAGCCCTTC TGGCCATTTG TCTTCATGGT 3900
GCCAGAACTA GTCAACAGAA GATGGAATTG GAGTTACCTA ATCAGAACTT GTCTGTGGAA 3960
AGTATATTAT TTGAAATGAG GGACCATCTT TCCCAGTCAA AGGTGATTGA AACACAACTA 4020
GCAAAGCCTT TATTTGATGC CCTGCTTCGA GTTGCCCTCG GGAATTATTC AGCAGATTTT 4080
GAACATAATG ATGCTATGAC TGAGAAGAGT CATCAATCTG CAGAAGAATT GTCATCCCAG 4 140
CCTGGTGATT TTTCAGAAGA AGCTGAGGAT TCTCAGTGTT GTAGTTTTAA ACTTTTAGTT 4200
GAAGAAGAAG GTTACGAAGC AGATAGTGAA AGCAATCCTG AAGATGGCGA AACCCAGGAT 4260
GATGGGGTAG ACTTAAAGTC TGAAACAGAA GGTTTCAGTG CATCAAGCAG TCCAAATGAC 4320
TTACTCGAAA ACCTCACTCA AGGGGAAATA ATTTATCCTG AGATTTGTAT GCTGGAATTA 4380
AATTTGCTTT CTGCTAGTAA AGCCAAACTT GATGTGCTTG CCCATGTATT TGAGAGTTTT 4 440
TTGAAAATTA TTAGGCAGAA AGAAAAGAAT GTTTTTCTGC TCATGCAACA GGGAACTGTG 4500
AAAAATCTTT TAGGAGGGTT CTTGAGTATT TTAACACAGG ATGATTCTGA TTTTCAAGCA 4560
TGCCAGAGAG TATTGGTGGA TCTTTTGGTA TCTTTGATGA GTTCAAGAAC ATGTTCAGAA 4620
GAGCTAACCC TTCTTTTGAG AATATTTCTG GAGAAATCTC CTTGTACAAA AATTCTTCTT 4680
CTGGGTATTC TGAAAATTAT TGAAAGTGAT ACTACTATGA GCCCTTCACA GTATCTAACC 4740
TTCCCTTTAC TGCACGCTCC AAATTTAAGC AACGGTGTTT CATCACAAAA GTATCCTGGG 4 800 Ji- I
ATTTTAAACA GTAAGGCCAT GGGTTTATTG AGAAGAGCAC GAGTTTCACG GAGCAAGAAA 4860
GAGGCTGATA GAGAGAGTTT TCCCCATCGG CTGCTTTCAT CTTGGCACAT AGCCCCAGTC 4920
CACCTGCCGT TGCTGGGGCA AAACTGCTGG CCACACCTAT CAGAAGGTTT CAGTGTTTCC 4980
CTGTGGTTTA ATGTGGAGTG TATCCATGAA GCTGAGAGTA CTACAGAAAA AGGAAAGAAG 5040
ATAAAGAAAA GAAACAAATC ATTAATTTTA CCAGATAGCA GTTTTGATGG TACAGAGAGC 5100
GACAGACCAG AAGGTGCAGA GTACATAAAT CCTGGTGAAA GACTCATAGA AGAAGGATGT 5160
ATTCATATAA TTTCACTGGG ATCCAAAGCG TTGATGATCC AAGTGTGGGC TGATCCCCAC 5220
AATGCCACTC TTATCTTTCG TGTGTGCATG GATTCAAATG ATGACATGAA AGCTGTTTTA 5280
CTAGCACAGG TTGAATCACA GGAGAATATT TTCCTCCCAA GCAAATGGCA ACATTTAGTA 5340
CTCACCTACT TACAGCAGCC CCAAGGGAAA AGGAGGATTC ATGGGAAAAT CTCCATATGG 5400
GTCTCTGGAC AGAGGAAGCC TGATGTTACT TTGGATTTTA TGCTTCCAAG AAAAACAAGT 5460
TTGTCATCTG ATAGCAATAA AACATTTTGC ATGATTGGCC ATTGTTTATC ATCCCAAGAA 5520
GAGTTTTTGC AGTTGGCTGG AAAATGGGAC CTGGGAAATT TGCTTCTCTT CAACGGAGCT 5580
AAGGTTGGTT CACAAGAGGC CTTTTATCTG TATGCTTGTG GACCCAACCA TACATCTGTA 5640
ATGCCATGTA AGTATGGCAA GCCAGTCAAT GACTACTCCA AATATATTAA TAAAGAAATT 5700
TTGCGATGTG AACAAATCAG AGAATTTTTT ATGACCAAGA AAGATGTGGA TATTGGTCTC 5760
TTAATTGGAG TCTTTCAGTT GTTTATACAA CTTACTGTCC TGCTCCAGTA TACCATCTAT 5820
GAACCAGTGA TTAGACTTAA AGGTCAAATG AAAACCCAAC TCTCTCAAAG ACCCTTCAGC 5880
TCAAAAGAAG TTCAGAGCAT CTTATTAGAA CCTCATCATC TAAAGAATCT CCAACCTACT 5940
GAATATAAAA CTATTCAAGG CATTCTGCAC GAAATTGGTG GAACTGGCAT ATTTGTTTTT 6000
CTCTTTGCCA GGGTTGTTGA ACTCAGTAGC TGTGAAGAAA CTCAAGCATT AGCACTGCGA 6060
GTTATACTCT CATTAATTAA ATACAACCAA CAAAGAGTAC ATGAATTAGA AAATTGTAAT 6120
GGACTTTCTA TGATTCATCA GGTGTTGATC AAACAAAAAT GCATTGTTGG GTTTTACATT 6180
TTGAAGACCC TTCTTGAAGG ATGCTGTGGT GAAGATATTA TTTATATGAA TGAGAATGGA 6240
GAGTTTAAGT TGGATGTAGA CTCTAATGCT ATAATCCAAG ATGTTAAGCT GTTAGAGGAA 6300
CTATTGCTTG ACTGGAAGAT ATGGAGTAAA GCAGAGCAAG GTGTTTGGGA AACTTTGCTA 6360
GCAGCTCTAG AAGTCCTCAT CAGAGCAGAT CACCACCAGC AGATGTTTAA TATTAAGCAG 6420
TTATTGAAAG CTCAAGTGGT TCATCACTTT CTACTGACTT GTCAGGTTTT GCAGGAATAC 6480
AAAGAGGGGC AACTCACACC CATGCCCCGA GAGATGGCAA GATCTTTCAG GAGAAAGTGC 6540
GGTCAATCAT GTACCTGAGG CATTCCAGCA GTGGAGGAAG GTCCCTTATG AGCCCTGGAT 6600
TTATGGTAAT AAGCCCATCT GGTTTTACTG CTTCACCATA TGAAGGAGAG AATTCCTCTA 6660
ATATTATTCC ACAACAGATG GCCGCCCATA TGCTGCGTTC TAGAAGCCTA CCAGCATTCC 6720
CTACTTCTTC ACTACTAACG CAATCACAAA AACTGACTGG AAGTTTGGGT TGTAGTATCG 6780 \ \o'l
ACAGGTTACA AAATATTGCA GATACTTATG.TTGCCACCCA ATCAAAGAAA CAAAATTCTT 6840
TGGGGAGTTC CGACACACTG AAAAAAGGCA AAGAGGACGC ATTCATCAGT AGCTGTGAGT 6900
CTGCAAAAAC TGTTTGTGAA ATGGAAGCTG TCCTCTCAGC CCAGGTCTCT GTCAGTGATG 6960
TCCCAAAGGG AGTGCTGGGA TTTCCAGTGG TCAAAGCAGA TCATAAACAG TTGGGAGCAG 7020
AACCCAGGTC AGAAGATGAC AGTCCTGGGG ATGAGTCCTG CCCACGCCGA GCCCTATGCA 7080
(2) INFORMATION FOR SEQ ID NO: 8:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2001 amino acids
(B) TYPE: ammo acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
Met Ser Thr Asp Ser Asn Ser Leu Ala Arg Glu Phe Leu Thr Asp Val 1 5 10 15
Asn Arg Leu Cys Asn Ala Val Val Gin Arg Val Glu Ala Arg Glu Glu 20 25 30
Glu Glu Glu Glu Thr His Met Ala Thr Leu Gly Gin Tyr Leu Val His 35 40 45
Gly Arg Gly Phe Leu Leu Leu Thr Lys Leu Asn Ser Ile Ile Asp Gin 50 55 60
Ala Leu Thr Cys Arg Glu Glu Leu Leu Thr Leu Leu Leu Ser Leu Leu 65 70 75 80
Pro Leu Val Trp Lys Ile Pro Val Gin Glu Glu Lys Ala Thr Asp Phe 85 90 95
Asn Leu Pro Leu Ser Ala Asp Ile Ile Leu Thr Lys Glu Lys Asn Ser 100 105 110
Ser Ser Gin Arg Ser Thr Gin Glu Lys Leu His Leu Glu Gly Ser Ala 115 120 125
Leu Ser Ser Gin Val Ser Ala Lys Val Asn Val Phe Arg Lys Ser Arg 130 135 140
Arg Gin Arg Lys Ile Thr His Arg Tyr Ser Val Arg Asp Ala Arg Lys 145 150 155 160
Thr Gin Leu Ser Thr Ser Asp Ser Glu Ala Asn Ser Asp Glu Lys Gly 165 170 175
Ile Ala Met Asn Lys His Arg Arg Pro His Leu Leu His His Phe Leu 180 185 190
Thr Ser Phe Pro Lys Gin Asp His Pro Lys Ala Lys Leu Asp Arg Leu 195 200 205
Ala Thr Lys Glu Gin Thr Pro Pro Asp Ala Met Ala Leu Glu Asn Ser 210 215 220
Arg Glu Ile Ile Pro Arg Gin Gly Ser Asn Thr Asp Ile Leu Ser Glu M
225 230 235 240
Pro Ala Ala Leu Ser Val Ile Ser Asn Met Asn Asn Ser Pro Phe Asp 245 250 255
Leu Cys His Val Leu Leu Ser Leu Leu Glu Lys Val Cys Lys Phe Asp 260 265 270
Val Thr Leu Asn H s Asn Ser Pro Leu Ala Ala Ser Val Val Pro Thr 275 280 285
Leu Thr Glu Phe Leu Ala Gly Phe Gly Asp Cys Cys Ser Leu Ser Asp 290 295 300
Asn Leu Glu Ser Arg Val Val Ser Ala Gly Trp Thr Glu Glu Pro Val 305 310 315 320
Ala Leu Ile Gin Arg Met Leu Phe Arg Thr Val Leu His Leu Leu Ser 325 330 335
Val Asp Val Ser Thr Ala Glu Met Met Pro Glu Asn Leu Arg Lys Asn 340 345 350
Leu Thr Glu Leu Leu Arg Ala Ala Leu Lys lie Arg Ile Cys Leu Glu 355 360 365
Lys Gin Pro Asp Pro Phe Ala Pro Arg Gin Lys Lys Thr Leu Gin Glu 370 375 380
Val Gin Glu Asp Phe Val Phe Ser Lys Tyr Arg His Arg Ala Leu Leu 385 390 395 400
Leu Pro Glu Leu Leu Glu Gly Val Leu Gin lie Leu Ile Cys Cys Leu 405 410 415
Gin Ser Ala Ala Ser Asn Pro Phe Tyr Phe Ser Gin Ala Met Asp Leu 420 425 430
Val Gin Glu Phe Ile Gin His His Gly Phe Asn Leu Phe Glu Thr Ala 435 440 445
Val Leu Gin Met Glu Trp Leu Val Leu Arg Asp Gly Val Pro Pro Glu 450 455 460
Ala Ser Glu His Leu Lys Ala Leu Ile Asn Ser Val Met Lys Ile Met 465 470 475 480
Ser Thr Val Lys Lys Val Lys Ser Glu Gin Leu His His Ser Met Cys 485 490 495
Thr Arg Lys Arg His Arg Arg Cys Glu Tyr Ser His Phe Met His His 500 505 510
His Arg Asp Leu Ser Gly Leu Leu Val Ser Ala Phe Lys Asn Gin Val 515 520 525
Ser Lys Asn Pro Phe Glu Glu Thr Ala Asp Gly Asp Val Tyr Tyr Pro 530 535 540
Glu Arg Cys Cys Cys Ile Ala Val Cys Ala His Gin Cys Leu Arg Leu 545 550 555 560
Leu Gin Gin Ala Ser Leu Ser Ser Thr Cys Val Gin Ile Leu Ser Gly 565 570 575 Val His Asn lie Gly Ile Cys Cys Cy Ms Met Asp Pro Lys Ser Val Ile 580 . 585 590
Ile Pro Leu Leu His Ala Phe Lys Leu Pro Ala Leu Lys Asn Phe Gin 595 600 605
Gin His Ile Leu Asn Ile Leu Asn Lys Leu Ile Leu Asp Gin Leu Gly 610 615 620
Gly Ala Glu Ile Ser Pro Lys Ile Lys Lys Ala Ala Cys Asn Ile Cys 625 630 635 640
Thr Val Asp Ser Asp Gin Leu Ala Gin Leu Glu Glu Thr Leu Gin Gly 645 650 655
Asn Leu Cys Asp Ala Glu Leu Ser Ser Ser Leu Ser Ser Pro Ser Tyr 660 665 670
Arg Phe Gin Gly Ile Leu Pro Ser Ser Gly Ser Glu Asp Leu Leu Trp 675 680 685
Lys Trp Asp Ala Leu Lys Ala Tyr Gin Asn Phe Val Phe Gly Glu Asp 690 695 700
Arg Leu His Ser Ile Gin Ile Ala Asn His Ile Cys Asn Leu Ile Gin 705 710 715 720
Lys Gly Asn Ile Val Val Gin Trp Lys Leu Tyr Asn Tyr Ile Phe Asn 725 730 735
Pro Val Leu Gin Arg Gly Val Glu Leu Ala His His Cys Gin His Leu 740 745 750
Ser Val Thr Ser Ala Gin Ser His Val Cys Ser His His Asn Gin Cys 755 760 765
Leu Pro Gin Asp Val Leu Gin Ile Tyr Val Lys Thr Leu Pro Ile Leu 770 775 780
Leu Lys Ser Arg Val Ile Arg Asp Leu Phe Leu Ser Cys Asn Gly Val 785 790 795 800
Ser Gin Ile Ile Glu Leu Asn Cys Leu Asn Gly Ile Arg Ser His Ser 805 810 815
Leu Lys Ala Phe Glu Thr Leu Ile Ile Ser Leu Gly Glu Gin Gin Lys 820 825 830
Asp Ala Ser Val Pro Asp Ile Asp Gly Ile Asp Ile Glu Gin Lys Glu 835 840 845
Leu Ser Ser Val His Val Gly Thr Ser Phe His His Gin Gin Ala Tyr 850 855 860
Ser Asp Ser Pro Gin Ser Leu Ser Lys Phe Tyr Ala Gly Leu Lys Glu 865 870 875 880
Ala Tyr Pro Lys Arg Arg Lys Thr Val Asn Gin Asp Val His Ile Asn 885 890 895
Thr Ile Asn Leu Phe Leu Cys Val Ala Phe Leu Cys Val Ser Lys Glu 900 905 910
Ala Glu Ser Asp Arg Glu Ser Ala Asn Asp Ser Glu Asp Thr Ser Gly 915 920 925 Tyr Asp Ser Thr Ala Ser Glu. Pro Leu Ser His Met Leu Pro Cys Ile 930 935 940
Ser Leu Glu Ser Leu Val Leu Pro Ser Pro Glu His Met His Gin Ala 945 950 955 960
Ala Asp Ile Trp Ser Met Cys Arg Trp Ile Tyr Met Leu Ser Ser Val 965 970 975
Phe Gin Lys Gin Phe Tyr Arg Leu Gly Gly Phe Arg Val Cys His Lys 980 985 990
Leu Ile Phe Met Ile Ile Gin Lys Leu Phe Arg Ser His Lys Glu Glu 995 1000 1005
Gin Gly Lys Lys Glu Gly Asp Thr Ser Val Asn Glu Asn Gin Asp Leu 1010 1015 1020
Asn Arg Ile Ser Gin Pro Lys Arg Thr Met Lys Glu Asp Leu Leu Ser 1025 1030 1035 1040
Leu Ala Ile Lys Ser Asp Pro Ile Pro Ser Glu Leu Gly Ser Leu Lys 1045 1050 1055
Lys Ser Ala Asp Ser Leu Gly Lys Leu Glu Leu Gin His Ile Ser Ser 1060 1065 1070
Ile Asn Val Glu Glu Val Ser Ala Thr Glu Ala Ala Pro Glu Glu Ala 1075 1080 1085
Lys Leu Phe Thr Ser Gin Glu Ser Glu Thr Ser Leu Gin Ser Ile Arg 1090 1095 1100
Leu Leu Glu Ala Leu Leu Ala lie Cys Leu His Gly Ala Arg Thr Ser 1105 1110 1115 1120
Gin Gin Lys Met Glu Leu Glu Leu Pro Asn Gin Asn Leu Ser Val Glu 1125 1130 1135
Ser Ile Leu Phe Glu Met Arg Asp His Leu Ser Gin Ser Lys Val Ile 1140 1145 1150
Glu Thr Gin Leu Ala Lys Pro Leu Phe Asp Ala Leu Leu Arg Val Ala 1155 1160 1165
Leu Gly Asn Tyr Ser Ala Asp Phe Glu His Asn Asp Ala Met Thr Glu 1170 1175 1180
Lys Ser His Gin Ser Ala Glu Glu Leu Ser Ser Gin Pro Gly Asp Phe 1185 1190 1195 1200
Ser Glu Glu Ala Glu Asp Ser Gin Cys Cys Ser Phe Lys Leu Leu Val 1205 1210 1215
Glu Glu Glu Gly Tyr Glu Ala Asp Ser Glu Ser Asn Pro Glu Asp Gly 1220 1225 1230
Glu Thr Gin Asp Asp Gly Val Asp Leu Lys Ser Glu Thr Glu Gly Phe 1235 1240 1245
Ser Ala Ser Ser Ser Pro Asn Asp Leu Leu Glu Asn Leu Thr Gin Gly 1250 1255 1260
Glu Ile Ile Tyr Pro Glu Ile Cys Met Leu Glu Leu Asn Leu Leu Ser 1265 1270 1275 1280
Ala Ser Lys Ala Lys Leu Asp Val Leu Ala His Val Phe Glu Ser Phe 1285 1290 1295
Leu Lys Ile Ile Arg Gin Lys Glu Lys Asn Val Phe Leu Leu Met Gin 1300 1305 1310
Gin Gly Thr Val Lys Asn Leu Leu Gly Gly Phe Leu Ser Ile Leu Thr 1315 1320 1325
Gin Asp Asp Ser Asp Phe Gin Ala Cys Gin Arg Val Leu Val Asp Leu 1330 1335 1340
Leu Val Ser Leu Met Ser Ser Arg Thr Cys Ser Glu Glu Leu Thr Leu 1345 1350 1355 1360
Leu Leu Arg lie Phe Leu Glu Lys Ser Pro Cys Thr Lys Ile Leu Leu 1365 1370 1375
Leu Gly Ile Leu Lys Ile Ile Glu Ser Asp Thr Thr Met Ser Pro Ser 1380 1385 1390
Gin Tyr Leu Thr Phe Pro Leu Leu His Ala Pro Asn Leu Ser Asn Gly 1395 1400 1405
Val Ser Ser Gin Lys Tyr Pro Gly Ile Leu Asn Ser Lys Ala Met Gly 1410 1415 1420
Leu Leu Arg Arg Ala Arg Val Ser Arg Ser Lys Lys Glu Ala Asp Arg 1425 1430 1435 1440
Glu Ser Phe Pro His Arg Leu Leu Ser Ser Trp His Ile Ala Pro Val 1445 1450 1455
His Leu Pro Leu Leu Gly Gin Asn Cys Trp Pro His Leu Ser Glu Gly 1460 1465 1470
Phe Ser Val Ser Leu Trp Phe Asn Val Glu Cys Ile His Glu Ala Glu 1475 1480 1485
Ser Thr Thr Glu Lys Gly Lys Lys Ile Lys Lys Arg Asn Lys Ser Leu 1490 1495 1500 lie Leu Pro Asp Ser Ser Phe Asp Gly Thr Glu Ser Asp Arg Pro Glu 1505 1510 1515 1520
Gly Ala Glu Tyr Ile Asn Pro Gly Glu Arg Leu Ile Glu Glu Gly Cys 1525 1530 1535
Ile His Ile Ile Ser Leu Gly Ser Lys Ala Leu Met Ile Gin Val Trp 1540 1545 1550
Ala Asp Pro His Asn Ala Thr Leu Ile Phe Arg Val Cys Met Asp Ser 1555 1560 1565
Asn Asp Asp Met Lys Ala Val Leu Leu Ala Gin Val Glu Ser Gin Glu 1570 1575 1580
Asn Ile Phe Leu Pro Ser Lys Trp Gin His Leu Val Leu Thr Tyr Leu 1585 1590 1595 1600
Gin Gin Pro Gin Gly Lys Arg Arg Ile His Gly Lys Ile Ser Ile Trp 1605 1610 1615 Val Ser Gly Gin Arg Lys Pro Asp Va Ml Thr Leu Asp Phe Met Leu Pro 1620 . 1625 1630
Arg Lys Thr Ser Leu Ser Ser Asp Ser Asn Lys Thr Phe Cys Met Ile 1635 1640 1645
Gly His Cys Leu Ser Ser Gin Glu Glu Phe Leu Gin Leu Ala Gly Lys 1650 1655 1660
Trp Asp Leu Gly Asn Leu Leu Leu Phe Asn Gly Ala Lys Val Gly Ser 1665 1670 1675 1680
Gin Glu Ala Phe Tyr Leu Tyr Ala Cys Gly Pro Asn His Thr Ser Val 1685 1690 1695
Met Pro Cys Lys Tyr Gly Lys Pro Val Asn Asp Tyr Ser Lys Tyr Ile 1700 1705 1710
Asn Lys Glu Ile Leu Arg Cys Glu Gin Ile Arg Glu Phe Phe Met Thr 1715 1720 1725
Lys Lys Asp Val Asp Ile Gly Leu Leu Ile Gly Val Phe Gin Leu Phe 1730 1735 1740
Ile Gin Leu Thr Val Leu Leu Gin Tyr Thr Ile Tyr Glu Pro Val Ile 1745 1750 1755 1760
Arg Leu Lys Gly Gin Met Lys Thr Gin Leu Ser Gin Arg Pro Phe Ser 1765 1770 1775
Ser Lys Glu Val Gin Ser Ile Leu Leu Glu Pro His His Leu Lys Asn 1780 1785 1790
Leu Gin Pro Thr Glu Tyr Lys Thr Ile Gin Gly Ile Leu His Glu Ile 1795 1800 1805
Gly Gly Thr Gly Ile Phe Val Phe Leu Phe Ala Arg Val Val Glu Leu 1810 1815 1820
Ser Ser Cys Glu Glu Thr Gin Ala Leu Ala Leu Arg Val Ile Leu Ser 1825 1830 1835 1840
Leu Ile Lys Tyr Asn Gin Gin Arg Val His Glu Leu Glu Asn Cys Asn 1845 1850 1855
Gly Leu Ser Met Ile His Gin Val Leu Ile Lys Gin Lys Cys lie Val 1860 1865 1870
Gly Phe Tyr Ile Leu Lys Thr Leu Leu Glu Gly Cys Cys Gly Glu Asp 1875 1880 1885
Ile Ile Tyr Met Asn Glu Asn Gly Glu Phe Lys Leu Asp Val Asp Ser 1890 1895 1900
Asn Ala Ile Ile Gin Asp Val Lys Leu Leu Glu Glu Leu Leu Leu Asp 1905 1910 1915 1920
Trp Lys Ile Trp Ser Lys Ala Glu Gin Gly Val Trp Glu Thr Leu Leu 1925 1930 1935
Ala Ala Leu Glu Val Leu Ile Arg Ala Asp His His Gin Gin Met Phe 1940 1945 1950
Asn Ile Lys Gin Leu Leu Lys Ala Gin Val Val His His Phe Leu Leu 1955 1960 1965 M
Thr Cys Gin Val Leu Gin Glu .Tyr Lys Glu Gly Gin Leu Thr Pro Met 1970 1975 1980
Pro Arg Glu Met Ala Arg Ser Phe Arg Arg Lys Cys Gly Gin Ser Cys 1985 1990 1995 2000
Thr
(2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5221 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
CGCAAGGGCT TCTAAGAAGC CATCCCAATG ACCTTTTGGC TTTGAGAAGA GCAGTCCTCA 60
TACCAGAGTG TTTGGGGTTT TGGCCTCTTT CAGTGTTTAT TCATTCTTAC GTGGGAAAGT 120
TGTATTCCGA GGTTTCTGTG GTGCATGAAG CTTTTGCCTT CACCATCTGT TCCCGTGTCT 180
TCCTCGGGTG ACATCAGAGT ACAGCAGTAT TTTCCCTTGC CATCTAATGG GGTTTGGGCT 240
GTTTGACTCA ACCGTGTGTG TTCCTCAATG CCAGGGGAAT AATCCTACCC TAAGTCAGCT 300
GAACAGAAGC CAGGATCTAA CTGCAAACAA GAGACCCAGC TTGCTTAACA GCATGGAAGA 360
GAACCAGTTT CCTTGCAGCT ACCTGGGAAG ACGGTTGCTA ATTAGCCTGC AACAAAGAGT 420
TCCTTGCTCA TCTAAAAGAG GCAATCACCG TTCAGGTGAA GCTTTGTTCT AAGAATATTT 480
GTTTCATCTA GTTTATGAGT CCAAATGATA TAGACTGTAA ATGTCACAGC AGTGGTGAAA 540
GACTGCTCGG TCATGAGCAC CGACAGTAAC TCACTGGCAC GTGAATTTCT GACCGATGTC 600
AACCGGCTTT GCAATGCAGT GGTCCAGAGG GTGGAGGCCA GGGAGGAAGA AGAGGAGGAG 660
ACGCACATGG CAACCCTTGG ACAGTACCTT GTCCATGGTC GAGGATTTCT ATTACTTACC 720
AAGCTAAATT CTATAATTGA TCAGGCATTG ACATGTAGAG AAGAACTCCT GACTCTTCTT 780
CTGTCTCTCC TTCCACTGGT ATGGAAGATA CCTGTCCAAG AAGAAAAGGC AACAGATTTT 840
AACCTACCGC TCTCAGCAGA TATAATCCTG ACCAAAGAAA AGAACTCAAG TTCACAAAGA 900
TCCACTCAGG AAAAATTACA TTTAGAAGGA AGTGCCCTGT CTAGTCAGGT TTCTGCAAAA 960
GTAAATGTTT TTCGAAAAAG CAGACGACAG CGTAAAATTA CCCATCGCTA TTCTGTAAGA 1020
GATGCAAGAA AGACACAGCT CTCCACCTCA GATTCAGAAG CCAATTCAGA TGAAAAAGGC 1080
ATAGCAATGA ATAAGCATAG AAGGCCCCAT CTGCTGCATC ATTTTTTAAC ATCGTTTCCT 1140
AAACAAGACC ACCCCAAAGC TAAACTTGAC CGCTTAGCAA CCAAAGAACA GACTCCTCCA 1200
GATGCTATGG CTTTGGAAAA TTCCAGAGAG ATTATTCCAA GACAGGGGTC AAACACTGAC 1260
ATTTTAAGTG AGCCAGCTGC CTTGTCTGTT ATCAGTAACA TGAACAATTC TCCATTTGAC 1320 TTATGTCATG TTTTGTTATC TTTATTAGAA AAAGTTTGTA AGTTTGACGT TACCTTGAAT 1380
CATAATTCTC CTTTAGCAGC CAGTGTAGTG CCCACACTAA CTGAATTCCT AGCAGGCTTT 1440
GGGGACTGCT GCAGTCTGAG CGACAACTTG GAGAGTCGAG TAGTTTCTGC AGGTTGGACC 1500
GAAGAACCGG TGGCTTTGAT TCAAAGGATG CTCTTTCGAA CAGTGTTGCA TCTTCTGTCA 1560
GTAGATGTTA GTACTGCAGA GATGATGCCA GAAAATCTTA GGAAAAATTT AACTGAATTG 1620
CTTAGAGCAG CTTTAAAAAT TAGAATATGC CTAGAAAAGC AGCCTGACCC TTTTGCACCA 1680
AGACAAAAGA AAACACTGCA GGAGGTTCAG GAAGATTTTG TGTTTTCAAA GTATCGTCAT 1740
AGAGCCCTTC TTTTACCTGA GCTTTTGGAA GGAGTTCTTC AGATTCTGAT CTGTTGTCTT 1800
CAGAGTGCAG CTTCAAATCC CTTCTACTTC AGTCAAGCCA TGGATTTGGT TCAAGAATTC 1860
ATTCAGCATC ATGGATTTAA TTTATTTGAA ACAGCAGTTC TTCAAATGGA ATGGCTGGTT 1920
TTAAGAGATG GAGTTCCTCC CGAGGCCTCA GAGCATTTGA AAGCCCTAAT AAATAGTGTG 1980
ATGAAAATAA TGAGCACTGT CAAAAAAGTG AAATCAGAGC AACTTCATCA TTCGATGTGT 2040
ACAAGAAAAA GGCACAGACG ATGTGAATAT TCTCATTTTA TGCATCATCA CCGAGATCTC 2100
TCAGGTCTTC TGGTTTCGGC TTTTAAAAAC CAGGTTTCCA AAAACCCATT TGAAGAGACT 2160
GCAGATGGAG ATGTTTATTA TCCTGAGCGG TGCTGTTGCA TTGCAGTGTG TGCCCATCAG 2220
TGCTTGCGCT TACTGCAGCA GGCTTCCTTG AGCAGCACTT GTGTCCAGAT CCTATCGGGT 2280
GTTCATAACA TTGGAATATG CTGTTGTATG GATCCCAAAT CTGTAATCAT TCCTTTGCTC 2340
CATGCTTTTA AATTGCCAGC ACTGAAAAAT TTTCAGCAGC ATATATTGAA TATCCTTAAC 2400
AAACTTATTT TGGATCAGTT AGGAGGAGCA GAGATATCAC CAAAAATTAA AAAAGCAGCT 2460
TGTAATATTT GTACTGTTGA CTCTGACCAA CTAGCCCAAT TAGAAGAGAC ACTGCAGGGA 2520
AACTTATGTG ATGCTGAACT CTCCTCAAGT TTATCCAGTC CTTCTTACAG ATTTCAAGGG 2580
ATCCTGCCCA GCAGTGGATC TGAAGATTTG TTGTGGAAAT GGGATGCTTT AAAGGCTTAT 2640
CAGAACTTTG TTTTTGGAGA AGACAGATTA CATAGTATAC AGATTGCAAA TCACATTTGC 2700
AATTTAATCC AGAAAGGCAA TATAGTTGTT CAGTGGAAAT TATATAATTA CATATTTAAT 2760
CCTGTGCTCC AAAGAGGAGT TGAATTAGCA CATCATTGTC AACACCTAAG CGTTACTTCA 2820
GCTCAAAGTC ATGTATGTAG CCATCATAAC CAGTGCTTGC CTCAGGACGT GCTTCAGATT 2880
TATGTAAAAA CTCTGCCTAT CCTGCTTAAA TCCAGGGTAA TAAGAGATTT GTTTTTGAGT 2940
TGTAATGGAG TAAGTCAAAT AATCGAATTA AATTGCTTAA ATGGTATTCG AAGTCATTCT 3000
CTAAAAGCAT TTGAAACTCT GATAATCAGC CTAGGGGAGC AACAGAAAGA TGCCTCAGTT 3060
CCAGATATTG ATGGGATAGA CATTGAACAG AAGGAGTTGT CCTCTGTACA TGTGGGTACT 3120
TCTTTTCATC ATCAGCAAGC TTATTCAGAT TCTCCTCAGA GTCTCAGCAA ATTTTATGCT 3180
GGCCTCAAAG AAGCTTATCC AAAGAGACGG AAGACTGTTA ACCAAGATGT TCATATCAAC 3240
ACAATAAACC TATTCCTCTG TGTGGCTTTT TTATGCGTAA GTAAAGAAGC AGAGTCTGAC 3300 AGGGAGTCGG CCAATGACTC AGAAGATACT -TCTGGCTATG ACAGCACAGC CAGCGAGCCT 3360
TTAAGTCATA TGCTGCCATG TATATCTCTC GAGAGCCTTG TCTTGCCTTC TCCTGAACAT 3420
ATGCACCAAG CAGCAGACAT TTGGTCTATG TGTCGTTGGA TCTACATGTT GAGTTCAGTG 3480
TTCCAGAAAC AGTTTTATAG GCTTGGTGGT TTCCGAGTAT GCCATAAGTT AATATTTATG 354 0
ATAATACAGA AACTGTTCAG AAGTCACAAA GAGGAGCAAG GAAAAAAGGA GGGAGATACA 3600
AGTGTAAATG AAAACCAGGA TTTAAACAGA ATTTCTCAAC CTAAGAGAAC TATGAAGGAA 3660
GATTTATTAT CTTTGGCTAT AAAAAGTGAC CCCATACCAT CAGAACTAGG TAGTCTAAAA 3720
AAGAGTGCTG ACAGTTTAGG TAAATTAGAG TTACAGCATA TTTCTTCCAT AAATGTGGAA 37 80
GAAGTTTCAG CTACTGAAGC CGCTCCCGAG GAAGCAAAGC TATTTACAAG TCAAGAAAGT 3840
GAGACCTCAC TTCAAAGTAT ACGACTTTTG GAAGCCCTTC TGGCCATTTG TCTTCATGGT 3900
GCCAGAACTA GTCAACAGAA GATGGAATTG GAGTTACCTA ATCAGAACTT GTCTGTGGAA 3960
AGTATATTAT TTGAAATGAG GGACCATCTT TCCCAGTCAA AGGTGATTGA AACACAACTA 4 02 0
GCAAAGCCTT TATTTGATGC CCTGCTTCGA GTTGCCCTCG GGAATTATTC AGCAGATTTT 4 080
GAACATAATG ATGCTATGAC TGAGAAGAGT CATCAATCTG CAGAAGAATT GTCATCCCAG 4 140
CCTGGTGATT TTTCAGAAGA AGCTGAGGAT TCTCAGTGTT GTAGTTTTAA ACTTTTAGTT 4200
GAAGAAGAAG GTTACGAAGC AGATAGTGAA AGCAATCCTG AAGATGGCGA AACCCAGGAT 4260
GATGGGGTAG ACTTAAAGTC TGAAACAGAA GGTTTCAGTG CATCAAGCAG TCCAAATGAC 4320
TTACTCGAAA ACCTCACTCA AGGGGAAATA ATTTATCCTG AGATTTGTAT GCTGGAATTA 4380
AATTTGCTTT CTGCTAGTAA AGCCAAACTT GATGTGCTTG CCCATGTATT TGAGAGTTTT 4440
TTGAAAATTA TTAGGCAGAA AGAAAAGAAT GTTTTTCTGC TCATGCAACA GGGAACTGTG 4500
AAAAATCTTT TAGGAGGGTT CTTGAGTATT TTAACACAGG ATGATTCTGA TTTTCAAGCA 4560
TGCCAGAGAG TATTGGTGGA TCTTTTGGTA TCTTTGATGA GTTCAAGAAC ATGTTCAGAA 4620
GAGCTAACCC TTCTTTTGAG AATATTTCTG GAGAAATCTC CTTGTACAAA AATTCTTCTT 4680
CTGGGTATTC TGAAAATTAT TGAAAGTGAT ACTACTATGA GCCCTTCACA GTATCTAACC 4740
TTCCCTTTAC TGCACGCTCC AAATTTAAGC AACGGTGTTT CATCACAAAA GTATCCTGGG 4800
ATTTTAAACA GTAAGGCCAT GGGTTTATTG AGAAGAGCAC GAGTTTCACG GAGCAAGAAA 4860
GAGGCTGATA GAGAGAGTTT TCCCCATCGG CTGCTTTCAT CTTGGCACAT AGCCCCAGTC 4920
CACCTGCCGT TGCTGGGGCA AAACTGCTGG CCACACCTAT CAGAAGGTTT CAGTGTTTCC 4980
CTGTGGTTTA ATGTGGAGTG TATCCATGAA GCTGAGAGTA CTACAGAAAA AGGAAAGAAG 5040
ATAAAGAAAA GAAACAAATC ATTAATTTTA CCAGATAGCA GTTTTGATGG TACAGGTATG 5100
ATGACAGGAT TATCTGATTT GTACACAAAG ATTGTTTTCA GACTATAATT TTCCTTGAGC 5160
CGTAAAAATG TGGTAGTGTT CTTAACACTC TTAACATGTT ATTCACCTTA AAGATCCTAC 5220 5221
\1f
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1531 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Met Ser Thr Asp Ser Asn Ser Leu Ala Arg Glu Phe Leu Thr Asp Val 1 5 10 15
Asn Arg Leu Cys Asn Ala Val Val Gin Arg Val Glu Ala Arg Glu Glu 20 25 30
Glu Glu Glu Glu Thr H s Met Ala Thr Leu Gly Gin Tyr Leu Val His 35 40 45
Gly Arg Gly Phe Leu Leu Leu Thr Lys Leu Asn Ser Ile Ile Asp Gin 50 55 60
Ala Leu Thr Cys Arg Glu Glu Leu Leu Thr Leu Leu Leu Ser Leu Leu 65 70 75 80
Pro Leu Val Trp Lys Ile Pro Val Gin Glu Glu Lys Ala Thr Asp Phe 85 90 95
Asn Leu Pro Leu Ser Ala Asp Ile lie Leu Thr Lys Glu Lys Asn Ser 100 105 110
Ser Ser Gin Arg Ser Thr Gin Glu Lys Leu His Leu Glu Gly Ser Ala 115 120 125
Leu Ser Ser Gin Val Ser Ala Lys Val Asn Val Phe Arg Lys Ser Arg 130 135 140
Arg Gin Arg Lys Ile Thr His Arg Tyr Ser Val Arg Asp Ala Arg Lys 145 150 155 160
Thr Gin Leu Ser Thr Ser Asp Ser Glu Ala Asn Ser Asp Glu Lys Gly 165 170 175
Ile Ala Met Asn Lys His Arg Arg Pro His Leu Leu His His Phe Leu 180 185 190
Thr Ser Phe Pro Lys Gin Asp His Pro Lys Ala Lys Leu Asp Arg Leu 195 200 205
Ala Thr Lys Glu Gin Thr Pro Pro Asp Ala Met Ala Leu Glu Asn Ser 210 215 220
Arg Glu Ile Ile Pro Arg Gin Gly Ser Asn Thr Asp Ile Leu Ser Glu 225 230 235 240
Pro Ala Ala Leu Ser Val Ile Ser Asn Met Asn Asn Ser Pro Phe Asp 245 250 255
Leu Cys His Val Leu Leu Ser Leu Leu Glu Lys Val Cys Lys Phe Asp 260 265 270
Val Thr Leu Asn His Asn Ser Pro Leu Ala Ala Ser Val Val Pro Thr \m
275 280 285
Leu Thr Glu Phe Leu Ala Gly Phe Gly Asp Cys Cys Ser Leu Ser Asp 290 295 300
Asn Leu Glu Ser Arg Val Val Ser Ala Gly Trp Thr Glu Glu Pro Val 305 310 315 320
Ala Leu Ile Gin Arg Met Leu Phe Arg Thr Val Leu His Leu Leu Ser 325 330 335
Val Asp Val Ser Thr Ala Glu Met Met Pro Glu Asn Leu Arg Lys Asn 340 345 350
Leu Thr Glu Leu Leu Arg Ala Ala Leu Lys Ile Arg Ile Cys Leu Glu 355 360 365
Lys Gin Pro Asp Pro Phe Ala Pro Arg Gin Lys Lys Thr Leu Gin Glu 370 375 380
Val Gin Glu Asp Phe Val Phe Ser Lys Tyr Arg His Arg Ala Leu Leu 385 390 395 400
Leu Pro Glu Leu Leu Glu Gly Val Leu Gin Ile Leu Ile Cys Cys Leu 405 410 415
Gin Ser Ala Ala Ser Asn Pro Phe Tyr Phe Ser Gin Ala Met Asp Leu 420 425 430
Val Gin Glu Phe Ile Gin His His Gly Phe Asn Leu Phe Glu Thr Ala 435 440 445
Val Leu Gin Met Glu Trp Leu Val Leu Arg Asp Gly Val Pro Pro Glu 450 455 460
Ala Ser Glu His Leu Lys Ala Leu Ile Asn Ser Val Met Lys Ile Met 465 470 475 480
Ser Thr Val Lys Lys Val Lys Ser Glu Gin Leu His His Ser Met Cys 485 490 495
Thr Arg Lys Arg His Arg Arg Cys Glu Tyr Ser His Phe Met His His 500 505 510
His Arg Asp Leu Ser Gly Leu Leu Val Ser Ala Phe Lys Asn Gin Val 515 520 525
Ser Lys Asn Pro Phe Glu Glu Thr Ala Asp Gly Asp Val Tyr Tyr Pro 530 535 540
Glu Arg Cys Cys Cys Ile Ala Val Cys Ala His Gin Cys Leu Arg Leu 545 550 555 560
Leu Gin Gin Ala Ser Leu Ser Ser Thr Cys Val Gin Ile Leu Ser Gly 565 570 575
Val His Asn Ile Gly Ile Cys Cys Cys Met Asp Pro Lys Ser Val Ile 580 585 590
Ile Pro Leu Leu His Ala Phe Lys Leu Pro Ala Leu Lys Asn Phe Gin 595 600 605
Gin His Ile Leu Asn Ile Leu Asn Lys Leu Ile Leu Asp Gin Leu Gly 610 615 620 /'73
Gly Ala Glu Ile Ser Pro Lys Ile Lys Lys Ala Ala Cys Asn Ile Cys 625 630 . 635 640
Thr Val Asp Ser Asp Gin Leu Ala Gin Leu Glu Glu Thr Leu Gin Gly 645 650 655
Asn Leu Cys Asp Ala Glu Leu Ser Ser Ser Leu Ser Ser Pro Ser Tyr 660 665 670
Arg Phe Gin Gly Ile Leu Pro Ser Ser Gly Ser Glu Asp Leu Leu Trp 675 680 685
Lys Trp Asp Ala Leu Lys Ala Tyr Gin Asn Phe Val Phe Gly Glu Asp 690 695 700
Arg Leu His Ser Ile Gin Ile Ala Asn His Ile Cys Asn Leu Ile Gin 705 710 715 720
Lys Gly Asn Ile Val Val Gin Trp Lys Leu Tyr Asn Tyr Ile Phe Asn 725 730 735
Pro Val Leu Gin Arg Gly Val Glu Leu Ala His His Cys Gin His Leu 740 745 750
Ser Val Thr Ser Ala Gin Ser His Val Cys Ser His His Asn Gin Cys 755 760 765
Leu Pro Gin Asp Val Leu Gin Ile Tyr Val Lys Thr Leu Pro Ile Leu 770 775 780
Leu Lys Ser Arg Val Ile Arg Asp Leu Phe Leu Ser Cys Asn Gly Val 785 790 795 800
Ser Gin Ile Ile Glu Leu Asn Cys Leu Asn Gly Ile Arg Ser His Ser 805 810 815
Leu Lys Ala Phe Glu Thr Leu Ile Ile Ser Leu Gly Glu Gin Gin Lys 820 825 830
Asp Ala Ser Val Pro Asp Ile Asp Gly Ile Asp Ile Glu Gin Lys Glu 835 840 845
Leu Ser Ser Val His Val Gly Thr Ser Phe His His Gin Gin Ala Tyr 850 855 860
Ser Asp Ser Pro Gin Ser Leu Ser Lys Phe Tyr Ala Gly Leu Lys Glu 865 870 875 880
Ala Tyr Pro Lys Arg Arg Lys Thr Val Asn Gin Asp Val His Ile Asn 885 890 895
Thr Ile Asn Leu Phe Leu Cys Val Ala Phe Leu Cys Val Ser Lys Glu 900 905 910
Ala Glu Ser Asp Arg Glu Ser Ala Asn Asp Ser Glu Asp Thr Ser Gly 915 920 925
Tyr Asp Ser Thr Ala Ser Glu Pro Leu Ser His Met Leu Pro Cys Ile 930 935 940
Ser Leu Glu Ser Leu Val Leu Pro Ser Pro Glu His Met His Gin Ala 945 950 955 960
Ala Asp Ile Trp Ser Met Cys Arg Trp Ile Tyr Met Leu Ser Ser Val 965 970 975 Phe Gin Lys Gin Phe Tyr Arg. eu Gly Gly Phe Arg Val Cys His Lys 980 985 990
Leu Ile Phe Met Ile Ile Gin Lys Leu Phe Arg Ser His Lys Glu Glu 995 1000 1005
Gin Gly Lys Lys Glu Gly Asp Thr Ser Val Asn Glu Asn Gin Asp Leu 1010 1015 1020
Asn Arg Ile Ser Gin Pro Lys Arg Thr Met Lys Glu Asp Leu Leu Ser 1025 1030 1035 1040
Leu Ala lie Lys Ser Asp Pro Ile Pro Ser Glu Leu Gly Ser Leu Lys 1045 1050 1055
Lys Ser Ala Asp Ser Leu Gly Lys Leu Glu Leu Gin His Ile Ser Ser 1060 1065 1070
Ile Asn Val Glu Glu Val Ser Ala Thr Glu Ala Ala Pro Glu Glu Ala 1075 1080 1085
Lys Leu Phe Thr Ser Gin Glu Ser Glu Thr Ser Leu Gin Ser Ile Arg 1090 1095 1100
Leu Leu Glu Ala Leu Leu Ala Ile Cys Leu His Gly Ala Arg Thr Ser 1105 1110 1115 1120
Gin Gin Lys Met Glu Leu Glu Leu Pro Asn Gin Asn Leu Ser Val Glu 1125 1130 1135
Ser Ile Leu Phe Glu Met Arg Asp His Leu Ser Gin Ser Lys Val Ile 1140 1145 1150
Glu Thr Gin Leu Ala Lys Pro Leu Phe Asp Ala Leu Leu Arg Val Ala 1155 1160 1165
Leu Gly Asn Tyr Ser Ala Asp Phe Glu His Asn Asp Ala Met Thr Glu 1170 1175 1180
Lys Ser His Gin Ser Ala Glu Glu Leu Ser Ser Gin Pro Gly Asp Phe 1185 1190 1195 1200
Ser Glu Glu Ala Glu Asp Ser Gin Cys Cys Ser Phe Lys Leu Leu Val 1205 1210 1215
Glu Glu Glu Gly Tyr Glu Ala Asp Ser Glu Ser Asn Pro Glu Asp Gly 1220 1225 1230
Glu Thr Gin Asp Asp Gly Val Asp Leu Lys Ser Glu Thr Glu Gly Phe 1235 1240 1245
Ser Ala Ser Ser Ser Pro Asn Asp Leu Leu Glu Asn Leu Thr Gin Gly 1250 1255 1260
Glu Ile Ile Tyr Pro Glu Ile Cys Met Leu Glu Leu Asn Leu Leu Ser 1265 1270 1275 1280
Ala Ser Lys Ala Lys Leu Asp Val Leu Ala His Val Phe Glu Ser Phe 1285 1290 1295
Leu Lys Ile Ile Arg Gin Lys Glu Lys Asn Val Phe Leu Leu Met Gin 1300 1305 1310
Gin Gly Thr Val Lys Asn Leu Leu Gly Gly Phe Leu Ser Ile Leu Thr 1315 1320 1325
Gin Asp Asp Ser Asp Phe Gin Ala Cys Gin Arg Val Leu Val Asp Leu 1330 1335 1340
Leu Val Ser Leu Met Ser Ser Arg Thr Cys Ser Glu Glu Leu Thr Leu 1345 1350 1355 1360
Leu Leu Arg Ile Phe Leu Glu Lys Ser Pro Cys Thr Lys Ile Leu Leu 1365 1370 1375
Leu Gly Ile Leu Lys Ile Ile Glu Ser Asp Thr Thr Met Ser Pro Ser 1380 1385 1390
Gin Tyr Leu Thr Phe Pro Leu Leu His Ala Pro Asn Leu Ser Asn Gly 1395 1400 1405
Val Ser Ser Gin Lys Tyr Pro Gly Ile Leu Asn Ser Lys Ala Met Gly 1410 1415 1420
Leu Leu Arg Arg Ala Arg Val Ser Arg Ser Lys Lys Glu Ala Asp Arg 1425 1430 1435 1440
Glu Ser Phe Pro His Arg Leu Leu Ser Ser Trp His Ile Ala Pro Val 1445 1450 1455
His Leu Pro Leu Leu Gly Gin Asn Cys Trp Pro His Leu Ser Glu Gly 1460 1465 1470
Phe Ser Val Ser Leu Trp Phe Asn Val Glu Cys Ile His Glu Ala Glu 1475 1480 1485
Ser Thr Thr Glu Lys Gly Lys Lys Ile Lys Lys Arg Asn Lys Ser Leu 1490 1495 1500
Ile Leu Pro Asp Ser Ser Phe Asp Gly Thr Gly Met Met Thr Gly Leu 1505 1510 1515 1520
Ser Asp Leu Tyr Thr Lys Ile Val Phe Arg Leu 1525 1530
(2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1979 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
ATACTTCTGA TGTAAAGGAA CTAATTCCAG AGTTCTACTA CCTACCAGAG ATGTTTGTCA 60
ACAGTAATGG ATATAATCTT GGAGTCAGAG AAGATGAAGT AGTGGTAAAT GATGTTGATC 120
TTCCCCCTTG GGCAAAAAAA CCTGAAGACT TTGTGCGGAT CAACAGGATG GCCCTAGAAA 180
GTGAATTTGT TTCTTGCCAA CTTCATCAGT GGATCGACCT TATATTTGGC TATAAGCAGC 240
GAGGACCAGA AGCAGTTCGT GCTCTGAATG TTTTTCACTA CTTGACTTAT GAAGGCTCTG 300
TGAACCTGGA TAGTATCACT GATCCTGTGC TCAGGGAGGC CATGGAGGCA CAGATACAGA 360
ACTTTGGACA GACGCCATCT CAGTTGCTTA TTGAGCCACA TCCGCCTCGG AACTCTGCCA 420 \
TGCACCTGTG TTTCCTTCCA CAGAGTCCGC TCATGTTTAA AGATCAGATG CAACAGGATG 480
TGATAATGGT GCTGAAGTTT CCTTCAAATT CTCCAGTAAC CCATGTGGCA GCCAACACTC 540
TGCCCCACTT GACCATCCCC GCAGTGGTGA CAGTGACTTG CAGCCGACTC TTTGCAGTGA 600
ATAGATGGCA CAACACAGTA GGCCTCAGAG GAGCTCCAGG ATACTCCTTG GATCAAGCCC 660
ACCATCTTCC CATTGAAATG GATCCATTAA TAGCCAATAA TTCAGGTGTA AACAAACGGC 720
AGATCACAGA CCTCGTTGAC CAGAGTATAC AAATCAATGC ACATTGTTTT GTGGTAACAG 780
CAGATAATCG CTATATTCTT ATCTGTGGAT TCTGGGATAA GAGCTTCAGA GTTTATACTA 840
CAGAAACAGG GAAATTGACT CAGATTGTAT TTGGCCATTG GGATGTGGTC ACTTGCTTGG 900
CCAGGTCCGA GTCATACATT GGTGGGGACT GCTACATCGT GTCCGGATCT CGAGATGCCA 960
CCCTGCTGCT CTGGTACTGG AGTGGGCGGC ACCATATCAT AGGAGACAAC CCTAACAGCA 1020
GTGACTATCC GGCACCAAGA GCCGTCCTCA CAGGCCATGA CCATGAAGTT GTCTGTGTTT 1080
CTGTCTGTGC AGAACTTGGG CTTGTTATCA GTGGTGCTAA AGAGGGCCCT TGCCTTGTCC 1140
ACACCATCAC TGGAGATTTG CTGAGAGCCC TTGAAGGACC AGAAAACTGC TTATTCCCAC 1200
GCTTGATATC TGTCTCCAGC GAAGGCCACT GTATCATATA CTATGAACGA GGGCGATTCA 1260
GTAATTTCAG CATTAATGGG AAACTTTTGG CTCAAATGGA GATCAATGAT TCAACACGGG 1320
CCATTCTCCT GAGCAGTGAC GGCCAGAACC TGGTCACCGG AGGGGACAAT GGGGTAGTAG 1380
AGGTCTGGCA GGCCTGTGAC TTCAAGCAAC TGTACATTTA ACCCTGGATG TGATGCTGGC 1440
ATTAGAGCAA TGGACTTGTC CCATGACCAG AGGACTCTGA TCACTGGCAT GGCTTCTGGT 1500
AGCATTGTAG CTTTTAATAT AGATTTTAAT CGGTGGCATT ATGAGCATCA GAACAGATAC 1560
TGAAGATAAA GGAAGAACCA AAAGCCAAGT TAAAGCTGAG GGCACAAGTG CTGCATGGAA 1620
AGGCAATATC TCTGGTGGAA AAAATTCGTC TACATCGACC TCCGTTTGTA CATTCCATCA 1680
CACCCAGCAA TAGCTGTACA TTGTAGTCAG CAACCATTTT ACTTTGTGTG TTTTTTCACG 1740
ACTGAACACC AGCTGCTATC AAGCAAGCTT ATATCATGTA AATTATATGA ATTAGGAGAT 1800
GTTTTGGTAA TTATTTCATA TATTGTTGTT TATTGAGAAA AGGTTGTAGG ATGTGTCACA 1860
AGAGACTTTT GACAATTCTG AGGAACCTTG TGTCCAGTTG TTACAAAGTT TAAGCTTTGA 1920
ACCTAACCTG CATCCCATTT CCAGCCTCTT TTCAAGCTGA GAAAAAAAAA AAAAAAAAA 1979
(2) INFORMATION FOR SEQ ID NO: 12:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 472 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Thr Ser Asp Val Lys Glu Leu Ile Pro Glu Phe Tyr Tyr Leu Pro Glu 1 5 IT] 10 15
Met Phe Val Asn Ser Asn Gly Tyr Asn Leu Gly Val Arg Glu Asp Glu 20 25 30
Val Val Val Asn Asp Val Asp Leu Pro Pro Trp Ala Lys Lys Pro Glu 35 40 45
Asp Phe Val Arg Ile Asn Arg Met Ala Leu Glu Ser Glu Phe Val Ser 50 55 60
Cys Gin Leu His Gin Trp Ile Asp Leu Ile Phe Gly Tyr Lys Gin Arg 65 70 75 80
Gly Pro Glu Ala Val Arg Ala Leu Asn Val Phe His Tyr Leu Thr Tyr 85 90 95
Glu Gly Ser Val Asn Leu Asp Ser Ile Thr Asp Pro Val Leu Arg Glu 100 105 110
Ala Met Glu Ala Gin Ile Gin Asn Phe Gly Gin Thr Pro Ser Gin Leu 115 120 125
Leu Ile Glu Pro His Pro Pro Arg Asn Ser Ala Met His Leu Cys Phe 130 135 140
Leu Pro Gin Ser Pro Leu Met Phe Lys Asp Gin Met Gin Gin Asp Val 145 150 155 160
Ile Met Val Leu Lys Phe Pro Ser Asn Ser Pro Val Thr His Val Ala 165 170 175
Ala Asn Thr Leu Pro His Leu Thr Ile Pro Ala Val Val Thr Val Thr 180 185 190
Cys Ser Arg Leu Phe Ala Val Asn Arg Trp His Asn Thr Val Gly Leu 195 200 205
Arg Gly Ala Pro Gly Tyr Ser Leu Asp Gin Ala His His Leu Pro Ile 210 215 220
Glu Met Asp Pro Leu Ile Ala Asn Asn Ser Gly Val Asn Lys Arg Gin 225 230 235 240
Ile Thr Asp Leu Val Asp Gin Ser Ile Gin Ile Asn Ala His Cys Phe 245 250 255
Val Val Thr Ala Asp Asn Arg Tyr Ile Leu Ile Cys Gly Phe Trp Asp 260 265 270
Lys Ser Phe Arg Val Tyr Thr Thr Glu Thr Gly Lys Leu Thr Gin Ile 275 280 285
Val Phe Gly His Trp Asp Val Val Thr Cys Leu Ala Arg Ser Glu Ser 290 295 300
Tyr Ile Gly Gly Asp Cys Tyr Ile Val Ser Gly Ser Arg Asp Ala Thr 305 310 315 320
Leu Leu Leu Trp Tyr Trp Ser Gly Arg His His Ile Ile Gly Asp Asn 325 330 335
Pro Asn Ser Ser Asp Tyr Pro Ala Pro Arg Ala Val Leu Thr Gly His 340 345 350 Asp His Glu Val Val Cys Val Ser Val Cys Ala Glu Leu Gly Leu Val 355 .360 365
Ile Ser Gly Ala Lys Glu Gly Pro Cys Leu Val His Thr Ile Thr Gly 370 375 380
Asp Leu Leu Arg Ala Leu Glu Gly Pro Glu Asn Cys Leu Phe Pro Arg 385 390 395 400
Leu Ile Ser Val Ser Ser Glu Gly His Cys Ile Ile Tyr Tyr Glu Arg 405 410 415
Gly Arg Phe Ser Asn Phe Ser Ile Asn Gly Lys Leu Leu Ala Gin Met 420 425 430
Glu Ile Asn Asp Ser Thr Arg Ala Ile Leu Leu Ser Ser Asp Gly Gin 435 440 445
Asn Leu Val Thr Gly Gly Asp Asn Gly Val Val Glu Val Trp Gin Ala 450 455 460
Cys Asp Phe Lys Gin Leu Tyr Ile 465 470
(2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2543 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
GCAGCAGGGC GAACCGGACC TCTGTGATGT TTAATTTTCC TGACCAAGCA ACAGTTAAAA 60
AAGTTGTCTA CAGCTTGCCT CGGGTTGGAG TGGGGACCAG CTATGGTTTG CCACAAGCCA 120
GGAGGATATC ACTGGCCACT CCTCGACAGC TGTATAAGTC TTCCAATATG ACTCAGCGCT 180
GGCAAAGAAG GGAAATCTCC AACTTTGAGT ATTTGATGTT TCTCAACACG ATAGCAGGTC 240
GGACGTATAA TGATCTGAAC CAGTATCCTG TGTTTCCATG GGTGTTAACA AACTATGAAT 300
CAGAGGAGTT GGACCTGACT CTCCCAGGAA ACTTCAGGCA TCTGTCAAAG CCAAAAGGTG 360
CTTTGAACCC GAAGAGAGCA GTGTTTTACG CAGAGCGCTA TGAGACATGG GAGGAGGATC 420
AAAGCCCACC CTTCCACTAC AACACACATT ACTCAACGGC GACTTCCCCC CTTTCATGGC 480
TTGTTCGGAT TGAGCCATTC ACAACCTTCT TCCTCAATGC AAATGATGGG AAATTTGACC 540
ATCCAGACCG AACCTTCTCA TCCATTGCAA GGTCATGGAG AACCAGTCAG AGAGATACAT 600
CCGATGTCAA GGAACTAATT CCAGAGTTCT ATTACGTACC AGAGATGTTT GTCAACAGCA 660
ATGGGTACCA TCTTGGAGTG AGGGAGGACG AAGTGGTGGT TAATGATGTG GACCTGCCCC 720
CCTGGGCCAA GAAGCCAGAA GACTTTGTGC GGATCAACAG GATGGCCCTG GAAAGTGAAT 780
TTGTTTCTTG CCAACTCCAT CAATGGATTG ACCTTATATT TGGCTACAAA CAGCGAGGGC 840
CAGAGGCAGT CCGTGCTCTC AATGTTTTCC ACTACTTGAC CTACGAAGGC TCTGTAAACC 900 TGGACAGCAT CACAGACCCT GTGCTCCGGG AGGCCATGGT TGCACAGATA CAGAACTTTG 960
CCCAGACGCC ATCTCAGTTG CTCATTGAGC CGCATCCGCC TAGGACTTCA GCCATGCATC 1020
TGTGTTCCCT TCCACAGAGC CCACTCATGT TCAAAGATCA GATGCAGCAG GATGTGATCA 1 08 0
TGGTGCTGAA GTTTCCATCC AATTCTCCTG TGACTCATGT GGCTGCCAAC ACCCTGCCCC 1 14 0
ACCTGACCAT CCCTGCAGTG GTGACAGTGA CCTGCAGCCG ACTGTTTGCA GTGAACAGAT 1200
GGCACAACAC AGTCGGCCTC AGAGGAGCCC CCGGATACTC CTTGGATCAA GCACACCATC 1260
TTCCCATTGA GATGGACCCA TTAATCGCAA ATAACTCTGG TGTGAACAAG CGGCAGATCA 1320
CAGACCTTGT AGACCAGAGC ATCCAGATCA ATGCCCACTG CTTCGTGGTC ACAGCTGATA 1380
ATCGCTACAT CCTCATCTGT GGGTTTTGGG ATAAAAGTTT CAGAGTTTAC TCGACAGAAA 14 40
CAGGGAAACT GACACAGATT GTATTTGGCC ACTGGGATGT TGTCACATGC CTGGCCAGGT 1500
CGGAGTCCTA CATTGGTGGA GACTGCTACA TAGTGTCTGG ATCTCGGGAC GCCACCTTGC 1560
TTCTCTGGTA CTGGAGTGGG CGTCACCACA TCATCGGAGA CAACCCCAAT AGCAGTGACT 1 620
ATCCTGCGCC CAGAGCTGTC CTCACAGGCC ATGACCATGA AGTTGTCTGT GTCTCCGTCT 1680
GTGCAGAACT CGGACTCGTT ATCAGTGGTG CTAAAGAGGG CCCTTGCCTC GTTCATACCA 1740
TCACTGGAAA TCTGCTGAAG GCCCTGGAAG GACCAGAAAA CTGCTTATTT CCACGCCTAA 1800
TTTCGGTATC CAGTGAAGGC CACTGCATCA TATATTATGA GCGAGGACGG TTTAGCAACT 1860
TCAGCATCAA TGGGAAACTT TTGGCTCAAA TGGAGATCAA TGATTCCACT AGGGCTATTC 1 920
TCCTGAGCAG CGATGGACAG AACCTGGTGA CTGGAGGGGA CAATGGTGTG GTGGAGGTCT 1980
GGCAGGCCTG TGACTTTAAG CAGCTGTACA TTTACCCAGG ATGTGATGCT GGCATTAGAG 2 04 0
CGATGGATTT ATCCCATGAC CAAAGGACTC TGATCACTGG CATGGCTTCC GGCAGCATTG 2100
TACTTTTAAT ATAGATTTTA ATCGGTGGCA TTATGAGCAT CAGAACAGTA CTGAAGAGAA 2160
GCAGCAGAAG CCACATTCAA GTGAGAGCAC AAGTGCTTCT GTGGAAAGGC AGTATCTCTG 2220
GTGGGACGCT GGTCCACATC GGCCTCTGCT TGTACATCCA TCCCACCCAG CAGTCGCCGA 2280
ACATCATAGT CGGGAGCCAT TTCACCCTGT TTTTCCAGGA CTGAACACCA GCTGCTGTCA 234 0
AGCAAGCTTA TATCATGTAA ATTATCTGAA TTAGGAGCCG TTTTGGTAAT TATTTCATAT 2400
ATCGCCGTTT ATTGAGAAAA GGTTGTAGGA AGCCTCACAA GAGACTTTTG ACAATTCTGA 2460
GGAACCTTGT GCCCAGTTGT TACAAAGTTT AAGCTTTGAA CCTAACTTGC ATCCCATTTC 2 520
CAGCCTCGGG CTTCACTCGT GCC 254 3
( 2 ) INFORMATION FOR SEQ I D NO : 14 :
( i ) SEQUENCE CHARACTERI STI CS :
(A) LENGTH : 703 amino acids
( B ) TYPE : ammo acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear \<tt>
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
Ser Arg Ala Asn Arg Thr Ser Val Met Phe Asn Phe Pro Asp Gin Ala
1 5 10 15
Thr Val Lys Lys Val Val Tyr Ser Leu Pro Arg Val Gly Val Gly Thr 20 25 30
Ser Tyr Gly Leu Pro Gin Ala Arg Arg Ile Ser Leu Ala Thr Pro Arg 35 40 45
Gin Leu Tyr Lys Ser Ser Asn Met Thr Gin Arg Trp Gin Arg Arg Glu 50 55 60
Ile Ser Asn Phe Glu Tyr Leu Met Phe Leu Asn Thr Ile Ala Gly Arg 65 70 75 80
Thr Tyr Asn Asp Leu Asn Gin Tyr Pro Val Phe Pro Trp Val Leu Thr 85 90 95
Asn Tyr Glu Ser Glu Glu Leu Asp Leu Thr Leu Pro Gly Asn Phe Arg 100 105 110
His Leu Ser Lys Pro Lys Gly Ala Leu Asn Pro Lys Arg Ala Val Phe 115 120 125
Tyr Ala Glu Arg Tyr Glu Thr Trp Glu Glu Asp Gin Ser Pro Pro Phe 130 135 140
His Tyr Asn Thr His Tyr Ser Thr Ala Thr Ser Pro Leu Ser Trp Leu 145 150 155 160
Val Arg Ile Glu Pro Phe Thr Thr Phe Phe Leu Asn Ala Asn Asp Gly
165 170 175
Lys Phe Asp His Pro Asp Arg Thr Phe Ser Ser Ile Ala Arg Ser Trp 180 185 190
Arg Thr Ser Gin Arg Asp Thr Ser Asp Val Lys Glu Leu Ile Pro Glu 195 200 205
Phe Tyr Tyr Val Pro Glu Met Phe Val Asn Ser Asn Gly Tyr His Leu 210 215 220
Gly Val Arg Glu Asp Glu Val Val Val Asn Asp Val Asp Leu Pro Pro 225 230 235 240
Trp Ala Lys Lys Pro Glu Asp Phe Val Arg Ile Asn Arg Met Ala Leu 245 250 255
Glu Ser Glu Phe Val Ser Cys Gin Leu His Gin Trp Ile Asp Leu Ile 260 265 270
Phe Gly Tyr Lys Gin Arg Gly Pro Glu Ala Val Arg Ala Leu Asn Val 275 280 285
Phe His Tyr Leu Thr Tyr Glu Gly Ser Val Asn Leu Asp Ser Ile Thr 290 295 300
Asp Pro Val Leu Arg Glu Ala Met Val Ala Gin Ile Gin Asn Phe Ala 305 310 315 320
Gin Thr Pro Ser Gin Leu Leu Ile Glu Pro His Pro Pro Arg Thr Ser 325 330 335 Ala Met H s Leu Cys Ser Leu Pro G min Ser Pro Leu Met Phe Lys Asp 340 . 345 350
Gin Met Gin Gin Asp Val Ile Met Val Leu Lys Phe Pro Ser Asn Ser 355 360 365
Pro Val Thr His Val Ala Ala Asn Thr Leu Pro His Leu Thr Ile Pro 370 375 380
Ala Val Val Thr Val Thr Cys Ser Arg Leu Phe Ala Val Asn Arg Trp 385 390 395 400
His Asn Thr Val Gly Leu Arg Gly Ala Pro Gly Tyr Ser Leu Asp Gin 405 410 415
Ala His His Leu Pro Ile Glu Met Asp Pro Leu Ile Ala Asn Asn Ser 420 425 430
Gly Val Asn Lys Arg Gin Ile Thr Asp Leu Val Asp Gin Ser Ile Gin 435 440 445
Ile Asn Ala His Cys Phe Val Val Thr Ala Asp Asn Arg Tyr Ile Leu 450 455 460
Ile Cys Gly Phe Trp Asp Lys Ser Phe Arg Val Tyr Ser Thr Glu Thr 465 470 475 480
Gly Lys Leu Thr Gin Ile Val Phe Gly His Trp Asp Val Val Thr Cys 485 490 495
Leu Ala Arg Ser Glu Ser Tyr Ile Gly Gly Asp Cys Tyr Ile Val Ser 500 505 510
Gly Ser Arg Asp Ala Thr Leu Leu Leu Trp Tyr Trp Ser Gly Arg His 515 520 525
His Ile Ile Gly Asp Asn Pro Asn Ser Ser Asp Tyr Pro Ala Pro Arg 530 535 540
Ala Val Leu Thr Gly His Asp His Glu Val Val Cys Val Ser Val Cys 545 550 555 560
Ala Glu Leu Gly Leu Val Ile Ser Gly Ala Lys Glu Gly Pro Cys Leu 565 570 575
Val His Thr Ile Thr Gly Asn Leu Leu Lys Ala Leu Glu Gly Pro Glu 580 585 590
Asn Cys Leu Phe Pro Arg Leu Ile Ser Val Ser Ser Glu Gly His Cys 595 600 605
Ile lie Tyr Tyr Glu Arg Gly Arg Phe Ser Asn Phe Ser Ile Asn Gly 610 615 620
Lys Leu Leu Ala Gin Met Glu Ile Asn Asp Ser Thr Arg Ala Ile Leu 625 630 635 640
Leu Ser Ser Asp Gly Gin Asn Leu Val Thr Gly Gly Asp Asn Gly Val 645 650 655
Val Glu Val Trp Gin Ala Cys Asp Phe Lys Gin Leu Tyr Ile Tyr Pro 660 665 670
Gly Cys Asp Ala Gly Ile Arg Ala Met Asp Leu Ser His Asp Gin Arg 675 680 685 Thr Leu Leu Ile
(2) INFORMATION FOR SEQ ID NO: 15:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: CATTCTTTAT TGACAGTGTT 20
(2) INFORMATION FOR SEQ ID NO: 16:
( ) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: GTGAACCCTA CCATATCT 18
(2) INFORMATION FOR SEQ ID NO: 17:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: ATGCCATTCT TTATTGACAG 20
(2) INFORMATION FOR SEQ ID NO: 18:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: CATTGGCACA GGAAACAAC 19
(2) INFORMATION FOR SEQ ID NO: 19:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: AAACCCTGTC TCGAAACAA 19
(2) INFORMATION FOR SEQ ID NO: 20:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: GAATTCCCAA GGACAGGT 18
(2) INFORMATION FOR SEQ ID NO: 21:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: TAGGAGGTGT GGCCTTG 17
(2) INFORMATION FOR SEQ ID NO: 22:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: AGAGACGGCG GACACTTA 18
(2) INFORMATION FOR SEQ ID NO: 23:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: TAAATATGAG GCGGGCAG 18
(2) INFORMATION FOR SEQ ID NO: 24:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: ATACTCTAAG TAAGATACAC 20
(2) INFORMATION FOR SEQ ID NO: 25:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: TGTTTTAACT GTTTGCTAA 19
(2) INFORMATION FOR SEQ ID NO: 26:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: ACGCAGTGGG CATGCTG 17
(2) INFORMATION FOR SEQ ID NO: 27:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: CACAGGCTGT GACTGGAA 18
(2) INFORMATION FOR SEQ ID NO: 28:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: AGTAGCCACA GGCCCTA 17
(2) INFORMATION FOR SEQ ID NO: 29:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (x ) SEQUENCE DESCRIPTION: SEQ ID NO: 29: GAGATTACCC CAATAGTA 1
(2) INFORMATION FOR SEQ ID NO: 30:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 30: AGTGGAAGGA GGCTGTC 17
(2) INFORMATION FOR SEQ ID NO: 31:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: CCATGGCGAT GAAGCGG 17
(2) INFORMATION FOR SEQ ID NO: 32:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: ATGATGCAAA GAACCCAG 18
(2) INFORMATION FOR SEQ ID NO: 33:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: TTCAACTACA TAGTGAATT 19
(2) INFORMATION FOR SEQ ID NO: 34:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: TCCCTAACAC ATCCCTAA 18
(2) INFORMATION FOR SEQ ID NO: 35:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: CCTCATGTTA GGGTAGAG 18
(2) INFORMATION FOR SEQ ID NO: 36:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: CCGTTAGTGT GTAGTCTC 18
(2) INFORMATION FOR SEQ ID NO: 37:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: CACATTGGCA CAGGAAAC 18
(2) INFORMATION FOR SEQ ID NO: 38:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: TTACGAATGT GCCTGGTG 18
(2) INFORMATION FOR SEQ ID NO: 39:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID mNO: 39: TCCAAACACA CTAAACCTG 19
(2) INFORMATION FOR SEQ ID NO: 40:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: CATGCCATTC TTTATTGACA 20
(2) INFORMATION FOR SEQ ID NO: 41:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: CTCAGTAGAC TATACGAG 18
(2) INFORMATION FOR SEQ ID NO: 42:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: GAGTTCAAGG TCATCCTC 18
(2) INFORMATION FOR SEQ ID NO: 43:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: GCCTAAGCCC ATTATCG 17
(2) INFORMATION FOR SEQ ID NO: 44:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: TAAATGCTGC CATAAACTCC 20
(2) INFORMATION FOR SEQ ID NO: 45:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 45: AGAAGTGACT TCAGGTAATA 20
(2) INFORMATION FOR SEQ ID NO: 46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: AGTCTCTCAC ACTTACAC 18
(2) INFORMATION FOR SEQ ID NO: 47:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: GACCCAAGTC AGCTTTC 17
(2) INFORMATION FOR SEQ ID NO: 48:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: CCAGTGTGTC ACTTAAGC 18
(2) INFORMATION FOR SEQ ID NO: 49:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: AGGGAGATGT ATCATCTGC 19
(2) INFORMATION FOR SEQ ID NO: 50:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: AGGGATCACC ATGCTTTG 18
(2) INFORMATION FOR SEQ ID NO: 51:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: ACTTGGTCTT GGGGTCC 17
(2) INFORMATION FOR SEQ ID NO: 52:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: GAAAGCAGTG TAATGAGG 18
(2) INFORMATION FOR SEQ ID NO: 53:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: TGCCTCTACA TGGGAGC 17
(2) INFORMATION FOR SEQ ID NO: 54:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear i<jv
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: GCAAGCATTT AGTTAAACG
(2) INFORMATION FOR SEQ ID NO: 55:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: CTTGTTCTTG TATATCTG
(2) INFORMATION FOR SEQ ID NO: 56:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: ACAATGAAAT CCTCCACC
(2) INFORMATION FOR SEQ ID NO: 57:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: GTGACTTGAT CCAGACTG
(2) INFORMATION FOR SEQ ID NO: 58:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 CTTGCTCTCA CTGTTCTC
(2) INFORMATION FOR SEQ ID NO: 59:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear IQ I
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: CAGGTGGAGA TGCTGTTC 18
(2) INFORMATION FOR SEQ ID NO: 60:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: GAGATGCCTT CAGGCAGT 18
(2) INFORMATION FOR SEQ ID NO: 61:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: CCGTTAGTGT GTAGTCTC 18
(2) INFORMATION FOR SEQ ID NO: 62:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: CTTGCTCTCA CTGTTCTC 18
(2) INFORMATION FOR SEQ ID NO: 63:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: TGGATGGGCT GTCTGAACGC 20
(2) INFORMATION FOR SEQ ID NO: 64:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear ( i) SEQUENCE DESCRIPTION: SEQ ID NO: 64: TGCTGGCAGA TGCTGGCATA ' 20
(2) INFORMATION FOR SEQ ID NO: 65:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: CCAAGATGAA AGCAGCCGAT GGGGAAAACT 30
(2) INFORMATION FOR SEQ ID NO: 66:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: TCAGCCTCTT TCTTGCTCCG TGAAACTGCT 30
(2) INFORMATION FOR SEQ ID NO: 67:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: AGTTTATGAG TCCAAATGAT 20
(2) INFORMATION FOR SEQ ID NO: 68:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: GAATGATGAA GTTGCTCTGA 20
(2) INFORMATION FOR SEQ ID NO: 69:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: CAGCAGTTCT TCAGATGGA " 19
(2) INFORMATION FOR SEQ ID NO: 70:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: ATCTTTCTGT TGTTCCCCTA 20
(2) INFORMATION FOR SEQ ID NO: 71:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: TAGGGGAGCA ACAGAAAGAT 20
(2) INFORMATION FOR SEQ ID NO: 72:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: GCTCATAGTA GTATCACTTT 20
(2) INFORMATION FOR SEQ ID NO: 73:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: CGCACATGGC AACCCTT 17
(2) INFORMATION FOR SEQ ID NO: 74:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: GCACATGGGC AACCCTT ' 17
(2) INFORMATION FOR SEQ ID NO: 75:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: CCAGGACACC AGGGCTACAG AG 22
(2) INFORMATION FOR SEQ ID NO: 76:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: CCCGAGTGCT GGGATTAAAG 20
(2) INFORMATION FOR SEQ ID NO: 77:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 41 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: GTTGTAAAAC GACGGCCAGT GGCAAGTTCA GCCTGGTTAA G 41
(2) INFORMATION FOR SEQ ID NO: 78:
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 40 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: CACAGGAAAC AGCTATGACC AGAGTATTTC TTCCAGGGTA 40 All of the compositions and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the composition, methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims Accordingly, the exclusive rights sought to be patented are as described in the claims below

Claims

CLAIMS:
1. A purified mammalian LYST1, Lyst1, LYST2, or Lyst2 protein.
2. The protein according to claim 1 , wherein said protein is isolated from a mouse or human.
3. The protein according to claim 1, comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
4. A purified nucleic acid segment encoding a LYST1, Lyst1, LYST2, or Lyst2 protein.
5. The nucleic acid segment of claim 4, wherein said segment encodes a human LYST1 or L YST2 protein, or a murine Lyst 1 or Lyst2 protein.
6. The nucleic acid segment of claim 4, further defined as encoding a protein comprising the amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14
7. The nucleic acid segment of claim 4, further defined as comprising the nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9,
SEQ ID NO: 1 1, or SEQ ID NO: 13, or the complements thereof, or a sequence which hybridizes to the sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID
NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13.
8. The nucleic acid segment of claim 4, further defined as an RNA segment.
9. A DNA segment comprising an isolated LYST1, Lyst1, LYST2, or Lyst2 gene.
10. The DNA segment of claim 9, comprising an isolated LYSTI, Lystl, LYST2, or Lyst2 gene.
1 1. The DNA segment of claim 10, comprising an isolated human LYST1 or LYST2 gene or an isolated murine Lyst1 or Lyst2 gene.
12. The DNA segment of claim 1 1, comprising an isolated human LYST1 or LYST2 gene, or murine Lyst1 or Lyst2 gene that encodes a protein or peptide that includes a contiguous amino acid sequence from SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
13. The DNA segment of claim 9, comprising an isolated human LYST1 or LYST2 gene, or murine Lyst1 or Lyst2 gene that includes a contiguous nucleic acid sequence of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13.
14. The DNA segment of claim 9, comprising an isolated human LYST1 or LYST2 gene, or murine Lyst1 or Lyst2 gene that encodes a protein of from about 15 to about 50 amino acids in length.
15. The DNA segment of claim 9, comprising an isolated human LYST1 or LYST2 gene, or murine Lyst1 or Lyst2 gene that encodes a protein of from about 50 to about 150 amino acids in length.
16. The DNA segment of claim 9, comprising an isolated human LYST1 or LYST2 gene, or murine Lyst1 or Lyst2 gene that encodes a protein of about 1185 amino acids in length.
17. The DNA segment of claim 9, defined further as a recombinant vector.
18. The DNA segment of claim 17, defined further as recombinant vector pCH.
19. The DNA segment of claim 9, wherein said DNA is operatively linked to a promotor, said promoter expressing the DNA segment.
20. A recombinant host cell comprising the DNA segment of claim 9.
21. The recombinant host cell of claim 20, defined further as being a prokaryotic cell.
22. The recombinant host cell of claim 21, further defined as a bacterial cell.
23. The recombinant host cell of claim 20, defined further as being a eukaryotic cell.
24. The recombinant host cell of claim 23, further defined as a yeast cell or an animal cell.
25. The recombinant host cell of claim 24, wherein said cell is a mammalian cell.
26. The recombinant host cell of claim 25, wherein said cell is a human cell.
27. The recombinant host cell of claim 20, wherein said DNA segment is introduced into the cell by means of a recombinant vector.
28. The recombinant host cell of claim 20, wherein said host cell expresses the DNA segment to produce a LYST1, Lyst1, LYST2, or Lyst2 protein or peptide.
29. The recombinant host cell of claim 28, wherein said LYST1, Lyst1, LYST2, or Lyst2 protein or peptide comprises a contiguous amino acid sequence from SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12 or SEQ ID NO: 14.
30. A method of using a DNA segment that encodes an isolated human LYST1 or LYST2 protein or murine Lyst1 or Lyst2 protein, comprising the steps of:
(a) preparing a recombinant vector in which a LYST1-, Lyst1-, LYST2-, or Lyst2- encoding DNA segment is positioned under the control of a promoter;
(b) introducing said recombinant vector into a host cell; (c) culturing said host cell under conditions effective to allow expression of the encoded protein or peptide; and
(d) collecting said expressed protein or peptide.
31. An isolated nucleic acid segment characterized as:
(a) a nucleic acid segment comprising a sequence region that consists of at least 14 contiguous nucleotides that have the same sequence as, or are complementary to, 14 contiguous nucleotides of of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5,
SEQ ID NO: 7, SEQ ID NO: 9; SEQ ID NO: 1 1 or SEQ ID NO : 13, or
(b) a nucleic acid segment of from 14 to about 10,000 nucleotides in length that hybridizes to the nucleic acid segment of of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, or SEQ ID NO: 13; or the complements thereof, under standard hybridization conditions.
32. The nucleic acid segment of claim 31, further defined as comprising a sequence region that consists of at least 14 contiguous nucleotides that have the same sequence as, or are complementary to, 14 contiguous nucleotides of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID
NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 1 1, or SEQ ID NO : 13.
33. The nucleic acid segment of claim 31, further defined as comprising a nucleic acid segment of from 14 to about 10,000 nucleotides in length that hybridizes to the nucleic acid segment of of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO : 9, SEQ ID NO: 11, or SEQ ID NO: 13, or the complements thereof, under standard hybridization conditions.
34. The nucleic acid segment of claim 33, wherein the segment comprises a sequence region of at least about 20 nucleotides; or wherein the segment is about 20 nucleotides in length.
35. The nucleic acid segment of claim 34, wherein the segment comprises a sequence region of at least about 30 nucleotides, or wherein the segment is about 30 nucleotides in length.
36. The nucleic acid segment of claim 35, wherein the segment comprises a sequence region of at least about 50 nucleotides; or wherein the segment is about 50 nucleotides in length.
37. The nucleic acid segment of claim 36, wherein the segment comprises a sequence region of at least about 100 nucleotides, or wherein the segment is about 100 nucleotides in length.
38. The nucleic acid segment of claim 37, wherein the segment comprises a sequence region of at least about 200 nucleotides, or wherein the segment is about 200 nucleotides in length.
39. The nucleic acid segment of claim 38, wherein the segment comprises a sequence region of at least about 500 nucleotides, or wherein the segment is about 500 nucleotides in length.
40. The nucleic acid segment of claim 39, wherein the segment comprises a sequence region of at least about 1000 nucleotides; or wherein the segment is about 1000 nucleotides in length.
41. The nucleic acid segment of claim 40, wherein the segment comprises a sequence region of of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO:5, SEQ ID NO: 7, SEQ ID NO:9, SEQ ID NO: 1 1 or SEQ ID NO: 13.
42. The nucleic acid segment of claim 31, wherein the segment is up to 10,000 basepairs in length.
43. The nucleic acid segment of claim 42, wherein the segment is up to 5,000 basepairs in length.
44. The nucleic acid segment of claim 43, wherein the segment is up to 4,000 basepairs in length.
45. The nucleic acid segment of claim 44, wherein the segment is up to 3,000 basepairs in length.
46. The nucleic acid segment of claim 45, wherein the segment is about 3514 basepairs in length.
47. A method for detecting a nucleic acid sequence encoding a LYST1, Lyst1, LYST2 or Lyst2 protein, comprising the steps of: (a) obtaining sample nucleic acids suspected of encoding a LYST1, Lyst1, LYST2, or Lyst2 protein;
(b) contacting said sample nucleic acids with an isolated nucleic acid segment encoding said protein under conditions effective to allow hybridization of substantially complementary nucleic acids; and
(c) detecting the hybridized complementary nucleic acids thus formed.
48. The method of claim 47, wherein the sample nucleic acids contacted are located within a cell.
49. The method of claim 47, wherein the sample nucleic acids are separated from a cell prior to contact.
50. The method of claim 47, wherein the isolated protein-encoding nucleic acid segment comprises a detectable label and the hybridized complementary nucleic acids are detected by detecting said label.
51. A nucleic acid detection kit comprising, in suitable container means, an isolated LYST1, Lyst1, LYST2 or Lyst2 nucleic acid segment and a detection reagent.
52. The nucleic acid detection kit of claim 51, wherein the detection reagent is a detectable label that is linked to said nucleic acid segment.
53. The nucleic acid detection kit of claim 51, further comprising a restriction enzyme
54. A peptide composition, free from total cells, comprising a LYST1, Lyst1, LYST2, or Lyst2 protein that includes a contiguous amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
55. The composition of claim 54, comprising a peptide that includes an about 15 to about 50 amino acid long sequence from of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
56. The composition of claim 54, comprising a peptide that includes an about 50 to about 150 amino acid long sequence from of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO : 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
57. The composition of claim 54, comprising a peptide that includes an about 150 to about 300 amino acid long sequence from of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, or SEQ ID NO: 14.
58. The composition of claim 54, wherein the protein or peptide is a recombinant protein or peptide.
59. A purified antibody that binds to a LYST1, Lyst1, LYST2, or Lyst2 protein or peptide.
60. The antibody of claim 59, wherein the antibody is linked to a detectable label.
61. The antibody of claim 60, wherein the antibody is linked to a radioactive label, a fluorogenic label, a nuclear magnetic spin resonance label, biotin or an enzyme that generates a colored product upon contact with a chromogenic substrate.
62. The antibody of claim 61, wherein the antibody is linked to an alkaline phosphatase, hydrogen peroxidase or glucose oxidase enzyme.
63. The antibody of claim 59, wherein said antibody is a monoclonal antibody.
64. A method for diagnosing Chediak-Higashi Syndrome, comprising identifying a Lyst1 or LYST1 nucleic acid segment or a Lyst1 or LYST1 protein or peptide present within a clinical sample from a patient suspected of having such a syndrome.
65. A transgenic animal having incorporated into its genome a transgene that encodes a LYST1, Lyst1, LYST2, or Lyst2 protein or peptide.
EP97904209A 1996-02-01 1997-01-31 Lyst1 and lyst2 gene compositions and methods of use Withdrawn EP0880586A1 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US1114696P 1996-02-01 1996-02-01
US11146P 1996-02-01
US3359996P 1996-12-20 1996-12-20
US33599 1996-12-20
US3434696P 1996-12-23 1996-12-23
US34346P 1996-12-23
PCT/US1997/001748 WO1997028262A1 (en) 1996-02-01 1997-01-31 Lyst1 and lyst2 gene compositions and methods of use

Publications (1)

Publication Number Publication Date
EP0880586A1 true EP0880586A1 (en) 1998-12-02

Family

ID=27359378

Family Applications (1)

Application Number Title Priority Date Filing Date
EP97904209A Withdrawn EP0880586A1 (en) 1996-02-01 1997-01-31 Lyst1 and lyst2 gene compositions and methods of use

Country Status (4)

Country Link
EP (1) EP0880586A1 (en)
JP (1) JP2002514897A (en)
AU (1) AU718378B2 (en)
WO (1) WO1997028262A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999042586A2 (en) * 1998-02-23 1999-08-26 Rigel Pharmaceuticals, Inc. Exo1 and exo2, exocytotic proteins
JP2002510490A (en) * 1998-04-03 2002-04-09 キュラジェン コーポレイション LYST protein complex and LYST interacting protein
JP2006042802A (en) * 2004-06-28 2006-02-16 Sumitomo Chemical Co Ltd Hypoxanthine guanine phosphoribosyl transferase gene originated from callithrix jacchus and its utilization
RU2728703C2 (en) * 2014-05-02 2020-07-31 Рисерч Инститьют Эт Нэшнуайд Чилдрен`С Хоспитал Compositions and methods for anti-lyst immunomodulation
CN108601863A (en) 2015-12-11 2018-09-28 国家儿童医院研究所 The system and method for patient-specific tissue engineering blood vessel graft for optimization
CN109112211A (en) * 2018-05-15 2019-01-01 广州达瑞生殖技术有限公司 A kind of the primer combination and method of human embryos Chediak-Higashi syndrome LYST detection in Gene Mutation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9728262A1 *

Also Published As

Publication number Publication date
WO1997028262A1 (en) 1997-08-07
JP2002514897A (en) 2002-05-21
AU718378B2 (en) 2000-04-13
AU1856297A (en) 1997-08-22

Similar Documents

Publication Publication Date Title
US5352775A (en) APC gene and nucleic acid probes derived therefrom
EP0785216B1 (en) Chomosome 13-linked breast cancer susceptibility gene BRCA2
US5709999A (en) Linked breast and ovarian cancer susceptibility gene
US20040248256A1 (en) Secreted proteins and polynucleotides encoding them
US20040170994A1 (en) DNA sequences for human tumour suppressor genes
US20020193567A1 (en) Secreted proteins and polynucleotides encoding them
JP2000500985A (en) Chromosome 13 linkage-breast cancer susceptibility gene
EP1260520A2 (en) Chromosome 13-linked breast cancer susceptibility gene
CA2554380C (en) Mecp2e1 gene
ES2284214T3 (en) NEW MOLECULES OF THE FAMILY OF THE PROTEINS TANGO-77 AND USES OF THE SAME.
US5773268A (en) Chromosome 21 gene marker, compositions and methods using same
CA2340616A1 (en) Secreted proteins and polynucleotides encoding them
US5876949A (en) Antibodies specific for fragile X related proteins and method of using the same
JP2002515735A (en) Genes and gene products related to Werner syndrome
JPH10511936A (en) Human somatostatin-like receptor
AU718378B2 (en) LYST1 and LYST2 gene compositions and methods of use
US6312921B1 (en) Secreted proteins and polynucleotides encoding them
US6171857B1 (en) Leucine zipper protein, KARP-1 and methods of regulating DNA dependent protein kinase activity
JP2002506875A (en) Methods and compositions for diagnosis and treatment of chromosome 18p-related disorders
US7339029B2 (en) Sperm-specific cation channel, CatSper2, and uses therefor
US6548258B2 (en) Methods for diagnosing tuberous sclerosis by detecting mutation in the TSC-1 gene
KR20030072205A (en) Human schizophrenia gene
US20030096951A1 (en) Secreted proteins and polynucleotides encoding them
US8729248B2 (en) Sperm-specific cation channel, CATSPER2, and uses therefor
JP3911017B2 (en) Nucleotide sequence and deduced amino acid sequence of oncogene Int6

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19980824

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20010917

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20010801