WO1998049314A2 - ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR $i(HELICOBACTER PYLORI) - Google Patents

ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR $i(HELICOBACTER PYLORI) Download PDF

Info

Publication number
WO1998049314A2
WO1998049314A2 PCT/US1998/008487 US9808487W WO9849314A2 WO 1998049314 A2 WO1998049314 A2 WO 1998049314A2 US 9808487 W US9808487 W US 9808487W WO 9849314 A2 WO9849314 A2 WO 9849314A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
cluster
pylori
antigen
lys
Prior art date
Application number
PCT/US1998/008487
Other languages
French (fr)
Other versions
WO1998049314A8 (en
WO1998049314A3 (en
Inventor
Theresa P. Chow
Kirk E. Fry
Moon Y. Lim
C. P. Mcatee
Original Assignee
Genelabs Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genelabs Technologies, Inc. filed Critical Genelabs Technologies, Inc.
Priority to CA002288211A priority Critical patent/CA2288211A1/en
Priority to EP98918806A priority patent/EP0977864A2/en
Priority to AU71660/98A priority patent/AU7166098A/en
Priority to JP54726398A priority patent/JP2001517091A/en
Publication of WO1998049314A2 publication Critical patent/WO1998049314A2/en
Publication of WO1998049314A3 publication Critical patent/WO1998049314A3/en
Publication of WO1998049314A8 publication Critical patent/WO1998049314A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/205Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Definitions

  • the present invention relates to one or more antigens of H. pylori, to polynucleotide sequences coding for the antigens, and to diagnostic and therapeutic methods employing such antigens and polynucleotides.
  • Mullis, K.B. U.S. Patent No. 4,683,202, issued 28 July 1987.
  • H. pylori is a human gastric pathogen associated with chronic superficial gastritis, peptic ulcer disease, and chronic atrophic gastritis leading to gastric adenocarcinoma, although for most of this century, peptic ulcer disease was thought to be stress-related rather than caused by H. pylori infection.
  • Acceptance for the causal role of H. pylori in peptic ulcer disease and gastric inflammation developed when studies showed that human subjects who ingested H. pylori developed gastritis, a condition that was resolved after the infection was eliminated by antibiotic treatment (Warren and Marshall, 1983).
  • H. pylori is a micro-aerophilic, Gram negative, slow-growing, flagellated organism with a spiral or S-shaped morphology which infects the lining of the stomach.
  • H. pylori was originally cultured from gastric biopsy in 1982 and was placed in the Campylobacter genus based upon gross morphology.
  • the new genera Helicobacteracea was proposed and accepted, with H. pylori being its sole member (Blaser, 1996).
  • 13 members of the Helicobacter genus are recognized.
  • urease is one of the most abundant surface proteins produced by the bacteria.
  • This enzyme is a multisubunit urease which functions to hydrolyse urea into carbon dioxide and ammonia (Cover, 1995). The resulting ammonia molecules surround the bacteria, thereby neutralizing the acid in the immediate vicinity of the bacteria.
  • urease is crucial for the survival of H. pylori at acidic pH and for its successful colonization of the gastric environment.
  • H. pylori is one of the most common chronic bacterial infections in humans. H. pylori infection is found in over 90% of patients with active gastritis, and the presence of H. pylori in the gastric mucosa has been associated with mucosa-associated lymphoid tissue lymphomas (Cover, et al, 1996). In developed countries, about half of the population has been colonized with H. pylori by age 50, and in developing countries, colonization is common even among children. Further, one to two out of ten infected individuals will develop peptic ulcer disease in the course of a lifetime. Current approaches for assessing H.
  • Non-invasive approaches include the urea breath test, UBT (Atherton, et al, 1994) and serological tests which utilize various H. pylori antigens for detecting anti-H. pylori antibodies.
  • the urea breath test relies upon the presence of the urease enzyme from H. pylori to convert isotopic urea to isotopic carbon dioxide (the analyte) and ammonia.
  • the reagents for such an assay should be readily and reproducibly prepared, in addition to being highly selective and specific for H. pylori.
  • the method should be accurate and exhibit high sensitivity, in addition to being simple, convenient and cost-effective.
  • the invention pertains to the discovery and characterization of new, highly immunogenic polypeptide antigens of H. pylori. Also forming part of the invention are 69 heretofore unrecognized immunogenic cluster families. The sequence and location of these cluster families within the H. pylori genome were determined on the basis of the over 250 disclosed DNA replicas of portions of the genome of H. pylori discovered to encode highly immunogenic antigens. Also disclosed are native antigenic proteins recovered from H. pylori using a proteomics methodology. The invention further provides methods employing one, several, many or each of the above-described antigens. Also forming part of the invention is a diagnostic kit and method employing one, several, many or each of the herein described antigenic proteins to detect H. pylori infection, where the assay is effective for detecting active infective status H. pylori.
  • the present invention includes H. pylori genomic polynucleotides encoding one or more of the polypeptide antigens described herein.
  • some aspects of the invention include H. pylori derived RNA and DNA polynucleotides, recombinant H. pylori polynucleotides, a recombinant vector including any of the above polynucleotides, and a host cell transformed with any of these vectors.
  • polypeptides encode H. pylori-specific polypeptide antigens.
  • the corresponding coding sequences allow for the production of polypeptides which are useful, for example, as reagents in diagnostic tests and/or as components of vaccines.
  • Preferred polynucleotides are H. pylori antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment identified by SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51 , 54 (B2), SEQ ID NO: 60 (Y104B), and SEQ ID NO: 98 (Y128D).
  • Polynucleotides encoding these antigens are particularly preferred due to the high sensitivity and specificity exhibited by the resulting antigens.
  • polynucleotides contemplated by the invention are antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment (typically at least about 18 nucleotides in length) spanning one of the following DNA fragment clusters corresponding to SEQ ID NOs:469-547:
  • a H. pylori polynucleotide is one that is capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24.
  • a H. pylori antigen coding polynucleotide is one that is capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95
  • a H. pylori antigen coding polynucleotide is composed of at least 18 contiguous nucleotides spanning a cluster region selected from the group consisting of SEQ ID NOs: 469-547.
  • a H. pylori antigen coding polynucleotide according to the invention is composed of at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO.51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO.271, SEQ ID NO:
  • a H. pylori antigen coding polynucleotide is composed of at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 ⁇ Cl), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
  • the invention includes a H. pylori polypeptide antigen, in substantially purified form, characterized by immunoreactivity with H. pylori positive anti-sera.
  • An antigen in accordance with the invention is, in one embodiment, encoded by a polynucleotide that is typically at least 18 nucleotides in length, having the features described above. More specifically, the antigen is encoded by all or a portion of a polynucleotide sequence at least 18 nucleotides in length and capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
  • Additional antigens according to the invention are encoded by H. pylori antigen coding sequences as described above.
  • an antigen is encoded by a polynucleotide at least 18 nucleotides in length and capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl
  • the invention encompasses a H. pylori antigen comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 340-468, where the antigen is in substantially purified form and is characterized by immunoreactivity with H. pylori positive anti-sera.
  • a H. pylori antigen comprises at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 2, 4, 5, 7, 9, 10, 12, 14, 17, 21, 25-28, 36, 37, 39, 44, 48, 55, 59, 61, 69, 249, 250, 252, 254, 256, 258, 260-263, 265-269, 323, 324, and 550-554.
  • a H. pylori antigen comprises at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 555-602.
  • a H. pylori antigen comprises a polypeptide sequence selected from the group consisting of SEQ ID NOs: 555-602.
  • a H. pylori antigen is one comprising at least 6 contiguous amino acids contained within a sequence selected from the group consisting of SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
  • kits for use in screening a biological fluid such as sera for the presence of anti-H. pylori antibodies.
  • the kit includes a substantially purified H. pylori antigen of the type characterized above that is immunoreactive with at least one anti-H. pylori antibody, and a reporter for detecting binding of the antibody to the antigen.
  • the polypeptide antigen may be attached to a solid support, and the kit may further include a non-attached reporter-labelled anti-human antibody, where binding of the anti-H. pylori antibodies to the polypeptide antigen can be detected by binding of the reporter-labelled antibody to the anti-H. pylori antibodies.
  • the kit includes at least two H. pylori antigens having different antibody specificities.
  • the invention includes a method of detecting H. pylori infection in a subject, or detecting the eradication of the bacteria in a previously infected subject. The method involves reacting a biological fluid sample from a subject with a purified H. pylori polypeptide antigen of the type described above, and examining the antigen for the presence of bound antibody.
  • Preferred antigens for use in the method correspond to one of the following polypeptide sequences or a contiguous region contained therein: SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 ⁇ Cl), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
  • the invention includes a H. pylori vaccine composition containing a H. pylori polypeptide antigen of the type described above.
  • the antigen is characterized by its ability to reduce the level of H. pylori infection in a mammalian model system, such as a mouse or rhesus monkey challenged with the peptide, and then infected with H. pylori.
  • Preferred antigens for use in a vaccine are those which invoke a long-lasting antigenic response, as evidenced by the persistance of antibodies for an extended period of time subsequent to antimicrobial treatment.
  • Representative antigens for use in a vaccine composition are selected from the group consisting of SEQ ID NO:565 (Y139), SEQ ID NO:575 (Y146B), SEQ ID NO:555 (Y175A), SEQ ID NO:44 (A22), SEQ ID NO:569 (Y184A), SEQ ID NO:578 (Z9A), SEQ ID NO:557 (Y261A) and SEQ ID NO:575 (Y146B).
  • Fig. 1 is a computer scanned image of a Western blot of a 2-dimensional (2D) sodium dodecyl sulfate polyacrylamide gel electrophoretic (SDS-PAGE) analysis (high pH conditions) of native H. pylori antigens blotted with Roost pooled sera ⁇ H. pylori positive);
  • Fig. 2 is a computer scanned image illustrating a Western blot of a sodium dodecyl sulfate polyacrylamide gel 2-dimensional electrophoretic (SDS-PAGE) analysis (low pH conditions) of native H. pylori antigens blotted with Roost pool;
  • Fig. 3 shows a schematic representation of the amino acid translation of ORF3 of clone Y104- l .asm (299 amino acids). Regions of sequence indicated in the figure were confirmed by amino acid sequence analysis of the Y104-l .asm protein expressed in E. coli strain XLlBlue, and correspond to
  • Fig. 4 is a schematic representation of the in vivo processing pathway of the 36 kD protein of H. pylori and its relationship to the "spot 15" antigen disclosed herein;
  • Fig. 5 is a reverse phase HPLC peptide profile corresponding to the "spot 15" antigen isolated from H. pylori (ATCC 43504) which illustrates the presence of numerous identifiable protein peaks;
  • Fig. 6 presents the amino acid sequence of the 36-kD protein of //, pylori, where underlining indicates the 28 kD protein region;
  • Fig. 7 is a schematic representation of the proteome methodology employed to identify several native antigens of H. pylori;
  • Fig. 8 is a graphical representation summarizing percent sensitivity of various clones against representative H. Py/ori-immunopositive sera panels;
  • Fig. 9 is a linear representation of the H. pylori genome, indicating the approximate positions of immunogenic cluster regions forming one aspect of the present invention.
  • Figs. 10-63 are linear maps indicating the relative positions of immunogenic subclones within the clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18),
  • Figs. 64A-D present a summary of the immunogenic clone clusters of the present invention including (i) cluster number, (ii) clones defining the start and end regions of each cluster, coordinates of the cluster consensus region within the H. pylori genome, and expression data;
  • Fig. 65 is a graphical representation of comparative sensitivities of H. pylori recombinant antigens against various sera panels. Sensitivity values were calculated using various gold reference standards as described in Example 6B;
  • Fig. 66 is a graphical representation of comparative specificities of H. pylori recombinant antigens against various sera panels, computed against gold standards as indicated; and Figs. 67A and 67B provide a tabular summary of (i) immunopositive clones forming the basis of the invention, (ii) their corresponding cluster numbers (indicating the relative position of each clone and cluster within the H. pylori genome), and relative (iii) sensitivity and (iv) specificity values.
  • a polypeptide sequence or fragment is "derived" from another polypeptide sequence or fragment when it has the same sequence of amino acid residues as the corresponding region of the fragment from which it is derived.
  • a polynucleotide sequence or fragment is "derived" from another polynucleotide sequence or fragment when it has the same sequence of nucleic acid residues as the corresponding region of the fragment from which it is derived.
  • a first polynucleotide fragment is "selectively-hybridizable" to a second polynucleotide fragment if the first fragment or its complement can form a double-stranded polynucleotide hybrid with the second fragment under selective (stringent) hybridization conditions.
  • the first and second fragments are typically at least 15 nucleotides in length, preferably at least 18-20 nucleotides in length.
  • Selective (stringent) hybridization conditions are defined herein as hybridization at ⁇ 45 °C in ⁇ 1.
  • Two or more polynucleotide or polypeptide fragments have at least a given percent "sequence identity” if their nucleotide bases or amino acid residues are identical, respectively, in at least the specified percent of total base or residue position, when the two or more fragments are aligned such that they correspond to one another using a computer program such as ALIGN.
  • ALIGN The ALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs, Pearson and Lipman, 1988; Pearson, 1990).
  • H. pylori polynucleotide refers to a polynucleotide sequence derived from the genome of H. pylori and variants thereof.
  • H. pylori polynucleotides of the type disclosed herein encode H. pylori polypeptide antigens, where the resulting antigen is characterized by immunoreactivity with H. pylori positive anti-sera.
  • a H. pylori polynucleotide of the invention will be at least about 18 or more nucleotides in length ⁇ i.e. , encoding a 6 peptide-antigen).
  • the pylori polynucleotide will be at least about 24 nucleotides in length (i.e. , encoding an 8 peptide-antigen). In yet another embodiment, the H. pylori polynucleotide will be at least about 30 nucleotides in length. In some instances, the H. pylori polynucleotide will range from about 45 to 75 nucleotides in length, but may of course be longer.
  • the polynucleotides of the invention may be obtained from natural or synthetic sources, or, may be prepared recombinantly.
  • the polynucleotide sequence may be a naturally-occurring sequence, or it may be related by mutation, including single or multiple base substitutions, by deletions, by insertions and inversions, to a particular naturally-occurring sequence, provided that the subject polynucleotide is capable of expressing a H. pylori antigen as described herein.
  • the polynucleotide sequence may optionally contain expression control sequences (typically from a heterologous source) positioned adjacent to the coding region.
  • nucleotide sequences described herein are meant to encompass variants possessing essentially the same "sequence identity" as defined above. Nucleotide sequences having essentially the same sequence identity are typically selectively hybridizable to one another under selective (stringent) hybridization conditions. This is to say, a nucleic acid fragment is considered to be selectively hybridizable to a H. pylori polynucleotide if it is capable of specifically hybridizing to the H. pylori polynucleotide sequence or a variant thereof (e.g. , a probe that hybridizes to a H. pylori polynucleotide but not to polynucleotides from other members of the Helicobacter family) under stringent hybridization and wash conditions.
  • H. pylori antigenic polypeptide is meant to encompass immunoreactive variants of the polypeptide, or regions, or parts thereof, provided that the variant is immunogenic.
  • a suitable variant is defined as any polypeptide having a sequence that is identical ⁇ i.e. , shares sequence identity) to that of a H. pylori polypeptide.
  • an antigenic polypeptide that is essentially identical to a H. pylori polypeptide antigen is (i) encoded by a nucleic acid that selectively hybridizes to sequences of H. pylori or its variants or (ii) is encoded by H. pylori or its variants.
  • a sequence comparison may also be employed for the purpose of determining "polypeptide homology", e.g., by using the local alignment program LALIGN.
  • a polypeptide sequence is typically compared against a selected H. pylori amino acid sequence or any of its variants, as defined above, using the LALIGN program with a ktup of 1, default parameters and the default PAM.
  • Any polypeptide with an optimal alignment longer than about 6 to 8 amino acids and greater than 70%, or more preferably 75% to 80% of identically aligned amino acids is considered to be a "homologous polypeptide.”
  • the LALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs (Pearson, etal. , 1988; Pearson, 1990; program available from William R. Pearson, Department of Biological Chemistry, Box 440, Jordan Hall, Charlottesville, VA). Sequence variations among antigens will depend upon a number of factors, such as the strain of //. pylori, the location of the gene or gene family encoding the antigen within the H.
  • An immunogenic polypeptide or polypeptide "fragment” is one that is (i) encoded by an open reading frame of a H. pylori polynucleotide, or (ii) displays sequence identity to H. pylori polypeptides as defined above, and is immunoreactive with a H. pylori immunopositive sample, such as sera.
  • the immunogenic fragment will comprise at least about 6 to 8 amino acids, and preferably at least about 10 to 12 contiguous amino acid residues of a particular antigen.
  • immunoreactive variant of a H. pylori polypeptide antigen is meant an amino acid substitution, deletion, and/or addition variant of a particular H. pylori polypeptide antigen sequence disclosed herein, having substantially the same or increased binding affinity to a given antibody as the particular polypeptide antigen, as determined by conventional methods, e.g. , a competition assay or a two antibody sandwich assay.
  • a representative H. pylori antigen is composed of at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:340-468, where the antigen is in substantially purified form and is characterized by immunoreactivity with H. pylori positive anti-sera.
  • Cluster sequences correspond to regions within the H. pylori genome encoding highly immunogenic polypeptides. The cluster sequences were determined on the basis of DNA sequence information for the over 250 immunogenic clones described herein. As a result, 69 unique clusters were identified, and antigens in accordance with the invention are those encoded by a contiguous series of nucleotides contained within any of the 69 clusters. Representative antigen coding sequences falling within each of the clusters are summarized in Figs. 64A-D. The cluster regions are referred to herein as clusters 1 to 69, corresponding to SEQ ID NOs: 340-468.
  • each of the cluster regions within the H. pylori genome is llustrated pictorially in Fig. 9.
  • a cluster is defined by various regions within the H. pylori genome rather than by a single sequence.
  • a particular cluster may be defined by a "start” sequence, an "end” sequence, and perhaps an invervening "middle” sequence, and thus may correspond to more than one sequence contained in the Sequence Listing.
  • the descriptor for the cluster sequence will indicate its relationship to a given cluster (e.g. , start, middle, end).
  • a defining cluster sequence will code for one or more antigens, and the remainder of the sequence for a particular cluster, if not explicitly provided, can be readily determined based upon the information provided herein, when considered along with the information, e.g. , provided in Tomb, et al, 1997.
  • Cluster sequences defined by more than one representative sequence are cluster 1 (SEQ ID NOs: 469, 470), cluster 5 (SEQ ID NOs:474,475), cluster 7 (SEQ ID NO:477, end), cluster 15 (SEQ ID NOs:485, 486), cluster 35 (SEQ ID NOs:506-508), cluster 40 (SEQ ID NOs:513, 514), cluster 41 (SEQ ID NO:515, end), cluster 43 (SEQ ID NOs:517, 518), cluster 47 (SEQ ID NOs:522, 523), cluster 49 (SEQ ID NO:525, end), cluster 58 (SEQ ID NOs:534, 535), and cluster 59 (SEQ ID NO:536, 537).
  • substantially purified and “in substantially purified form” are used in several contexts and typically refer to at least partial purification of a H. pylori polynucleotide or polypeptide away from unrelated or contaminating components ⁇ e.g. , serum, cells, proteins, non-H. pylori polynucleotides, etc.) by at least one purification or isolation step. Methods and procedures for the isolation or purification of compounds or components of interest are described herein (e.g. , SDS-PAGE, affinity purification of fusion proteins, blotting, and recombinant production of H. pylori polypeptides).
  • An antigen is "specifically immunoreactive" with H. pylori positive anti-sera or a biological fluid sample when, under optimal conditions, the antigen binds to antibodies present in the H. pylori infected sample but does not bind to antibodies present in the majority (greater than about 60 to 65% , preferably from about 70% to 80% , even more preferably greater than about 85 %) of fluid samples from subjects who are not or have not been infected with H. pylori.
  • "Specifically immunoreactive" antigens may be immunoreactive with monoclonal or polyclonal antibodies generated against specific H. pylori antigens.
  • biological fluid any fluid derived from the body of a mammal, particularly a human.
  • Representative biological fluids include blood, serum, plasma, urine, faeces, mucous, gastric secretions, dental plaques, or saliva.
  • Immunologically effective amount refers to an amount administered to a mammalian host, either as a single dose or as part of a series, that is effective for treatment or prevention of infection by H. pylori.
  • the amount will vary depending upon the health and physical condition of the subject to be treated, the capacity of the subject's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, and the like. Such an amount will typically fall within a relatively broad range, and can be determined in routine trials.
  • the present invention is based on the identification and isolation of a number of highly immunogenic H. pylori polypeptides, resulting from the screening of over a million individual H. pylori compositions.
  • the antigens of the present invention were either produced recombinantly, or, were separated from a mixture of soluble proteins obtained from pelleted and lysed H. pylori.
  • disparate sequence information corresponding to all of the disclosed immunogenic clones was compiled to provide a collection of heretofore unrecognized antigenic cluster regions contained within the genome of H. pylori.
  • the H. pylori antigens of the invention can be obtained from phage libraries using conventional screening methods described below. Unless otherwise stated, the DNA lambda libraries described herein have been deposited in the Genelabs Technologies, Inc. Culture Collection, 505 Penobscot Drive, Redwood City, CA, 94063, or in the Genelabs Diagnostics PTE LTD Culture Collection, 85 Science Park Drive #04-01, The Cavendish, Singapore Science Park, Singapore 118259.
  • E. coli XL-1 Blue MRF plasmids containing inserts corresponding to the following H. pylori clones were accepted for deposit by the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on September 9, 1997 and assigned the following designation numbers: clone dHClS.
  • the antigen-encoding DNA fragments of the invention can be identified by (i) immunoscreening, as described above, and/or (ii) computer analysis of coding sequences using an algorithm (such as, "ANTIGEN,” Intelligenetics, Mountain View, CA) to identify potential antigenic regions.
  • an antigen-encoding DNA fragment is subcloned, and the subcloned insert is then fragmented by partial DNase digestion to generate random fragments or by specific restriction endonuclease digestion to produce specific subfragments.
  • the resulting DNA fragments are then inserted into an expression vector, such as the lambda gtl l vector, and subjected to immunoscreening in order to provide an epitope map of the cloned insert.
  • DNA fragments of the type described herein can be employed as probes in hybridization experiments to identify overlapping H. pylori sequences, and these in turn can be further used as probes to identify additional sets of contiguous clones.
  • any of the herein-described clone sequences can be used to probe a DNA library, generated in a vector such as lambda gtlO or "LAMBDA ZAP II" (Stratagene, La Jolla, CA). Specific subfragments of known sequence may be isolated using the polymerase chain reaction or after restriction endonuclease cleavage of vectors carrying such sequences. The resulting DNA fragments can be used as radiolabelled probes against any selected library. In particular, the 5' and 3' terminal sequences of the clone inserts are useful as probes to identify additional clones.
  • sequences provided by the 5' end of cloned inserts are useful as sequence specific primers in first-strand DNA synthesis reactions (Maniatis, et al, 1982; Scharf, et al, 1986).
  • specifically primed H. pylori DNA libraries can be prepared by using specific primers derived from one of the cloned DNA sequences described herein as a template.
  • the second-strand of the new DNA is synthesized using RNase H and DNA polymerase I.
  • the above procedures identify or produce DNA molecules corresponding to nucleic acid regions that are 5' adjacent to the known clone insert sequences. These newly isolated sequences can in turn be used to identify further flanking sequences, and so on.
  • the polynucleotides can be cloned and immunoscreened to identify specific sequences encoding H. pylori antigens.
  • DNA libraries were prepared from commercially available strains of H. pylori (American Type Culture Collection, Rockville, MD; ATCC Designation No. 43504, ATCC Designation No. 43526) in either the expression vector lambda gtl l or "ZAPII" (Stratagene, La Jolla, CA; Example 1). Polynucleotide sequences were then selected for the expression of peptides which were immunoreactive with pooled sera obtained from 11 patients identified by endoscopy as H. />y/ ⁇ ' -positive, herein identified as "Roost pool” sera, or another pool of 4 H. j py/ori-irnmunopositive sera samples identified as "SFA001 ".
  • the samples may be individual samples, i.e. , derived from a single subject, or may be pooled samples, such as those described herein.
  • Recombinant proteins identified by this approach provided candidates for polypeptides that can serve, either singly or in combination, as substrates in diagnostic tests for detecting infection by H. pylori. Further, the corresponding nucleic acid coding sequences serve as useful hybridization probes for the identification of additional H. pylori antigen coding sequences.
  • the H. pylori strains described above were used to generate DNA libraries in lambda vectors
  • Example 1 Other commonly available strains of H. pylori include, for example, H. pylori samples identified as strain # 29995 and J-170. Alternatively, libraries can be constructed from H. pylori isolated from a sample confirmed as //. /o ⁇ ' -positive. In the method illustrated in Example 1, the libraries were generated from genomic DNA isolated from the pelleted bacteria. Alternatively, centrifugation can be used to pellet bacteria from infected biological specimens such as gastric mucosa.
  • H. pylori DNA libraries were generated using DNase-digested genomic DNA fragments isolated from H. pylori as starting material.
  • the resulting molecules were ligated to Sequence Independent Single Primer
  • SISPA SISPA
  • Reyes, et al, 1991 linker primers and expanded in a non-selective manner, and then cloned into a suitable vector, for example, lambda gtl l or ZAPII, for expression and screening of peptide antigens.
  • a suitable vector for example, lambda gtl l or ZAPII
  • the libraries disclosed herein have been designated as the "short antigen clone” library, typically designated by a prefix beginning with the letters Y or Z; and the "long antigen clone” library, designated as libraries 1 and 2.
  • the ZAPII libraries 1 and 2 were similarly constructed, with the following exceptions.
  • the ZAPII libraries 1 and 2 were generated from longer H. pylori DNA fragments, i.e., either EcoRI or Z/m- ⁇ II-digested genomic DNA which had not undergone Sequence Independent Single Primer Amplification.
  • Library 1 clones (designated herein by upper case letters) were generated by ligating EcoRI-digested H. pylori DNA directly into the EcoRI sites of the lambda "ZAPII" vector.
  • Library 2 clones (lower case designations) were obtained by digesting H. pylori genomic DNA with Hindlll, then blunt ended with ⁇ .
  • Lambda gtl 1 is a particularly useful expression vector for producing H. pylori antigens.
  • the vector contains a unique EcoRI insertion site, located 53 base pairs upstream of the translation termination codon of the /3-galactosidase gene.
  • an inserted sequence is expressed as a ⁇ - galactosidase fusion protein which contains the N-terminal portion of the 3-galactosidase gene product, the heterologous peptide, and optionally the C-terminal region of the /3-galactosidase peptide (the C ⁇ terminal portion being expressed when the heterologous peptide coding sequence does not contain a translation termination codon).
  • the lambda gtll vector also produces a temperature-sensitive repressor (cI857) which causes viral lysogeny at permissive temperatures, e.g., 32°C, and leads to viral lysis at elevated temperatures, e.g., 42°C.
  • Advantages of lambda gtll include: (1) highly efficient recombinant clone generation, (2) ability to select lysogenized host cells on the basis of host-cell growth at permissive, but not non-permissive, temperatures, and (3) production of recombinant fusion protein.
  • phage containing a heterologous insert produces an inactive / 3-galactosidase enzyme, phage with inserts are typically identified using a colorimetric substrate conversion reaction employing ⁇ - galactosidase.
  • E. coli expression vectors are useful for expression of antigens.
  • Alternative microbial hosts suitable for expression include bacilli, such as B. subtilis, and other Enterobacteriaceae, such as Salmonella, Serratia, and various Pseudomonas species.
  • the expression vectors will typically contain control sequences compatible with the host cell.
  • Other known promoter sequences may be present in the expression vector, such as the lactose promoter system, a tryptophan promoter system, or a promoter system from phage lambda.
  • An amino terminal methionine can be provided, if necessary, by insertion of a Met codon in-frame with the antigen.
  • the carboxy terminal extension of an antigen can be removed by conventional mutagenesis procedures.
  • yeast expression systems can be used, such as the Saccharomyces cerevisiae pre- pro-alpha-factor leader region used to direct protein expression from yeast.
  • the antigen coding sequence can be fused in frame to the leader region. This construct is then typically put under the control of a strong transcription promoter, such as the alcohol dehydrogenase I promoter or a glycolytic promoter. The antigen sequence is followed by a translation termination codon which is followed by transcription termination signals.
  • the antigen coding sequences can be fused to a second protein coding sequence, such as / 3-galactosidase, used to facilitate purification of the fusion protein by affinity chromatography.
  • protease cleavage sites may be inserted to facilitate separation of the fusion protein components.
  • mammalian cells can be used for expression of the antigens of the invention.
  • Vectors useful for expression in mammalian cells are typically characterized by insertion of the antigen coding sequence between a strong viral promoter and a polyadenylation (poly A) signal.
  • the vectors may optionally include selectable marker genes, such as those conferring antibiotic resistance.
  • Suitable host cells include Chinese hamster ovary cells (CHO) cell lines, HeLa cells, myeloma cells, Jurkat cell lines, and the like.
  • the expression vectors for these cells may include expression control sequences, such as an origin of replication, a promoter, an enhancer, information processing sites, e.g., ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences.
  • expression control sequences such as an origin of replication, a promoter, an enhancer, information processing sites, e.g., ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences.
  • the DNA sequences are expressed in hosts after the sequences of interest have been operably linked to (i.e., positioned to enable the functioning of) an expression control sequence.
  • Example 1 describes the preparation of a DNA library for H. pylori (ATCC No. 43504).
  • the library was immunoscreened using H. pylori-positive pooled sera (Example 2).
  • a number of lambda clones were identified which were immunoreactive with anti-//. pylori antibodies present in the pooled sera.
  • Selected immunopositive clones were plaque-purified and their immunoreactivity retested.
  • the immunoreactivity of the clones with normal human sera (control, H. >y/or.-negative) was also tested.
  • Numerous clones were identified by immunosceening and further characterized, as described in Examples 3 and 4.
  • Immunoreactive clones further described in Example 3 and referred to herein as the Y and Z families of clones include clone Y-104-1, and the clones summarized in Table 2, corresponding to SEQ ID NOs: 70-230.
  • the DNA inserts of the immunoreactive recombinant lambda clones were PCR amplified, using primers corresponding to lambda arm sequences flanking the EcoRI cloning site of the vectors, and utilizing each immunoreactive clone as template.
  • gtl 1 clones gtl IF (S ⁇ Q ID NO: 65) and gtl 1 R (S ⁇ Q ID NO:66) primers were used.
  • T3 S ⁇ Q ID NO:67
  • T7 S ⁇ Q ID NO:68
  • the resulting amplification products were agarose gel purified and eluted from the gel (Ausubel, et al, 1988) to remove primers and other components.
  • the purified insert DNA was then subjected to direct sequencing. In some cases, the insert DNA was first subcloned into the TA cloning vector (Invitrogen, San Diego, CA) and then sequenced. Clones exhibiting immunoreactivity against
  • the genomic clones were digested by restriction enzyme treatment, and the resulting subfragments were inserted into a suitable expression vector.
  • the resulting subclones, containing the specific digested DNA fragments, were then screened for immunogenicity. Clones identified as immunoreactive towards //. /ry/ ⁇ -positive pooled sera were plaque purified. Plasmid DNA containing inserts obtained by recovery from phage were sequenced as described above and also in Example 4.
  • H. pylori Antigens ZAPII Libraries 1 and 2.
  • Library 1 and library 2 clones isolated by immunoscreening with H. pylori immunopositive pooled sera include the following: LIBRARY 1: A3, A22, B2, B9, B17, B23, Cl, C3, C7, DH, and LIBRARY 2: al, a3, a5, b5, b8c7, c2, c5, cl3, d5, d6, dll, d7, e6, f3, f8, fll, g2, g9, gll, k4.
  • Clone Gla was isolated by screening with monoclonal antibody 1G6 from Biogenesis, Inc. (Bournemouth, England). Sequence data for these clones is presented in the sequence listing herein.
  • Clone d7 and its Relationship to the 36 kD Protein of H. pylori.
  • Clone d7 another immunoreactive clone of the present invention, is nHindl clone that was blunt-ended, ligated to A/B linkers, digested with EcoRI, and subcloned into lambda vector ZAPII to produce a beta- galactosidase fusion protein.
  • the nucleotide sequence is presented as S ⁇ Q ID NO:ll. Polynucleotides and polypeptides derived from this clone represent preferred embodiments of the invention.
  • this clone codes for about 70% of the carboxy- end of the 36 Kd native protein of H. pylori, while clone Y104 codes for the entire 36 K protein.
  • clone Y104 codes for the entire 36 K protein.
  • portions of the amino acid sequence of the native 36 Kd protein of H. pylori have been determined.
  • the 36 kD protein (encoded in part by clone d7) appears to be a precursor to a highly antigenic H. pylori protein referred to herein as the "spot 15" protein (Examples 8,9, to be described in more detail below).
  • the native H. pylori translation product a 34 kD protein (i.e, calculated molecular weight) composed of 299 amino acids, appears to be cleaved in vivo at amino acid position 23. This results in a 31.6 kD cleavage product commonly referred to as "the 36 kD protein" of //. pylori.
  • the differences in molecular weight terminology in referring to this protein arise due to the following: 36-kD is observed from SDS-PAGE; 34-kD is calculated from the corresponding DNA sequence (Fig. 6); and 31 kD is determined from experimental sequence data, as determined starting from residue 23 of SEQ ID NO:60.
  • a post translation modification of the 36 kD protein i.e., acetylation at the amino acid terminus (position 23) and cleavage at the carboxy end, results in the "spot 15" antigen, having a molecular weight of 28 kD.
  • "X"s correspond to positions where deamidations (i.e. , asparagine or glutamine) or point mutations have occurred.
  • Further proposed modifications leading to the minor spot 15 protein are also indicated.
  • the antigenic polypeptide corresponding to spot 15, and sequences coding for this protein e.g. , clones Y-104 and d7, represent one particularly preferred embodiment of the present invention. This is due to the strong antigenicity of the peptide.
  • the spot 15 peptide has been detected in various strains of //, pylori, as indicated in Table 9 and in Example 10.
  • Genome The antigenic sequences described herein were determined based upon the screening of over a million discrete H. pylori antigenic compositions. DNA sequencing was carried out for H. pylori immunopositive clones, and open reading frames coding for antigenic proteins were identified. Nucleic acid sequences coding for the thus-identified antigens were then inserted into expression vectors and expression of the desired antigenic protein was confirmed. As a result of this work, over 250 antigenic clone sequences have been identified and are disclosed herein.
  • the antigenic polynucleotide fragments isolated herein map into 69 clusters which are identified herein as clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69).
  • Fig. 9 The position of these antigenic clusters within the complete H. pylori genome is shown in Fig. 9. As can be seen from Fig. 9, the locations of the clusters within the genome (and clones within the clusters) are highly random, and represent only a small portion (i.e. , approximately 2-3 %) of the overall nucleotides contained within the entire genome. Prior to this work, a comprehensive guide to the unique and highly antigenic regions within the genome of H. pylori was unknown.
  • Cluster analysis all of the immunoclone sequences were combined into a FASTA database as defined by Pearson and Lipman (1988). This database was converted to a BLAST database for high speed searching (Altschul, et al, 1990). The sequence of each immunoclone was then searched against this database using the program BLASTN to define clusters. Clusters are an assembly of clones that contain identical sequences with other clones in the group. The sequences in each group or cluster were then combined in separate database files and formatted for entry in the GEL program of IG-Suite (Oxford Molecular). GEL then assembled the sequences and suggested a consensus sequence with ambiguities. A non-ambiguous sequence was determined for each cluster by editing the consensus sequence.
  • Clusters 1 to 69 correspond to SEQ ID NO:s 469-547. Open reading frames for the antigens were then determined from the cluster consensus sequence. Single clone sequences were translated directly to provide antigen sequences. Antigenic regions contained within each of the clusters as determined by translation of cluster open reading frames are provided as SEQ ID NO:s 340-468. Figs.
  • 10 to 63 are linear maps indicating the relative positions of immunogenic subclones within the clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (23), (25), (27), (28), (29), (30), (32), (33), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (53), (54), (58), (59), (61), (62), (68), and (69).
  • Clusters not shown on the linear maps are clusters defined by one clone, i.e., clusters (22), (24), (26), (31), (34), (52), (55), (56), (57), (60), (63), (64), (65), (66), (67).
  • Figs. 64A-D present a tabular summary of clusters 1-69, clones contained within each cluster, and coordinates of each cluster consensus region within the H. pylori genome.
  • H. pylori possesses a circular genome of 1,667,867 base pairs and 1590 predicted coding sequences.
  • the genome sequence reported on the TIGR Web site was used as a reference for reporting nucleotide positions of the clusters and immunogenic clones of the invention. Based upon the present work, it can be seen that a relatively small number of the predicted open reading frames reported for the H. pylori genome encode the antigenic proteins or cluster regions forming the basis of the present invention.
  • Each cluster defines a continuous DNA sequence that spans, i.e., extends, from the 5' end of the most upstream clone in the cluster to the 3' end of the most downstream clone in the cluster.
  • the spanning sequence is incomplete, it has been filled in with sequence from the reported H. pylori genomic sequences (TIGR Web site).
  • the sequence defined by cluster 1 includes the sequence beginning at the 5' end of clone Y92 (SEQ ID NO: 222) and ending at the 3' end of clone Y92 (SEQ ID NO:223), including the short (about 730 bases) genomic sequence connecting the two clone sequences.
  • the positions of the individual clones in each cluster are shown in Figs. 10-63, along with corresponding SEQ ID NOs.
  • the invention includes antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment spanning one of the DNA fragment clusters: (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69).
  • the antigen-coding regions are the spanning sequences themselves from the clusters.
  • Preferred polynucleotides are H. pylori antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (Cl), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
  • the clusters disclosed herein provide a non-random collection of H. pylori antigen coding sequences and resulting antigens which have been shown to react with H. pylori immunopositive samples and are useful in a variety of diagnostic applications.
  • Fig. 7 An overview of the proteome methodology disclosed herein and utitilized to separate and identify over twenty H. pylori antigens is provided in Fig. 7 (right-hand side).
  • bacterial proteins are separated by two-dimensional (2D) electrophoresis.
  • Immunoreactive spots ⁇ i.e. , reactive in Western blots with pooled sera from H. pylori infected patients) are then selected and subsequently characterized by endoproteolytic digestion, chromatography, and mass spectrometry (e.g. , matrix assisted laser desorption time of flight mass spectrometry, MALDI-TOF).
  • Box A indicates the use of electrospray mass spectrometry to determine the mass of the intact protein.
  • Box B indicates the use of MALDI-TOF mass spectrometry to evaluate the number and mass of Lys-C peptides.
  • Box C indicates the use of MALDI-MS to evaluate chromatographically separated Lys-C peptides and provide sequence information by post source decay (when peptides are pure).
  • Box D indicates the use of electrospray mass spectrometry to evaluate and sequence peptides through collision induced dissociation.
  • Antigens are generally obtained from whole lysates as follows. Fractionation of H. pylori soluble proteins is carried out by SDS-PAGE, preferably utilizing 2-dimensional electrophoresis (O'Farrell, 1975; O'Fanell, et al , 1977). In performing a typical 2-dimensional electrophoretic separation, an isoelectric focusing gel separation is first carried out (first dimension gel). Isoelectric focusing is carried out, e.g. , on a acrylamide/bisacrylamide gel, for an appropriate number of volt-hrs, determined as described in "CURRENT PROTOCOLS IN PROTEIN SCIENCE” . Units 10.46, John Wiley and Sons, Inc., New York (1996).
  • the final tube gel pH gradient is then measured by a surface pH electrode.
  • Protein components in a sample mixture undergo a first separation according to pi value, as indicated on the horizontal axes of Figs. 1 and 2.
  • the first dimensional separation can be carried under either acidic (Fig. 2) or basic (Fig. 1) conditions, depending upon the nature of the proteins to be separated.
  • a second dimensional, sized-based polyacrylamide slab gel separation is then carried out, to further separate the proteins on the basis of molecular weight.
  • Suitable molecular weight markers are typically added to the gel, for determining corresponding molecular weights of eluted proteins.
  • Detection is typically carried out by Coomassi blue staining or by silver staining. Silver staining may be preferred, in some cases, since silver staining methods are considerably more sensitive and can be used to detect smaller amounts of protein.
  • Figs. 1 and 2 illustrate H. pylori antigens obtained generally as described above, which have been Western blotted with Roost H. pylori-positive serum pool and a negative serum pool, respectively. The numbers in each figure correspond to spots representing H.
  • immunoreactive spots were excised from the gels, digested, and sequenced.
  • the native H. pylori antigens were further characterized by a combination of sequencing methodologies, including N-terminal sequencing, liquid-chromatography- mass spectrometry, and determination of internal sequences by amino-acid specific chemical cleavage, followed by Edman sequencing (Example 9).
  • Internal amino acid sequences can be determined by utilizing a combination of various site- specific cleavage reagents, such as ortho-pthalaldehyde (OPA)/cyanogen bromide (CNBr), hydroxy lamine, formic acid, and BNPS-skatole, which cleave as follows: CNBr (cleaves at C- terminus of methionine), BNPS-skatole (cleaves at C-terminus of tryptophan), formic acid (cleaves at Asp-Pro peptide bond), hydroxylamine (cleaves at Asn-Gly bond), and OPA, which distinguishes between secondary and primary amines; and enzymatic reagents such as Endo Protease Lvs-C, Endoproteinase ASP-N, Endoproteinase GLU-C.
  • OPA ortho-pthalaldehyde
  • CBr cyanogen bromide
  • CNBr cleaves at C-terminus of methionine
  • Endo Protease Lvs-C cleaves at C-terminus of lysine
  • Endoproteinase ASP-N cleaves at N-terminus of aspartic acid and cysteic acid
  • Endoproteinase GLU-C cleaves at C-terminus of glutamic acid
  • Trvsin cleaves at C-terminus of arginine and lysine
  • spots 9, 11, 12, 13, 15, 16 (major and minor), and 17 represent unique antigens.
  • Spot 9 represents the native H. pylori antigen corresponding to the recombinant protein expressed by ORF2 of clone al
  • spot 12 represents the native antigen that corresponds to the antigenic protein encoded by clone a5.
  • spots 9 (major and minor) 10, 12, 13, and
  • Mass spectral profiles of selected Lys-C digested H. pylori proteins corresponding to Western positive spots are provided in Tables 11 and 12.
  • the mass "fingerprints" of the peptide digests were then used as a basis for comparison to proteins predicted from various genomics databases (details are provided in Example 12) to further confirm the identify of selected H. pylori antigens described herein.
  • proteomic approach is useful for analyzing the genetic diversity of H. pylori, examining antigen-antibody responses during acute and chronic infection, and with particular gastroduodenal pathologies and possible autoimmune components to //. y/o ⁇ -associated disease.
  • 2-D gel electrophoresis and in-situ proteolytic digestion in conjunction with MALDI-TOF MS provides an extremely sensitive technique for the rapid identification of//, pylori antigens and for rapid screening for preferred vaccine and diagnostic candidates.
  • the recombinant antigenic peptides of the present invention can be purified by standard protein purification procedures which may include differential precipitation, molecular sieve chromatography, ion-exchange chromatography, isoelectric focusing, gel electrophoresis and affinity chromatography.
  • H. pylori antigens in accordance with the invention comprise at least 6 contiguous amino acids contained within one of the following cluster antigen sequences: SEQ ID NOS: 340-468.
  • a H. pylori antigen may comprise at least 6 contiguous amino acids contained within a polypeptide sequence selected from one of the following SEQ ID NOS: 2, 4, 5, 7, 9, 10, 12, 14, 17, 21, 25-28, 36, 37, 39, 44, 48, 55, 59, 61, 69, 249, 250, 252, 254, 256, 258, 260-263, 265- 269, 323, 324, and 550-554.
  • a H. pylori antigen corresponds to at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 555-602, where these sequences represent illustrative expression proteins corresponding to H. pylori antigens. Even more preferably, a H.
  • pylori antigen is identified by at least 6 contiguous amino acids contained within one of the following sequences: SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
  • Polynucleotide sequences encoding the antigens of the present invention have been cloned in the plasmid p-GEX (Example 5) or various derivatives thereof (pGEX-del65).
  • the plasmid pGEX (Smith, et al. , 1988) and its derivatives express the polypeptide sequences of a cloned insert fused in- frame to the protein glutathione-S-transferase (sj26).
  • plasmid pGEX-hisB an amino acid sequence of 6 histidines, is introduced at the carboxy terminus of the fusion protein.
  • the various recombinant pGEX plasmids can be transformed into appropriate strains of E. coli and fusion protein production can be induced by the addition of IPTG (isopropyl-thio galactopyranoside) as described in Example 5.
  • Solubilized recombinant fusion protein can then be purified from cell lysates of the induced cultures using Ni-NTA+ + affinity chromatography (Example 5).
  • Insoluble fusion protein expressed by the plasmids can be purified by means of immobilized metal ion affinity chromatography (Porath, 1992) in buffers containing 6 M Urea or 6 M guanidinium isothiocyanate, both of which are useful for the solubilization of proteins.
  • insoluble proteins expressed in pGEX-GLI or derivatives thereof can be purified using combinations of centrifugation to remove soluble proteins followed by solubilization of insoluble proteins and standard chromatographic methodologies, such as ion exchange or size exclusion chromatography, and other such methods are known in the art.
  • the fused protein can be isolated readily by affinity chromatography, or by passing cell lysis material over a solid support having surface-bound anti- / 3-galactosidase antibody.
  • an expression vector such as the lambda gtl 1 or pGEX vectors described above, containing H. pylori antigen coding sequences and expression control elements which allow expression of the coding regions in a suitable host.
  • the control elements generally include a promoter, a translation initiation codon, translation and transcription termination sequences, and an insertion site for introducing the insert into the vector.
  • the DNA encoding the desired antigenic polypeptide can be cloned into any number of commercially available vectors to generate expression of the polypeptide in the appropriate host system.
  • These systems include, but are not limited to the following: baculovirus expression (Reilly, et al, 1992; Beames, et al, 1991; Pharmingen, San Diego, CA; Clontech, Palo Alto, CA), vaccinia expression (Earl, et al, 1991; Moss, et al, 1991), expression in bacteria (Ausubel, et al, 1988; Clontech), expression in yeast (Gellissen, etal , 1992; Romanos, etal , 1992; Goeddel, 1990; Guthrie and Fink, 1991), expression in mammalian cells (Clontech; Gibco-BRL, Ground Island, NY), e.g.
  • Example 5 Expression of large polypeptide antigens is described in Example 5.
  • Several of the long antigen clone sequences were cloned into expression vectors and successfully expressed in E. coli, as described in Example 5 and indicated in Tables 3 a and 3b.
  • Expression in yeast systems has the advantage of commercial production.
  • Recombinant protein production by vaccinia and CHO cell line have the advantage of being mammalian expression systems.
  • vaccinia virus expression has several advantages including the following: (i) a wide host range; (ii) faithful post-transcriptional modification, processing, folding, transport, secretion, and assembly of recombinant proteins; (iii) high level expression of relatively soluble recombinant proteins; and (iv) a large capacity to accommodate foreign DNA.
  • the recombinant expressed polypeptide-produced H. pylori polypeptide antigens are typically isolated from lysed cells or culture media. Purification can be carried out by methods known in the art including salt fractionation, ion exchange chromatography, and affinity chromatography. Immunoaffinity chromatography can be employed using antibodies generated based on the H. pylori antigens identified by the methods of the present invention.
  • the resulting DNA coding regions can be expressed recombinantly either as fusion proteins or isolated polypeptides.
  • amino acid sequences can be readily chemically synthesized using commercially available synthesizer (Applied Biosystems, Foster City, CA) or "PIN” technology (Applied Biosystems). Antigens obtained by any of these methods can be used for antibody generation, diagnostic tests and vaccine development.
  • the invention includes specific antibodies directed against the polypeptide antigens of the present invention. Antigens obtained by any of these methods may be directly used for the generation of antibodies or they may be coupled to appropriate carrier molecules. Many such carriers are known in the art and are commercially available (e.g. , Pierce, Rockford, IL). Typically, to prepare antibodies, a host animal, such as a rabbit or a goat, is immunized with the purified antigen or fused protein antigen.
  • Hybrid or fused proteins may be generated using a variety of coding sequences derived from other proteins, such as glutathione-S-transferase or /3-galactosidase.
  • the host serum or plasma is collected following an appropriate time interval, and this serum is tested for antibodies specific against the antigen.
  • the gamma globulin fraction or the IgG antibodies of immunized animals can be obtained, for example, by use of saturated ammonium sulfate precipitation or DEAE Sephadex chromatography, affinity chromatography, or other techniques known to those skilled in the art for producing polyclonal antibodies.
  • purified antigen or fused antigen protein may be used for producing monoclonal antibodies.
  • the spleen or lymphocytes from an immunized animal are removed and immortalized or used to prepare hybridomas by methods known to those skilled in the art.
  • a human lymphocyte donor is selected.
  • a donor known to be infected with a H. pylori may serve as a suitable lymphocyte donor.
  • Lymphocytes can be isolated from a peripheral blood sample.
  • Epstein-Barr virus (EBV) can be used to immortalize human lymphocytes or a suitable fusion partner can be used to produce human-derived hybridomas.
  • Primary in vitro sensitization with viral specific polypeptides can also be used in the generation of human monoclonal antibodies.
  • Antibodies secreted by the immortalized cells are screened to determine the clones that secrete antibodies of the desired specificity, for example, by using the ELISA or Western blot method (Ausubel et al, 1988).
  • H. pylori antigens can then be used in any of a number of standard immunoassay formats to detect the presence of antigen, such as described in Harlow, et al, 1988.
  • One representative assay format is an antigen capture sandwich assay.
  • antibody is immobilized on a solid support.
  • H. pylori infected samples e.g. , feces, dental plaque, gastric biopsies, culture suspension from a biopsy sample
  • H. pylori infected samples e.g. , feces, dental plaque, gastric biopsies, culture suspension from a biopsy sample
  • H. pylori infected samples e.g. , feces, dental plaque, gastric biopsies, culture suspension from a biopsy sample
  • a different antibody directed against H. pylori e.g.
  • H. pylori antigen in a test sample is indicative of the presence of H. pylori antigen in a test sample.
  • the above-described assay is representative of any of an antigen-based assay based on antibodies prepared as described above, useful for the early detection of H. pylori antigens in a sample suspected of infection by H. pylori.
  • H. pylori antigens are first identified, typically through plaque immunoscreening as described above, and expressed and purified (as previously described). The antigens are then screened rapidly against a large number of suspected H. pylori positive anti-sera using alternative immunoassays, such as, ELISAs or Protein Blot Assays (Western blots) employing the isolated antigen peptide.
  • the antigen polypeptide fusion protein is then isolated as described above, usually by affinity chromatography to the fusion partner such as /3-galactosidase or glutathione-S-transferase. Alternatively, the antigen itself is purified using antibodies generated against it (see below).
  • a general ELISA assay format may be employed, such as those described in Harlow, et al. (1988).
  • the purified antigen polypeptide or fusion polypeptide containing the antigen of interest is attached to a solid support, for example, a multiwell polystyrene plate.
  • Biological fluid e.g., sera
  • Biological fluid e.g., sera
  • the sera are washed out of the wells.
  • a labelled reporter antibody is added to each well along with an appropriate substrate: wells containing antibodies bound to the purified antigen polypeptide or fusion polypeptide containing the antigen are detected by a positive signal.
  • a typical format for protein blot analysis using one, any, several, or each of the polypeptide antigens of the present invention is presented in Example 6. General protein blotting methods are described by Ausubel, et al. (1988).
  • Example 6A the antigenic protein expressed by clone dHA22.8 (A22) was used to screen a number of sera samples (both pooled sera and discrete samples). The high percentage of sera reacting to antigen produced from recombinant clone dHA22.8 indicates that it is a dominant epitope, and that it is a suitable infection marker for H. pylori. The results presented in Example 6 A demonstrate that several different source H. pylori-positive anti-sera are immunoreactive with this representative polypeptide antigen. Similar results are described for recombinant protein expressed by clone dHClS.il (Cl).
  • Example 6B Additional experiments carried out in support of the invention as described in Example 6B reveal, on the basis of sera paneling data using both single antigens and antigen combinations, that preferred antigens for use in reliably and universally detecting H. pylori infection include but are not limited to the following: A22, Cl, Y124A, Y261A, c5, C7, B2, Y104B, and Y128D. These antigens are effective as serological markers for detecting active infection by H. pylori, based upon favorable sensitivity and selectivity features. In Example 8, native proteins from H. pylori are shown to be immunoreactive with anti-/ . pylori primary antibodies obtained from "Roost" pooled serum.
  • Protective Antibodies can be identified using, for example, an animal model system (DuBois, et al. , 1996). To identify protective antibodies, polyclonal or monoclonal antibodies are generated against the antigens of the present invention, where the antigens may be used as the immune-stimulation component arm in conjunction with cholera toxic (CT). Antibodies thus generated are then used to pre-treat an infectious H. py /on-containing inoculum ⁇ e.g. , serum) before infection of cell cultures or animals. The ability of a single antibody or mixtures of antibodies to protect the cell culture or animal from infection is evaluated.
  • CT cholera toxic
  • the absence of antigen and/or nucleic acid production serves as a screen.
  • the absence of H. pylori disease symptoms e.g. , elevated carbon dioxide/ammonia levels in a urea breath test (UBT) is also indicative of the presence of protective antibodies.
  • UBT urea breath test
  • the urea breath test takes advantage of the action of the urease enzyme of H. pylori to decompose ingested 13 C or 14 C urea to radioactive carbon dioxide and ammonia, and radioactive carbon dioxide is then measured.
  • Animal models for investigating H. infection include: (i) gnotobiotic newborn piglets (easily infected by H. pylori of human origin, but preferred for short term studies), (ii) mice and ferrets, which can be colonized for months (mice) or years (ferrets), and (iii) certain domestic cats, which can carry H. pylori (DuBois, et al, 1996).
  • H. pylori causes ulcers in gnotobiotic piglets, and in mice using "THE SYDNEY STRAIN" of H. pylori, and gastritis in mouse strains SJL, C3H/HZ, DBA, C56BL b, and Balb/C.
  • a rhesus monkey infection model has also been developed (DuBois, et al, 1996).
  • convalescent sera can be screened for the presence of protective antibodies and then these sera used to identify H. pylori antigens that bind with the antibodies.
  • the identified H. pylori antigen is then recombinantly or synthetically produced. The ability of the antigen to generate protective antibodies is tested as above.
  • the antigen or antigens identified as capable of generating protective antibodies can be used as a vaccine to inoculate test animals (to be described in greater detail below).
  • the animals are then challenged with infectious H. pylori. Protection from infection indicates the ability of the animals to generate antibodies that protect them from infection. Further, use of the animal models allows identification of antigens that activate cellular immunity.
  • a protective immune response in response to challenge by a bacterial preparation ⁇ e.g. , infected serum
  • a bacterial preparation e.g. , infected serum
  • Vaccines can be prepared from one or more of the immunogenic polypeptides identified herein.
  • Numerous H. pylori polypeptides of the invention e.g. , spot 15
  • the intensity of color development is representative of the strength (binding affinity) of the antigen-anti-/ . pylori antibody interaction.
  • Representative serum paneling results for proteins expressed by two of the clones, dHA22.8 (A22) and dHClS.ll (Cl), indicate that both of these recombinant proteins are highly immunogenic.
  • Protein produced by clone dHA22.8 reacted with antibodies present in both of the H. pylori-positive pooled sera sources, Roost pool and SFA 001, and exhibited no cross reactivity with antibodies present in H. /ry/o ⁇ ' -negative samples. Similar results were observed for protein produced by clone dHClS.ll (Cl).
  • antigenic protein expressed by each of the clones dHA22.8 and dHClS.11 reacted with anti-/ . pylori antibodies in 100% and 95% of the samples, respectively, indicating the ability of the antigens of the present invention to detect H. pylori infection, and to provide components for vaccines against H. pylori.
  • antigens for use in vaccine compositions are described in Example 13.
  • Exemplary antigens are those capable of invoking a long-lasting antigenic response, as evidenced by the persistant presence of antibodies whose titre remains high for an extended period of time subsequent antimicrobial treatment for H. pylori.
  • particularly preferred antigens for use in vaccine compositions include Y139, Y146B, Y175A, and A22, Y184A, Z9A, Y261Ains and Y146B.
  • H. pylori peptides can be identified as useful in a vaccine for H. pylori as follows.
  • the individual test peptide is formulated in a suitable carrier, e.g. , adjuvant, at a concentration suitable for injection, e.g., 5-500 mg/ml.
  • a suitable carrier e.g. , adjuvant
  • One suitable test animal for vaccination is the Rhesus monkey, as detailed in DuBois (DuBois, et al., 1996).
  • the animal is vaccinated, e.g. , by oral, intramuscular or intravenous injection, in an amount typically between 0.2 to 2.0 mg/kg body weight. After a suitable period, e.g.
  • the animal may be given a booster by the same route and typically in the same amount.
  • the animal is challenged with H. pylori in a known manner, e.g. , as described in DuBois (DuBois, et al, 1996).
  • H. pylori e.g. , a suspension of approximately 10 8 -10 9 CFU of H. pylori, 1 ml
  • the subject is typically monitored by endoscopy, histologic examination, microbiological methods, and/or measurement of H. pylori- specific plasma IgG, as described in DuBois (DuBois, et al, 1996).
  • the level of infection in the vaccinated animal is then compared with that of a control animal to assess the degree of protection.
  • Those peptides which provide a measurable degree of protection against H. pylori infection are suitable for vaccine use, either alone or in combination with other peptide vaccine agents , such as the above noted dH A22.8 (A22) , dHC 1 S .11 (C 1 ) and spot 15 peptides .
  • the selected peptide is formulated according to known vaccine formulations.
  • the peptide is conjugated to a carrier protein, e.g. , keyhole limpet hemocyanin or human serum albumen, and/or suspended in a suitable adjuvant, such as Freund's adjuvant.
  • a carrier protein e.g. , keyhole limpet hemocyanin or human serum albumen
  • a suitable adjuvant such as Freund's adjuvant.
  • the vaccine is administered by conventional routes, typically IM or IV routes, as above, at peptide levels preferably in the range of 0.2 to 2.0 mg/kg. If necessary, one or more booster injections is given.
  • the specificity of a putative immunogenic fragment can be assessed by testing sera, other fluids or lymphocytes from the inoculated animal for cross reactivity with other related bacteria.
  • Synthetic Peptides Using the coding sequences of H. pylori polypeptide antigens disclosed herein, synthetic peptides can be generated which correspond to these polypeptides. Synthetic peptides can be commercially synthesized or prepared using standard methods and apparatus in the art (Applied Biosystems, Foster City CA).
  • oligonucleotide sequences encoding peptides can be either synthesized directly by standard methods of oligonucleotide synthesis, or, in the case of large coding sequences, synthesized by a series of cloning steps involving a tandem array of multiple oligonucleotide fragments corresponding to the coding sequence (Crea, 1989; Yoshio, et al, 1989; Eaton, et al, 1988). Oligonucleotide coding sequences can be expressed by standard recombinant procedures (Maniatis, et al, 1982; Ausubel, et al, 1988).
  • antigens herein are their use as diagnostic reagents for the effective and reliable detection of antibodies present in the sera of test subjects infected with H. pylori, to thereby provide an indication of infection in a test subject.
  • Preferred antigens which can be employed either singly or in combination in such a method include, e.g.
  • antigens identified by or derived from SEQ ID NO:44 A22
  • SEQ ID NO:39 Cl
  • SEQ ID NO:568 Y124A
  • SEQ ID NO:557 Y261A
  • SEQ ID NO:254 c5
  • SEQ ID NO:21 C7
  • SEQ ID NO:55 B2
  • SEQ ID NO:61 Y104B
  • SEQ ID NO:573 Y128D
  • preferred antigens are defined in terms of their DNA coding sequences, corresponding to or derived from SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), or SEQ ID NO:98 (Y128D).
  • the antigen used for the detection of antibodies present in the sera of test subjects infected with H. pylori is encoded by a DNA fragment spanning one of the DNA fragment clusters: (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69), or immunoreactive variants thereof.
  • the antigens of the present invention can be used singly, or in combination with each other, in order to detect H. pylori, as illustrated by the results in Example 6B.
  • the antigens of the present invention may also be coupled with diagnostic assays for other infectious agents.
  • test serum is reacted with a solid phase reagent having a surface-bound antigen obtained by the methods of the present invention, e.g. , the "spot 15" antigen.
  • a solid phase reagent having a surface-bound antigen obtained by the methods of the present invention e.g. , the "spot 15" antigen.
  • Exemplary antigens are A22 and Cl, which both show high sensitivity (as indicated in Figs. 65 and 66, and in Example 6B).
  • the reagent is reacted with reporter-labelled anti-human antibody to bind reporter to the reagent in proportion to the amount of bound anti-//. pylori antibody on the solid support.
  • the reagent is again washed to remove unbound labelled antibody, and the amount of reporter associated with the reagent is determined.
  • the reporter is an enzyme which is detected by incubating the solid phase in the presence of a suitable fluorometric or color-metric substrate, e.g. , 5-bromo-4-chloro-3-indoyl-phosphate (BCIP) and nitroblue tetrazolium (NBT). (Sigma, St. Louis, MO).
  • BCIP 5-bromo-4-chloro-3-indoyl-phosphate
  • NBT nitroblue tetrazolium
  • the solid surface reagent in the above assay is prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, 96-well plate or filter material.
  • attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein, typically through a free amine group binding to a chemically reactive group on the solid support, e.g. , an activated carboxyl, hydroxyl, or aldehyde group.
  • streptavidin coated plates can be used in conjunction with biotinylated antigen(s).
  • an assay system or kit for carrying out this diagnostic method generally includes a support with surface-bound recombinant antigen (e.g. , antigens such as those described in Tables 3a and 3b, or encoded by the representative clones summarized in Table 2) or native H. pylori antigen (such as those identified in Table 6 and in Fig. 1 and Fig. 2, as above), and a reporter-labelled anti-human antibody for detecting surface-bound anti-//. pylori antigen antibody.
  • surface-bound recombinant antigen e.g. , antigens such as those described in
  • homogeneous assay In a second diagnostic configuration, known as a homogeneous assay, antibody binding to a solid support produces some change in the reaction medium which can be directly detected.
  • Known general types of homogeneous assays proposed heretofore include (a) spin-labelled reporters, where antibody binding to the antigen is detected by a change in reported mobility (broadening of the spin splitting peaks), (b) fluorescent reporters, where binding is detected by a change in fluorescence efficiency or polarization, (c) enzyme reporters, where antibody binding causes enzyme/substrate interactions, and (d) liposome-bound reporters, where binding leads to liposome lysis and release of encapsulated reporter.
  • spin-labelled reporters where antibody binding to the antigen is detected by a change in reported mobility (broadening of the spin splitting peaks)
  • fluorescent reporters where binding is detected by a change in fluorescence efficiency or polarization
  • enzyme reporters where antibody binding causes enzyme/substrate interactions
  • liposome-bound reporters where binding leads to lip
  • the assay method involves reacting the serum from a test individual with the protein antigen and examining the antigen for the presence of bound antibody.
  • the examining may involve attaching a labelled anti-human antibody to the subject antibody (for example from acute, chronic or convalescent phase) and measuring the amount of reporter bound to the solid support, as in the first method, or may involve observing the effect of antibody binding on a homogeneous assay reagent, as in the second method.
  • an antigen capture assay as previously described in Section D.2.
  • Synthetic oligonucleotide linkers and primers were prepared using commercially available automated oligonucleotide synthesizers . Alternatively, custom designed synthetic oligonucleotides may be purchased from commercial suppliers.
  • H. pylori strains corresponding to ATCC Designation Nos. 43504 (short antigen clone set) and 43526 (Libraries 1 and 2) were used to generate the DNA libraries. //. pylori strain ATCC 43504 was used for isolation of native proteins produced by H. pylori. Escherichia coli strain(s) Y1088, Y1089,and XLI-Blue for libraries 1 and 2 (Stratagene, La Jolla, CA) was the host used for phage infection. E. coli strains XLI-Blue, XLOLR (Stratagene, La Jolla, CA) were used for protein expression of cloned genes.
  • H. pylori American Type Culture Collection, Rockville, MD; ATCC Designation Nos. 43504 was streaked on blood agar plates and incubated in a microaerophile environment at room temperature for 7 days. Cells were harvested by scraping the bacterial cells from 10 plates, followed by washing once with phosphate buffered saline (Dulbecco's, Gibco BRL, Gaithersburg, MD).
  • Genomic DNA was prepared as described in Ausubel et al. 1988, with minor modifications.
  • Cell pellets from 5 plates were resuspended in 510 ⁇ l of TE (Ausubel et al, 1988), to which was added 60 ⁇ l of 10 % SDS and 30 ⁇ l of 20 mg/ml of proteinase K.
  • the suspension was mixed, followed by incubation for a period from 4 - 8 hours at 37 °C.
  • To the suspension was added 80 ⁇ l of CTAB/NaCl (10% hexadecyltrimethyl ammonium bromide in 0.7 M NaCl), and the resulting solution was then mixed and incubated for 10 minutes at 65 °C.
  • the solution was then extracted with an equal volume of chloroform/ isoamyl alcohol and spun in a microcentrifuge for 5 minutes. The separated aqueous phase was transferred to a new tube, and the DNA was precipitated by addition of 0.6 volumes of isopropanol, followed by centrifugation. The DNA pellet was washed with 70% ethanol, dried briefly under vacuum and solubilized in 100 ⁇ l of distilled water. The DNA solution was then treated with DNase-free Rnase (Boehringer Mannheim, Indianapolis, IN) (Ausubel, et al, 1988; Maniatis, et al., 1982) to selectively degrade any RNA present in the sample. 2. DNase Digestion and DNA Amplification a. Short Antigen Clone Libraries.
  • H. pylori genomic DNA as described above was digested with pancreatic DNase I (Boehringer Mannheim) essentially as described in Ausubel, et al. (1988) and Sambrook, et al (1989). Aliquots of the digested DNA were taken at various time points. The DNA digests were resolved by preparative agarose gel electrophoresis. Product bands containing the desired size range of DNA (200-2000 base pairs) were excised from the gel, and recovered using the "GENE CLEAN II" kit (Bio 101 Inc., La Jolla, CA) or the "MERMAID” kit (Bio 101 Inc., La Jolla, CA), according to the manufacturer's instructions.
  • pancreatic DNase I Boehringer Mannheim
  • the recovered DNA fragments were incubated with E. coli Klenow fragment of DNA polymerase (Ausubel, et al, 1988; Sambrook, et al, 1989). The reaction mixture was incubated at room temperature for 30 minutes, followed by extraction with phenol/chloroform.
  • phosphorylated SISPA Sequence-Independent Single Primer Amplification
  • linker AB a double strand linker comprised of SEQ ID NO: 63 and SEQ ID NO: 64, where SEQ ID NO: 64 is in a 3' to 5' orientation relative to SEQ ID NO: 63 as a partially complementary sequence to SEQ ID NO:63
  • the DNA and linker were mixed at a 1: 100 ratio.
  • the linker ligated DNAs were then amplified by SISPA (Reyes, et al , 1991).
  • SISPA Reyes, et al , 1991.
  • 10 mM Tris-Cl buffer, pH 8.3, containing 1.5 mM MgCl 2 and 50 mM KCl (Buffer A) was added about 1 ⁇ l of the linker-ligated DNA preparation, 2 ⁇ M of a primer having the sequence shown as SEQ ID NO:63, 200 ⁇ M each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of Amplitaq DNA polymerase (Applied Biosystems Division, Perkin Elmer, Foster City, CA).
  • the reaction mixture was heated to 94°C for 30 seconds for denaturation, allowed to cool to 50°C for 30 seconds for primer annealing, and then heated to 72 °C for 0.5-3 minutes to allow for primer extension by Taq polymerase.
  • the amplification reaction involving successive heating, cooling, and polymerase reaction, was repeated an additional 25-40 times with the aid of a Perkin-Elmer Cetus DNA thermal cycler (Mullis, 1987; Mullis, et al, 1987; Reyes, et al, 1991; Perkin-Elmer Cetus, Norwalk, CT).
  • the ligated DNA was packaged by standard procedures using a lambda DNA packaging system (GIGAPAK, Stratagene, La Jolla), and then plated at various dilutions to determine the titer.
  • the titer of the DNA-insert phage libraries and percent recombination were determined by a standard X-gal blue/white assay (Miller, 1994; Maniatis et al, 1982).
  • the titer of the recombinant libraries ranged from 1.5 x 10 4 to 3 x 10 6 PFU/ml.
  • Percent recombination in each library can also be confirmed by selecting a number of random clones and isolating the corresponding phage DNA.
  • Polymerase chain reaction (Mullis, 1987; Mullis, et al, 1987) is then performed using isolated phage DNA as template and lambda DNA sequences, derived from lambda sequences flanking the EcoRI insert site for the DNA molecules, as primers. The presence or absence of insert is then evident from gel analysis of the polymerase chain reaction products.
  • the lambda gtl l and ZAPII phage libraries described in l.C. above were immunoscreened for the production of antigens recognizable by a pool of sera from 11 patients (designated herein as
  • the fusion proteins expressed by the recombinant lambda phage clones were screened with serum antibodies essentially as described by Ausubel, et al. (1988).
  • Each library was plated at approximately 1.5 to 2 x 10 4 phages per 150 mm plate. Plates were overlaid with nitrocellulose filters overnight. Filters were washed with TBS (10 mM, Tris pH 7.5; 150 mM NaCl), blocked with AIB (TBS buffer with 1 % gelatin) and incubated with a primary antibody diluted 100 times in AIB.
  • TBS 10 mM, Tris pH 7.5; 150 mM NaCl
  • AIB TBS buffer with 1 % gelatin
  • SFA001 is a pool of 4 donor sera (plasma packs) showing strong reactive bands on Western blot format with a crude lysate antigen preparation from Helicobacter lysate using in HelicoBlot 2.0.
  • NAA121 is a pool of 4 donor sera (plasma packs) showing no reactive bands on Western blot format with the crude lysate antigen preparation from Helicobacter lysate used in HelicoBlot 2.0.
  • H. pylori cloned families were isolated by PCR amplification of representative clones. Cloned sequences having an .ASM extension were sequenced completely; sequences with a .SEQ extension were partially sequenced.
  • the DNA inserts of the immunoreactive recombinant lambda clones were PCR amplified using primers corresponding to lambda arm sequences flanking the EcoRI cloning site of the vectors. Amplification was carried out by polymerase chain reactions utilizing each immunoreactive clone as template.
  • gtl IF S ⁇ Q ID NO:65
  • gtll R S ⁇ Q ID NO:66
  • primers T3 S ⁇ Q ID NO:67
  • T7 S ⁇ Q ID NO:68
  • the resulting amplified fragments were then agarose gel purified and eluted from the gel (Ausubel, et al, 1988).
  • the PCR products were further purified by "WIZARD PCR PR ⁇ PS” (Promega, Madison, WI) or "CHROMASPIN” columns (Clonetech, Mountain View, CA) to remove primers and other ingredients.
  • the purified insert DNA was then subjected to direct sequencing. In some cases, the insert DNA was first subcloned into the TA cloning vector (Invitrogen, San Diego, CA) and then sequenced.
  • Sequence determination for the DNA inserts was carried out using a Perkin Elmer Applied Biosystems 373A DNA sequencer (Perkin Elmer, Applied Biosystems Division, Foster City, CA) according to the manufacturer's protocol (dideoxy chain terminator sequencing methodology. Sanger, et al , 1977).
  • Clone Y104-1 (SEQ ID NO:60) contains the entire d7 clone, and encodes all of the 36K peptides and all of the spot 15 peptides, as indicated in Fig. 3.
  • Example 5A and 5B The production of expressed antigenic proteins and their subsequent purification is generally described in Examples 5A and 5B.
  • a tabular summary indicating clone name, expression, purification, and panelling data is provided in Table 3b, and a summary of the immunoreactivy of various recombinant antigens is provided in Table 5b.
  • Expression profiles were obtained, and the immunoreactivity of various clones to H. pylori positive pooled sera such as "Roost" was confirmed.
  • amplified products corresponding to a particular ORF were typically cloned into a pGEXhisB vector and expressed in E. coli. The size of the expression product was then determined, followed by confirmation of immunoreactivity (e.g., with Roost pool sera).
  • Example 3 Additional immunopositive clones, as described in Example 2 above, were purified and analyzed for DNA insert size and expressed protein size as in Example 3 above.
  • the ZAPII clones were rescued into plasmids from phagemids (as per protocol in Stratagene, La Jolla, U.S.A.); the resulting insert DNA was excised with EcoRI and ran on agarose gel to determine size.
  • Slightly longer versions of T3 and T7 (GD 60 and GD 61, corresponding to SEQ ID NO:231 and SEQ ID NO:232 respectively) were then used as primers for sequencing.
  • libraries correspond to the "library 1 " series (EcoRI-cut), i.e., clones A3, A22, B2, B9, B17, B23, Cl, C3, C7, and "library 2" series clones (Hind cut), i.e., clones al, a3, a5, b5, b8c7, c2, c5, cl3, d5, d6, d7, dll, fll, e6, f3, f8, g2, g9, gll, and k4; and also clone Gla from the EcoRL/Xbal cut library. Many clones were determined to be larger than the predicted coding regions of the proteins.
  • the specific antigen coding sequences from H. pylori library 1 and library 2 clones were subcloned.
  • the subclones were typically prepared by fragmenting the corresponding genomic clone by specific restriction endonuclease digestion to produce a specific subfragment or subfragments, which were purified using "G ⁇ N ⁇ CLEAN II" kit (Bio 101 Inc., La Jolla, CA) or the "MERMAID” kit (Bio 101 Inc., La Jolla, CA).
  • the resulting DNA fragments were then inserted into a suitable expression vector, i.e., the PB/Bluescript (SK) vector (Stratagene, La Jolla, CA).
  • Immunoreactive DNA regions were then sequenced and locations of the open reading frames were determined. Unless otherwise indicated as a 0-galactoside fusion product, for each of the clones in Tables 3a and 3b below, expression of coding regions was determined to be driven by the corresponding H. pylori promoter rather than by the /3-galactosidase gene promoter in lambda gtl 1. LIBRARY 1 CLONES 2. Clone A3
  • the corresponding nucleotide sequence (1878 bp) is presented herein as SEQ ID NO: 16.
  • the open reading frame (ORF) extends from nucleotides 399-1743, and codes for a putative protein containing 448 amino acids (443 amino acids when calculated from the first methionine).
  • the protein sequence corresponding to the translation of open reading frame (ORF) 399-1743 is presented herein as SEQ ID NO: 17.
  • Subclone A22210DMIC was obtained from an original genomic clone having an insert size of about 2.2 kb, to produce a /3-galactosidase fusion protein in E. coli.
  • the open reading frame extends from nucleotides 12-599 of SEQ ID NO:43, where bases 12-121 correspond to the /3-galactosidase fusion peptide and bases 122-599 code for a unique A22 antigenic sequence.
  • the translation sequence for the corresponding protein is presented as SEQ ID NO:44.
  • B3C19 is a subclone obtained from the 5.0 kb insert in genomic clone B3 as follows.
  • the insert was excised, digested with DNase, followed by treatment with T4 DNA ligase and Klenow enzyme to produce blunt-ends.
  • the blunt-ended short DNA pieces were then ligated to kinased A/B linkers (linker A, SEQ ID NO:63, corresponding to the top strand of AB SISPA linker; linker B, SEQ ID NO:64, corresponding to the bottom strand ot AB SISPA linker), PCR amplified with primer A, and the amplified products digested with EcoRI.
  • the digested products were then ligated into EcoR ⁇ - digested lambda vector ZAPII.
  • the DNA sequence of B3C19 is presented as S ⁇ Q ID NO:51.
  • primers GD77 (5' primer, S ⁇ Q ID NO:52) and GD80 (3' primer, S ⁇ Q ID NO:53) were designed to walk back and forward the sequence of B3C19 using genomic B2 DNA as the template.
  • the sequence of clone B3C19 was extended in both the 5' and 3' directions, resulting in B2 extension clone, B2197780 (S ⁇ Q ID NO:54).
  • a computer-generated protein translation of the B2197780 sequence resulted in a corresponding 291 amino acid putative protein, with a predicted transmembrane segment from amino acids 9-25 (predictions obtained using "SOAP" program from "PCG ⁇ N ⁇ ").
  • Subclone B9.4C is a 1.2 kb Hindlll fragment obtained from original genomic clone B9 (4.5 kb insert) as follows.
  • Subclone B9.4C corresponding to a 1.2 kb Hindlll subfragment of genomic clone B9, produced an immunoreactive protein of 32 kd.
  • the nucleotide sequence of subclone B9.4C is presented as SEQ ID N0:47; the translated protein sequence ⁇ i.e. , nucleotides 230-931 of SEQ ID N0:47) is presented as SEQ ID NO:48.
  • SEQ ID NO:47 an AGGA Shine-Dalgarno sequence occurs at nucleotide 218, and the first amino acid of the translated protein sequence begins with a GTG codon for valine at position 230.
  • the predicted molecular weight of the corresponding protein is 25.2 kd, with a pi of 10.35.
  • a potential cleavage site occurs between amino acid positions 225 and 226.
  • clone B9 appears to code for the 50A LI protein of H. pylori.
  • Clone B17 B17CON4 (2006 bases) was subcloned from a genomic clone having an insert size of about
  • the corresponding nucleotide sequence is presented as SEQ ID NO: 24.
  • the open reading frames (ORF) correspond to the following regions of SEQ ID NO:24: ORF1 (nucleotides 500-700); ORF2 (nucleotides 870-1406); ORF3 (nucleotides 1410-2000); and ORF4 (nucleotides 142-705).
  • the corresponding translated protein sequences are presented herein as SEQ ID NO: 25 (B170RF4), SEQ ID NO:26 (B170RF1), SEQ ID NO:27 (B170RF2) and SEQ ID NO:28 (B170RF3). Based upon nested deletion experiments, it appears that the immunoreactive protein corresponds to B170RF4.
  • B170RF4, B170RF2 and B170RF3 encode putative proteins having predicted sizes of 20.9 kd, 20 kd and 22.2 kd respectively.
  • the predicted size of the immunoreactive genomic protein is a doublet of 22 kd and 23 kd. 7.
  • Immunoreactivity was determined to reside in a 3.5 kb Pstl subclone (one Pstl site is derived from vector pBSK) of original genomic clone B23 (5.5 kb).
  • the 3.5 kb subclone was further sequenced, to determine the immunoreactive coding sequence presented in SEQ ID NO: 13.
  • the nucleotide 1078 base pair sequence (SEQ ID NO: 13) was determined to be open all the way (bases
  • the translated protein sequence is presented as SEQ ID NO: 14.
  • C1CON6V2 was subcloned from an original genomic clone having an insert size of about 4.0 kb.
  • the sequence information obtained from exo-mung deletion clones is presented as SEQ ID NO:38.
  • the open reading frame (ORF) extends from nucleotides 868-1926.
  • the subcloning of the antigen coding sequence from the Cl clone into an expression vector, and characteristics of the corresponding protein product are presented in Example 5 below.
  • the translated protein sequence is presented herein as SEQ ID NO:39.
  • C3 (2.9 kb insert size)
  • several different subfragments of the genomic clone were obtained and sequenced.
  • the immunoreactive clone was determined to reside in an EcoRI/Kpn 2.2 kb subclone. Based upon a number of subcloning experiments and subfragments, the C3 DNA sequence is presented as S ⁇ Q ID
  • Clone C7.2C is a EcoRI/ Hindlll subclone obtained from genomic clone C7.
  • the original genomic C7 clone is a EcoRI clone of approximately 3.8kb size that produces an immunoreactive protein having a molecular weight of approximately 30 kd.
  • Subclone C7.2C was obtained as follows. The genomic C7 clone was digested into individual fragments using Hindlll. Each DNA restriction fragment was then subcloned into the pBKS vector via either the Hindlll site, or alternatively, for end-piece fragments, via the EcoRI/ 'Hindlll sites. The corresponding nucleotide sequence of C7.2C (616 base pairs) is presented as S ⁇ Q ID NO: 20. Subclone C7.2C produces an immunoreactive protein which is the same size as that produced by the genomic clone. The translated protein sequence for bases 1-561 of clone C7.2C is presented herein as S ⁇ Q ID NO :21.
  • Clone B8 Clone B8 (SEQ ID NO:549) is a genomic clone with an insert size of 3.5 kb that produces an immunoreactive protein of about 50 kd on SDS Western blot.
  • the DNA sequence of B8 overlaps with that of clone C7 (clone C7.2C) which was a fusion protein with beta-galactosidase.
  • Clone B8 added on 266 additional amino acids 5' (upstream) to C7 and contains the beginning of the gene. However, B8 ends 64 amino acids before the C-terminus of the gene, which is also encoded by C7. Based on the above, the complete sequence of the full gene was compiled (SEQ ID NO:549).
  • the corresponding protein sequence from the compiled gene sequence codes for a putative protein of 48.9 kd, and is presented herein as SEQ ID NO: 550.
  • Clone al is a Hindlll clone that produces an immunoreactive protein of 28-30 kd on SDS gel.
  • the corresponding DNA sequence encoding an antigenic protein contains 1208 nucleotides (SEQ ID NO:35). There are two open reading frames.
  • ORFl extends from bases 53 - 801 of SEQ ID NO: 35, and encodes a putative protein containing 249 amino acids (SEQ ID NO:36, translated protein).
  • ORFl contains a Shine-Dalgarno sequence extending from nucleotides 43-46 ("AGGA" sequence). The predicted molecular weight of the protein is 27.5 kd, which is in agreement with the expected protein size based upon SDS-PAGE (above).
  • the putative protein for ORFl has a calculated pK of 8.42 and a predicted transmembrane region extending from amino acids 1-19.
  • the second ORF contained in clone al extends from nucleotides 880-1206 and ends at one of the Hindlll sites, which indicates that clone al is a partial clone.
  • the predicted size of the partial protein is 109 amino acids (SEQ ID NO: 37).
  • Immunoreactive clone a3 (not to be confused with clone A3 from library 1) is a ⁇ - galactosidase fusion clone with an insert size of 1975 base pairs. The corresponding nucleotide sequence corresponds to SEQ ID NO: 248.
  • the clone contains two open reading frames (ORFs): ORFl (nucleotides 3-608, SEQ ID NO:249) and ORF2 (nucleotides 613-1266, SEQ ID NO:250). Based upon the observed size of the immunoreactive protein, i.e. , 30 kd, the expected immunoreactive-expressing open reading frame is ORFl . Expression of a3 protein is described in the following section.
  • Clone a5 is an original Hindlll/ Eco ⁇ l genomic clone (1 kb) that produces a 25 kd immunoreactive protein.
  • the corresponding nucleotide sequence is presented as SEQ ID NO: 3.
  • the clone contains two open reading frames, (ORFs): ORFl (nucleotides 3-545, SEQ ID NO: 5) which codes for a beta-galactosidase fusion protein, and ORF2 (nucleotides 569-1021, SEQ ID NO:4).
  • ORFs open reading frames
  • Clone b5 contains a double insert with an internal EcoRI site.
  • the individual inserts are about 0.3 and 0.2 kb in length.
  • the combined nucleotide sequence for clone b5 corresponds to S ⁇ Q ID NO:6, with the open reading frame (ORF) extending from nucleotides 1 through 414.
  • the corresponding protein sequence translation is presented as S ⁇ Q ID NO:7.
  • Clone c2 is a library 2 clone with an insert size of 2077 nucleotides (S ⁇ Q ID NO:251).
  • the size of the immunoreactive protein on SDS-gel is 30kd.
  • the corresponding protein sequence translation is presented as S ⁇ Q ID NO:253.
  • the protein has a predicted pi of 9.54, with a potential transmembrane region between amino acids 2 through 24.
  • the predicted molecular size of this c2 protein is 30.2kd, which agrees with the observed size on SDS gel.
  • Clone c5 is a 650 base pair Hindllll Hindlll clone (S ⁇ Q ID NO:253). Clone c5 is a partial clone which is not in phase with the /3-galactosidase gene, and contains a 29 amino acid stretch extending into pBluescript at the N-terminus.
  • the ORF extends from bases 2-604, with a predicted protein size of 23.5kd (S ⁇ Q ID NO:254).
  • the observed protein size on SDS-gel is 30kd.
  • the putative protein has a pi of 9.6.
  • Clone c!3 Clone cl3 is a Hindlll/ Hindlll clone with an insert size of 1742 b.p. (S ⁇ Q ID NO:255). It is a /3-galactosidase fusion protein with an observed size of 55kd on SDS gel. The ORF extends from bases 2 to 1420. The corresponding protein possesses a predicted molecular weight of 54.6kd (S ⁇ Q ID NO:256). A sequence search using SWISS pro database indicates that the sequence possesses homologies with threonyl-tRNA synthetase of various organisms, from bacteria to yeast and human. 19. Clone d5
  • Clone d5 is an EcoRI clone that produces an immunoreactive protein (/3-galactosidase fusion protein) of about 70 kd on SDS-PAG ⁇ gel.
  • the size of the cloned insert was determined to be 1795 bases (S ⁇ Q ID NO:58), with an ORF from extending from nucleotides 1-1704.
  • the open reading frame codes for a putative protein composed of 568 amino acids (S ⁇ Q ID NO:59), and having a predicted molecular weight of 62.1 kd.
  • the putative protein has a predicted pK value of 5.1, and the predicted antigenic determinant lies in the N-terminus region of the protein.
  • Clone d7 is a Hindlll clone that was blunt-ended, ligated to A/B linkers, digested with EcoRI, and subcloned into lambda vector ZAPII as described above for Example 4.1.
  • the product is a ⁇ - galactosidase fusion protein.
  • the nucleotide sequence is presented as SEQ ID NO:ll.
  • the induced protein was determined to have sizes of 40 kd and 35 kd.
  • the corresponding protein sequence translated from SEQ ID NO: 11 is presented as SEQ ID NO: 12.
  • Clone O is a EcoRI clone having an insert size of 2274 nucleotides (S ⁇ Q ID NO:257).
  • Clone f3 produces a /3-galactosidase fusion protein.
  • the ORF extends from nucleotides 1-1788 and codes for a putative protein having a predicted molecular size of 67 kd (excluding the /3-galactosidase portion) (S ⁇ Q ID NO:258).
  • the calculated pi of the protein is 9.66.
  • Streptococcus pneumoniae A search against the ⁇ MBL DNA data base reveals a 56% homology with penicillin binding proteins of Haemophilus influenza.
  • the size of the protein observed on SDS-gel is 25kd (major band), with a minor band of around 67kd. This size discrepancy may possibly be due to cleavage of the protein to produce a smaller fragment.
  • F8CON1 is a Hindlll subclone that produces an immunoreactive protein of 33-35 kd on SDS gel.
  • the DNA sequence corresponding to the f 8 subclone is presented herein as S ⁇ Q ID NO: 1.
  • the cloned 1459 base DNA subfragment contains an open reading frame (ORF) from nucleotides 134 - 1042.
  • the putative protein encoded by the f8 subclone contains 303 amino acids, with a predicted molecular weight of 33.9 kd (which is in agreement with the expected protein size).
  • the predicted antigenic determinant is at the 3' end of the putative protein - extending from amino acid 270 to amino acid 275 (nucleotides 941-958).
  • the putative protein has a pK of 9.75.
  • the corresponding protein sequence based upon a translation of SEQ ID NO:l is presented as SEQ ID NO:2.
  • Clone g2 is a Eco /Hindlll clone with an insert size of 2474 nucleotides (S ⁇ Q ID NO:259).
  • ORFl extends from nucleotide 2 to 445 and codes for the carboxyl end of a putative protein of 148 amino acids (S ⁇ Q ID NO:260).
  • ORF2 extends from bases 461-1156, coding for a putative protein of molecular size 25.4 kd (S ⁇ Q ID NO:261).
  • ORF3 extends from bases 1156 to 1776 and codes for a protein having a size of about 22.2 kd (S ⁇ Q ID NO:262).
  • ORF4 (nucleotides 1798-2472) codes for the amino terminus of a putative protein of 24.4 kd (S ⁇ Q ID NO:263). The predicted pis of ORF2 and ORF3 are 7.82 and 5.26, respectively. While ORF2 and ORF3 putative protein does not contain predicted transmembrane regions, ORF4 putative protein contains 3 predicted transmembrane regions with 6 predicted transmembrane helices.
  • Clone g9 possesses an insert size of 4292 bases. Two subclones, k4 and gll, are contained within clone g9. Clone g9 produces an immunoreactive protein band of about 85kd, whilst k4 has immunoreactive bands of 55 kd, 46 kd, 35 kd and 30 kd and gl 1 produces a weakly immunoreactive band of 22 kd.
  • the complete sequence of g9 was obtained by walking in the 5' direction from the gll C-terminus sequence, and walking in the 3' direction from k4 N-terminus sequence (S ⁇ Q ID NO:264).
  • Clone g9 has 5 ORFs.
  • ORFl encodes a partial protein in the region extending from bases 2-349 (S ⁇ Q ID NO:265). The corresponding predicted molecular size of the protein is 12.3 kd.
  • ORF2 extends from bases 495-1403 (S ⁇ Q ID NO:266), and the corresponding predicted protein possesses a molecular size of 33.8 kd.
  • ORF2 is followed by a "AGGA" Shine-Dalgarno sequence, positioned in front of the ATG of ORF3 (bases 1418-2209).
  • the predicted molecular size for the putative protein encoded by ORF3 is 29.7 kd (S ⁇ Q ID NO:267).
  • ORF4 corresponds to bases 2223-3719.
  • the protein encodes by ORF4 has a predicted molecular size of 56.8 kd (SEQ ID NO:268).
  • ORF5 extends from nucleotides 3133 to 4236, with a predicted protein molecular size of 19.9 kd (SEQ ID NO:269).
  • H. pylori antigen coding regions Amplified products from various clone families, e.g., A3, A22, B2, B9, Cl, C5, C7, and Gla, were cloned into pGEX vectors (Pharmacia). The cloned constructs were expressed in E. coli strain (Stratagene, LaJolla, CA).
  • the protein was purified by affinity purification chromatography, employing Ni-NTA+ + resin from Qiagen (Chatsworth, CA).
  • the purified protein was paneled against various sources of H. pylori-ir ⁇ ecte ⁇ sera.
  • the serum paneling was carried out twice, with differing results: (i) 75% sensitivity, based upon 35 sera samples, and (ii) sensitivity of about 30% , based upon 183 sera.
  • the serum panel was performed using protein that had undergone further SDS gel electroelution following affinity chromatography purification.
  • Amplified products from clone C1CON6V2 were cloned into expression vector pGEXdl65 using primers BF (SEQ ID NO:40) and BG (SEQ ID NO:41) as outlined in Table 5 below.
  • the expressed protein was confirmed to be immunoreactive; however, it was cleaved beyond the transmembrane region to form a smaller immunoreactive band. Based upon this observation, a new forward primer was designed for the expression of the smaller protein, primer BO (SEQ ID NO:42), as indicated in Table 5.
  • the resulting protein was determined to be immunoreactive, and formed in much higher yields than in the previous construct.
  • Expression primers CA forward primer, SEQ ID NO:29
  • CB reverse primer, SEQ ID NO:30
  • Expression primers CC forward primer, SEQ ID NO:31
  • CD reverse primer, SEQ ID NO:32
  • Amplified products from the genomic A3 DNA were digested with NcollBamHl restriction enzymes and subcloned into the Ncol and BamHl sites of expression vector pGEXdl65 polyHis, using primers BP (SEQ ID NO: 18) and BQ (SEQ ID NO: 19) respectively.
  • Expression of protein was induced at 0.5 mM IPTG (isopropylthiogalacto-pyranoside). The expressed protein was of the expected size, and was confirmed to be immunoreactive against Roost pooled sera.
  • PCR amplified products from H. pylori genomic DNA produced a DNA fragment of the expected size, which was then subcloned into the NcollBamHl sites of expression vector pGEXdel65 using primers GD100 (SEQ ID NO:22) and GD101 (SEQ ID NO:23) respectively.
  • the expressed protein was of the expected size, and was highly insoluble. The protein was readily purifiable by Ni-NTA column chromatography as described above. Immunogenic screening confirmed the expressed protein to be immunoreactive against Roost pooled sera.
  • PCR amplified products were subcloned into expression vector pdell65polyHis using primers BW (SEQ ID NO:33) and BX (SEQ ID NO:34) as outlined in Table 5 below.
  • the expressed protein was confirmed to be immunoreactive. However, the expression product was internally cleaved, with cleavage most likely occurring at the transmembrane region.
  • a pellet from a shake flask was spun and then submitted to differential solubilization. Briefly, the pellet was homogenized in a solution containg PBS/5mM PMSF, and spun for 60 minutes at 4°C, 30k. Subsequent rounds of homogenization/centrifugation were as follows: (i) 100 mM Tris/2% Triton/2M Urea/5mM EDTA/0.5 mM DTT (homogenization step)/60 minutes at 4°C, 30k (centrifugation); (ii) PBS pH 7.8/2M Urea/0.5 mM DTT/ 60 minutes at 4°C, 30k; (iii) PBS pH 7.8/4M Urea/0.5 mM DTT/60 minutes at 4°C, 30k, (iv) PBS pH 8.0/6M Urea/0.5 mM DTT/60 minutes at 4°C, 30k; (v) PBS pH 8.0/6M urea/2M guanidine HCL/2mM BME /60 minutes
  • step (iv) above was then dialyzed into PBS pH 8.0/6M Urea/2mM BME, followed by chromatographic separation using a pre-packed column of Chelating Sepharose
  • 8.0/6M Urea/2mM BME and Nickel IMAC Buffer B contained Buffer A and 250 mM imidazole.
  • Nickel IMAC fractions were pooled and dialyzed overnight into PBS pH 8.0/6M Urea/0.5mM DTT, and the final product was then vialed and stored at -80°C.
  • glutathione-sepharose packed columns (Sigma) were utilized.
  • Protein expressed by clone dHA22.8 (corresponding to clone A22) was isolated and purified as follows. 2000 ml of culture pellet was suspended in 200 ml of Buffer B (48g urea, 1.2 g NaH 2 P0 4 ,
  • the resin was washed with 50 ml of Buffer C (48 g urea, 1.2 g NaH 2 P0 4 , 0.12 Tris-HCl and 90 ml deionized water, adjusted to pH 6.3 and brought to a total volume of 100 ml), centrifuged, and the supernatant discarded. The resin was then loaded into a disposable column, and the wash step repeated with remaining Buffer C. The protein was then eluted with 50 ml of Buffer III (50 mM sodium phosphate, pH 8.0, 300 mM NaCl, 250 imidazole). Fractions were collected (2 ml) and analyzed on 12% SDS-PAGE gel.
  • Buffer C 48 g urea, 1.2 g NaH 2 P0 4 , 0.12 Tris-HCl and 90 ml deionized water, adjusted to pH 6.3 and brought to a total volume of 100 ml
  • the resin was then loaded into a disposable column, and the wash step repeated with remaining
  • the purified protein was then screened with serum antibodies using conventional techniques (Ausubel, et al, 1988).
  • the purified protein was slotted at concentrations of 0.1-20 ⁇ g/ml 0.1M carbonate buffer, blocked with 5 % skim milk in TBS buffer, and reacted with H. pylori positive and H. pylori negative sera diluted 100 times.
  • a serum panel consisting of 36 total sera was used as indicated above. Positive sera was from Roost pool #2 or SFA 001 pool; negatives were from donor packs.
  • the strips were washed 3 times with TBS buffer, and incubated with alkaline phosphatase-conjugated anti-human Ig secondary antibodies (Promega Biotech, Madison, WI) diluted 1000 times in 5% skim milk in TBS buffer. The strips were then washed 3 times with TBS wash buffer. Immunoreactive proteins were developed with a substrate, e.g. , BCIP, (5-bromo-4-chloro-3-indolyl-phosphate), and NBT (nitro blue tetrazolium salt (Sigma)). The development of color indicated the highly antigenic nature of the purified A22 protein (i.e. , reactive with H. pylori-positive sera).
  • a substrate e.g. , BCIP, (5-bromo-4-chloro-3-indolyl-phosphate)
  • NBT nitro blue tetrazolium salt (Sigma)
  • the optimum concentration of antigen was determined to be 2.0 mg/ml.
  • the dHA22.8 (A22) expression product reacted with anti-H. pylori antibodies present in both pooled sera sources (Roost pool #2 and SFA 001 pool), and with 100% of the individual H. pylori-positive sera samples tested.
  • the antigen exhibited no detectable cross reactivity with normal sera.
  • Protein expressed by clone dHClS. l l (Cl) was isolated and purified according to the following protocol.
  • Protein was extracted from a culture pellet by differential solubilization using a series of homogenization/centrifugation steps.
  • the pellet was (i) homogenized in PBS/1 mM PMSF and centrifuged for 30 minutes at 4°C, 19,000 rpm, followed by separation of the supernatant, (ii) homogenized using 1 M urea/ 10 mM Tris at pH 8.0/10 mM DTT, followed by centrifugation as in (i) and separation of the supernatant; (iii) homogenized in 4 M urea/10 mM Tris pH 8.0/2 mM BME, followed by centrifugation as in (i) and separation of the supernatant; and finally, (iv) homogenization of the pellet with 6 M urea/lOmM Tris pH 8.0/2 mM BME, followed by centrifugation as described above.
  • the protein was then separated from the combined supernatants by immobilized metal adsorption chromatography (IMAC) using a 5 ml prepacked column containing chelating sepharose (Pharmacia, Chelating Sepharose Fast Flow) loaded with 2 C.V. (column volumes) 0.2 M NiCl 2 .
  • the protein was eluted from the column using a 20 C.V. gradient of Buffer A (4 M urea/10 mM Tris at pH 8.0/0.2mM BME/150 mM NaCl) into 50% Buffer B (Buffer A to which was added 0.5 M imidazole). Fractions were collected (2 ml) and analyzed on 12% SDS-PAGE gel.
  • the optimal concentration for immunoscreening was determined essentially as described above. Serum panelling as described above was repeated for the purified Cl protein expressed by clone dHClS.il .
  • the expressed protein was reactive with both sources of pooled H. pylori-positive sera, and showed no signs of cross reactivity with control sera (H. pylori negative samples, 1 sample of pooled sera and 13 individual samples). Additionally, as can be seen in panels
  • the recombinant protein exhibited immunoreactivity with 95 % of the H. pylori- positive samples.
  • Gold standard tests were employed as reference standards to compute the sensitivities and specificities of each individual antigen.
  • Standards used were conventional gold standard tests including histology, CLO test, and the UBT test.
  • UBT is the current most widely used gold standard for indication of "active infection” by non-invasive methods. Histology, CLO (rapid urease test) and UBT are all indicate "active infection” since these tests depend on the presence of bacteria to produce a positive result. Since serology measures both past and present infection, the aim of this study was to (i) identify serological markers that would correlate with UBT results for use as "active infection markers", and (ii) to explore both single and muliple antigen combinations to effectively screen for active infection by H. pylori.
  • POW Panel Based upon this paneling study, the best single antigen selection appeared to be Y128D12S and Y124A, where sensitivity and specificity were about 75-80% for Y128D12S, and both about 80% for Y124A. Another preferred antigen is Y261A, which exibited a sensitivity of 70% and a specificity of 80% . For this paneling series, preferred clones A22 and Cl were found to be about 95-98% sensitive, and about 60% specific.
  • the 2-4 antigen format was based on the criteria that, for any 2 -3 highly sensitive antigens, at least both or all 3 have to be positive with respect to the criteria to increase the specificity.
  • This "2 antigen both positive” criteria was applied to a selected 12 antigen set, where the 12 antigens were selected for a sensitivity of at least 40-50% for consideration.
  • the sensitivities and specificities of the 2 antigen both positive criterion is computed, and the resulting table is examined for good performers. Additionally, 2 antigen combinations with high specificity and lower selectivity is run against the entire two antigen combination matrix to provide a result indicating "2 antigen both positive or 2 other antigen both positive”. The final analysis then provides a selection of commonly occurring antigens that provide good results across the board between the USC and POW panels.
  • an antigen combination of Cl and c5 was preferred, and for the POW panel, antigens combinations Y261A and Y124A or antigens C7 and B2 also were preferred.
  • preferred antigens for use in reliably and universally detecting H. pylori infection include but are not limited to the following: A22, Cl , Y124A, Y261A, c5, C7, B2, Y104B, and Y128D.
  • H. pylori ATCC No. 43504
  • Prefractionation of soluble H. pylori lysate supernatant was carried out only when enriching for spot 15 (high pH), for subsequent sequence/structural investigations (e.g. , mass spectrometry, peptide map).
  • Fractionation of the spot 15 antigen was carried out in a similar fashion.
  • the sample was passed over Sephacryl S-100 ion exchange resin (Pharmacia Biotechnology, Piscataway, NJ) (to 10 mM phosphate buffer, 50 mM NaCl, pH 8.0), followed by further fractionation by gradient elution over Resource S cation exchange resin (Pharmacia Biotechnology, Piscataway, NJ).
  • the fraction containing spot 15 antigen was analyzed by SDS-PAGE and Western blot, and was similarly confirmed to be highly antigenic in nature, as indicated by Western blot and immunoreactivity with Roost pooled sera.
  • Two-dimensional electrophoresis was performed essentially as described by O'Farrell (1975). Isoelectric focusing was carried out in glass tubes having an inner diameter of 2.0 mm, using 2% ampholines (BDH, Hofer Scientific Instruments, San Francisco, CA) for 9600 volt-hrs. The final tube gel pH gradient as measured by a surface pH electrode is on the enclosed pH gradient form.
  • the following proteins were added as molecular weight markers to the agarose which sealed the tube gel to the slab gel: mysin (220 kD), phosphorylase A (94 kD), catalase (60 kD), actin (43 kD), carbonic anhydrase (29 kD), and lysozyme (14 kD). These standards appear as horizontal lines on the silver-stained (Oakley, et al, 1980) 10% acrylamide slab gels. The silver stained gel was dried between sheets of cellophane paper with the acid edge to the left.
  • the blot was blocked for 2 hours in 2% bovine serum albumin (BSA) in TTBS (Tween-Tris-Buffered Saline), rinsed in TTBS, incubated in primary antibody from Roost pool or negative serum pool and diluted 1:2500 in 1 % BSA/TTBS for 2 hours, rinsed in TTBS and placed in a solution containing secondary antibody (antihuman IgG horse radish peroxidase, 1:5000 diluted in TTBS) for 1 hour.
  • the blot was rinsed with TTBS, treated with ECL (Amersham Corporation, Arlington Heights, IL), and exposed to X-ray film.
  • FIG. 2 A computer-generated photograph of an exemplary stained membrane containing antigenic proteins from H. pylori as described above is shown in Figure 2.
  • the "normal” human sera was confirmed to be H. pylori-negative using "HELICOBLOT 2.0" (Genelabs Diagnostics (PTE) Ltd.,
  • the tube gel was sealed to the top of the stacking gel, which was placed on top of a 10% acrylamide slab gel (0.75 mm thick), and SDS slab gel electrophoresis was carried out for 4 hours at 12.5 mA/gel.
  • the slab gels were fixed in a solution of 10% acetic acid/50% methanol overnight.
  • the following proteins were added as molecular weight standards to the agarose which sealed the tube gel to the slab gel: myosin (220 kD), phosphorylase A (94 kD), catalase (60 kD), actin (43 kD), carbonic anhydrase (29 kD) and lysozyme (14 kD). These standards appear as horizontal lines on the silver stained (Oakley, et ⁇ l, 1980) 10% acrylamide slab gels. The silver stained gel was dried between sheets of cellophane paper with the acid edge to the left.
  • a duplicate gel was transblotted onto PVDF paper and Western blotting carried out as described in 8.1.b. above.
  • the PVDF blots from Example 8 above were stained with Coomassie brilliant blue. Spots corresponding to Western positive bands were excised by scalpel and sequenced directly using a Hewlett-Packard G 1005 A N-terminal sequencer with conventional sequencing techniques ⁇ e.g. , Miller, 1994; Spiecher, 1989). The sequencing techniques employed gave high repetitive yields (typically ranging from 93-98%), with a detection limit of approximately 100-200 fmol.
  • Lys-C peptide map and sequencing were utilized to obtain internal sequence information for the above- described isolated antigens of H. pylori (Allen, 1981): Lys-C peptide map and sequencing, CNBr peptide map and sequencing, OPA/CHBr peptide map and sequencing, and LC-MS/MS sequencing of Lys-C digests.
  • Cyanogen bromide (CNBr) cleavage was performed on PVDF membranes in 70% formic acid (Crimmins and Mische, 1996). The cyanogen bromide digested peptides were then either repurified by capillary HPLC and sequenced directly, or subjected to ortho-phthalaldehyde (OP A) modification.
  • OP A ortho-phthalaldehyde
  • LC-MS Liquid chromatography-mass spectrometry
  • a Carlo Erba Phoenix 20 CU pump was used to deliver a mixture of methoxy ethanol and isopropanol (1:1, v/v) at a rate of 50 microliters per minute, which was combined with the column eluent in a post column mixing chamber.
  • An in-line flow splitter was used to restrict flow to the mass spectrometer to approximately 10 microliters/minute. Detection was performed immediately following elution from the column at 214 nanometers using an ABI 759 variable wavelength detector. Mass spectrometric detection was achieved following post column solvent addition and flow splitting by a VG BioQ triple quadruple mass spectrometer with a nano-electrospray ion source.
  • Spectra were recorded in the positive ion mode using electrospray ionization. Calibration of the instrument was performed in the range m/z 500-2000 by using direct injection analysis of myoglobin. Spectra were recorded at 1.5 seconds intervals and a drying gas of nitrogen was used to aid evaporation of the solvent. The capillary voltage was maintained at approximately 4 kV with a source temperature of 60°C.
  • spots 9, 11, 12, 13, 15, 16 (major and minor), and 17 represent unique antigens.
  • Spot 9 represents the native H. pylori antigen corresponding to the recombinant protein expressed by ORF2 of clone al
  • spot 12 represents the native antigen that corresponds to the antigenic protein encoded by clone a5.
  • spots 9 major and minor
  • 10, 12, 13, and 15 represent unique antigens.
  • the N-terminus of the spot 15 antigen high pH
  • 2-dimensional gel analysis was later shown to be inaccurate.
  • the H. pylori spot 15 antigen as described above was isolated from 3 different sources of H. pylori as described in Examples 8 and 9.
  • the H. pylori samples used were the following: H. pylori, ATCC 43504 (Australian type strain); H. pylori, strain # 26695, and H. pylori, J-170, (both obtained from Washington University School of Medicine, St. Louis, MO).
  • the corresponding protein isolated from each source was then analyzed by reverse phase high performance liquid chromatography (HPLC) as described in Example 9.
  • the spot 15 antigen appears to be a processed product of the putative 36 kD protein, as shown in Fig. 4.
  • H. pylori strains tested were the following: Chico (clinical isolate from Oroville
  • J170, A-lc Lithuanian isolate, which contains at least 6 kB of DNA referred to as "X" segment as an insertion near cag region and not present in C-3c
  • Rus-95 isolated from Russian immigrant to the United States
  • #9 Peruvian isolate
  • C-3c Lithuanian isolate
  • ATCC 45304 are highly conserved between various strains of H. pylori.
  • Western blot results further supported the highly antigenic nature of this protein.
  • Gel pieces stained with Coomassie or compatible silver stain as described in Example 9 were transferred to a microcentrifuge tube and rehydrated with 10 microliters of water. The gel pieces were then washed 3 times with 500 microliters 50% acetonitrile/0.05 M Tris-HCl, pH 8.5 for 20 minutes.
  • the supernatants were discarded and the washed pieces were dried for 30 minutes in a Speed- Vac concentrator.
  • Five microliters of a solution containing 0.05 micrograms Lys-C was added to the tube and incubated for 20 hours at 32°C. Following digestion, the gel pieces were extracted three times with 30 microliters 50% acetonitrile/0.1 % TFA.
  • the supernatants were transferred to a 0.5 ml microcentrifuge tube and dried.
  • the extracted peptides were redissolved in 4 microliters 4-hydroxy- alpha-cyano cinnamic acid and a 0.8 microliter sample was spotted onto a MALDI sample plate.
  • Antigens for Use in Vaccines Representative antigens described herein were evaluated as vaccine candidates on the basis of Western blot analyses (as described above) of sera obtained from patients prior to and after antimicrobial treatment.
  • the Greenberg panel was obtained from male and female patients in California, also aged 18- 70, who were diagnosed as having H. pylori infection, as confirmed in antibody tests. Prior to antimicrobial treatment and 24 months after treatment, serum was collected from the patients.
  • Table 13 summarizes the data obtained from the Gasbarrinni panel. The numbers and percentages of patients who exhibited high antibody titre against the indicated antigens at 12 months are indicated therein. Twelve months after treatment, a high percentage of patients continued to exhibit high antibody titre against clones Y139, Y146B, Y175A, and A22.
  • Table 14 summarizes the data obtained from the Greenberg panel. The numbers and percentages of patients who exhibited high antibody titre against the indicated antigens at 24 months are shown. Twenty four months after treatment, a high percentage of patients continued to exhibit high antibody titre against clones Y184A, Z9A, Y261Ains and Y146B. Since antigens which invoke a long-lasting antigenic response are considered to be good vaccine candidates, and based upon the results provided in the Tables below, the following antigens are considered to be preferred vaccine candidates: Y139, Y146B, Y175A, and A22, Y184A, Z9A, Y261Ains and Y146B.
  • ADDRESSEE Dehlinger & Associates
  • STREET P.O. Box 60850
  • TELECOMMUNICATION INFORMATION (A) TELEPHONE: 650-324-0880 (B) TELEFAX: 650-324-0960
  • TCTTTATCCC ACAAGCTCAT CTAAAACCAC ACCCGCTAAA AACTAAAATT AACAAAAACT 1260 AAAATCTTTT TTAAGAGCCT ACACGAGCGA GCAAAAAGAA TGACAATCAA TAAAAACGAA 1320
  • Met Lys Lys lie lie Leu Ala Cys Leu Met Ala Phe Val Gly Ala Asn
  • Glu lie Lys Asn Ala Leu lie Ser Ala Tyr Ala Arg Val Leu Thr Pro
  • Asn Lys Asn Phe Ala lie Thr Arg Leu Gin Ser Leu Leu Tyr Lys Glu 275 280 285 Leu Lys Asp Tyr Ala Asn Lys Glu Gly Gin Gly Asn Thr Gly Leu 290 295 300
  • AAAGAAGAAA AATTGGCGTG CATGACAATG AAGTCTTTCA AACCTTGTAT TATGAAGCGA 360
  • Glu Lys lie Val Phe Asp Leu Pro Lys Thr lie lie Glu Gin Glu Met
  • Ala Met lie Glu Asp Arg Val Leu Ala Tyr Leu Leu Asp Lys Asn Leu 145 150 155 160
  • TTTAGAAAAA CCTTTAAAAA AACCACACAA ACACAGCTTT TTAGCCGCTT CAAAAGCGTT 300 AGAAGAGAGC AAACGGCAGG CCTTAAAAGT CGCAAGCACG GACGCTAATG TCATGCTATT 360
  • GGTTCAGACT GCACCTGTTA CTACAGAACC AGCTCCAGAG AAAGAAGAGC CTAAACAAGA 1200 GCCAGCTCCA GTGGTTGAAG AAAAGCCGGC TATTGAAAGC GGGACTATCA TCGCTTCTAT 1260
  • TTTAGTCATT AAAGGGGTAG AAAAAGATAT GATCAAAACC ATCAGTTTTG GTGAAACCAA 1500 ACCCAAATGC GCCCAAAAAA CTAGAGAATG TTACAAAGAA AACAGAAGAG TGGATGTCAA 1560
  • Asp Asn Lys Ser Val Lys lie Asp Val Arg Phe lie Ser Ala Thr Asn 210 215 220
  • Lys lie Gin Ala Phe Asp Trp 305 310
  • GGAAATTCCA AAAGAACCAA ATGGCTCCGG ATTTTTCTAA AGCCGCTTTC GCTTTAACTT 480 CTGGGGATTA CACTAAAACC CCTGTTAAAA CAGAGTTTGG TTATCATATT ATCTATTTGA 540
  • CTGCAGCAGG CAATATTGGT GGTGGAGGTT TTGCGGTTAT CCATTTGGCT AATGGTGAAA 60 ATGTTGCCTT AGATTTTAGA GAAAAAGCCC CCTTGAAAGC CACTAAAAAC ATGTTTTTAG 120
  • AACTTTATAA TCTACCAATC CATCGCATGA CTTTTAAAAT ACTCAAAGAT CCTAGATGAG 1440 AGCTTGAGTT GGATTGACTT TAGTTTATTT TAATTTTTCT TTATTTTGAA ATATCTTGAA 1500
  • GGGATTTAGT CAATAACAGC GTGCTTTTAG TGGAAAATGA GCATAAAGAA AAATTAAAAG 780

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention pertains to the characterization and isolation of newly discovered polypeptide antigens of H. pylori. Disclosed herein are cluster families of DNA replicas of portions of the genome of H. pylori encoding antigens which are highly immunogenic. Also disclosed are new antigenic proteins recovered from H. pylori. The invention also provides methods employing the above-described antigens.

Description

ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR HELICOBACTER PYLORI
Field of the Invention
The present invention relates to one or more antigens of H. pylori, to polynucleotide sequences coding for the antigens, and to diagnostic and therapeutic methods employing such antigens and polynucleotides.
References
Allen, G., LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY: SEQUENCING OF PROTEINS AND PEPΉDES. Elsevier Science/North-Holland Biomedical Press, New York (1981).
Altschul, S., et al., J. Mol. Biol. 215:403-410 (1990).
Atherton, J.C.; Spiller, R.A., Gut. 35:723-725 (1994).
Ausubel, F.M., etal.. CURRENT PROTOCOLS IN MOLECULAR BIOLOGY. John Wiley and Sons, Inc., Media PA (1988).
Bauer, A.W., et al., Anal. Biochem. 137: 134-142 (1984).
Beames, et al., Biotechniques 1_1:378 (1991).
Blaser, M., Scientific American, p. 104 (1996).
Cover, T., et al., Adv. Int. Med. 41:85-117 (1996). Cover, T., ASM News 61:21 (1995). Crea, R., U.S. Patent No. 4,888,286, issued December 19, 1989.
Crimmins, D.L. and Mische, S.M., in CURRENT PROTOCOLS IN PROTEIN SCIENCE, Units 11.5, John Wiley and Sons, Inc., New York (1996).
Dubois, A., et al., Infection and Immunity 64(8):2885-2891 (1996). Earl, P.L., et al. , "Expression of proteins in mammalian cells using vaccinia" In CURRENT
PROTOCOLS IN MOLECULAR BIOLOGY (Ausubel, F.M., et al, Eds.), Greene Publishing Associates & Wiley Interscience, New York (1991).
Eaton, M.A.W., et al, U.S. Patent No. 4,719,180, issued Jan. 12, 1988. Gellissen, G., et al, Antonie Van Leeuwenhoek, 62(l-2):79-93 (1992). Goeddel, D.V., Methods in Enzymology 185 (1990).
Guthrie, C, and Fink, G.R., Methods in Enzymology 194 (1991). Harlow, E., et al, ANTIBODIES: A LABORATORY MANUAL. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1988).
Haynes, J., et al, Nuc. Acid. Res. ϋ:687-706 (1983). Kaufman, R.J., "Selection and coamplification of heterologous genes in mammalian cells," in METHODS IN ENZYMOLOGY. vol. 185, pp 537-566. Academic Press, Inc., San Diego CA (1991). Lau, Y.F., et al , Mol. Cell. Biol. 4: 1469-1475 (1984).
Maniatis, T., et al, MOLECULAR CLONING: A LABORATORY MANUAL. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1982). McAtee, C.P., et al, J. Chromatog. B. 685:91-104 (1996).
Miller, C.G., Methods: A Companion to Methods in Enzymology 6:315-333 (1994). Moss, B., et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (Section IV, Unit 16) (1991).
Mullis, K.B., U.S. Patent No. 4,683,202, issued 28 July 1987. Mullis, K.B., et al, U.S. Patent No. 4,683,195, issued 28 July 1987.
Nakayama, H., et al, J. Chromatog. A 730:279-287 (1996). Oakley, B.R., et al, Anal. Biochem. 105:361-363 (1980). O'Farrell, P.H., J. Biol. Chem. 250:4007-4021 (1975). O'Farrell, P.H., et al, Cell 12:1133-1142 (1977). Parsonnet, J., et al, N. Engl. J. Med. 325:1127-1131 (1991).
Porath, J., Protein Exp. and Purif. 3:263 (1992). Pearson, W.R. and Lipman, D.J., PNAS 85:2444-2448 (1988). Pearson, W.R., Methods in Enzymology 183:63-98 (1990). Reilly, P.R., et al, BACULOVIRUS EXPRESSION VECTORS: A LABORATORY MANUAL (1992). Reyes, G., et al, Molecular and Cellular Probes 5:473-481 (1991). Romanos, M.A., et al, Yeast 8(6): 423-488 (1992).
Sambrook, J., et al, In MOLECULAR CLONING: A LABORATORY MANUAL. Vol. 2, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989). Sanger, et al, Proc. Natl. Acad. Sci. 74:5463-5467 (1977).
Scharf, S.J., et al, Science 233:1076 (1986). Smith, D.B., et al, Gene 67:31 (1988).
Spiecher, D.W. in TECHNIQUES IN PROTEIN CHEMISTRY. (T. Hugli, Ed.) Academic Press, San Diego, CA, pp. 24-35 (1989). Tomb, et al, Nature 388:539 (1997).
Warren, J.R., and Marshall, B., Lancet 1: 1273-1275 (1983). Yoshio, T., et al, U.S. Patent No. 4,849,350, issued July 18, 1989.
Background of the Invention H. pylori is a human gastric pathogen associated with chronic superficial gastritis, peptic ulcer disease, and chronic atrophic gastritis leading to gastric adenocarcinoma, although for most of this century, peptic ulcer disease was thought to be stress-related rather than caused by H. pylori infection. Acceptance for the causal role of H. pylori in peptic ulcer disease and gastric inflammation developed when studies showed that human subjects who ingested H. pylori developed gastritis, a condition that was resolved after the infection was eliminated by antibiotic treatment (Warren and Marshall, 1983).
H. pylori is a micro-aerophilic, Gram negative, slow-growing, flagellated organism with a spiral or S-shaped morphology which infects the lining of the stomach. H. pylori was originally cultured from gastric biopsy in 1982 and was placed in the Campylobacter genus based upon gross morphology. In 1989, the new genera Helicobacteracea was proposed and accepted, with H. pylori being its sole member (Blaser, 1996). Currently, 13 members of the Helicobacter genus are recognized.
One unusual feature shared by all members of the genus is the presence of a urease operon which encodes for a 540 kD cell surface enzyme, with urease being one of the most abundant surface proteins produced by the bacteria. This enzyme is a multisubunit urease which functions to hydrolyse urea into carbon dioxide and ammonia (Cover, 1995). The resulting ammonia molecules surround the bacteria, thereby neutralizing the acid in the immediate vicinity of the bacteria. Thus, urease is crucial for the survival of H. pylori at acidic pH and for its successful colonization of the gastric environment.
H. pylori is one of the most common chronic bacterial infections in humans. H. pylori infection is found in over 90% of patients with active gastritis, and the presence of H. pylori in the gastric mucosa has been associated with mucosa-associated lymphoid tissue lymphomas (Cover, et al, 1996). In developed countries, about half of the population has been colonized with H. pylori by age 50, and in developing countries, colonization is common even among children. Further, one to two out of ten infected individuals will develop peptic ulcer disease in the course of a lifetime. Current approaches for assessing H. pylori infection typically employ invasive techniques such as collection of gastric biopsy specimens, culture, histology and detection of pre-formed bacterial enzymes. These approaches suffer from numerous drawbacks including low sensitivity, inconvenience, and high-cost. Non-invasive approaches include the urea breath test, UBT (Atherton, et al, 1994) and serological tests which utilize various H. pylori antigens for detecting anti-H. pylori antibodies. The urea breath test relies upon the presence of the urease enzyme from H. pylori to convert isotopic urea to isotopic carbon dioxide (the analyte) and ammonia. Although the 13C (or 14C)- urea breath test is fairly accurate, it suffers from its reliance upon a mass spectrometer for analyzing patient breath samples, a piece of equipment absent from most clinical settings. Moreoever, existing serological assays are not yet reliable enough for routine diagnostic use, and are typically utilized in epidemiological studies for retroactively assessing infection.
In view of its importance as a human pathogen, and the persistence of H. pylori infection amongst the world population, a clear need exists for methods and compositions effective for reliably detecting and treating infection by genetically diverse strains of H. pylori. Ideally, the reagents for such an assay should be readily and reproducibly prepared, in addition to being highly selective and specific for H. pylori. Moreover, the method should be accurate and exhibit high sensitivity, in addition to being simple, convenient and cost-effective.
Summary of the Invention
The invention pertains to the discovery and characterization of new, highly immunogenic polypeptide antigens of H. pylori. Also forming part of the invention are 69 heretofore unrecognized immunogenic cluster families. The sequence and location of these cluster families within the H. pylori genome were determined on the basis of the over 250 disclosed DNA replicas of portions of the genome of H. pylori discovered to encode highly immunogenic antigens. Also disclosed are native antigenic proteins recovered from H. pylori using a proteomics methodology. The invention further provides methods employing one, several, many or each of the above-described antigens. Also forming part of the invention is a diagnostic kit and method employing one, several, many or each of the herein described antigenic proteins to detect H. pylori infection, where the assay is effective for detecting active infective status H. pylori.
In one aspect, the present invention includes H. pylori genomic polynucleotides encoding one or more of the polypeptide antigens described herein. With respect to the polynucleotides, some aspects of the invention include H. pylori derived RNA and DNA polynucleotides, recombinant H. pylori polynucleotides, a recombinant vector including any of the above polynucleotides, and a host cell transformed with any of these vectors.
These polynucleotides encode H. pylori-specific polypeptide antigens. The corresponding coding sequences allow for the production of polypeptides which are useful, for example, as reagents in diagnostic tests and/or as components of vaccines.
Preferred polynucleotides are H. pylori antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment identified by SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51 , 54 (B2), SEQ ID NO: 60 (Y104B), and SEQ ID NO: 98 (Y128D). Polynucleotides encoding these antigens are particularly preferred due to the high sensitivity and specificity exhibited by the resulting antigens.
Other polynucleotides contemplated by the invention are antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment (typically at least about 18 nucleotides in length) spanning one of the following DNA fragment clusters corresponding to SEQ ID NOs:469-547:
(1). the DNA fragment having the sequence spanning SEQ ID NOS:222 and 223 (cluster 1 , corresponding to SEQ ID NOS:469, 470); (2). the DNA fragment having the sequence spanning SEQ ID NOS:8, 135, 161, 162, 218
(cluster 2, SEQ ID NO:471);
(3). the DNA fragment having the sequence spanning SEQ ID NOS:l l, 60, 73 (cluster 3, SEQ ID NO:472);
(4). the DNA fragment having the sequence spanning SEQ ID NOS:75, 127, 20, 169 (cluster 4, SEQ ID NO:473);
(5). the DNA fragment having the sequence spanning SEQ ID NOS: 175 and 176 (cluster 5, SEQ ID NOs:474, 475);
(6). the DNA fragment having the sequence spanning SEQ ID NOS: 115, 3, and 227 (cluster 6, SEQ ID NO:476); (7). the DNA fragment having the sequence spanning SEQ ID NO:206 (cluster 7, SEQ ID
NO:477);
(8). the DNA fragment having the sequence spanning clones Y291/T7 and Y291/T3 (cluster 8, SEQ ID NO:478);
(9). the DNA fragment having the sequence spanning SEQ ID NO:35 (cluster 9, SEQ ID NO:479); (10). the DNA fragment having the sequence spanning SEQ ID NOS: 160, 100, 198, 159, 134, 197, 99 (cluster 10, SEQ ID NO:480);
(11). the DNA fragment having the sequence spanning SEQ ID NOS:201, 155, 202, and 156 (cluster 11, SEQ ID NO:481); (12). the DNA fragment having the sequence spanning SEQ ID NOS: 1, 114, 192, 130, 113,
191, 129 (cluster 12, SEQ ID NO:482);
(13). the DNA fragment having the sequence spanning SEQ ID NOS: 117, 116, 214, 203, 213, and 110 (cluster 13, SEQ ID NO: 483);
(14). the DNA fragment having the sequence spanning SEQ ID NOS: 158, 95, 94, 157 (cluster 14, SEQ ID NO:484);
(15). the DNA fragment having the sequence spanning SEQ ID NOS: 153 and 154 (cluster
15, SEQ ID NOs:485, 486);
(16). the DNA fragment having the sequence spanning SEQ ID NOS: 140 and 141 (cluster
16, SEQ ID NO:487); (17). the DNA fragment having the sequence spanning SEQ ID NOS: 181, 182 (cluster 17,
SEQ ID NO:488);
(18). the DNA fragment having the sequence spanning SEQ ID NOS: 138, 144, 15, and 137 (cluster 18, SEQ ID NO:489);
(19). the DNA fragment having the sequence spanning SEQ ID NOS:51 and 80 (cluster 19, SEQ ID NO:490);
(20). the DNA fragment having the sequence spanning SEQ ID NOS:98 and 125 (cluster 20, SEQ ID NO:491);
(21). the DNA fragment having the sequence spanning SEQ ID NOS:43, 28 (cluster 21, SEQ ID NO: 492); (22). the DNA fragment having the sequence spanning SEQ ID NO: 6 (cluster 22, SEQ ID
NO:493);
(23). the DNA fragment having the sequence spanning SEQ ID NOS:78; 212, 142, 79 (cluster 23, SEQ ID NO:494);
(24). the DNA fragment corresponding to SEQ ID NO:495 (25). the DNA fragment having the sequence spanning SEQ ID NO: 259 (cluster 25, SEQ ID
NO:496);
(26). the DNA fragment having the sequence spanning SEQ ID NO: 164 (cluster 26, SEQ ID NO:497);
(27). the DNA fragment having the sequence spanning SEQ ID NO: 174 (cluster 27, SEQ ID NO:498); (28). the DNA fragment having the sequence spanning SEQ ID NOS: 119, 83, 85, 136, 84, 120, 24 (cluster 28, SEQ ID NO:499);
(29). the DNA fragment having the sequence spanning SEQ ID NOS: 194, 108, 163, and 193 (cluster 29, SEQ ID NO: 500); (30). the DNA fragment having the sequence spanning SEQ ID NO:224 (cluster 30, SEQ ID
NO:501);
(31). the DNA fragment having the sequence spanning SEQ ID NOs: 150, 173, 199, 183, 230, and 253 (cluster 31, SEQ ID NO:502);
(32). the DNA fragment having the sequence spanning SEQ ID NOS: 122, 77, 187 (cluster 32, SEQ ID NO:503);
(33). the DNA fragment having the sequence spanning SEQ ID NOS:88, 184, 89 (cluster 33, SEQ ID NO:504);
(34). the DNA fragment having the sequence spanning SEQ ID NO:91 (cluster 34, SEQ ID NO:505); (35). the DNA fragment having the sequence spanning SEQ ID NOS: 101, 92, 102, and 93
(cluster 35, SEQ ID NOs:506, 507, 508);
(36). the DNA fragment having the sequence spanning SEQ ID NOS: 170 and 118 (cluster 36, SEQ ID NO:509);
(37). the DNA fragment having the sequence spanning SEQ ID NOS:205, 104, 87, 204, 103, 126, 86, 264, 271, 151, and 152 (cluster 37, SEQ ID NO:510);
(38). the DNA fragment having the sequence spanning SEQ ID NOS: 151, 152, 271 and 264 (cluster 38, SEQ ID NO:511);
(39). the DNA fragment having the sequence spanning SEQ ID NOS: 143 and 225 (cluster
39, SEQ ID NO:512); (40). the DNA fragment having the sequence spanning SEQ ID NOS: 165 and 166 (cluster
40, SEQ ID NOs:513, 514);
(41). the DNA fragment having the sequence spanning SEQ ID NO: 171 (cluster 41, SEQ ID NO:515);
(42). the DNA fragment having the sequence spanning SEQ ID NOS:209, 216, 217, 208, 215, 38 (cluster 42, SEQ ID NO:516);
(43). the DNA fragment having the sequence spanning SEQ ID NOS: 177, 178, 179, 180 (cluster 43, SEQ ID NOs:517, 518);
(44). the DNA fragment having the sequence spanning cluster 44 (SEQ ID NO:519); (45). the DNA fragment having the sequence spanning SEQ ID NOS:186, 195, 185, 196 (cluster 45, SEQ ID NO: 520); (46). the DNA fragment having the sequence spanning SEQ ID NO:251 (cluster 46, SEQ ID NO:521);
(47). the DNA fragment having the sequence spanning cluster 47 (SEQ ID NOs:522, 523);
(48). the DNA fragment having the sequence spanning SEQ ID NOS: 190, 189 (cluster 48, SEQ ID NO:524);
(49). the DNA fragment having the sequence spanning SEQ ID NO:200 (cluster 49, SEQ ID NO:525);
(50). the DNA fragment having the sequence spanning SEQ ID NOS:211, 210 (cluster 50, SEQ ID NO:526); (51). the DNA fragment having the sequence spanning SEQ ID NOS: 168, 81, 132, 82, 121,
133, 167, 228 (cluster 51, SEQ ID NO:527);
(52). the DNA fragment having the sequence spanning SEQ ID NO: 146 (cluster 52, SEQ ID NO: 528);
(53). the DNA fragment having the sequence spanning cluster 53 (SEQ ID NO:529); (54). the DNA fragment having the sequence spanning SEQ ID NOS:221 and 220 (cluster
54, SEQ ID NO: 530);
(55). the DNA fragment having the sequence spanning SEQ ID NO:74 (cluster 55, SEQ ID NO:531);
(56). the DNA fragment having the sequence spanning SEQ ID NO:72 (cluster 56, SEQ ID NO:532);
(57). the DNA fragment having the sequence spanning SEQ ID NO:70 and 71 (cluster 57, SEQ ID NO:533);
(58). the DNA fragment having the sequence spanning SEQ ID NOS:96, 97 (cluster 58, SEQ ID NOs:534, 535); (59). the DNA fragment having the sequence spanning SEQ ID NOS: 105 and 106 (cluster
59, SEQ ID NOs:536, 537);
(60). the DNA fragment having the sequence spanning SEQ ID NO: 107 (cluster 60, SEQ ID NO:538);
(61). the DNA fragment having the sequence spanning SEQ ID NO: 109 (cluster 61, SEQ ID NO:539);
(62). the DNA fragment having the sequence spanning SEQ ID NOS: 111, and 112 (cluster 62, SEQ ID NO:540);
(63). the DNA fragment having the sequence spanning cluster 63 (SEQ ID NO:541);
(64). the DNA fragment having the sequence spanning SEQ ID NO:58 (cluster 64, SEQ ID NO:542); (65). the DNA fragment having the sequence spanning cluster 65 (SEQ ID NO: 543);
(66). the DNA fragment having the sequence spanning SEQ ID NO: 90 (cluster 66, SEQ ID NO: 544);
(67). the DNA fragment having the sequence spanning SEQ ID NO: 13 (cluster 67. SEQ ID NO:545);
(68). the DNA fragment having the sequence spanning SEQ ID NO:47 (cluster 68, SEQ ID NO:546);
(69). the DNA fragment having the sequence spanning SEQ ID NO: 16 (cluster 69, SEQ ID NO:547). According to another embodiment, a H. pylori polynucleotide is one that is capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24. SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO:51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO.251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO.271, SEQ ID NO:322, SEQ ID NO:549, where these DNA sequences correspond to cloned and sequenced regions of the H. pylori genome encoding highly immunogenic proteins.
In a preferred embodiment, a H. pylori antigen coding polynucleotide is one that is capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95
(Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID
NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
Alternatively, a H. pylori antigen coding polynucleotide is composed of at least 18 contiguous nucleotides spanning a cluster region selected from the group consisting of SEQ ID NOs: 469-547.
In yet another embodiment, a H. pylori antigen coding polynucleotide according to the invention is composed of at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO.51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO.271, SEQ ID NO:322, SEQ ID NO:549.
In yet another embodiment, a H. pylori antigen coding polynucleotide is composed of at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 {Cl), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
In another aspect, the invention includes a H. pylori polypeptide antigen, in substantially purified form, characterized by immunoreactivity with H. pylori positive anti-sera. An antigen in accordance with the invention is, in one embodiment, encoded by a polynucleotide that is typically at least 18 nucleotides in length, having the features described above. More specifically, the antigen is encoded by all or a portion of a polynucleotide sequence at least 18 nucleotides in length and capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
Additional antigens according to the invention are encoded by H. pylori antigen coding sequences as described above.
In a preferred embodiment, an antigen is encoded by a polynucleotide at least 18 nucleotides in length and capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID
NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7),
SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
Alternatively, the invention encompasses a H. pylori antigen comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 340-468, where the antigen is in substantially purified form and is characterized by immunoreactivity with H. pylori positive anti-sera.
In a particular embodiment, a H. pylori antigen comprises at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 2, 4, 5, 7, 9, 10, 12, 14, 17, 21, 25-28, 36, 37, 39, 44, 48, 55, 59, 61, 69, 249, 250, 252, 254, 256, 258, 260-263, 265-269, 323, 324, and 550-554.
In yet another embodiment, a H. pylori antigen comprises at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 555-602.
Alternatively, a H. pylori antigen comprises a polypeptide sequence selected from the group consisting of SEQ ID NOs: 555-602. In one particularly preferred embodiment, a H. pylori antigen is one comprising at least 6 contiguous amino acids contained within a sequence selected from the group consisting of SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D). Also forming part of the invention is a diagnostic kit for use in screening a biological fluid such as sera for the presence of anti-H. pylori antibodies. The kit includes a substantially purified H. pylori antigen of the type characterized above that is immunoreactive with at least one anti-H. pylori antibody, and a reporter for detecting binding of the antibody to the antigen. The polypeptide antigen may be attached to a solid support, and the kit may further include a non-attached reporter-labelled anti-human antibody, where binding of the anti-H. pylori antibodies to the polypeptide antigen can be detected by binding of the reporter-labelled antibody to the anti-H. pylori antibodies.
According to one embodiment, the kit includes at least two H. pylori antigens having different antibody specificities. In a related aspect, the invention includes a method of detecting H. pylori infection in a subject, or detecting the eradication of the bacteria in a previously infected subject. The method involves reacting a biological fluid sample from a subject with a purified H. pylori polypeptide antigen of the type described above, and examining the antigen for the presence of bound antibody.
Preferred antigens for use in the method correspond to one of the following polypeptide sequences or a contiguous region contained therein: SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 {Cl), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
In still another aspect, the invention includes a H. pylori vaccine composition containing a H. pylori polypeptide antigen of the type described above. The antigen is characterized by its ability to reduce the level of H. pylori infection in a mammalian model system, such as a mouse or rhesus monkey challenged with the peptide, and then infected with H. pylori.
Preferred antigens for use in a vaccine are those which invoke a long-lasting antigenic response, as evidenced by the persistance of antibodies for an extended period of time subsequent to antimicrobial treatment. Representative antigens for use in a vaccine composition are selected from the group consisting of SEQ ID NO:565 (Y139), SEQ ID NO:575 (Y146B), SEQ ID NO:555 (Y175A), SEQ ID NO:44 (A22), SEQ ID NO:569 (Y184A), SEQ ID NO:578 (Z9A), SEQ ID NO:557 (Y261A) and SEQ ID NO:575 (Y146B).
These and other objects and features of the invention will become more fully apparent when the following detailed description is read in conjunction with the accompanying figures and examples.
Brief Description of the Figures
Fig. 1 is a computer scanned image of a Western blot of a 2-dimensional (2D) sodium dodecyl sulfate polyacrylamide gel electrophoretic (SDS-PAGE) analysis (high pH conditions) of native H. pylori antigens blotted with Roost pooled sera {H. pylori positive); Fig. 2 is a computer scanned image illustrating a Western blot of a sodium dodecyl sulfate polyacrylamide gel 2-dimensional electrophoretic (SDS-PAGE) analysis (low pH conditions) of native H. pylori antigens blotted with Roost pool;
Fig. 3 shows a schematic representation of the amino acid translation of ORF3 of clone Y104- l .asm (299 amino acids). Regions of sequence indicated in the figure were confirmed by amino acid sequence analysis of the Y104-l .asm protein expressed in E. coli strain XLlBlue, and correspond to
(i) the 36 kD protein of H. pylori (indicated by underline), (ii) "spot 15" peptides (indicated by a box), and (iii) peptides encoded by clone d7 (bracketed);
Fig. 4 is a schematic representation of the in vivo processing pathway of the 36 kD protein of H. pylori and its relationship to the "spot 15" antigen disclosed herein;
Fig. 5 is a reverse phase HPLC peptide profile corresponding to the "spot 15" antigen isolated from H. pylori (ATCC 43504) which illustrates the presence of numerous identifiable protein peaks;
Fig. 6 presents the amino acid sequence of the 36-kD protein of //, pylori, where underlining indicates the 28 kD protein region; Fig. 7 is a schematic representation of the proteome methodology employed to identify several native antigens of H. pylori;
Fig. 8 is a graphical representation summarizing percent sensitivity of various clones against representative H. Py/ori-immunopositive sera panels;
Fig. 9 is a linear representation of the H. pylori genome, indicating the approximate positions of immunogenic cluster regions forming one aspect of the present invention;
Figs. 10-63 are linear maps indicating the relative positions of immunogenic subclones within the clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18),
(19), (20), (21), (23), (25), (27), (28), (29), (30), (32), (33), (35), (36), (37), (38), (39), (40), (41),
(42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (53), (54), (58), (59), (61), (62), (68), and (69), respectively;
Figs. 64A-D present a summary of the immunogenic clone clusters of the present invention including (i) cluster number, (ii) clones defining the start and end regions of each cluster, coordinates of the cluster consensus region within the H. pylori genome, and expression data;
Fig. 65 is a graphical representation of comparative sensitivities of H. pylori recombinant antigens against various sera panels. Sensitivity values were calculated using various gold reference standards as described in Example 6B;
Fig. 66 is a graphical representation of comparative specificities of H. pylori recombinant antigens against various sera panels, computed against gold standards as indicated; and Figs. 67A and 67B provide a tabular summary of (i) immunopositive clones forming the basis of the invention, (ii) their corresponding cluster numbers (indicating the relative position of each clone and cluster within the H. pylori genome), and relative (iii) sensitivity and (iv) specificity values.
Detailed Description of the Invention I. Definitions
The following terms, as used herein, have the meanings as indicated: A polypeptide sequence or fragment is "derived" from another polypeptide sequence or fragment when it has the same sequence of amino acid residues as the corresponding region of the fragment from which it is derived.
A polynucleotide sequence or fragment is "derived" from another polynucleotide sequence or fragment when it has the same sequence of nucleic acid residues as the corresponding region of the fragment from which it is derived.
A first polynucleotide fragment is "selectively-hybridizable" to a second polynucleotide fragment if the first fragment or its complement can form a double-stranded polynucleotide hybrid with the second fragment under selective (stringent) hybridization conditions. The first and second fragments are typically at least 15 nucleotides in length, preferably at least 18-20 nucleotides in length. Selective (stringent) hybridization conditions are defined herein as hybridization at ~ 45 °C in ~ 1. IM salt followed by at least one wash at 37 °C in 0.3M salt. Such conditions typically allow at most about 25-30% basepair mismatches.
Two or more polynucleotide or polypeptide fragments have at least a given percent "sequence identity" if their nucleotide bases or amino acid residues are identical, respectively, in at least the specified percent of total base or residue position, when the two or more fragments are aligned such that they correspond to one another using a computer program such as ALIGN. (The ALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs, Pearson and Lipman, 1988; Pearson, 1990).
An "H. pylori polynucleotide" as used herein, refers to a polynucleotide sequence derived from the genome of H. pylori and variants thereof. H. pylori polynucleotides of the type disclosed herein encode H. pylori polypeptide antigens, where the resulting antigen is characterized by immunoreactivity with H. pylori positive anti-sera. Generally, a H. pylori polynucleotide of the invention will be at least about 18 or more nucleotides in length {i.e. , encoding a 6 peptide-antigen). In an alternative embodiment, the H. pylori polynucleotide will be at least about 24 nucleotides in length (i.e. , encoding an 8 peptide-antigen). In yet another embodiment, the H. pylori polynucleotide will be at least about 30 nucleotides in length. In some instances, the H. pylori polynucleotide will range from about 45 to 75 nucleotides in length, but may of course be longer. The polynucleotides of the invention may be obtained from natural or synthetic sources, or, may be prepared recombinantly. The polynucleotide sequence may be a naturally-occurring sequence, or it may be related by mutation, including single or multiple base substitutions, by deletions, by insertions and inversions, to a particular naturally-occurring sequence, provided that the subject polynucleotide is capable of expressing a H. pylori antigen as described herein. The polynucleotide sequence may optionally contain expression control sequences (typically from a heterologous source) positioned adjacent to the coding region.
The nucleotide sequences described herein are meant to encompass variants possessing essentially the same "sequence identity" as defined above. Nucleotide sequences having essentially the same sequence identity are typically selectively hybridizable to one another under selective (stringent) hybridization conditions. This is to say, a nucleic acid fragment is considered to be selectively hybridizable to a H. pylori polynucleotide if it is capable of specifically hybridizing to the H. pylori polynucleotide sequence or a variant thereof (e.g. , a probe that hybridizes to a H. pylori polynucleotide but not to polynucleotides from other members of the Helicobacter family) under stringent hybridization and wash conditions.
An "H. pylori antigenic polypeptide" is meant to encompass immunoreactive variants of the polypeptide, or regions, or parts thereof, provided that the variant is immunogenic. A suitable variant is defined as any polypeptide having a sequence that is identical {i.e. , shares sequence identity) to that of a H. pylori polypeptide. For example, an antigenic polypeptide that is essentially identical to a H. pylori polypeptide antigen is (i) encoded by a nucleic acid that selectively hybridizes to sequences of H. pylori or its variants or (ii) is encoded by H. pylori or its variants.
A sequence comparison may also be employed for the purpose of determining "polypeptide homology", e.g., by using the local alignment program LALIGN. In carrying out such a determination, a polypeptide sequence is typically compared against a selected H. pylori amino acid sequence or any of its variants, as defined above, using the LALIGN program with a ktup of 1, default parameters and the default PAM.
Any polypeptide with an optimal alignment longer than about 6 to 8 amino acids and greater than 70%, or more preferably 75% to 80% of identically aligned amino acids is considered to be a "homologous polypeptide." The LALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs (Pearson, etal. , 1988; Pearson, 1990; program available from William R. Pearson, Department of Biological Chemistry, Box 440, Jordan Hall, Charlottesville, VA). Sequence variations among antigens will depend upon a number of factors, such as the strain of //. pylori, the location of the gene or gene family encoding the antigen within the H. pylori genome (i.e., whether the gene is highly conserved or prone to recombination) and the like. An immunogenic polypeptide or polypeptide "fragment" is one that is (i) encoded by an open reading frame of a H. pylori polynucleotide, or (ii) displays sequence identity to H. pylori polypeptides as defined above, and is immunoreactive with a H. pylori immunopositive sample, such as sera.
Typically, the immunogenic fragment will comprise at least about 6 to 8 amino acids, and preferably at least about 10 to 12 contiguous amino acid residues of a particular antigen.
By immunoreactive variant of a H. pylori polypeptide antigen is meant an amino acid substitution, deletion, and/or addition variant of a particular H. pylori polypeptide antigen sequence disclosed herein, having substantially the same or increased binding affinity to a given antibody as the particular polypeptide antigen, as determined by conventional methods, e.g. , a competition assay or a two antibody sandwich assay.
A representative H. pylori antigen is composed of at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:340-468, where the antigen is in substantially purified form and is characterized by immunoreactivity with H. pylori positive anti-sera. "Cluster sequences" correspond to regions within the H. pylori genome encoding highly immunogenic polypeptides. The cluster sequences were determined on the basis of DNA sequence information for the over 250 immunogenic clones described herein. As a result, 69 unique clusters were identified, and antigens in accordance with the invention are those encoded by a contiguous series of nucleotides contained within any of the 69 clusters. Representative antigen coding sequences falling within each of the clusters are summarized in Figs. 64A-D. The cluster regions are referred to herein as clusters 1 to 69, corresponding to SEQ ID NOs: 340-468.
The location of each of the cluster regions within the H. pylori genome is llustrated pictorially in Fig. 9. In some instances, a cluster is defined by various regions within the H. pylori genome rather than by a single sequence. For example, a particular cluster may be defined by a "start" sequence, an "end" sequence, and perhaps an invervening "middle" sequence, and thus may correspond to more than one sequence contained in the Sequence Listing. In such cases, the descriptor for the cluster sequence will indicate its relationship to a given cluster (e.g. , start, middle, end). Typically, a defining cluster sequence will code for one or more antigens, and the remainder of the sequence for a particular cluster, if not explicitly provided, can be readily determined based upon the information provided herein, when considered along with the information, e.g. , provided in Tomb, et al, 1997. Cluster sequences defined by more than one representative sequence are cluster 1 (SEQ ID NOs: 469, 470), cluster 5 (SEQ ID NOs:474,475), cluster 7 (SEQ ID NO:477, end), cluster 15 (SEQ ID NOs:485, 486), cluster 35 (SEQ ID NOs:506-508), cluster 40 (SEQ ID NOs:513, 514), cluster 41 (SEQ ID NO:515, end), cluster 43 (SEQ ID NOs:517, 518), cluster 47 (SEQ ID NOs:522, 523), cluster 49 (SEQ ID NO:525, end), cluster 58 (SEQ ID NOs:534, 535), and cluster 59 (SEQ ID NO:536, 537).
"Substantially purified" and "in substantially purified form" are used in several contexts and typically refer to at least partial purification of a H. pylori polynucleotide or polypeptide away from unrelated or contaminating components {e.g. , serum, cells, proteins, non-H. pylori polynucleotides, etc.) by at least one purification or isolation step. Methods and procedures for the isolation or purification of compounds or components of interest are described herein (e.g. , SDS-PAGE, affinity purification of fusion proteins, blotting, and recombinant production of H. pylori polypeptides).
An antigen is "specifically immunoreactive" with H. pylori positive anti-sera or a biological fluid sample when, under optimal conditions, the antigen binds to antibodies present in the H. pylori infected sample but does not bind to antibodies present in the majority (greater than about 60 to 65% , preferably from about 70% to 80% , even more preferably greater than about 85 %) of fluid samples from subjects who are not or have not been infected with H. pylori. "Specifically immunoreactive" antigens may be immunoreactive with monoclonal or polyclonal antibodies generated against specific H. pylori antigens.
By biological fluid is meant any fluid derived from the body of a mammal, particularly a human. Representative biological fluids include blood, serum, plasma, urine, faeces, mucous, gastric secretions, dental plaques, or saliva.
"Immunologically effective amount" refers to an amount administered to a mammalian host, either as a single dose or as part of a series, that is effective for treatment or prevention of infection by H. pylori. The amount will vary depending upon the health and physical condition of the subject to be treated, the capacity of the subject's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, and the like. Such an amount will typically fall within a relatively broad range, and can be determined in routine trials.
II. Isolation of H. pylori Antigenic Sequences
The present invention is based on the identification and isolation of a number of highly immunogenic H. pylori polypeptides, resulting from the screening of over a million individual H. pylori compositions. The antigens of the present invention were either produced recombinantly, or, were separated from a mixture of soluble proteins obtained from pelleted and lysed H. pylori. Moreover, as a result of an intensive computational analysis effort, disparate sequence information corresponding to all of the disclosed immunogenic clones was compiled to provide a collection of heretofore unrecognized antigenic cluster regions contained within the genome of H. pylori.
The preparation and identification of recombinant H. pylori antigens and their classification into clustered families will now be described. 1. Method for Screening Recombinant Sublibraries
The H. pylori antigens of the invention can be obtained from phage libraries using conventional screening methods described below. Unless otherwise stated, the DNA lambda libraries described herein have been deposited in the Genelabs Technologies, Inc. Culture Collection, 505 Penobscot Drive, Redwood City, CA, 94063, or in the Genelabs Diagnostics PTE LTD Culture Collection, 85 Science Park Drive #04-01, The Cavendish, Singapore Science Park, Singapore 118259.
E. coli XL-1 Blue MRF plasmids containing inserts corresponding to the following H. pylori clones were accepted for deposit by the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD 20852 on September 9, 1997 and assigned the following designation numbers: clone dHClS. l l (ATCC 98525), clone dHGla2S.5 (ATCC 98526), clone Y212A (ATCC 98527), clone dHC7.2cd6 (ATCC 96528), clone yl75A (ATCC 98529), clone yl46B (ATCC 98530), clone dHA22.8 (ATCC 98531), Y184A (ATCC 98532), clone dHB2dl (ATCC 98533), clone yl04B (ATCC 98534). The bacteriophage lambda containing H. pylori DNA insert Y-library was accepted for deposit by the American Type Culture Collection on September 3, 1997 and assigned designation number ATCC 209234.
The antigen-encoding DNA fragments of the invention can be identified by (i) immunoscreening, as described above, and/or (ii) computer analysis of coding sequences using an algorithm (such as, "ANTIGEN," Intelligenetics, Mountain View, CA) to identify potential antigenic regions. For example, an antigen-encoding DNA fragment is subcloned, and the subcloned insert is then fragmented by partial DNase digestion to generate random fragments or by specific restriction endonuclease digestion to produce specific subfragments. The resulting DNA fragments are then inserted into an expression vector, such as the lambda gtl l vector, and subjected to immunoscreening in order to provide an epitope map of the cloned insert. In addition, DNA fragments of the type described herein can be employed as probes in hybridization experiments to identify overlapping H. pylori sequences, and these in turn can be further used as probes to identify additional sets of contiguous clones.
Any of the herein-described clone sequences can be used to probe a DNA library, generated in a vector such as lambda gtlO or "LAMBDA ZAP II" (Stratagene, La Jolla, CA). Specific subfragments of known sequence may be isolated using the polymerase chain reaction or after restriction endonuclease cleavage of vectors carrying such sequences. The resulting DNA fragments can be used as radiolabelled probes against any selected library. In particular, the 5' and 3' terminal sequences of the clone inserts are useful as probes to identify additional clones.
Further, the sequences provided by the 5' end of cloned inserts are useful as sequence specific primers in first-strand DNA synthesis reactions (Maniatis, et al, 1982; Scharf, et al, 1986). For example, specifically primed H. pylori DNA libraries can be prepared by using specific primers derived from one of the cloned DNA sequences described herein as a template. The second-strand of the new DNA is synthesized using RNase H and DNA polymerase I. The above procedures identify or produce DNA molecules corresponding to nucleic acid regions that are 5' adjacent to the known clone insert sequences. These newly isolated sequences can in turn be used to identify further flanking sequences, and so on. After new H. pylori sequences are isolated, the polynucleotides can be cloned and immunoscreened to identify specific sequences encoding H. pylori antigens.
2. Recombinant Antigens In order to identify new and highly antigenic polypeptides useful for detection of H. pylori,
DNA libraries were prepared from commercially available strains of H. pylori (American Type Culture Collection, Rockville, MD; ATCC Designation No. 43504, ATCC Designation No. 43526) in either the expression vector lambda gtl l or "ZAPII" (Stratagene, La Jolla, CA; Example 1). Polynucleotide sequences were then selected for the expression of peptides which were immunoreactive with pooled sera obtained from 11 patients identified by endoscopy as H. />y/øπ'-positive, herein identified as "Roost pool" sera, or another pool of 4 H. jpy/ori-irnmunopositive sera samples identified as "SFA001 ". It is also possible to screen with other sources of /. /?y/øπ-infected samples, such as those described in Examples 6 A and 6B, and referred to in Figs. 65 and 66, or described in Example 13. The samples may be individual samples, i.e. , derived from a single subject, or may be pooled samples, such as those described herein.
Recombinant proteins identified by this approach provided candidates for polypeptides that can serve, either singly or in combination, as substrates in diagnostic tests for detecting infection by H. pylori. Further, the corresponding nucleic acid coding sequences serve as useful hybridization probes for the identification of additional H. pylori antigen coding sequences. The H. pylori strains described above were used to generate DNA libraries in lambda vectors
(Example 1). Other commonly available strains of H. pylori include, for example, H. pylori samples identified as strain # 29995 and J-170. Alternatively, libraries can be constructed from H. pylori isolated from a sample confirmed as //. /oπ'-positive. In the method illustrated in Example 1, the libraries were generated from genomic DNA isolated from the pelleted bacteria. Alternatively, centrifugation can be used to pellet bacteria from infected biological specimens such as gastric mucosa.
In reference to Example 1, H. pylori DNA libraries were generated using DNase-digested genomic DNA fragments isolated from H. pylori as starting material. For preparing the "short antigen clone" library, the resulting molecules were ligated to Sequence Independent Single Primer
Amplification (SISPA; Reyes, et al, 1991) linker primers and expanded in a non-selective manner, and then cloned into a suitable vector, for example, lambda gtl l or ZAPII, for expression and screening of peptide antigens. The libraries disclosed herein have been designated as the "short antigen clone" library, typically designated by a prefix beginning with the letters Y or Z; and the "long antigen clone" library, designated as libraries 1 and 2.
The ZAPII libraries 1 and 2 were similarly constructed, with the following exceptions. The ZAPII libraries 1 and 2 were generated from longer H. pylori DNA fragments, i.e., either EcoRI or Z/m-ΛII-digested genomic DNA which had not undergone Sequence Independent Single Primer Amplification. Library 1 clones (designated herein by upper case letters) were generated by ligating EcoRI-digested H. pylori DNA directly into the EcoRI sites of the lambda "ZAPII" vector. Library 2 clones (lower case designations) were obtained by digesting H. pylori genomic DNA with Hindlll, then blunt ended with Ε. coli Klenow enzyme or T4 DNA polymerase enzyme and the resulting blunt ended fragments were then ligated to SISPA primers. The linker-ligated DNAs were then treated with EcoRI and ligated into the EcoRI sites of "ZAPII" lambda arms.
Lambda gtl 1 is a particularly useful expression vector for producing H. pylori antigens. The vector contains a unique EcoRI insertion site, located 53 base pairs upstream of the translation termination codon of the /3-galactosidase gene. Thus, an inserted sequence is expressed as a β- galactosidase fusion protein which contains the N-terminal portion of the 3-galactosidase gene product, the heterologous peptide, and optionally the C-terminal region of the /3-galactosidase peptide (the C~ terminal portion being expressed when the heterologous peptide coding sequence does not contain a translation termination codon). The lambda gtll vector also produces a temperature-sensitive repressor (cI857) which causes viral lysogeny at permissive temperatures, e.g., 32°C, and leads to viral lysis at elevated temperatures, e.g., 42°C. Advantages of lambda gtll include: (1) highly efficient recombinant clone generation, (2) ability to select lysogenized host cells on the basis of host-cell growth at permissive, but not non-permissive, temperatures, and (3) production of recombinant fusion protein. Further, since phage containing a heterologous insert produces an inactive /3-galactosidase enzyme, phage with inserts are typically identified using a colorimetric substrate conversion reaction employing β- galactosidase.
In addition to the lambda gtll vector, numerous E. coli expression vectors are useful for expression of antigens. Alternative microbial hosts suitable for expression include bacilli, such as B. subtilis, and other Enterobacteriaceae, such as Salmonella, Serratia, and various Pseudomonas species. In such hosts, the expression vectors will typically contain control sequences compatible with the host cell. Other known promoter sequences may be present in the expression vector, such as the lactose promoter system, a tryptophan promoter system, or a promoter system from phage lambda. An amino terminal methionine can be provided, if necessary, by insertion of a Met codon in-frame with the antigen. The carboxy terminal extension of an antigen can be removed by conventional mutagenesis procedures.
Additionally, yeast expression systems can be used, such as the Saccharomyces cerevisiae pre- pro-alpha-factor leader region used to direct protein expression from yeast. The antigen coding sequence can be fused in frame to the leader region. This construct is then typically put under the control of a strong transcription promoter, such as the alcohol dehydrogenase I promoter or a glycolytic promoter. The antigen sequence is followed by a translation termination codon which is followed by transcription termination signals. Alternatively, the antigen coding sequences can be fused to a second protein coding sequence, such as /3-galactosidase, used to facilitate purification of the fusion protein by affinity chromatography. For constructs used for expression in yeast, protease cleavage sites may be inserted to facilitate separation of the fusion protein components.
Additionally, mammalian cells can be used for expression of the antigens of the invention. Vectors useful for expression in mammalian cells are typically characterized by insertion of the antigen coding sequence between a strong viral promoter and a polyadenylation (poly A) signal. The vectors may optionally include selectable marker genes, such as those conferring antibiotic resistance. Suitable host cells include Chinese hamster ovary cells (CHO) cell lines, HeLa cells, myeloma cells, Jurkat cell lines, and the like. The expression vectors for these cells may include expression control sequences, such as an origin of replication, a promoter, an enhancer, information processing sites, e.g., ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences.
The DNA sequences are expressed in hosts after the sequences of interest have been operably linked to (i.e., positioned to enable the functioning of) an expression control sequence.
Experiments carried out in support of the present invention will now be described, and in particular, the generation of various phage libraries and characterization of inserts coding for the H. pylori antigens of the invention.
a. Representative H. pylori Antigens: Short Antigen Clone Library. Example
1 describes the preparation of a DNA library for H. pylori (ATCC No. 43504). The library was immunoscreened using H. pylori-positive pooled sera (Example 2). A number of lambda clones were identified which were immunoreactive with anti-//. pylori antibodies present in the pooled sera. Selected immunopositive clones were plaque-purified and their immunoreactivity retested. The immunoreactivity of the clones with normal human sera (control, H. >y/or.-negative) was also tested. Numerous clones were identified by immunosceening and further characterized, as described in Examples 3 and 4. Immunoreactive clones further described in Example 3 and referred to herein as the Y and Z families of clones include clone Y-104-1, and the clones summarized in Table 2, corresponding to SEQ ID NOs: 70-230.
To obtain the polypeptide antigens of the invention, and their respective coding sequences, the DNA inserts of the immunoreactive recombinant lambda clones were PCR amplified, using primers corresponding to lambda arm sequences flanking the EcoRI cloning site of the vectors, and utilizing each immunoreactive clone as template. For the lambda gtl 1 clones, gtl IF (SΕQ ID NO: 65) and gtl 1 R (SΕQ ID NO:66) primers were used. For the lambda ZAPII clones, T3 (SΕQ ID NO:67) and T7 (SΕQ ID NO:68) primers were used.
The resulting amplification products were agarose gel purified and eluted from the gel (Ausubel, et al, 1988) to remove primers and other components. The purified insert DNA was then subjected to direct sequencing. In some cases, the insert DNA was first subcloned into the TA cloning vector (Invitrogen, San Diego, CA) and then sequenced. Clones exhibiting immunoreactivity against
//. /ry/oπ-positive pooled sera are identified in Example 3, and more specifically in Table 2.
Sequencing was carried out using "DYEDEOXY TERMINATOR CYCLE SEQUENCING" (a modification of the procedure of Sanger, et al. , 1977) on a Perkin Elmer Applied Biosystems model
373A DNA sequencing system according to the manufacturer's recommendations (Perkin Elmer
Applied Biosystems, Foster City, CA). Sequence data is presented in the accompanying Sequence
Listing.
Sequences for the Y and Z families of clones were compared with "GENBANK", EMBL database and dbEST (National Library of Medicine) sequences at both nucleic acid and amino acid levels. Search programs FASTA, BLASTP, BLASTN and BLASTX (Altschul, etal, 1990) indicated that these sequences were unique as both nucleic acid and amino acid sequences.
In instances where the clones were determined to be significantly longer than the predicted coding regions of the proteins, the genomic clones were digested by restriction enzyme treatment, and the resulting subfragments were inserted into a suitable expression vector. The resulting subclones, containing the specific digested DNA fragments, were then screened for immunogenicity. Clones identified as immunoreactive towards //. /ry/øπ-positive pooled sera were plaque purified. Plasmid DNA containing inserts obtained by recovery from phage were sequenced as described above and also in Example 4.
b. Representative H. pylori Antigens: ZAPII Libraries 1 and 2. Library 1 and library 2 clones isolated by immunoscreening with H. pylori immunopositive pooled sera include the following: LIBRARY 1: A3, A22, B2, B9, B17, B23, Cl, C3, C7, DH, and LIBRARY 2: al, a3, a5, b5, b8c7, c2, c5, cl3, d5, d6, dll, d7, e6, f3, f8, fll, g2, g9, gll, k4. Clone Gla was isolated by screening with monoclonal antibody 1G6 from Biogenesis, Inc. (Bournemouth, England). Sequence data for these clones is presented in the sequence listing herein.
Exemplary procedures for generating and isolating the library 1 and 2 subclones for preparing the antigens of the invention, including features of the derived coding sequences and putative proteins, are provided above and described in detail in Example 3.
c. Clone d7 and its Relationship to the 36 kD Protein of H. pylori. Clone d7, another immunoreactive clone of the present invention, is nHindl clone that was blunt-ended, ligated to A/B linkers, digested with EcoRI, and subcloned into lambda vector ZAPII to produce a beta- galactosidase fusion protein. The nucleotide sequence is presented as SΕQ ID NO:ll. Polynucleotides and polypeptides derived from this clone represent preferred embodiments of the invention.
As will be described in further detail below, this clone codes for about 70% of the carboxy- end of the 36 Kd native protein of H. pylori, while clone Y104 codes for the entire 36 K protein. Based upon the results of sequencing experiments carried out in support of the invention (Examples 8 and 9), portions of the amino acid sequence of the native 36 Kd protein of H. pylori have been determined. The 36 kD protein (encoded in part by clone d7) appears to be a precursor to a highly antigenic H. pylori protein referred to herein as the "spot 15" protein (Examples 8,9, to be described in more detail below). This band appears to be Western Blot positive in the majority of //, pylori- positive samples, and is typically absent in normal samples. This antigenic protein thus appears to be highly indicative of infection by H. pylori. The in-vitro processing of the 36 kD protein (i.e, the apparent molecular weight) is illustrated schematically in Fig. 4.
As shown in Fig. 4, the native H. pylori translation product, a 34 kD protein (i.e, calculated molecular weight) composed of 299 amino acids, appears to be cleaved in vivo at amino acid position 23. This results in a 31.6 kD cleavage product commonly referred to as "the 36 kD protein" of //. pylori. The differences in molecular weight terminology in referring to this protein arise due to the following: 36-kD is observed from SDS-PAGE; 34-kD is calculated from the corresponding DNA sequence (Fig. 6); and 31 kD is determined from experimental sequence data, as determined starting from residue 23 of SEQ ID NO:60. A post translation modification of the 36 kD protein, i.e., acetylation at the amino acid terminus (position 23) and cleavage at the carboxy end, results in the "spot 15" antigen, having a molecular weight of 28 kD. Referring to Fig. 4, "X"s correspond to positions where deamidations (i.e. , asparagine or glutamine) or point mutations have occurred. Further proposed modifications leading to the minor spot 15 protein are also indicated. The antigenic polypeptide corresponding to spot 15, and sequences coding for this protein, e.g. , clones Y-104 and d7, represent one particularly preferred embodiment of the present invention. This is due to the strong antigenicity of the peptide. Moreover, the spot 15 peptide has been detected in various strains of //, pylori, as indicated in Table 9 and in Example 10.
The results discussed above and in the Examples indicate the isolation and identification of numerous new H. pylori polypeptide antigens, and polynucleotide sequences encoding such antigens.
d. Cluster Analysis and Identification of Antigenic Regions within the H. pylori
Genome. The antigenic sequences described herein were determined based upon the screening of over a million discrete H. pylori antigenic compositions. DNA sequencing was carried out for H. pylori immunopositive clones, and open reading frames coding for antigenic proteins were identified. Nucleic acid sequences coding for the thus-identified antigens were then inserted into expression vectors and expression of the desired antigenic protein was confirmed. As a result of this work, over 250 antigenic clone sequences have been identified and are disclosed herein.
In an effort to further correlate the sequence information for the various immunogenic clones, the clones were organized into cluster groups (1-69). The antigenic polynucleotide fragments isolated herein map into 69 clusters which are identified herein as clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69). The position of these antigenic clusters within the complete H. pylori genome is shown in Fig. 9. As can be seen from Fig. 9, the locations of the clusters within the genome (and clones within the clusters) are highly random, and represent only a small portion (i.e. , approximately 2-3 %) of the overall nucleotides contained within the entire genome. Prior to this work, a comprehensive guide to the unique and highly antigenic regions within the genome of H. pylori was unknown.
For cluster analysis, all of the immunoclone sequences were combined into a FASTA database as defined by Pearson and Lipman (1988). This database was converted to a BLAST database for high speed searching (Altschul, et al, 1990). The sequence of each immunoclone was then searched against this database using the program BLASTN to define clusters. Clusters are an assembly of clones that contain identical sequences with other clones in the group. The sequences in each group or cluster were then combined in separate database files and formatted for entry in the GEL program of IG-Suite (Oxford Molecular). GEL then assembled the sequences and suggested a consensus sequence with ambiguities. A non-ambiguous sequence was determined for each cluster by editing the consensus sequence. When the cluster contained non-contiguous sequence, the genome sequence of H. pylori (TIGR World Wide Web site) was used merely to guide assembly. Clusters 1 to 69 correspond to SEQ ID NO:s 469-547. Open reading frames for the antigens were then determined from the cluster consensus sequence. Single clone sequences were translated directly to provide antigen sequences. Antigenic regions contained within each of the clusters as determined by translation of cluster open reading frames are provided as SEQ ID NO:s 340-468. Figs. 10 to 63 are linear maps indicating the relative positions of immunogenic subclones within the clusters (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (23), (25), (27), (28), (29), (30), (32), (33), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (53), (54), (58), (59), (61), (62), (68), and (69). Clusters not shown on the linear maps are clusters defined by one clone, i.e., clusters (22), (24), (26), (31), (34), (52), (55), (56), (57), (60), (63), (64), (65), (66), (67).
Figs. 64A-D present a tabular summary of clusters 1-69, clones contained within each cluster, and coordinates of each cluster consensus region within the H. pylori genome.
During the course of this work, the complete genome sequence of H. pylori was reported (Tomb, et al, 1997), and is incoφorated herein by reference. H. pylori possesses a circular genome of 1,667,867 base pairs and 1590 predicted coding sequences. The genome sequence reported on the TIGR Web site was used as a reference for reporting nucleotide positions of the clusters and immunogenic clones of the invention. Based upon the present work, it can be seen that a relatively small number of the predicted open reading frames reported for the H. pylori genome encode the antigenic proteins or cluster regions forming the basis of the present invention. Each cluster defines a continuous DNA sequence that spans, i.e., extends, from the 5' end of the most upstream clone in the cluster to the 3' end of the most downstream clone in the cluster. Where the spanning sequence is incomplete, it has been filled in with sequence from the reported H. pylori genomic sequences (TIGR Web site). For example, and with reference to Fig. 10, the sequence defined by cluster 1 includes the sequence beginning at the 5' end of clone Y92 (SEQ ID NO: 222) and ending at the 3' end of clone Y92 (SEQ ID NO:223), including the short (about 730 bases) genomic sequence connecting the two clone sequences. The positions of the individual clones in each cluster are shown in Figs. 10-63, along with corresponding SEQ ID NOs.
The invention includes antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA fragment spanning one of the DNA fragment clusters: (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69). Preferably the antigen-coding regions are the spanning sequences themselves from the clusters. Preferred polynucleotides are H. pylori antigen-coding DNA fragments, in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (Cl), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
Thus, the clusters disclosed herein provide a non-random collection of H. pylori antigen coding sequences and resulting antigens which have been shown to react with H. pylori immunopositive samples and are useful in a variety of diagnostic applications.
3. Isolation and Mapping of Native Antigens
Experiments carried out in support of the invention have resulted in the concurrent identification of several additional native antigens of / , pylori (Table 6) using a proteomics approach.
An overview of the proteome methodology disclosed herein and utitilized to separate and identify over twenty H. pylori antigens is provided in Fig. 7 (right-hand side). In turning now to this figure, bacterial proteins are separated by two-dimensional (2D) electrophoresis. Immunoreactive spots {i.e. , reactive in Western blots with pooled sera from H. pylori infected patients) are then selected and subsequently characterized by endoproteolytic digestion, chromatography, and mass spectrometry (e.g. , matrix assisted laser desorption time of flight mass spectrometry, MALDI-TOF). Box A indicates the use of electrospray mass spectrometry to determine the mass of the intact protein. Box B indicates the use of MALDI-TOF mass spectrometry to evaluate the number and mass of Lys-C peptides. Box C indicates the use of MALDI-MS to evaluate chromatographically separated Lys-C peptides and provide sequence information by post source decay (when peptides are pure). Box D indicates the use of electrospray mass spectrometry to evaluate and sequence peptides through collision induced dissociation. These procedures can be used as an alternative to, or in parallel with, data obtained from conventional cloning techniques to derive genomic data corresponding to antigens of H. pylori, as shown by experiments carried out in support of the invention. Antigens thus obtained are highly immunogenic, and may be isolated and characterized as illustrated in Examples 7-9 and described as follows. To obtain soluble H. pylori proteins, pelleted H. pylori is typically lysed in a French press at > 10,000 PSI, followed by centrifugation. Alternative lysis methods include mechanical douncing or detergent disruption.
Antigens are generally obtained from whole lysates as follows. Fractionation of H. pylori soluble proteins is carried out by SDS-PAGE, preferably utilizing 2-dimensional electrophoresis (O'Farrell, 1975; O'Fanell, et al , 1977). In performing a typical 2-dimensional electrophoretic separation, an isoelectric focusing gel separation is first carried out (first dimension gel). Isoelectric focusing is carried out, e.g. , on a acrylamide/bisacrylamide gel, for an appropriate number of volt-hrs, determined as described in "CURRENT PROTOCOLS IN PROTEIN SCIENCE" . Units 10.46, John Wiley and Sons, Inc., New York (1996). The final tube gel pH gradient is then measured by a surface pH electrode. Protein components in a sample mixture undergo a first separation according to pi value, as indicated on the horizontal axes of Figs. 1 and 2. The first dimensional separation can be carried under either acidic (Fig. 2) or basic (Fig. 1) conditions, depending upon the nature of the proteins to be separated.
A second dimensional, sized-based polyacrylamide slab gel separation is then carried out, to further separate the proteins on the basis of molecular weight. Suitable molecular weight markers are typically added to the gel, for determining corresponding molecular weights of eluted proteins. Detection is typically carried out by Coomassi blue staining or by silver staining. Silver staining may be preferred, in some cases, since silver staining methods are considerably more sensitive and can be used to detect smaller amounts of protein. Figs. 1 and 2 illustrate H. pylori antigens obtained generally as described above, which have been Western blotted with Roost H. pylori-positive serum pool and a negative serum pool, respectively. The numbers in each figure correspond to spots representing H. pylori proteins. Features of these proteins (including approximate molecular weights and pi values), as identified by spot number, are summarized in Table 6. The polypeptides were confirmed to be immunoreactive when tested against anti-/ . /ry/øri-antibodies contained in the Roost pool sera. The identities of the spots indicated numerically on the gels were determined by a variety of protein sequencing techniques as described in Example 9.
To further identify these peptides, immunoreactive spots were excised from the gels, digested, and sequenced. As mentioned above, the native H. pylori antigens were further characterized by a combination of sequencing methodologies, including N-terminal sequencing, liquid-chromatography- mass spectrometry, and determination of internal sequences by amino-acid specific chemical cleavage, followed by Edman sequencing (Example 9).
Internal amino acid sequences can be determined by utilizing a combination of various site- specific cleavage reagents, such as ortho-pthalaldehyde (OPA)/cyanogen bromide (CNBr), hydroxy lamine, formic acid, and BNPS-skatole, which cleave as follows: CNBr (cleaves at C- terminus of methionine), BNPS-skatole (cleaves at C-terminus of tryptophan), formic acid (cleaves at Asp-Pro peptide bond), hydroxylamine (cleaves at Asn-Gly bond), and OPA, which distinguishes between secondary and primary amines; and enzymatic reagents such as Endo Protease Lvs-C, Endoproteinase ASP-N, Endoproteinase GLU-C. and Trysin. which cleave respectively as follows: Endo Protease Lvs-C (cleaves at C-terminus of lysine), Endoproteinase ASP-N (cleaves at N-terminus of aspartic acid and cysteic acid), Endoproteinase GLU-C (cleaves at C-terminus of glutamic acid), and Trvsin (cleaves at C-terminus of arginine and lysine).
The N-terminal sequence corresponding to each spot was determined (Table 6). Utilizing the approaches described above and in Examples 7-9, the spots indicated in the 2-D blots (Figs. 1 and 2) were identified as indicated in Table 6.
Referring to the low pH blot presented as Fig. 2, spots 9, 11, 12, 13, 15, 16 (major and minor), and 17 represent unique antigens. Spot 9 represents the native H. pylori antigen corresponding to the recombinant protein expressed by ORF2 of clone al, while spot 12 represents the native antigen that corresponds to the antigenic protein encoded by clone a5. Looking at the high pH Western blot in Fig. 1, spots 9 (major and minor), 10, 12, 13, and
15 (major and minor) represent new antigens. The relationship between basic spot 15 and clones Y104 and d7 is discussed above and also illustrated in Fig. 3.
Mass spectral profiles of selected Lys-C digested H. pylori proteins corresponding to Western positive spots are provided in Tables 11 and 12. The mass "fingerprints" of the peptide digests were then used as a basis for comparison to proteins predicted from various genomics databases (details are provided in Example 12) to further confirm the identify of selected H. pylori antigens described herein.
The above-described proteomic approach is useful for analyzing the genetic diversity of H. pylori, examining antigen-antibody responses during acute and chronic infection, and with particular gastroduodenal pathologies and possible autoimmune components to //. y/oπ-associated disease. Moreover, 2-D gel electrophoresis and in-situ proteolytic digestion in conjunction with MALDI-TOF MS provides an extremely sensitive technique for the rapid identification of//, pylori antigens and for rapid screening for preferred vaccine and diagnostic candidates.
4. Expression and Purification of Antigenic Polypeptides and Preparation of Their
Respective Antibodies a. Expression and Purification of Antigenic Polypeptides. The recombinant antigenic peptides of the present invention can be purified by standard protein purification procedures which may include differential precipitation, molecular sieve chromatography, ion-exchange chromatography, isoelectric focusing, gel electrophoresis and affinity chromatography.
H. pylori antigens in accordance with the invention comprise at least 6 contiguous amino acids contained within one of the following cluster antigen sequences: SEQ ID NOS: 340-468. In a particular instance, a H. pylori antigen may comprise at least 6 contiguous amino acids contained within a polypeptide sequence selected from one of the following SEQ ID NOS: 2, 4, 5, 7, 9, 10, 12, 14, 17, 21, 25-28, 36, 37, 39, 44, 48, 55, 59, 61, 69, 249, 250, 252, 254, 256, 258, 260-263, 265- 269, 323, 324, and 550-554.
Preferably, a H. pylori antigen corresponds to at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 555-602, where these sequences represent illustrative expression proteins corresponding to H. pylori antigens. Even more preferably, a H. pylori antigen is identified by at least 6 contiguous amino acids contained within one of the following sequences: SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D). Polynucleotide sequences encoding the antigens of the present invention have been cloned in the plasmid p-GEX (Example 5) or various derivatives thereof (pGEX-del65). The plasmid pGEX (Smith, et al. , 1988) and its derivatives express the polypeptide sequences of a cloned insert fused in- frame to the protein glutathione-S-transferase (sj26). In one vector construction, plasmid pGEX-hisB, an amino acid sequence of 6 histidines, is introduced at the carboxy terminus of the fusion protein. The various recombinant pGEX plasmids can be transformed into appropriate strains of E. coli and fusion protein production can be induced by the addition of IPTG (isopropyl-thio galactopyranoside) as described in Example 5. Solubilized recombinant fusion protein can then be purified from cell lysates of the induced cultures using Ni-NTA+ + affinity chromatography (Example 5). Insoluble fusion protein expressed by the plasmids can be purified by means of immobilized metal ion affinity chromatography (Porath, 1992) in buffers containing 6 M Urea or 6 M guanidinium isothiocyanate, both of which are useful for the solubilization of proteins. Alternatively, insoluble proteins expressed in pGEX-GLI or derivatives thereof can be purified using combinations of centrifugation to remove soluble proteins followed by solubilization of insoluble proteins and standard chromatographic methodologies, such as ion exchange or size exclusion chromatography, and other such methods are known in the art.
In the case of j3-galactosidase fusion proteins (such as those produced by lambda gtl 1 clones), the fused protein can be isolated readily by affinity chromatography, or by passing cell lysis material over a solid support having surface-bound anti-/3-galactosidase antibody. Also included in the invention is an expression vector, such as the lambda gtl 1 or pGEX vectors described above, containing H. pylori antigen coding sequences and expression control elements which allow expression of the coding regions in a suitable host. The control elements generally include a promoter, a translation initiation codon, translation and transcription termination sequences, and an insertion site for introducing the insert into the vector. The DNA encoding the desired antigenic polypeptide can be cloned into any number of commercially available vectors to generate expression of the polypeptide in the appropriate host system. These systems include, but are not limited to the following: baculovirus expression (Reilly, et al, 1992; Beames, et al, 1991; Pharmingen, San Diego, CA; Clontech, Palo Alto, CA), vaccinia expression (Earl, et al, 1991; Moss, et al, 1991), expression in bacteria (Ausubel, et al, 1988; Clontech), expression in yeast (Gellissen, etal , 1992; Romanos, etal , 1992; Goeddel, 1990; Guthrie and Fink, 1991), expression in mammalian cells (Clontech; Gibco-BRL, Ground Island, NY), e.g. , Chinese hamster ovary (CHO) cell lines (Haynes, et al, 1983; Lau, et al , 1984; Kaufman, 1990). These recombinant polypeptide antigens can be expressed directly or as fusion proteins. A number of features can be engineered into the expression vectors, such as leader sequences which promote the secretion of the expressed sequences into culture medium.
Expression of large polypeptide antigens is described in Example 5. Several of the long antigen clone sequences were cloned into expression vectors and successfully expressed in E. coli, as described in Example 5 and indicated in Tables 3 a and 3b. Expression in yeast systems has the advantage of commercial production. Recombinant protein production by vaccinia and CHO cell line have the advantage of being mammalian expression systems. Further, vaccinia virus expression has several advantages including the following: (i) a wide host range; (ii) faithful post-transcriptional modification, processing, folding, transport, secretion, and assembly of recombinant proteins; (iii) high level expression of relatively soluble recombinant proteins; and (iv) a large capacity to accommodate foreign DNA.
The recombinant expressed polypeptide-produced H. pylori polypeptide antigens are typically isolated from lysed cells or culture media. Purification can be carried out by methods known in the art including salt fractionation, ion exchange chromatography, and affinity chromatography. Immunoaffinity chromatography can be employed using antibodies generated based on the H. pylori antigens identified by the methods of the present invention.
The resulting DNA coding regions can be expressed recombinantly either as fusion proteins or isolated polypeptides. In addition, amino acid sequences can be readily chemically synthesized using commercially available synthesizer (Applied Biosystems, Foster City, CA) or "PIN" technology (Applied Biosystems). Antigens obtained by any of these methods can be used for antibody generation, diagnostic tests and vaccine development.
Exemplary amino acid sequences corresponding to expressed antigenic proteins of H. pylori are provided herein as SEQ ID NOs: 555-602. b. Antibody Production. In another aspect, the invention includes specific antibodies directed against the polypeptide antigens of the present invention. Antigens obtained by any of these methods may be directly used for the generation of antibodies or they may be coupled to appropriate carrier molecules. Many such carriers are known in the art and are commercially available (e.g. , Pierce, Rockford, IL). Typically, to prepare antibodies, a host animal, such as a rabbit or a goat, is immunized with the purified antigen or fused protein antigen. Hybrid or fused proteins may be generated using a variety of coding sequences derived from other proteins, such as glutathione-S-transferase or /3-galactosidase. The host serum or plasma is collected following an appropriate time interval, and this serum is tested for antibodies specific against the antigen. These techniques are equally applicable to all immunogenic sequences described herein.
The gamma globulin fraction or the IgG antibodies of immunized animals can be obtained, for example, by use of saturated ammonium sulfate precipitation or DEAE Sephadex chromatography, affinity chromatography, or other techniques known to those skilled in the art for producing polyclonal antibodies. Alternatively, purified antigen or fused antigen protein may be used for producing monoclonal antibodies. Here the spleen or lymphocytes from an immunized animal are removed and immortalized or used to prepare hybridomas by methods known to those skilled in the art. To produce a human- derived hybridoma, a human lymphocyte donor is selected. A donor known to be infected with a H. pylori may serve as a suitable lymphocyte donor. Lymphocytes can be isolated from a peripheral blood sample. Epstein-Barr virus (EBV) can be used to immortalize human lymphocytes or a suitable fusion partner can be used to produce human-derived hybridomas. Primary in vitro sensitization with viral specific polypeptides can also be used in the generation of human monoclonal antibodies. Antibodies secreted by the immortalized cells are screened to determine the clones that secrete antibodies of the desired specificity, for example, by using the ELISA or Western blot method (Ausubel et al, 1988).
Purified polyclonal or monoclonal antibodies directed against H. pylori antigens can then be used in any of a number of standard immunoassay formats to detect the presence of antigen, such as described in Harlow, et al, 1988. One representative assay format is an antigen capture sandwich assay. In a typical sandwich assay, antibody is immobilized on a solid support. H. pylori infected samples (e.g. , feces, dental plaque, gastric biopsies, culture suspension from a biopsy sample) are then allowed to react with the immobilized antibodies, followed by incubation with a different antibody directed against H. pylori (e.g. , whole lysate) and subsequent reaction with a secondary antibody carrying a reporter label. Detection of label in a testing region is indicative of the presence of H. pylori antigen in a test sample. The above-described assay is representative of any of an antigen-based assay based on antibodies prepared as described above, useful for the early detection of H. pylori antigens in a sample suspected of infection by H. pylori.
5. ELISA and Protein Blot Screening H. pylori antigens are first identified, typically through plaque immunoscreening as described above, and expressed and purified (as previously described). The antigens are then screened rapidly against a large number of suspected H. pylori positive anti-sera using alternative immunoassays, such as, ELISAs or Protein Blot Assays (Western blots) employing the isolated antigen peptide. The antigen polypeptide fusion protein is then isolated as described above, usually by affinity chromatography to the fusion partner such as /3-galactosidase or glutathione-S-transferase. Alternatively, the antigen itself is purified using antibodies generated against it (see below).
A general ELISA assay format may be employed, such as those described in Harlow, et al. (1988). The purified antigen polypeptide or fusion polypeptide containing the antigen of interest, is attached to a solid support, for example, a multiwell polystyrene plate. Biological fluid (e.g., sera) to be tested are diluted and added to the wells. After a period of time sufficient for the binding of antibodies to the immobilized antigens, the sera are washed out of the wells. A labelled reporter antibody is added to each well along with an appropriate substrate: wells containing antibodies bound to the purified antigen polypeptide or fusion polypeptide containing the antigen are detected by a positive signal. A typical format for protein blot analysis using one, any, several, or each of the polypeptide antigens of the present invention is presented in Example 6. General protein blotting methods are described by Ausubel, et al. (1988).
In Example 6A, the antigenic protein expressed by clone dHA22.8 (A22) was used to screen a number of sera samples (both pooled sera and discrete samples). The high percentage of sera reacting to antigen produced from recombinant clone dHA22.8 indicates that it is a dominant epitope, and that it is a suitable infection marker for H. pylori. The results presented in Example 6 A demonstrate that several different source H. pylori-positive anti-sera are immunoreactive with this representative polypeptide antigen. Similar results are described for recombinant protein expressed by clone dHClS.il (Cl). Additional experiments carried out in support of the invention as described in Example 6B reveal, on the basis of sera paneling data using both single antigens and antigen combinations, that preferred antigens for use in reliably and universally detecting H. pylori infection include but are not limited to the following: A22, Cl, Y124A, Y261A, c5, C7, B2, Y104B, and Y128D. These antigens are effective as serological markers for detecting active infection by H. pylori, based upon favorable sensitivity and selectivity features. In Example 8, native proteins from H. pylori are shown to be immunoreactive with anti-/ . pylori primary antibodies obtained from "Roost" pooled serum.
The results presented above demonstrate that the polypeptide antigens of the present invention can, by these methods, be rapidly screened against panels of suspected H. pylori infected serum samples for the detection of H. pylori.
A. Protective Antibodies. Vaccines and the Generation of Protective Immunity a. Protective Antibodies. Protective antibodies can be identified using, for example, an animal model system (DuBois, et al. , 1996). To identify protective antibodies, polyclonal or monoclonal antibodies are generated against the antigens of the present invention, where the antigens may be used as the immune-stimulation component arm in conjunction with cholera toxic (CT). Antibodies thus generated are then used to pre-treat an infectious H. py /on-containing inoculum {e.g. , serum) before infection of cell cultures or animals. The ability of a single antibody or mixtures of antibodies to protect the cell culture or animal from infection is evaluated. For example, in cell culture and animals, the absence of antigen and/or nucleic acid production serves as a screen. Further, in animals, the absence of H. pylori disease symptoms, e.g. , elevated carbon dioxide/ammonia levels in a urea breath test (UBT) is also indicative of the presence of protective antibodies. The urea breath test takes advantage of the action of the urease enzyme of H. pylori to decompose ingested 13C or 14C urea to radioactive carbon dioxide and ammonia, and radioactive carbon dioxide is then measured.
Animal models for investigating H. infection include: (i) gnotobiotic newborn piglets (easily infected by H. pylori of human origin, but preferred for short term studies), (ii) mice and ferrets, which can be colonized for months (mice) or years (ferrets), and (iii) certain domestic cats, which can carry H. pylori (DuBois, et al, 1996). H. pylori causes ulcers in gnotobiotic piglets, and in mice using "THE SYDNEY STRAIN" of H. pylori, and gastritis in mouse strains SJL, C3H/HZ, DBA, C56BL b, and Balb/C. A rhesus monkey infection model has also been developed (DuBois, et al, 1996).
Alternatively, convalescent sera can be screened for the presence of protective antibodies and then these sera used to identify H. pylori antigens that bind with the antibodies. The identified H. pylori antigen is then recombinantly or synthetically produced. The ability of the antigen to generate protective antibodies is tested as above.
b. Vaccines. After initial screening, the antigen or antigens identified as capable of generating protective antibodies, either singly or in combination, can be used as a vaccine to inoculate test animals (to be described in greater detail below). The animals are then challenged with infectious H. pylori. Protection from infection indicates the ability of the animals to generate antibodies that protect them from infection. Further, use of the animal models allows identification of antigens that activate cellular immunity.
In animal model studies, a protective immune response in response to challenge by a bacterial preparation {e.g. , infected serum) (i) protects the animal from infection or (ii) prevents manifestation of disease.
Vaccines can be prepared from one or more of the immunogenic polypeptides identified herein. Numerous H. pylori polypeptides of the invention (e.g. , spot 15) have been shown to be extremely antigenic, i.e., they react strongly, with antibodies present in sera pooled from a number of confirmed H. pylori-positive donors. In a typical screening method, the intensity of color development is representative of the strength (binding affinity) of the antigen-anti-/ . pylori antibody interaction.
Representative serum paneling results (Example 6) for proteins expressed by two of the clones, dHA22.8 (A22) and dHClS.ll (Cl), indicate that both of these recombinant proteins are highly immunogenic. Protein produced by clone dHA22.8 reacted with antibodies present in both of the H. pylori-positive pooled sera sources, Roost pool and SFA 001, and exhibited no cross reactivity with antibodies present in H. /ry/oπ'-negative samples. Similar results were observed for protein produced by clone dHClS.ll (Cl).
In looking now at antigen reactivity with individual H. pylori-positive serum samples, antigenic protein expressed by each of the clones dHA22.8 and dHClS.11 reacted with anti-/ . pylori antibodies in 100% and 95% of the samples, respectively, indicating the ability of the antigens of the present invention to detect H. pylori infection, and to provide components for vaccines against H. pylori.
Preferred antigens for use in vaccine compositions are described in Example 13. Exemplary antigens are those capable of invoking a long-lasting antigenic response, as evidenced by the persistant presence of antibodies whose titre remains high for an extended period of time subsequent antimicrobial treatment for H. pylori. On this basis, particularly preferred antigens for use in vaccine compositions include Y139, Y146B, Y175A, and A22, Y184A, Z9A, Y261Ains and Y146B.
Other peptides, among the unique H. pylori peptides disclosed herein, can be identified as useful in a vaccine for H. pylori as follows. The individual test peptide is formulated in a suitable carrier, e.g. , adjuvant, at a concentration suitable for injection, e.g., 5-500 mg/ml. One suitable test animal for vaccination is the Rhesus monkey, as detailed in DuBois (DuBois, et al., 1996). The animal is vaccinated, e.g. , by oral, intramuscular or intravenous injection, in an amount typically between 0.2 to 2.0 mg/kg body weight. After a suitable period, e.g. , 2 weeks, the animal may be given a booster by the same route and typically in the same amount. Two to three weeks later, the animal is challenged with H. pylori in a known manner, e.g. , as described in DuBois (DuBois, et al, 1996). As an example, the inoculated animal is challenged with H. pylori (e.g. , a suspension of approximately 108-109 CFU of H. pylori, 1 ml) to test the potential vaccine effect of the specific immunogenic fragment. Following the challenge with H. pylori, the subject is typically monitored by endoscopy, histologic examination, microbiological methods, and/or measurement of H. pylori- specific plasma IgG, as described in DuBois (DuBois, et al, 1996).
The level of infection in the vaccinated animal is then compared with that of a control animal to assess the degree of protection. Those peptides which provide a measurable degree of protection against H. pylori infection are suitable for vaccine use, either alone or in combination with other peptide vaccine agents , such as the above noted dH A22.8 (A22) , dHC 1 S .11 (C 1 ) and spot 15 peptides .
The selected peptide is formulated according to known vaccine formulations. Typically, the peptide is conjugated to a carrier protein, e.g. , keyhole limpet hemocyanin or human serum albumen, and/or suspended in a suitable adjuvant, such as Freund's adjuvant. The vaccine is administered by conventional routes, typically IM or IV routes, as above, at peptide levels preferably in the range of 0.2 to 2.0 mg/kg. If necessary, one or more booster injections is given.
The specificity of a putative immunogenic fragment can be assessed by testing sera, other fluids or lymphocytes from the inoculated animal for cross reactivity with other related bacteria.
B. Synthetic Peptides Using the coding sequences of H. pylori polypeptide antigens disclosed herein, synthetic peptides can be generated which correspond to these polypeptides. Synthetic peptides can be commercially synthesized or prepared using standard methods and apparatus in the art (Applied Biosystems, Foster City CA).
Alternatively, oligonucleotide sequences encoding peptides can be either synthesized directly by standard methods of oligonucleotide synthesis, or, in the case of large coding sequences, synthesized by a series of cloning steps involving a tandem array of multiple oligonucleotide fragments corresponding to the coding sequence (Crea, 1989; Yoshio, et al, 1989; Eaton, et al, 1988). Oligonucleotide coding sequences can be expressed by standard recombinant procedures (Maniatis, et al, 1982; Ausubel, et al, 1988).
C. Immunoassavs for H. pylori
One utility for one, several, many, or each of the antigens herein is their use as diagnostic reagents for the effective and reliable detection of antibodies present in the sera of test subjects infected with H. pylori, to thereby provide an indication of infection in a test subject. Preferred antigens which can be employed either singly or in combination in such a method include, e.g. , antigens identified by or derived from SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), or SEQ ID NO:573 (Y128D). Alternatively, preferred antigens are defined in terms of their DNA coding sequences, corresponding to or derived from SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), or SEQ ID NO:98 (Y128D).
Alternatively, the antigen used for the detection of antibodies present in the sera of test subjects infected with H. pylori is encoded by a DNA fragment spanning one of the DNA fragment clusters: (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (61), (62), (62), (63), (64), (65), (66), (67), (68), and (69), or immunoreactive variants thereof. Preferably the antigen-coding regions are the spanning sequences themselves from the clusters.
The antigens of the present invention can be used singly, or in combination with each other, in order to detect H. pylori, as illustrated by the results in Example 6B. The antigens of the present invention may also be coupled with diagnostic assays for other infectious agents.
In one diagnostic configuration, test serum is reacted with a solid phase reagent having a surface-bound antigen obtained by the methods of the present invention, e.g. , the "spot 15" antigen. Exemplary antigens are A22 and Cl, which both show high sensitivity (as indicated in Figs. 65 and 66, and in Example 6B). After binding anti-/ . pylori antibody to the reagent and removing unbound serum components by washing, the reagent is reacted with reporter-labelled anti-human antibody to bind reporter to the reagent in proportion to the amount of bound anti-//. pylori antibody on the solid support. The reagent is again washed to remove unbound labelled antibody, and the amount of reporter associated with the reagent is determined. Typically, the reporter is an enzyme which is detected by incubating the solid phase in the presence of a suitable fluorometric or color-metric substrate, e.g. , 5-bromo-4-chloro-3-indoyl-phosphate (BCIP) and nitroblue tetrazolium (NBT). (Sigma, St. Louis, MO). The solid surface reagent in the above assay is prepared by known techniques for attaching protein material to solid support material, such as polymeric beads, dip sticks, 96-well plate or filter material. These attachment methods generally include non-specific adsorption of the protein to the support or covalent attachment of the protein, typically through a free amine group binding to a chemically reactive group on the solid support, e.g. , an activated carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin coated plates can be used in conjunction with biotinylated antigen(s). Also forming part of the invention is an assay system or kit for carrying out this diagnostic method. The kit generally includes a support with surface-bound recombinant antigen (e.g. , antigens such as those described in Tables 3a and 3b, or encoded by the representative clones summarized in Table 2) or native H. pylori antigen (such as those identified in Table 6 and in Fig. 1 and Fig. 2, as above), and a reporter-labelled anti-human antibody for detecting surface-bound anti-//. pylori antigen antibody.
In a second diagnostic configuration, known as a homogeneous assay, antibody binding to a solid support produces some change in the reaction medium which can be directly detected. Known general types of homogeneous assays proposed heretofore include (a) spin-labelled reporters, where antibody binding to the antigen is detected by a change in reported mobility (broadening of the spin splitting peaks), (b) fluorescent reporters, where binding is detected by a change in fluorescence efficiency or polarization, (c) enzyme reporters, where antibody binding causes enzyme/substrate interactions, and (d) liposome-bound reporters, where binding leads to liposome lysis and release of encapsulated reporter. The adaptation of these methods to the protein antigen of the present invention follows conventional methods for preparing homogeneous assay reagents.
In each of the assays described above, the assay method involves reacting the serum from a test individual with the protein antigen and examining the antigen for the presence of bound antibody. The examining may involve attaching a labelled anti-human antibody to the subject antibody (for example from acute, chronic or convalescent phase) and measuring the amount of reporter bound to the solid support, as in the first method, or may involve observing the effect of antibody binding on a homogeneous assay reagent, as in the second method. Also contemplated is an antigen capture assay as previously described in Section D.2.
The following examples illustrate, but in no way are intended to limit the scope of the present invention.
Materials and Methods 1. General Procedures
Synthetic oligonucleotide linkers and primers were prepared using commercially available automated oligonucleotide synthesizers . Alternatively, custom designed synthetic oligonucleotides may be purchased from commercial suppliers.
Standard molecular biology and cloning techniques were performed essentially as previously described in Ausubel, et al, 1988; Sambrook, et al, 1989; and Maniatis, et al, 1982.
Common manipulations relevant to employing antisera and/or antibodies for screening and detection of immunoreactive protein antigens were performed essentially as described in Harlow, et al. (1988). Antibody screening of//, pylori genomic libraries was carried out by plaque immunoblot assay.
Similarly Western blot assays were performed either as described by their manufacturer (Abbott, N. Chicago, IL; Genelabs Diagnostics, Singapore) or using standard techniques known in the art (Harlow, et al, 1988).
2. Bacterial Strains
Commercially available H. pylori strains corresponding to ATCC Designation Nos. 43504 (short antigen clone set) and 43526 (Libraries 1 and 2) were used to generate the DNA libraries. //. pylori strain ATCC 43504 was used for isolation of native proteins produced by H. pylori. Escherichia coli strain(s) Y1088, Y1089,and XLI-Blue for libraries 1 and 2 (Stratagene, La Jolla, CA) was the host used for phage infection. E. coli strains XLI-Blue, XLOLR (Stratagene, La Jolla, CA) were used for protein expression of cloned genes.
Example 1
Construction of //, pylori Lambda gtll and ZAPII DNA Libraries
1. Isolation of Genomic DNA
H. pylori (American Type Culture Collection, Rockville, MD; ATCC Designation Nos. 43504) was streaked on blood agar plates and incubated in a microaerophile environment at room temperature for 7 days. Cells were harvested by scraping the bacterial cells from 10 plates, followed by washing once with phosphate buffered saline (Dulbecco's, Gibco BRL, Gaithersburg, MD).
Genomic DNA was prepared as described in Ausubel et al. 1988, with minor modifications. Cell pellets from 5 plates were resuspended in 510 μl of TE (Ausubel et al, 1988), to which was added 60 μl of 10 % SDS and 30 μl of 20 mg/ml of proteinase K. The suspension was mixed, followed by incubation for a period from 4 - 8 hours at 37 °C. To the suspension was added 80 μl of CTAB/NaCl (10% hexadecyltrimethyl ammonium bromide in 0.7 M NaCl), and the resulting solution was then mixed and incubated for 10 minutes at 65 °C. The solution was then extracted with an equal volume of chloroform/ isoamyl alcohol and spun in a microcentrifuge for 5 minutes. The separated aqueous phase was transferred to a new tube, and the DNA was precipitated by addition of 0.6 volumes of isopropanol, followed by centrifugation. The DNA pellet was washed with 70% ethanol, dried briefly under vacuum and solubilized in 100 μl of distilled water. The DNA solution was then treated with DNase-free Rnase (Boehringer Mannheim, Indianapolis, IN) (Ausubel, et al, 1988; Maniatis, et al., 1982) to selectively degrade any RNA present in the sample. 2. DNase Digestion and DNA Amplification a. Short Antigen Clone Libraries. H. pylori genomic DNA as described above was digested with pancreatic DNase I (Boehringer Mannheim) essentially as described in Ausubel, et al. (1988) and Sambrook, et al (1989). Aliquots of the digested DNA were taken at various time points. The DNA digests were resolved by preparative agarose gel electrophoresis. Product bands containing the desired size range of DNA (200-2000 base pairs) were excised from the gel, and recovered using the "GENE CLEAN II" kit (Bio 101 Inc., La Jolla, CA) or the "MERMAID" kit (Bio 101 Inc., La Jolla, CA), according to the manufacturer's instructions.
To generate blunt-ends, the recovered DNA fragments were incubated with E. coli Klenow fragment of DNA polymerase (Ausubel, et al, 1988; Sambrook, et al, 1989). The reaction mixture was incubated at room temperature for 30 minutes, followed by extraction with phenol/chloroform.
The resulting molecules were ligated to Sequence Independent Single Primer Amplification
(SISPA; Reyes, et al, 1991) linker primers according to the method of Reyes (Reyes, et al, 1991).
To the blunt-ended DNA were added the following: phosphorylated SISPA (Sequence-Independent Single Primer Amplification) linker AB, a double strand linker comprised of SEQ ID NO: 63 and SEQ ID NO: 64, where SEQ ID NO: 64 is in a 3' to 5' orientation relative to SEQ ID NO: 63 as a partially complementary sequence to SEQ ID NO:63, 2 μl 10 x ligation buffer (0.66 M Tris.Cl pH = 7.6, 50 mM MgCl2, 50 mM DTT, 10 mM ATP) and 1 μl T4 DNA ligase (0.3 to 0.6 Weiss Units). Typically, the DNA and linker were mixed at a 1: 100 ratio. The reaction was incubated at 14°C overnight. The reaction was then incubated at 70°C for 3 minutes to inactivate the ligase. Unligated linkers were removed by gel filtration using a Chromaspin column (Clonetech, Mountain View, CA) or a Sephadex G spin column (Pharmacia, Piscataway, NJ) according to the manufacturer's instructions.
The linker ligated DNAs were then amplified by SISPA (Reyes, et al , 1991). To 100 μl of 10 mM Tris-Cl buffer, pH 8.3, containing 1.5 mM MgCl2 and 50 mM KCl (Buffer A) was added about 1 μl of the linker-ligated DNA preparation, 2 μM of a primer having the sequence shown as SEQ ID NO:63, 200 μM each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of Amplitaq DNA polymerase (Applied Biosystems Division, Perkin Elmer, Foster City, CA). The reaction mixture was heated to 94°C for 30 seconds for denaturation, allowed to cool to 50°C for 30 seconds for primer annealing, and then heated to 72 °C for 0.5-3 minutes to allow for primer extension by Taq polymerase. The amplification reaction, involving successive heating, cooling, and polymerase reaction, was repeated an additional 25-40 times with the aid of a Perkin-Elmer Cetus DNA thermal cycler (Mullis, 1987; Mullis, et al, 1987; Reyes, et al, 1991; Perkin-Elmer Cetus, Norwalk, CT).
After the amplification reactions, the solution was then extracted with phenol/chloroform and precipitated with two volumes of ethanol. 3. Cloning of the DNA into Lambda Vectors a. Short Antigen Clone Libraries. The linkers used in the ligation to the DNA contained an EcoRI site which allowed for direct insertion of the amplified DNAs into the lambda vectors (gtl l or ZAP II, Stratagene, La Jolla). The lambda vectors as purchased from the manufacturer had been digested with EcoRI and treated with alkaline phosphatase to remove the terminal 5' phosphate and prevent self-ligation of the vector. The amplified DNAs from l .B. were digested with EcoRI and short nucleotides were removed by gel filtration. The digested DNA preparations were then ligated into lambda gtl l or "ZAPII" using T4 DNA ligase (Boehringer Mannheim, Indianapolis, IN). The conditions of the ligation reactions were as follows: 1 μl vector DNA (Stratagene (La Jolla, CA) 0.5 mg/ml); 0.5 or 3 μl of the PCR amplified insert DNA; 0.5 μl 10 x ligation buffer (0.5 M Tris-HCl, pH = 7.8; 0.1 M MgCl2; 0.2 M DTT; 10 mM ATP; 0.5 mg/ml bovine serum albumin (BSA)), 0.5 μl T4 DNA ligase (0.3 to 0.6 Weiss units) and distilled water to a final reaction volume of 5 μl. The ligation reactions were incubated at 14 °C overnight (12- 18 hours). The ligated DNA was packaged by standard procedures using a lambda DNA packaging system (GIGAPAK, Stratagene, La Jolla), and then plated at various dilutions to determine the titer. The titer of the DNA-insert phage libraries and percent recombination were determined by a standard X-gal blue/white assay (Miller, 1994; Maniatis et al, 1982). Typically, the titer of the recombinant libraries ranged from 1.5 x 104 to 3 x 106 PFU/ml.
Percent recombination in each library can also be confirmed by selecting a number of random clones and isolating the corresponding phage DNA. Polymerase chain reaction (Mullis, 1987; Mullis, et al, 1987) is then performed using isolated phage DNA as template and lambda DNA sequences, derived from lambda sequences flanking the EcoRI insert site for the DNA molecules, as primers. The presence or absence of insert is then evident from gel analysis of the polymerase chain reaction products.
b. Lambda ZAPII Libraries (Libraries 1 and 2). The lambda ZAPII libraries (generated from H. pylori sample ATCC No. 43526) were similarly prepared with the following exceptions. The ZAPII libraries were generated from genomic DNA which had not undergone Sequence Independent Single Primer Amplification. Library 1 clones (designated herein by upper case letters) were generated by ligating EcøRI- digested H. pylori DNA directly into the EcoRI sites of the lambda "ZAPII" vector. Library 2 clones (lower case designations) were obtained by digesting H. pylori genomic DNA with Hindlll, blunt- ending the / tnαΗI-fragments, and ligating the resulting molecules to SISPA primers AB as described above. The linker-ligated DNAs were then treated with EcoRI and ligated into the EcoRI sites of ZAPII lambda arms. Unless otherwise indicated, the DNA-insert gtl l and ZAPII phage libraries generated from
H. pylori samples ATCC Designation No. 43504 and ATCC No. 43526 have been deposited in the
Genelabs Technologies, Inc. Culture Collection, 505 Penobscot Drive, Redwood City, CA, 94063 or in the Genelabs Diagnostics PTE LTD Culture Collection, 85 Science Park Drive #04-01 , The Cavendish, Singapore Science Park, Singapore 118259.
Example 2
Immunoscreening of Recombinant Libraries
The lambda gtl l and ZAPII phage libraries described in l.C. above were immunoscreened for the production of antigens recognizable by a pool of sera from 11 patients (designated herein as
"Roost pool" sera) identified as H. pylori-positive using endoscopy, or by another pool of 4 H. pylori immunopositive sera samples identified as SFA001. The lambda short antigen clone library and ZAPII library 2 were immunoscreened against Roost pool sera; the ZAPII library 1 was immunoscreened against sera pool SFA001. The lambda gtl 1 libraries were plated for plaque formation using E. coli Y1089 bacterial plating strain, while the lambda ZAPII libraries were plated for plaque formation using E. coli XLI-Blue bacterial plating strain.
The fusion proteins expressed by the recombinant lambda phage clones were screened with serum antibodies essentially as described by Ausubel, et al. (1988).
Each library was plated at approximately 1.5 to 2 x 104 phages per 150 mm plate. Plates were overlaid with nitrocellulose filters overnight. Filters were washed with TBS (10 mM, Tris pH 7.5; 150 mM NaCl), blocked with AIB (TBS buffer with 1 % gelatin) and incubated with a primary antibody diluted 100 times in AIB.
After washing with TBS, filters were incubated with a second antibody consisting of goat-anti- human IgG conjugated to alkaline phosphatase at a concentration of 1 : 1000. Reactive plaques were developed with a substrate (for example, BCIP, 5-bromo-4-chloro-3-indolyl-phosphate), with NBT (nitro blue tetrazolium salt; Sigma Chemical Co., St. Louis, MO). Positive areas from the primary screening were replated and immunoscreened until pure plaques were obtained. The results of the screening are presented in Tables la and lb. Table la
Roost Pooled Sera
Figure imgf000043_0001
NA = not available
The above sera have all been tested with HelicoBlot 2.0 (Genelabs Diagnostics Western blot product) and were found to be positively infected samples using the criteria of the product.
SFA001 is a pool of 4 donor sera (plasma packs) showing strong reactive bands on Western blot format with a crude lysate antigen preparation from Helicobacter lysate using in HelicoBlot 2.0. NAA121 is a pool of 4 donor sera (plasma packs) showing no reactive bands on Western blot format with the crude lysate antigen preparation from Helicobacter lysate used in HelicoBlot 2.0.
Table lb
The lambda gtll and ZAPII libraries
Figure imgf000044_0001
Example 3
Characterization of Immunoreactive Lambda gtll, Z and Lambda ZapII Y library Clones
1. PCR Amplification. Purification, and Sequence Determination for DNA Inserts of Immuno-reactive Recombinant Lambda Clones (Short Antigen Clone Set, H. pylori Cloned Families Y and Z)
The specific antigen coding sequences from H. pylori cloned families were isolated by PCR amplification of representative clones. Cloned sequences having an .ASM extension were sequenced completely; sequences with a .SEQ extension were partially sequenced.
The DNA inserts of the immunoreactive recombinant lambda clones were PCR amplified using primers corresponding to lambda arm sequences flanking the EcoRI cloning site of the vectors. Amplification was carried out by polymerase chain reactions utilizing each immunoreactive clone as template. For the lambda gtll clones, gtl IF (SΕQ ID NO:65) and gtll R (SΕQ ID NO:66) primers were used. For the lambda ZAPII clones, primers T3 (SΕQ ID NO:67) and T7 (SΕQ ID NO:68) were used.
The resulting amplified fragments were then agarose gel purified and eluted from the gel (Ausubel, et al, 1988). The PCR products were further purified by "WIZARD PCR PRΕPS" (Promega, Madison, WI) or "CHROMASPIN" columns (Clonetech, Mountain View, CA) to remove primers and other ingredients. The purified insert DNA was then subjected to direct sequencing. In some cases, the insert DNA was first subcloned into the TA cloning vector (Invitrogen, San Diego, CA) and then sequenced.
Sequence determination for the DNA inserts was carried out using a Perkin Elmer Applied Biosystems 373A DNA sequencer (Perkin Elmer, Applied Biosystems Division, Foster City, CA) according to the manufacturer's protocol (dideoxy chain terminator sequencing methodology. Sanger, et al , 1977).
Sequence data is presented in the accompanying Sequence Listing. Table 2 below presents a partial summary of recombinant H. pylori nucleotide sequences encoding immunogenic proteins , i. e. , proteins shown to be reactive with H. pylori-positive sera, corresponding to cloned families Y and Z.
The EcoRI site of the linkers has typically been deleted for the corresponding sequences presented in the figures.
Table 2
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Clone Y104-1 (SEQ ID NO:60) contains the entire d7 clone, and encodes all of the 36K peptides and all of the spot 15 peptides, as indicated in Fig. 3.
The production of expressed antigenic proteins and their subsequent purification is generally described in Examples 5A and 5B. A tabular summary indicating clone name, expression, purification, and panelling data is provided in Table 3b, and a summary of the immunoreactivy of various recombinant antigens is provided in Table 5b. Expression profiles were obtained, and the immunoreactivity of various clones to H. pylori positive pooled sera such as "Roost" was confirmed.
Briefly, amplified products corresponding to a particular ORF were typically cloned into a pGEXhisB vector and expressed in E. coli. The size of the expression product was then determined, followed by confirmation of immunoreactivity (e.g., with Roost pool sera).
2. Sequence Comparison
Sequences were compared with "GENBANK" (versions 1998, 1996), EMBL database and dbEST (National Library of Medicine) sequences at both nucleic acid and amino acid levels. Search programs FASTA, BLASTP, BLASTN and BLASTX (Altschul, et al, 1990) indicated that most of the antigen-coding polynucleotide sequences disclosed herein were not greater than 95 % identical to any sequence contained in the above-mentioned public databases prior to the publication of the genome of //, pylori (Tomb, et al, 1997). Example 4
Characterization of Immunoreactive Lambda Library 1 and Library 2 Clones
1. Subcloning, Purification, Identification and Sequence Determination for H. pylori Antigens
Additional immunopositive clones, as described in Example 2 above, were purified and analyzed for DNA insert size and expressed protein size as in Example 3 above. The ZAPII clones were rescued into plasmids from phagemids (as per protocol in Stratagene, La Jolla, U.S.A.); the resulting insert DNA was excised with EcoRI and ran on agarose gel to determine size. Slightly longer versions of T3 and T7 (GD 60 and GD 61, corresponding to SEQ ID NO:231 and SEQ ID NO:232 respectively) were then used as primers for sequencing. These libraries correspond to the "library 1 " series (EcoRI-cut), i.e., clones A3, A22, B2, B9, B17, B23, Cl, C3, C7, and "library 2" series clones (Hind cut), i.e., clones al, a3, a5, b5, b8c7, c2, c5, cl3, d5, d6, d7, dll, fll, e6, f3, f8, g2, g9, gll, and k4; and also clone Gla from the EcoRL/Xbal cut library. Many clones were determined to be larger than the predicted coding regions of the proteins.
The specific antigen coding sequences from H. pylori library 1 and library 2 clones were subcloned. The subclones were typically prepared by fragmenting the corresponding genomic clone by specific restriction endonuclease digestion to produce a specific subfragment or subfragments, which were purified using "GΕNΕ CLEAN II" kit (Bio 101 Inc., La Jolla, CA) or the "MERMAID" kit (Bio 101 Inc., La Jolla, CA). The resulting DNA fragments were then inserted into a suitable expression vector, i.e., the PB/Bluescript (SK) vector (Stratagene, La Jolla, CA). Expression clones were induced with 0.5 mM IPTG (isopropylthio-beta-D-galactoside) for 4 hours at 37 degrees C, and whole bacterial cell lysate was run on SDS gel. Immunoreactivity of the expressed proteins was confirmed using Roost, SFAOOl and NAA001 pooled sera. Alternatively, original genomic clones were subjected to nested deletion using the "Erase-a-Base" kit (Promega, Wisconsin, U.S.A.), and the resulting smaller nested clones were tested for immunoreactivity to locate the position of the coding antigen.
Immunoreactive DNA regions were then sequenced and locations of the open reading frames were determined. Unless otherwise indicated as a 0-galactoside fusion product, for each of the clones in Tables 3a and 3b below, expression of coding regions was determined to be driven by the corresponding H. pylori promoter rather than by the /3-galactosidase gene promoter in lambda gtl 1. LIBRARY 1 CLONES 2. Clone A3
Subclone A3CON3, originating from a genomic clone having a size of approximately 6.0 kb, was obtained from nested deletion clones. The corresponding nucleotide sequence (1878 bp) is presented herein as SEQ ID NO: 16. The open reading frame (ORF) extends from nucleotides 399-1743, and codes for a putative protein containing 448 amino acids (443 amino acids when calculated from the first methionine). The protein sequence corresponding to the translation of open reading frame (ORF) 399-1743 is presented herein as SEQ ID NO: 17.
3. Clone A22
Subclone A22210DMIC was obtained from an original genomic clone having an insert size of about 2.2 kb, to produce a /3-galactosidase fusion protein in E. coli. As indicated in Tables 3a and 3b below, the open reading frame (ORF) extends from nucleotides 12-599 of SEQ ID NO:43, where bases 12-121 correspond to the /3-galactosidase fusion peptide and bases 122-599 code for a unique A22 antigenic sequence. The translation sequence for the corresponding protein is presented as SEQ ID NO:44.
4. Clone B2
B3C19 is a subclone obtained from the 5.0 kb insert in genomic clone B3 as follows. The insert was excised, digested with DNase, followed by treatment with T4 DNA ligase and Klenow enzyme to produce blunt-ends. The blunt-ended short DNA pieces were then ligated to kinased A/B linkers (linker A, SEQ ID NO:63, corresponding to the top strand of AB SISPA linker; linker B, SEQ ID NO:64, corresponding to the bottom strand ot AB SISPA linker), PCR amplified with primer A, and the amplified products digested with EcoRI. The digested products were then ligated into EcoRΪ- digested lambda vector ZAPII. The DNA sequence of B3C19 is presented as SΕQ ID NO:51.
Based upon the sequence of B3C19, primers GD77 (5' primer, SΕQ ID NO:52) and GD80 (3' primer, SΕQ ID NO:53) were designed to walk back and forward the sequence of B3C19 using genomic B2 DNA as the template. The sequence of clone B3C19 was extended in both the 5' and 3' directions, resulting in B2 extension clone, B2197780 (SΕQ ID NO:54). A computer-generated protein translation of the B2197780 sequence resulted in a corresponding 291 amino acid putative protein, with a predicted transmembrane segment from amino acids 9-25 (predictions obtained using "SOAP" program from "PCGΕNΕ"). Based upon the results of the analysis, a Shine Dalgarno AGGA sequence was determined to be situated 8 b.p. in front of an ATG codon. This ATG codon at nucleotide position 278 appears to be the first "met" amino acid of the gene. The translated protein sequence for clone B2 is presented as SΕQ ID NO:55. The predicted molecular weight of full size protein is 31,682. There are 2 potential cleavage sites between (i) amino acids 91 and 92; and (ii) between amino acids 25 and 26 ("PCGENE" - Prediction of prokaryotic secretory signal sequence). The pi of the protein is predicted as 8.3.
5. Clone B9
Subclone B9.4C is a 1.2 kb Hindlll fragment obtained from original genomic clone B9 (4.5 kb insert) as follows.
Induction of genomic clone B9 produced an immunoreactive protein with sizes of 32 kd and
14 kd in SFAOOl pooled sera and 14 kd in Roost pooled sera. B9 was digested with Hindlll to form several Hindlll subfragments, which were each subcloned into the pBKS vector. Protein production was induced in the resulting subclones, and the sizes of the corresponding proteins were determined.
Subclone B9.4C, corresponding to a 1.2 kb Hindlll subfragment of genomic clone B9, produced an immunoreactive protein of 32 kd. The nucleotide sequence of subclone B9.4C is presented as SEQ ID N0:47; the translated protein sequence {i.e. , nucleotides 230-931 of SEQ ID N0:47) is presented as SEQ ID NO:48. Referring to SEQ ID NO:47, an AGGA Shine-Dalgarno sequence occurs at nucleotide 218, and the first amino acid of the translated protein sequence begins with a GTG codon for valine at position 230. Based upon PCGENE calculations (CHARGPRO program) as described above, the predicted molecular weight of the corresponding protein is 25.2 kd, with a pi of 10.35. A potential cleavage site occurs between amino acid positions 225 and 226.
Based upon a FASTA sequence identity analysis of both nucleic acids and proteins, clone B9 appears to code for the 50A LI protein of H. pylori.
6. Clone B17 B17CON4 (2006 bases) was subcloned from a genomic clone having an insert size of about
4.0 kb. The corresponding nucleotide sequence is presented as SEQ ID NO: 24. The open reading frames (ORF) correspond to the following regions of SEQ ID NO:24: ORF1 (nucleotides 500-700); ORF2 (nucleotides 870-1406); ORF3 (nucleotides 1410-2000); and ORF4 (nucleotides 142-705). The corresponding translated protein sequences are presented herein as SEQ ID NO: 25 (B170RF4), SEQ ID NO:26 (B170RF1), SEQ ID NO:27 (B170RF2) and SEQ ID NO:28 (B170RF3). Based upon nested deletion experiments, it appears that the immunoreactive protein corresponds to B170RF4.
On the basis of computer-generated protein translation analysis as previously described, B170RF4, B170RF2 and B170RF3 encode putative proteins having predicted sizes of 20.9 kd, 20 kd and 22.2 kd respectively. The predicted size of the immunoreactive genomic protein is a doublet of 22 kd and 23 kd. 7. Clone B23
Immunoreactivity was determined to reside in a 3.5 kb Pstl subclone (one Pstl site is derived from vector pBSK) of original genomic clone B23 (5.5 kb). The 3.5 kb subclone was further sequenced, to determine the immunoreactive coding sequence presented in SEQ ID NO: 13. The nucleotide 1078 base pair sequence (SEQ ID NO: 13) was determined to be open all the way (bases
3-1076). The translated protein sequence is presented as SEQ ID NO: 14.
8. Clone Cl
C1CON6V2 was subcloned from an original genomic clone having an insert size of about 4.0 kb. The sequence information obtained from exo-mung deletion clones is presented as SEQ ID NO:38. The open reading frame (ORF) extends from nucleotides 868-1926. The subcloning of the antigen coding sequence from the Cl clone into an expression vector, and characteristics of the corresponding protein product are presented in Example 5 below. The translated protein sequence is presented herein as SEQ ID NO:39.
9. Clone C3
To determine the immunoreactive coding region of original genomic clone, C3 (2.9 kb insert size), several different subfragments of the genomic clone were obtained and sequenced. The immunoreactive clone was determined to reside in an EcoRI/Kpn 2.2 kb subclone. Based upon a number of subcloning experiments and subfragments, the C3 DNA sequence is presented as SΕQ ID
NO: 15.
10. Clone C7
Clone C7.2C is a EcoRI/ Hindlll subclone obtained from genomic clone C7. The original genomic C7 clone is a EcoRI clone of approximately 3.8kb size that produces an immunoreactive protein having a molecular weight of approximately 30 kd.
Subclone C7.2C was obtained as follows. The genomic C7 clone was digested into individual fragments using Hindlll. Each DNA restriction fragment was then subcloned into the pBKS vector via either the Hindlll site, or alternatively, for end-piece fragments, via the EcoRI/ 'Hindlll sites. The corresponding nucleotide sequence of C7.2C (616 base pairs) is presented as SΕQ ID NO: 20. Subclone C7.2C produces an immunoreactive protein which is the same size as that produced by the genomic clone. The translated protein sequence for bases 1-561 of clone C7.2C is presented herein as SΕQ ID NO :21.
11. Clone B8 Clone B8 (SEQ ID NO:549) is a genomic clone with an insert size of 3.5 kb that produces an immunoreactive protein of about 50 kd on SDS Western blot. The DNA sequence of B8 overlaps with that of clone C7 (clone C7.2C) which was a fusion protein with beta-galactosidase. Clone B8 added on 266 additional amino acids 5' (upstream) to C7 and contains the beginning of the gene. However, B8 ends 64 amino acids before the C-terminus of the gene, which is also encoded by C7. Based on the above, the complete sequence of the full gene was compiled (SEQ ID NO:549). The corresponding protein sequence from the compiled gene sequence codes for a putative protein of 48.9 kd, and is presented herein as SEQ ID NO: 550.
LIBRARY 2 CLONES AND G1A
12. Clone al
Clone al is a Hindlll clone that produces an immunoreactive protein of 28-30 kd on SDS gel. The corresponding DNA sequence encoding an antigenic protein contains 1208 nucleotides (SEQ ID NO:35). There are two open reading frames.
ORFl extends from bases 53 - 801 of SEQ ID NO: 35, and encodes a putative protein containing 249 amino acids (SEQ ID NO:36, translated protein). ORFl contains a Shine-Dalgarno sequence extending from nucleotides 43-46 ("AGGA" sequence). The predicted molecular weight of the protein is 27.5 kd, which is in agreement with the expected protein size based upon SDS-PAGE (above). The putative protein for ORFl has a calculated pK of 8.42 and a predicted transmembrane region extending from amino acids 1-19.
The second ORF contained in clone al extends from nucleotides 880-1206 and ends at one of the Hindlll sites, which indicates that clone al is a partial clone. The predicted size of the partial protein is 109 amino acids (SEQ ID NO: 37).
13. Clone a3
Immunoreactive clone a3 (not to be confused with clone A3 from library 1) is a β- galactosidase fusion clone with an insert size of 1975 base pairs. The corresponding nucleotide sequence corresponds to SEQ ID NO: 248. The clone contains two open reading frames (ORFs): ORFl (nucleotides 3-608, SEQ ID NO:249) and ORF2 (nucleotides 613-1266, SEQ ID NO:250). Based upon the observed size of the immunoreactive protein, i.e. , 30 kd, the expected immunoreactive-expressing open reading frame is ORFl . Expression of a3 protein is described in the following section.
14. Clone a5 Clone a5 is an original Hindlll/ EcoΕl genomic clone (1 kb) that produces a 25 kd immunoreactive protein. The corresponding nucleotide sequence is presented as SEQ ID NO: 3. The clone contains two open reading frames, (ORFs): ORFl (nucleotides 3-545, SEQ ID NO: 5) which codes for a beta-galactosidase fusion protein, and ORF2 (nucleotides 569-1021, SEQ ID NO:4). A Shine Delgarno sequence occurs at nucleotides 559-562 of ORF 2. Based upon the observed size of the immunoreactive protein, i.e. , 28 kd, the expected immunoreactive expressing open reading frame is ORF 1.
15. Clone b5
Clone b5 contains a double insert with an internal EcoRI site. The individual inserts are about 0.3 and 0.2 kb in length. The combined nucleotide sequence for clone b5 corresponds to SΕQ ID NO:6, with the open reading frame (ORF) extending from nucleotides 1 through 414. The corresponding protein sequence translation is presented as SΕQ ID NO:7.
16. Clone c2 Clone c2 is a library 2 clone with an insert size of 2077 nucleotides (SΕQ ID NO:251). The size of the immunoreactive protein on SDS-gel is 30kd. There is one ORF extending from nucleotides 3-877, with a possible initial start codon at nucleotide 45. The corresponding protein sequence translation is presented as SΕQ ID NO:253. The protein has a predicted pi of 9.54, with a potential transmembrane region between amino acids 2 through 24. The predicted molecular size of this c2 protein is 30.2kd, which agrees with the observed size on SDS gel.
17. Clone c5
Clone c5 is a 650 base pair Hindllll Hindlll clone (SΕQ ID NO:253). Clone c5 is a partial clone which is not in phase with the /3-galactosidase gene, and contains a 29 amino acid stretch extending into pBluescript at the N-terminus. The ORF extends from bases 2-604, with a predicted protein size of 23.5kd (SΕQ ID NO:254). The observed protein size on SDS-gel is 30kd. The putative protein has a pi of 9.6.
18. Clone c!3 Clone cl3 is a Hindlll/ Hindlll clone with an insert size of 1742 b.p. (SΕQ ID NO:255). It is a /3-galactosidase fusion protein with an observed size of 55kd on SDS gel. The ORF extends from bases 2 to 1420. The corresponding protein possesses a predicted molecular weight of 54.6kd (SΕQ ID NO:256). A sequence search using SWISS pro database indicates that the sequence possesses homologies with threonyl-tRNA synthetase of various organisms, from bacteria to yeast and human. 19. Clone d5
Clone d5 is an EcoRI clone that produces an immunoreactive protein (/3-galactosidase fusion protein) of about 70 kd on SDS-PAGΕ gel. The size of the cloned insert was determined to be 1795 bases (SΕQ ID NO:58), with an ORF from extending from nucleotides 1-1704. The open reading frame codes for a putative protein composed of 568 amino acids (SΕQ ID NO:59), and having a predicted molecular weight of 62.1 kd. The putative protein has a predicted pK value of 5.1, and the predicted antigenic determinant lies in the N-terminus region of the protein.
A computer-based analysis was carried out using the "PROSITΕ" program on "PCGΕNΕ" . Based upon this study, several signature sites were identified within the protein, which included a cAMP and cGMP dependent protein kinase phosphorylation site and a ATP-GTP-binding motif A (P-loop).
20. Clone d7
Clone d7 is a Hindlll clone that was blunt-ended, ligated to A/B linkers, digested with EcoRI, and subcloned into lambda vector ZAPII as described above for Example 4.1. The product is a β- galactosidase fusion protein. The nucleotide sequence is presented as SEQ ID NO:ll.
The induced protein was determined to have sizes of 40 kd and 35 kd. The corresponding protein sequence translated from SEQ ID NO: 11 is presented as SEQ ID NO: 12.
21. Clone O
Clone O is a EcoRI clone having an insert size of 2274 nucleotides (SΕQ ID NO:257). Clone f3 produces a /3-galactosidase fusion protein. The ORF extends from nucleotides 1-1788 and codes for a putative protein having a predicted molecular size of 67 kd (excluding the /3-galactosidase portion) (SΕQ ID NO:258). The calculated pi of the protein is 9.66. A sequence search against both the SWISS pro data base and the DNA data base reveals a
30% homology to penicillin binding proteins of Haemophilus influenza, Ε.coli, Bacillus subtilis and
Streptococcus pneumoniae. A search against the ΕMBL DNA data base reveals a 56% homology with penicillin binding proteins of Haemophilus influenza.
The size of the protein observed on SDS-gel is 25kd (major band), with a minor band of around 67kd. This size discrepancy may possibly be due to cleavage of the protein to produce a smaller fragment.
22. Clone f8
F8CON1 is a Hindlll subclone that produces an immunoreactive protein of 33-35 kd on SDS gel. The DNA sequence corresponding to the f 8 subclone is presented herein as SΕQ ID NO: 1. The cloned 1459 base DNA subfragment contains an open reading frame (ORF) from nucleotides 134 - 1042.
The putative protein encoded by the f8 subclone contains 303 amino acids, with a predicted molecular weight of 33.9 kd (which is in agreement with the expected protein size). The predicted antigenic determinant is at the 3' end of the putative protein - extending from amino acid 270 to amino acid 275 (nucleotides 941-958). The putative protein has a pK of 9.75. The corresponding protein sequence based upon a translation of SEQ ID NO:l is presented as SEQ ID NO:2.
W. Clone g2 Clone g2 is a Eco /Hindlll clone with an insert size of 2474 nucleotides (SΕQ ID NO:259).
The size of the immunoreactive protein observed on SDS gel was 25kd. The clone contains 4 ORFs. The first and last ORFs encode partial proteins, i.e., the complete ORF was not contained in the clone. ORFl extends from nucleotide 2 to 445 and codes for the carboxyl end of a putative protein of 148 amino acids (SΕQ ID NO:260). ORF2 extends from bases 461-1156, coding for a putative protein of molecular size 25.4 kd (SΕQ ID NO:261). ORF3 extends from bases 1156 to 1776 and codes for a protein having a size of about 22.2 kd (SΕQ ID NO:262). ORF4 (nucleotides 1798-2472) codes for the amino terminus of a putative protein of 24.4 kd (SΕQ ID NO:263). The predicted pis of ORF2 and ORF3 are 7.82 and 5.26, respectively. While ORF2 and ORF3 putative protein does not contain predicted transmembrane regions, ORF4 putative protein contains 3 predicted transmembrane regions with 6 predicted transmembrane helices.
23. Clone g9
Clone g9 possesses an insert size of 4292 bases. Two subclones, k4 and gll, are contained within clone g9. Clone g9 produces an immunoreactive protein band of about 85kd, whilst k4 has immunoreactive bands of 55 kd, 46 kd, 35 kd and 30 kd and gl 1 produces a weakly immunoreactive band of 22 kd.
The complete sequence of g9 was obtained by walking in the 5' direction from the gll C-terminus sequence, and walking in the 3' direction from k4 N-terminus sequence (SΕQ ID NO:264). Clone g9 has 5 ORFs. ORFl encodes a partial protein in the region extending from bases 2-349 (SΕQ ID NO:265). The corresponding predicted molecular size of the protein is 12.3 kd. ORF2 extends from bases 495-1403 (SΕQ ID NO:266), and the corresponding predicted protein possesses a molecular size of 33.8 kd. ORF2 is followed by a "AGGA" Shine-Dalgarno sequence, positioned in front of the ATG of ORF3 (bases 1418-2209). The predicted molecular size for the putative protein encoded by ORF3 is 29.7 kd (SΕQ ID NO:267). ORF4 corresponds to bases 2223-3719. The protein encodes by ORF4 has a predicted molecular size of 56.8 kd (SEQ ID NO:268). ORF5 extends from nucleotides 3133 to 4236, with a predicted protein molecular size of 19.9 kd (SEQ ID NO:269).
24. Clone Gla An immunoreactive 2.8 kb Hindlll fragment of the corresponding genomic EcoRI clone, having an insert size of 6.4kb, was treated with exo-mung nuclease to generate nested deletion clones for sequencing. The DNA sequence of a 1610 base pair region was determined (SΕQ ID NO: 8). Two open reading frames (ORFs) were determined: ORF 1, nucleotides 32-964; and ORF 2, nucleotides 1034-1570. The corresponding translated protein sequences are presented herein as SΕQ ID NO: 9 and SΕQ ID NO: 10, respectively.
25. Details of the characterization of additional library 2 clones, e6, dll, fl l, and d6 are provided in the Tables below.
Table 3a
Summary of Library 1 and Library 2 Clones
Figure imgf000059_0001
Figure imgf000060_0001
n.d. = not determined, underlined numbers for protein size is observed protein size on SDS gel. Predicted protein size does not include /3-galactosidase fusion peptide. Table 3b
10
15
20
25
30
35
Figure imgf000061_0001
Figure imgf000062_0001
n.d. = not determined, underlined numbers for protein size is observed protein size on SDS gel. where there are more than 2 ORF's, * indicates ORG expressed that is immunoreactive.
Table 4
Figure imgf000062_0002
Figure imgf000063_0001
Figure imgf000064_0001
Example 5A
Expression of H. pylori antigen coding regions Amplified products from various clone families, e.g., A3, A22, B2, B9, Cl, C5, C7, and Gla, were cloned into pGEX vectors (Pharmacia). The cloned constructs were expressed in E. coli strain (Stratagene, LaJolla, CA).
1. Clone B2 Expression primers GD96 (forward primer, SEQ ID NO:56) and GD97 (reverse primer, SEQ
ID NO: 57) were designed to introduce Ncol and BamHl sites for subcloning the B2 fragment into expression vectors pGEXdel65 and pGEXGLI. The expression product lacked the predicted transmembrane region.
Based upon the insolubility of the expressed proteins, only the expression product from pGEXdel65 was further purified. The protein was purified by affinity purification chromatography, employing Ni-NTA+ + resin from Qiagen (Chatsworth, CA). The purified protein was paneled against various sources of H. pylori-irάecteά sera. The serum paneling was carried out twice, with differing results: (i) 75% sensitivity, based upon 35 sera samples, and (ii) sensitivity of about 30% , based upon 183 sera. The serum panel was performed using protein that had undergone further SDS gel electroelution following affinity chromatography purification.
2. Clone Cl
Amplified products from clone C1CON6V2 were cloned into expression vector pGEXdl65 using primers BF (SEQ ID NO:40) and BG (SEQ ID NO:41) as outlined in Table 5 below. The expressed protein was confirmed to be immunoreactive; however, it was cleaved beyond the transmembrane region to form a smaller immunoreactive band. Based upon this observation, a new forward primer was designed for the expression of the smaller protein, primer BO (SEQ ID NO:42), as indicated in Table 5.
The resulting protein was determined to be immunoreactive, and formed in much higher yields than in the previous construct.
3. Clone B17
Expression primers CA (forward primer, SEQ ID NO:29) and CB (reverse primer, SEQ ID NO:30) were designed to introduce Ncol and BamHl sites for subcloning the B17 ORF4 fragment into expression vector GEXdelG5.
Expression primers CC (forward primer, SEQ ID NO:31) and CD (reverse primer, SEQ ID NO:32) were designed to introduce Ncol and BamHl sites for subcloning the B 17 ORF2 fragment into the expression vector.
4. Clone A3
Amplified products from the genomic A3 DNA were digested with NcollBamHl restriction enzymes and subcloned into the Ncol and BamHl sites of expression vector pGEXdl65 polyHis, using primers BP (SEQ ID NO: 18) and BQ (SEQ ID NO: 19) respectively. Expression of protein was induced at 0.5 mM IPTG (isopropylthiogalacto-pyranoside). The expressed protein was of the expected size, and was confirmed to be immunoreactive against Roost pooled sera.
5. Clone C7
PCR amplified products from H. pylori genomic DNA produced a DNA fragment of the expected size, which was then subcloned into the NcollBamHl sites of expression vector pGEXdel65 using primers GD100 (SEQ ID NO:22) and GD101 (SEQ ID NO:23) respectively. The expressed protein was of the expected size, and was highly insoluble. The protein was readily purifiable by Ni-NTA column chromatography as described above. Immunogenic screening confirmed the expressed protein to be immunoreactive against Roost pooled sera.
6. Clone Gla
PCR amplified products were subcloned into expression vector pdell65polyHis using primers BW (SEQ ID NO:33) and BX (SEQ ID NO:34) as outlined in Table 5 below. The expressed protein was confirmed to be immunoreactive. However, the expression product was internally cleaved, with cleavage most likely occurring at the transmembrane region.
Expression of additional cloned constructs was similarly carried out.
Table 5a
Primers Used for PCR Expression of DNA Sequences Encoding Antigens of H. pylori
Figure imgf000066_0001
Figure imgf000067_0001
Table 5b
Immunoreactivitv of Expressed Recombinant Antigens of H. pylori to H. pylori Infected Sera
Figure imgf000067_0002
Figure imgf000068_0001
All protein expression products were determined to be immunoreactive against Roost pooled sera, except for dHB9.4.2.
Example 5B
Purification of Illustrative Antigenic Proteins Expressed by H. pylori Clones Y175A. Y212A. and Y146B Clones Y104B, Y175A, Y212A and Y146B were expressed as E. coli recombinant proteins according to the protocol described above. Primer sequences for the E. coli expression clones are presented in the Sequence Listing. The recombinant proteins were purified and their immunogenicity was confirmed according to the general procedure described below.
A pellet from a shake flask was spun and then submitted to differential solubilization. Briefly, the pellet was homogenized in a solution containg PBS/5mM PMSF, and spun for 60 minutes at 4°C, 30k. Subsequent rounds of homogenization/centrifugation were as follows: (i) 100 mM Tris/2% Triton/2M Urea/5mM EDTA/0.5 mM DTT (homogenization step)/60 minutes at 4°C, 30k (centrifugation); (ii) PBS pH 7.8/2M Urea/0.5 mM DTT/ 60 minutes at 4°C, 30k; (iii) PBS pH 7.8/4M Urea/0.5 mM DTT/60 minutes at 4°C, 30k, (iv) PBS pH 8.0/6M Urea/0.5 mM DTT/60 minutes at 4°C, 30k; (v) PBS pH 8.0/6M urea/2M guanidine HCL/2mM BME /60 minutes at 4°C, 30k.
The supernatant from step (iv) above was then dialyzed into PBS pH 8.0/6M Urea/2mM BME, followed by chromatographic separation using a pre-packed column of Chelating Sepharose
(Pharmacia) loaded with five column volumes of 0.2M NiCl2. A 10 Column Volume (C.V.) gradient of Buffer A into 100% of Buffer B was utilized, where Nickel IMAC Buffer A contained PBS pH
8.0/6M Urea/2mM BME and Nickel IMAC Buffer B contained Buffer A and 250 mM imidazole.
The appropriate Nickel IMAC fractions were pooled and dialyzed overnight into PBS pH 8.0/6M Urea/0.5mM DTT, and the final product was then vialed and stored at -80°C.
Alternatively, glutathione-sepharose packed columns (Sigma) were utilized.
Fractions were typically analyzed by 1) Pierce Coomassie protein assay, 2) SDS-PAGE (on 12% gel), 3) Western blot using H. pylori Roost Pool #3, 4) GLT H.pylori negative pool and 5) Anti- E.coli polyclonal antibodies. The results of these analyses confirmed the immunogenicity of the expressed proteins.
Example 6A
Optimization of Antigen Concentration and Small Scale Serum Paneling for Purified Antigen from
Clones dHA22.8 (A22) and dHClS. l l (CD
1. Clone dHA22.8
Protein expressed by clone dHA22.8 (corresponding to clone A22) was isolated and purified as follows. 2000 ml of culture pellet was suspended in 200 ml of Buffer B (48g urea, 1.2 g NaH2P04,
0.12 g Tris-HCl, and 90 ml deionized water, adjusted to pH 8.0 and to a total volume of 100 ml), and sonicated for 10 minutes to effect cell lysis. The sonicated mixture was then rocked at room temperature for 1 hour, and spun at 15,000 rpm for 15 minutes at 10°C to remove the cell debris. To the supernatant was then added 1 ml of Ni-NTA resin (Qiagen GmgH, Hilden, Germany) which was then mixed at room temperature for 1-2 hours. To pellet the resin, the mixture was centrifuged at 5,000 rpm for 15 minutes at 4°C and the supernatant discarded. The resin was washed with 50 ml of Buffer C (48 g urea, 1.2 g NaH2P04, 0.12 Tris-HCl and 90 ml deionized water, adjusted to pH 6.3 and brought to a total volume of 100 ml), centrifuged, and the supernatant discarded. The resin was then loaded into a disposable column, and the wash step repeated with remaining Buffer C. The protein was then eluted with 50 ml of Buffer III (50 mM sodium phosphate, pH 8.0, 300 mM NaCl, 250 imidazole). Fractions were collected (2 ml) and analyzed on 12% SDS-PAGE gel.
The purified protein was then screened with serum antibodies using conventional techniques (Ausubel, et al, 1988). The purified protein was slotted at concentrations of 0.1-20 μg/ml 0.1M carbonate buffer, blocked with 5 % skim milk in TBS buffer, and reacted with H. pylori positive and H. pylori negative sera diluted 100 times. A serum panel consisting of 36 total sera was used as indicated above. Positive sera was from Roost pool #2 or SFA 001 pool; negatives were from donor packs.
Following a 1 hour incubation period, the strips were washed 3 times with TBS buffer, and incubated with alkaline phosphatase-conjugated anti-human Ig secondary antibodies (Promega Biotech, Madison, WI) diluted 1000 times in 5% skim milk in TBS buffer. The strips were then washed 3 times with TBS wash buffer. Immunoreactive proteins were developed with a substrate, e.g. , BCIP, (5-bromo-4-chloro-3-indolyl-phosphate), and NBT (nitro blue tetrazolium salt (Sigma)). The development of color indicated the highly antigenic nature of the purified A22 protein (i.e. , reactive with H. pylori-positive sera).
The optimum concentration of antigen was determined to be 2.0 mg/ml. The dHA22.8 (A22) expression product reacted with anti-H. pylori antibodies present in both pooled sera sources (Roost pool #2 and SFA 001 pool), and with 100% of the individual H. pylori-positive sera samples tested. The antigen exhibited no detectable cross reactivity with normal sera.
2. Clone dHClS. i l (Cl)
Protein expressed by clone dHClS. l l (Cl) was isolated and purified according to the following protocol.
Protein was extracted from a culture pellet by differential solubilization using a series of homogenization/centrifugation steps. The pellet was (i) homogenized in PBS/1 mM PMSF and centrifuged for 30 minutes at 4°C, 19,000 rpm, followed by separation of the supernatant, (ii) homogenized using 1 M urea/ 10 mM Tris at pH 8.0/10 mM DTT, followed by centrifugation as in (i) and separation of the supernatant; (iii) homogenized in 4 M urea/10 mM Tris pH 8.0/2 mM BME, followed by centrifugation as in (i) and separation of the supernatant; and finally, (iv) homogenization of the pellet with 6 M urea/lOmM Tris pH 8.0/2 mM BME, followed by centrifugation as described above. To the combined supernatants was added 150 mM NaCl. The protein was then separated from the combined supernatants by immobilized metal adsorption chromatography (IMAC) using a 5 ml prepacked column containing chelating sepharose (Pharmacia, Chelating Sepharose Fast Flow) loaded with 2 C.V. (column volumes) 0.2 M NiCl2. The protein was eluted from the column using a 20 C.V. gradient of Buffer A (4 M urea/10 mM Tris at pH 8.0/0.2mM BME/150 mM NaCl) into 50% Buffer B (Buffer A to which was added 0.5 M imidazole). Fractions were collected (2 ml) and analyzed on 12% SDS-PAGE gel.
The optimal concentration for immunoscreening was determined essentially as described above. Serum panelling as described above was repeated for the purified Cl protein expressed by clone dHClS.il .
As in the above example, the expressed protein was reactive with both sources of pooled H. pylori-positive sera, and showed no signs of cross reactivity with control sera (H. pylori negative samples, 1 sample of pooled sera and 13 individual samples). Additionally, as can be seen in panels
20-35 and A-C, the recombinant protein exhibited immunoreactivity with 95 % of the H. pylori- positive samples.
In summary, for this small sized panel study, A22 and Cl both exhibit high sensitivity and specificity (> 90-95%)
Table 5c
Clinical Data for Sera Used for Serum Panelling of Clones dHA22.8 and dHClS. l l
Figure imgf000071_0001
Figure imgf000072_0001
ND = not available.
Example 6B
Further Sera Paneling with Additional Sera Panels for Purified Antigens: Selection of Preferred
Antigens Additional sera paneling studies were conducted with other sera panels (USC for University of Southern California, and POW for Prince of Wales Hospital) to further examine the sensitivity and specificity of various H. pylori antigens.
Gold standard tests were employed as reference standards to compute the sensitivities and specificities of each individual antigen. Standards used were conventional gold standard tests including histology, CLO test, and the UBT test. UBT is the current most widely used gold standard for indication of "active infection" by non-invasive methods. Histology, CLO (rapid urease test) and UBT are all indicate "active infection" since these tests depend on the presence of bacteria to produce a positive result. Since serology measures both past and present infection, the aim of this study was to (i) identify serological markers that would correlate with UBT results for use as "active infection markers", and (ii) to explore both single and muliple antigen combinations to effectively screen for active infection by H. pylori.
A. Use of UBT as a Gold Standard in Single Antigen Screening
Since different antigens have different performances in different panels (this may be a reflection of the geographical differences due to different strains of H. pylori and previous exposures of populations to other cross-reacting antigens from other bacteria endemic to the region), the use of single antigen markers for the local population was explored using UBT as a single reference standard. 1. USC Panel. Based upon this panel, the best single antigen selection appeared to be antigen Y261A or Y124A. Both of these antigens provided an overall balance of sensitivity and specificity (> 80% for both sensitivity and specificity for Y261, and 100% sensitivity with 70% specificity for Y124A). Other preferred antigens, A22 and Cl, were 95-100% sensitive, and 67-77% specific, respectively in this panel.
2. POW Panel. Based upon this paneling study, the best single antigen selection appeared to be Y128D12S and Y124A, where sensitivity and specificity were about 75-80% for Y128D12S, and both about 80% for Y124A. Another preferred antigen is Y261A, which exibited a sensitivity of 70% and a specificity of 80% . For this paneling series, preferred clones A22 and Cl were found to be about 95-98% sensitive, and about 60% specific.
B. Using Other Gold Standards
1. Roost Panel. Only positive sera was tested, and antigens A22 and Cl exhibited 90% sensitivity. Y124A exhibited about 80% sensitivity, and Y261A possessed a sensitivity of approximately 87% .
2. UNSW Panel In this paneling series, A22 and Cl demonstated greater than 90-100% specificity. Y124 demonstrated 80% sensitivity and 96% specificity, and Y261A demonstrated 80% sensitivity and 100% specificity.
It was observed that the sensitivity and specificity values do not change significantly whether UBT is used alone or in combination with other gold standards in the USC and POW panels. Thus, the somewhat lower specificity for A22 and Cl may be a reflection of previous exposure to other cross-reacting antigens.
C. 2-4 Antigen Combination Tests
In view of the high sensitivity of A22 and Cl and somewhat lower specificity for certain panels (POW, USC), a multiple antigen format was explored. The 2-4 antigen format was based on the criteria that, for any 2 -3 highly sensitive antigens, at least both or all 3 have to be positive with respect to the criteria to increase the specificity.
This "2 antigen both positive" criteria was applied to a selected 12 antigen set, where the 12 antigens were selected for a sensitivity of at least 40-50% for consideration.
The sensitivities and specificities of the 2 antigen both positive criterion is computed, and the resulting table is examined for good performers. Additionally, 2 antigen combinations with high specificity and lower selectivity is run against the entire two antigen combination matrix to provide a result indicating "2 antigen both positive or 2 other antigen both positive". The final analysis then provides a selection of commonly occurring antigens that provide good results across the board between the USC and POW panels.
D. Results of the 2-4 Antigen Combination Test
For the USC panel, there are 15 antigenic combinations that provide around 95% sensitivity and 100% specificity.
For the POW panel. 10 different antigen combinations were shown to provide greater than 70% sensitivity and around 80-90% specificity. Based upon these findings, exemplary combination selections for both panels are as follows:
(a) Cl and Y124A both positive (sensitivity of 76.2 and 100% and specificity of 89.1 and 95.2% for POW and USC panels respectively), and
(b) Cl and Y124A and A22 all positive (sensitivity of 71.4 and 94.7% and specificity of 91.3 and 95.2% respectively for POW and USC panels).
For the USC panel, an antigen combination of Cl and c5 was preferred, and for the POW panel, antigens combinations Y261A and Y124A or antigens C7 and B2 also were preferred.
Thus, based upon these sera paneling data, preferred antigens for use in reliably and universally detecting H. pylori infection include but are not limited to the following: A22, Cl , Y124A, Y261A, c5, C7, B2, Y104B, and Y128D.
Table 5d
Hp Recombinant Antigen Sensitivities
Figure imgf000074_0001
Figure imgf000075_0001
Table 5e
Hp Recombinant Antigen Specificities
Figure imgf000075_0002
Figure imgf000076_0001
A graphical summary of sensitivity and specificity data for representative purified H. pylori antigens in different sera panelling studies is presented in Figs. 65 and 66.
Example 7
Purification of 36 kD and "Spot 15" Antigens Produced by H. pylori
Cultures of H. pylori (ATCC No. 43504) were grown under appropriate conditions and the cells harvested into phosphate buffered saline. This was followed by repeated centrifugation to remove cell debris and other contaminants. The resulting pellet was then lysed in a French press at
> 10,000 PSI, followed by centrifugation at 10,000 x g for 25 minutes.
Prefractionation of soluble H. pylori lysate supernatant was carried out only when enriching for spot 15 (high pH), for subsequent sequence/structural investigations (e.g. , mass spectrometry, peptide map).
Fractionation of the 36 kD protein was achieved by dialysis to 10 mM Tris buffer, at pH 8.0,
50 mM NaCl, followed by gradient elution over anion exchange resin, Resource Q (Pharmacia
Biotechnology, Piscataway, NJ). SDS-PAGE/Western blot analysis indicated a Western-positive doublet at 30 kD and a positive 36 kD band. Both fractions were determined to be highly immunoreactive when tested against Roost pooled sera.
Fractionation of the spot 15 antigen was carried out in a similar fashion. The sample was passed over Sephacryl S-100 ion exchange resin (Pharmacia Biotechnology, Piscataway, NJ) (to 10 mM phosphate buffer, 50 mM NaCl, pH 8.0), followed by further fractionation by gradient elution over Resource S cation exchange resin (Pharmacia Biotechnology, Piscataway, NJ). The fraction containing spot 15 antigen was analyzed by SDS-PAGE and Western blot, and was similarly confirmed to be highly antigenic in nature, as indicated by Western blot and immunoreactivity with Roost pooled sera.
Example 8
Characterization of 36 kD and Spot 15 H. pylori Antigens
Following fractionation of the antigenic proteins as described in Example 7 above, the isolated fractions were further characterized by two-dimensional electrophoresis, carried out according to the methods of O'Farrell (1975, 1977).
1. 2-Dimensional Electrophoresis Under Low pH Conditions
Two-dimensional electrophoresis was performed essentially as described by O'Farrell (1975). Isoelectric focusing was carried out in glass tubes having an inner diameter of 2.0 mm, using 2% ampholines (BDH, Hofer Scientific Instruments, San Francisco, CA) for 9600 volt-hrs. The final tube gel pH gradient as measured by a surface pH electrode is on the enclosed pH gradient form.
Following equilibration for 10 minutes in Buffer O (10% glycerol, 50 mM DTT, 2.3 % SDS, and 62.5 mM tris buffer, pH 6.8), the tube gel was sealed to the top of the stacking gel, which was placed on top of a 10% acrylamide slab gel (0.75 mm thick). SDS slab gel electrophoresis was carried out for 4 hours at 12.5 mA/gel. The slab gels were then fixed in a solution of 10% acetic acid-50% methanol overnight. The following proteins were added as molecular weight markers to the agarose which sealed the tube gel to the slab gel: mysin (220 kD), phosphorylase A (94 kD), catalase (60 kD), actin (43 kD), carbonic anhydrase (29 kD), and lysozyme (14 kD). These standards appear as horizontal lines on the silver-stained (Oakley, et al, 1980) 10% acrylamide slab gels. The silver stained gel was dried between sheets of cellophane paper with the acid edge to the left.
2. Western Blot
Following slab gel electrophoresis, a duplicate gel was transferred to transfer buffer (12.5 mM Tris, pH 8.8, 86 mM glycins, 10% methanol), transblotted onto PVDF paper overnight at 200 mA (approximately 50 volts/gel). The blot was blocked for 2 hours in 2% bovine serum albumin (BSA) in TTBS (Tween-Tris-Buffered Saline), rinsed in TTBS, incubated in primary antibody from Roost pool or negative serum pool and diluted 1:2500 in 1 % BSA/TTBS for 2 hours, rinsed in TTBS and placed in a solution containing secondary antibody (antihuman IgG horse radish peroxidase, 1:5000 diluted in TTBS) for 1 hour. The blot was rinsed with TTBS, treated with ECL (Amersham Corporation, Arlington Heights, IL), and exposed to X-ray film.
A computer-generated photograph of an exemplary stained membrane containing antigenic proteins from H. pylori as described above is shown in Figure 2. The "normal" human sera was confirmed to be H. pylori-negative using "HELICOBLOT 2.0" (Genelabs Diagnostics (PTE) Ltd.,
Singapore). The identities of the spots indicated numerically on the gel were determined by protein sequencing (described in Example 9 below).
3. 2-Dimensional Electrophoresis Under High pH Conditions Two-dimensional electrophoresis adapted for the resolution of basic proteins was performed according to the method of O'Farrell (O'Farrell, et αl, 1977).
Nonequilibrium pH gradient electrophoresis using 1.5% pH 3.5-10 and 0.25% pH 9-11 ampholines (Pharmacia Biotechnology, Piscataway, NJ) was carried out for 140 volts for 12 hour.
Purified tropomyosin, with a lower spot molecular weight of 33 kD and a pi of 5.2, and purified lysozyme, with a molecular weight of 14 kD and a pi of 10.5-11 (Merck Shaφe & Dohme,
Philadelphia, PA) were added to the samples as internal pi markers.
After equilibration for 10 minutes in buffer (10% glycerol, 50 mM DTT, 2.3 % SDS and 62.5 mM tris, pH 6.8), the tube gel was sealed to the top of the stacking gel, which was placed on top of a 10% acrylamide slab gel (0.75 mm thick), and SDS slab gel electrophoresis was carried out for 4 hours at 12.5 mA/gel. The slab gels were fixed in a solution of 10% acetic acid/50% methanol overnight. The following proteins were added as molecular weight standards to the agarose which sealed the tube gel to the slab gel: myosin (220 kD), phosphorylase A (94 kD), catalase (60 kD), actin (43 kD), carbonic anhydrase (29 kD) and lysozyme (14 kD). These standards appear as horizontal lines on the silver stained (Oakley, et αl, 1980) 10% acrylamide slab gels. The silver stained gel was dried between sheets of cellophane paper with the acid edge to the left.
A duplicate gel was transblotted onto PVDF paper and Western blotting carried out as described in 8.1.b. above.
A computer-generated photograph of an exemplary Western blotted membrane containing antigenic proteins from H. pylori as described above is shown in Figure 1. The identities of the spots indicated numerically on the gel were determined by protein sequencing (as described in Example 9 below). Example 9
Isolation and Protein Sequence Determination for Western-Positive Spots
1. N-terminal Sequencing
The PVDF blots from Example 8 above were stained with Coomassie brilliant blue. Spots corresponding to Western positive bands were excised by scalpel and sequenced directly using a Hewlett-Packard G 1005 A N-terminal sequencer with conventional sequencing techniques {e.g. , Miller, 1994; Spiecher, 1989). The sequencing techniques employed gave high repetitive yields (typically ranging from 93-98%), with a detection limit of approximately 100-200 fmol.
2. Internal Sequencing
Four different methods were utilized to obtain internal sequence information for the above- described isolated antigens of H. pylori (Allen, 1981): Lys-C peptide map and sequencing, CNBr peptide map and sequencing, OPA/CHBr peptide map and sequencing, and LC-MS/MS sequencing of Lys-C digests.
a. Cyanogen Bromide Cleavage. Cyanogen bromide (CNBr) cleavage was performed on PVDF membranes in 70% formic acid (Crimmins and Mische, 1996). The cyanogen bromide digested peptides were then either repurified by capillary HPLC and sequenced directly, or subjected to ortho-phthalaldehyde (OP A) modification.
b. Ortho-phthalaldehvde (OP A) Modification. Digested peptides were subjected to OPA modification according to the method of Bauer (Bauer, et al, 1984). This reagent is used to modify primary, but not secondary amines, thereby allowing the identification of sequences having proline as their N-terminal residue and the silencing of all other sequences as determined by Edman- type N-terminal sequence analysis.
c. Liquid Chromatographv/Mass Spectrometry. Internal sequence information was also obtained from liquid chromatography /mass spectrometric based analysis. Copper stained gel slices (Nakayama, et al, 1996) and PVDF-transferred proteins were first pre-treated by extraction, preliminary purification using Sep-Pak solid phase extraction (Millipore Coφoration, Bedford MA), and protease (Lys C) digestion.
Liquid chromatography-mass spectrometry (LC-MS) was performed on digested proteins to assign internal sequence residues by peptide mass and ion trapping techniques (McAtee, et al, 1996). The digests were chromatographed on a Vydac C18 reverse phase microbore column (150 mm x 1 mm) using an ABI Model 410 B dual syringe pumping system (Applied Biosystems Division, Perkin Elmer, Foster City, CA). The flow rate was maintained at 50 microliters/minute and elution was carried out using a linear gradient from 0.1 % aqueous TFA to 0.1 % TFA in acetonitrile. A Carlo Erba Phoenix 20 CU pump was used to deliver a mixture of methoxy ethanol and isopropanol (1:1, v/v) at a rate of 50 microliters per minute, which was combined with the column eluent in a post column mixing chamber. An in-line flow splitter was used to restrict flow to the mass spectrometer to approximately 10 microliters/minute. Detection was performed immediately following elution from the column at 214 nanometers using an ABI 759 variable wavelength detector. Mass spectrometric detection was achieved following post column solvent addition and flow splitting by a VG BioQ triple quadruple mass spectrometer with a nano-electrospray ion source. Spectra were recorded in the positive ion mode using electrospray ionization. Calibration of the instrument was performed in the range m/z 500-2000 by using direct injection analysis of myoglobin. Spectra were recorded at 1.5 seconds intervals and a drying gas of nitrogen was used to aid evaporation of the solvent. The capillary voltage was maintained at approximately 4 kV with a source temperature of 60°C.
3. Sequencing Results
Utilizing the approaches described above, the spots indicated in the Western Blots (Figs. 1 and 2) were identified. The identities of the spots (where known) are presented in Table 6 below.
Table 6
H. pylori Antigens Reactive Sera With H. pylori Infected Humans
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Referring to the low pH blot presented as Fig. 2, spots 9, 11, 12, 13, 15, 16 (major and minor), and 17 represent unique antigens. Spot 9 represents the native H. pylori antigen corresponding to the recombinant protein expressed by ORF2 of clone al, while spot 12 represents the native antigen that corresponds to the antigenic protein encoded by clone a5.
Looking at the high pH Western blot in Fig. 1, spots 9 (major and minor), 10, 12, 13, and 15 (major and minor) represent unique antigens. As will become apparent from the results below, it was later determined that the N-terminus of the spot 15 antigen (high pH) was blocked, thus, the N-terminal sequence determined for spot 15 by 2-dimensional gel analysis was later shown to be inaccurate.
Example 10
Sequence Analysis of the 36 kD Antigenic
Protein and Spot 15 Antigen Isolated from
Various Sources of H. pylori
The H. pylori spot 15 antigen as described above was isolated from 3 different sources of H. pylori as described in Examples 8 and 9. The H. pylori samples used were the following: H. pylori, ATCC 43504 (Australian type strain); H. pylori, strain # 26695, and H. pylori, J-170, (both obtained from Washington University School of Medicine, St. Louis, MO). The corresponding protein isolated from each source was then analyzed by reverse phase high performance liquid chromatography (HPLC) as described in Example 9.
Minor differences in the native spot 15 proteins from various H. pylori strains were observed, particularly in the early-eluting peak regions, and the spot 15 antigens derived from the TIGR and ATCC H. pylori sources appear to be closely related. The minor differences included slight minor amino acid changes, deamidations, and perhaps a side chain modification, as suggested by a high molecular weight {e.g. , lipopolysaccharide or glycosylation) modification in the TIGR-derived sample. The J-170 derived spot 15 protein appears to contain additional high molecular weight modifications, as indicated by the differences in molecular weight.
However, all peaks exhibited strong antigenicity when screened against Roost pooled sera, indicating that any minor differences in amino acid sequence among the native proteins does not adversely affect the high immunoreactivity of the recovered spot 15 antigen.
Based upon the characterization data described above, the spot 15 antigen appears to be a processed product of the putative 36 kD protein, as shown in Fig. 4.
A MALDI-TOF MS comparison of peptides generated from an in situ Lys-C digest of "Spot 15", generated by the above-described strains of H. pylori, was carried out. In examining the total digest, the data indicated that strain 26695 and strain 43504 are more conserved in their primary sequence, and possibly in their post-translational modification profiles than are strains 43504 and J- 170, or strains 26695 and J170. This data, along with that presented above, suggests that although immunoreactivity to Spot 15 is noticeably present in all strains examined, there are differences in amino acid content and protein modifications between Spot 15 in the various strains. The Lys-C digests were also examined by reversed phase HPLC. Not all of the peptides from the total digest were observed by RP-HPLC, however, this was attributed to the presence of extremely hydrophilic and/or hydrophobic peptides, not detectable within the operating parameters of the reverse phase column. Based upon an examination of RP-HPLC elution profiles, the peptide mass fingeφrints appeared to be more conserved between strains than was indicated by the MALDI-TOF MS results. Preferentially, the MALDI-TOF and other forms of MS (electrospray, fast atom bombardment, etc.) would present a more accurate determination of structural analysis.
Table 7
Comparison of Molecular Masses of Spot 15 Lvs-C Digests bv MALDI-TOF MS
Figure imgf000083_0001
Figure imgf000084_0001
Table 8
Comparison of Molecular Masses of Spot 15
Lvs-C Digests Chromatographed on C,»
Reverse-Phased HPLC
Figure imgf000084_0002
51
Figure imgf000085_0001
Example 11
Antigenic Reactivity Between Various Strains of H. pylori. Separated by 2-D Electrophoresis
Various strains of H. pylori were examined to determine the extent of conservation of the proteins identified in ATCC strain 45304, as described in Examples 7 and 8 and shown in Fig. 1, among other strains of H. pylori.
Exemplary H. pylori strains tested were the following: Chico (clinical isolate from Oroville
Community Hospital), TIGR 26695, 96-212 (Alaskan isolate, from Aleut), 4655-1 (Gambian child),
J170, A-lc (Lithuanian isolate, which contains at least 6 kB of DNA referred to as "X" segment as an insertion near cag region and not present in C-3c), Rus-95 (isolated from Russian immigrant to the United States), #9 (Peruvian isolate), C-3c (Lithuanian isolate).
These various strains of H. pylori were purified, fractioned by 2D electrophoresis and Western blotted as described for strain ATCC No. 45304 in Examples 7 and 8, to determine whether the immunoreactive spots indicated in Figs. 1 and 2 set by strain 45304 were conserved between strains.
The results are summarized in Tables 9 and 10 below. Table 9
Comparison of Helicobacter Strains by 2D-PAGE (pH 8-13)
Figure imgf000086_0001
Table 10
Comparison of Helicobacter Strains by 2D-PAGE (pH 4-8)
Figure imgf000086_0002
Figure imgf000087_0001
As can be seen from the above, the immunogenic proteins isolated from H. pylori strain
ATCC 45304 are highly conserved between various strains of H. pylori. One of the preferred antigens of the invention, the spot 15 antigen (high pH), was observed in 100% of the representative strains examined. Western blot results further supported the highly antigenic nature of this protein.
Example 12
MALDI-TOF Mass Spectrometry Based Mapping of Antigens of H. pylori
Gel pieces stained with Coomassie or compatible silver stain as described in Example 9 were transferred to a microcentrifuge tube and rehydrated with 10 microliters of water. The gel pieces were then washed 3 times with 500 microliters 50% acetonitrile/0.05 M Tris-HCl, pH 8.5 for 20 minutes.
The supernatants were discarded and the washed pieces were dried for 30 minutes in a Speed- Vac concentrator. Five microliters of a solution containing 0.05 micrograms Lys-C was added to the tube and incubated for 20 hours at 32°C. Following digestion, the gel pieces were extracted three times with 30 microliters 50% acetonitrile/0.1 % TFA. The supernatants were transferred to a 0.5 ml microcentrifuge tube and dried. The extracted peptides were redissolved in 4 microliters 4-hydroxy- alpha-cyano cinnamic acid and a 0.8 microliter sample was spotted onto a MALDI sample plate. MALDI mass spectroscopic analysis was then performed on a PerSeptive BioSystems Voyager DE mass spectrometer. Peptide mass peaks were then compared to data within the PIR, NRDB, Genebank, EMBL, TIGR H. pylori and The Swiss Protein Databases using the MS-FIT program (UCSF).
The results of matching mass spectral profile peaks of particular Lys-C digested proteins with database information is summarized in Tables 11 and 12 below.
Table 11 MALDI-TOF Peptide Mass Mapping, pH 4-8 proteins
Figure imgf000088_0001
Table 12 MALDI-TOF Peptide Mass Mapping, pH 8-13 Proteins
Figure imgf000089_0001
Example 13
Antigens for Use in Vaccines Representative antigens described herein were evaluated as vaccine candidates on the basis of Western blot analyses (as described above) of sera obtained from patients prior to and after antimicrobial treatment.
Briefly, prior to treatment and twelve months after antimicrobial treatment, the titres of antibodies reactive against the cloned antigens were determined. Antigens corresonding to antibodies whose titre remained high for an extended period of time subsequent to treatment were determined to be good vaccine candidates. Two serum panels, the Gasbarrinni panel and the Greenberg panel were used for these analyses. Sera from patients were obtained prior to treatment and 12 or 24 months after treatment. Patients who tested positive by the UBT after treatment were not included in this analysis, due to active infection by H. pylori. The Gasbarrinni panel was obtained from male and female patients in Italy, aged 18-70, who were diagnosed with H. pylori infection by endoscopy and UBT. Prior to antimicrobial treatment and 12 months after treatment, serum was collected from the patients.
The Greenberg panel was obtained from male and female patients in California, also aged 18- 70, who were diagnosed as having H. pylori infection, as confirmed in antibody tests. Prior to antimicrobial treatment and 24 months after treatment, serum was collected from the patients.
Table 13 summarizes the data obtained from the Gasbarrinni panel. The numbers and percentages of patients who exhibited high antibody titre against the indicated antigens at 12 months are indicated therein. Twelve months after treatment, a high percentage of patients continued to exhibit high antibody titre against clones Y139, Y146B, Y175A, and A22.
Table 14 summarizes the data obtained from the Greenberg panel. The numbers and percentages of patients who exhibited high antibody titre against the indicated antigens at 24 months are shown. Twenty four months after treatment, a high percentage of patients continued to exhibit high antibody titre against clones Y184A, Z9A, Y261Ains and Y146B. Since antigens which invoke a long-lasting antigenic response are considered to be good vaccine candidates, and based upon the results provided in the Tables below, the following antigens are considered to be preferred vaccine candidates: Y139, Y146B, Y175A, and A22, Y184A, Z9A, Y261Ains and Y146B.
Table 13 Gasbarrini Panel
Figure imgf000090_0001
Figure imgf000091_0001
Table 14 Greenberg Panel
Figure imgf000091_0002
While the invention has been described with reference to specific methods and embodiments, it will be appreciated that various modifications and changes may be made without departing from the invention.
SEQUENCE LISTING
(1) GENERAL INFORMATION
(i) APPLICANT: Genelabs Technologies, Inc.
(ii) TITLE OF THE INVENTION: Antigenic Composition and Method of Detection for Helicobacter pylori
(iii) NUMBER OF SEQUENCES: 602
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Dehlinger & Associates (B) STREET: P.O. Box 60850
(C) CITY: Palo Alto
(D) STATE: CA
(E) COUNTRY: USA
(F) ZIP: 94306
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette
(B) COMPUTER: IBM Compatible
(C) OPERATING SYSTEM: DOS (D) SOFTWARE: FastSEQ for Windows Version 2.0
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: not yet assigned
(B) FILING DATE: 25-APR-1998
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: 60/045,107
(B) FILING DATE: 25-APR-1997 (A) APPLICATION NUMBER: 60/061,958
(B) FILING DATE: 14-OCT-1997
(viii) ATTORNEY/AGENT INFORMATION: (A) NAME: Evans, Susan T (B) REGISTRATION NUMBER: 38,443
(C) REFERENCE/DOCKET NUMBER: 4600-0126.41
(ix) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: 650-324-0880 (B) TELEFAX: 650-324-0960
(2) INFORMATION FOR SEQ ID NO:l: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1459 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: f8
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:
AGCTTTCAAG GGGGATAAAA TATAGATTAA GGCTAGTGAG CTAAAGGCCA CTTTTATTGA 60
TACGGATAAA GTTTATGTGC TTCTAAGAAT CACTAAGAAG CATGTCGCTT TAATGAACGA 120
GTAAGGATTA ATAATGAAAA AGATTATTCT TGCATGCCTT ATGGCTTTTG TGGGTGCTAA 180
TTTAAGCGCG GAGCCTAAGT GGTATGGCAA GGCCTATAAC AAAACAAACA CCCAAAAAGG 240 CTATCTTTAT GGGAGTGGTT CAGCCACTTC TAAAGAAGCT TCTAAACAAA AAGCGTTAGC 300
GGATTTAGTG GCGTCTATTA GCGTGGTGGT TAATTCCCAA ATCCATATCC AAAAAAGTCG 360
TGTGGATAAT AAGTTAAAAT CCAGTGATTC GCAAACGATT AACTTAAAGA CCGATGACTT 420 GGAATTGAAT AATGTAGAAA TTGTCAATCA AGAAGCGCAA AAAGGGATCT ACTACACCAG 480
AGTAAGGATC AATCAAAACT TGTTTTTGCA GGGTTTAAGG GATAAGTATA ACGCTCTTTA 540
TGGGCAGTTT TCCACCTTAA TGCCTAAGGT TTGTAAAGGG GTTTTTTTAC AGCAATCCAA 600
GAGCATGGGG GATTTATTGG CTAAAGCGAT GCCTATAGAA AGGATTTTAA AAGCGTATTC 660 TGTTCCGGTG GGTTCGTTAG AAAATTATGA AAAAATCTAT TATCAAAACG CTTTCAAACC 720
TAAAGTGCAA ATCACTTTTG ATAACAACAG CGATACAGAG ATTAAAAACG CTCTCATAAG 780
CGCTTATGCC AGAGTGCTAA CCCCTAGCGA TGAAGAAAAA CTCTATCAAA TCAAAAATGA 840
AGTTTTCACA GACAGCGCTA ATGGCACCAC ACGCATTAGA GTGGTCGTTA GCGCGAGCGA 900
TTGTCAAGGC ACGCCCGTAC TGAATAGAAG CCTTGAAGTG GATGAAAAGA ATAAGAATTT 960 TGCTATCACG CGCTTGCAAT CTTTGCTTTA TAAAGAACTG AAAGATTATG CCAATAAAGA 1020
AGGGCAAGGC AATACGGGGT TATAAGCAGG ATATTAATCT TGCTTGCACT TTTGGTTATT 1080
GATTCAGTCC GGTAGCGTTA TCATTTTGTT TGGCGTGGCG TTTTCGCACC CATTAAAATA 1140
ACTCTCTAAA GGTTTGTGGC TTTTGTTCTC GCCACAAATA AACTAAATAT CATTCAATAC 1200
TCTTTATCCC ACAAGCTCAT CTAAAACCAC ACCCGCTAAA AACTAAAATT AACAAAAACT 1260 AAAATCTTTT TTAAGAGCCT ACACGAGCGA GCAAAAAGAA TGACAATCAA TAAAAACGAA 1320
TTTTACAACA ATTTTAACAA CTTGGGTGCT CTCACAATCT ATTACGCTTT GCATGGATTG 1380
CATAAAGAAC CTCTCTTAAT ACAATCTTTA TTTTTTTAAA ACCCTGATTT TAGCGCTCAT 1440
TAAATCGTGG TTTAAAGCT 1459 (2) INFORMATION FOR SEQ ID NO : 2 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 303 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: f8 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 :
Met Lys Lys lie lie Leu Ala Cys Leu Met Ala Phe Val Gly Ala Asn
1 5 10 15
Leu Ser Ala Glu Pro Lys Trp Tyr Gly Lys Ala Tyr Asn Lys Thr Asn 20 25 30
Thr Gin Lys Gly Tyr Leu Tyr Gly Ser Gly Ser Ala Thr Ser Lys Glu
35 40 45
Ala Ser Lys Gin Lys Ala Leu Ala Asp Leu Val Ala Ser lie Ser Val 50 55 60 Val Val Asn Ser Gin lie His lie Gin Lys Ser Arg Val Asp Asn Lys 65 70 75 80
Leu Lys Ser Ser Asp Ser Gin Thr lie Asn Leu Lys Thr Asp Asp Leu
85 90 95
Glu Leu Asn Asn Val Glu lie Val Asn Gin Glu Ala Gin Lys Gly lie 100 105 110
Tyr Tyr Thr Arg Val Arg lie Asn Gin Asn Leu Phe Leu Gin Gly Leu
115 120 125
Arg Asp Lys Tyr Asn Ala Leu Tyr Gly Gin Phe Ser Thr Leu Met Pro
130 135 140 Lys Val Cys Lys Gly Val Phe Leu Gin Gin Ser Lys Ser Met Gly Asp
145 150 155 160
Leu Leu Ala Lys Ala Met Pro lie Glu Arg lie Leu Lys Ala Tyr Ser
165 170 175
Val Pro Val Gly Ser Leu Glu Asn Tyr Glu Lys lie Tyr Tyr Gin Asn 180 185 190
Ala Phe Lys Pro Lys Val Gin lie Thr Phe Asp Asn Asn Ser Asp Thr
195 200 205
Glu lie Lys Asn Ala Leu lie Ser Ala Tyr Ala Arg Val Leu Thr Pro
210 215 220 Ser Asp Glu Glu Lys Leu Tyr Gin lie Lys Asn Glu Val Phe Thr Asp
225 230 235 240
Ser Ala Asn Gly Thr Thr Arg lie Arg Val Val Val Ser Ala Ser Asp
245 250 255
Cys Gin Gly Thr Pro Val Leu Asn Arg Ser Leu Glu Val Asp Glu Lys 260 265 270
Asn Lys Asn Phe Ala lie Thr Arg Leu Gin Ser Leu Leu Tyr Lys Glu 275 280 285 Leu Lys Asp Tyr Ala Asn Lys Glu Gly Gin Gly Asn Thr Gly Leu 290 295 300
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1023 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
AGCTTTTAAA AGAAAGGGTT GAGGGGCAGT TGTTTTTAGA AAATAAAGCC AGGCTCTATA 60
ATGAAGAGTT GAAAGAAAAA TTGATTGAAA ATTTAGATGA AAAGATTGTT TTTGATTTGC 120
CTAAAACGAT CATAGAGCAA GAAATGGATT TGTTGTTCAG GAACGCTCTT TATTCCATGC 180 AAGCTGAGGA AGTCAAATCC TTACAAGAAA GTCAAGAAAA AGCCAAAGAA AAGCGTGAGA 240
GCTTTAGGAA CGATGCCACA AAAAGCGTGA AAATCACTTT TATCATTGAC GCTTTAGCGA 300
AAAGAAGAAA AATTGGCGTG CATGACAATG AAGTCTTTCA AACCTTGTAT TATGAAGCGA 360
TGATGACAGG GCGAAACCCA GAAAGTCTCA TTGAACAATA CCGCAAAAAT AACATGTTAG 420
CGGCGGTGAA AATGGCGATG ATTGAAGATA GGGTGTTAGC TTATTTGTTG GATAAAAACC 480 TGCCTAAAGA GCAACAAGAA ATTTTGGAAA AAATGAGGCC CAACGCTCAA AAAATTCAAG 540
CGGGTTAAAC GGCTAAAAAG GAGAGATGAT GGGATACATT CCTTATGTAA TAGAGAATAC 600
CGATCGTGGG GAGCGCAGCT ATGATATTTA CTCGCGCCTT TTAAAGGATC GCATTGTTTT 660
ATTGAGCGGT GAGATTAACG ATAGCGTGGC GTCTTCTATC GTGGCCCAAC TCTTGTTTTT 720
GGAAGCTGAA GACCCTGAAA AAGACATTGG CTTGTATATC AATTCTCCCG GTGGGGTGAT 780 AACAAGCGGT CTTAGCATCT ATGATACCAT GAATTTTATC CGCCCTGATG TTTCCACGAT 840
TTGCATCGGT CAAGCGGCTT CTATGGGGGC GTTTTTACTG AGCTGTGGGG CTAAGGGCAA 900
GCGCTTTTCA CTACCCCATT CAAGGATTAT GATCCACCAG CCTTTAGGGG GGGCTCAAGG 960
GCAAGCGAGC GATATTGAAA TCATTTCTAA CGAGATCCTT AGGCTTAAGG GTTTGATGAA 1020
TTC 1023
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 151 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a50RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 :
Met Gly Tyr lie Pro Tyr Val lie Glu Asn Thr Asp Arg Gly Glu Arg 1 5 10 15 Ser Tyr Asp lie Tyr Ser Arg Leu Leu Lys Asp Arg lie Val Leu Leu 20 25 30
Ser Gly Glu lie Asn Asp Ser Val Ala Ser Ser lie Val Ala Gin Leu
35 40 45
Leu Phe Leu Glu Ala Glu Asp Pro Glu Lys Asp lie Gly Leu Tyr lie 50 55 60
Asn Ser Pro Gly Gly Val lie Thr Ser Gly Leu Ser lie Tyr Asp Thr 65 70 75 80
Met Asn Phe lie Arg Pro Asp Val Ser Thr lie Cys lie Gly Gin Ala 85 90 95 Ala Ser Met Gly Ala Phe Leu Leu Ser Cys Gly Ala Lys Gly Lys Arg 100 105 110
Phe Ser Leu Pro His Ser Arg lie Met lie His Gin Pro Leu Gly Gly
115 120 125
Ala Gin Gly Gin Ala Ser Asp lie Glu lie lie Ser Asn Glu lie Leu 130 135 140
Arg Leu Lys Gly Leu Met Asn 145 150 (2) INFORMATION FOR SEQ ID NO : 5 :
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 181 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a50RFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
Leu Leu Lys Glu Arg Val Glu Gly Gin Leu Phe Leu Glu Asn Lys Ala 1 5 10 15 Arg Leu Tyr Asn Glu Glu Leu Lys Glu Lys Leu lie Glu Asn Leu Asp 20 25 30
Glu Lys lie Val Phe Asp Leu Pro Lys Thr lie lie Glu Gin Glu Met
35 40 45
Asp Leu Leu Phe Arg Asn Ala Leu Tyr Ser Met Gin Ala Glu Glu Val 50 55 60
Lys Ser Leu Gin Glu Ser Gin Glu Lys Ala Lys Glu Lys Arg Glu Ser 65 70 75 80
Phe Arg Asn Asp Ala Thr Lys Ser Val Lys lie Thr Phe lie lie Asp 85 90 95 Ala Leu Ala Lys Arg Arg Lys lie Gly Val His Asp Asn Glu Val Phe 100 105 110
Gin Thr Leu Tyr Tyr Glu Ala Met Met Thr Gly Arg Asn Pro Glu Ser
115 120 125
Leu lie Glu Gin Tyr Arg Lys Asn Asn Met Leu Ala Ala Val Lys Met 130 135 140
Ala Met lie Glu Asp Arg Val Leu Ala Tyr Leu Leu Asp Lys Asn Leu 145 150 155 160
Pro Lys Glu Gin Gin Glu lie Leu Glu Lys Met Arg Pro Asn Ala Gin 165 170 175 Lys lie Gin Ala Gly 180
(2) INFORMATION FOR SEQ ID NO : 6 : (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: b5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
AGCTTTGTAG CTGAAAAACT CAAAAACGAT TATGAAAACA AAATGAAAGT TTTGGATAGC 60
GAACAAAGAA GCCGGATCGA ACGCATCGTG TATTTGCAGA TTTTAGACAA CGCATGGCGA 120
GAGCACCTTT ATACGATGGA TAATCTCAAA ACTGGTATCA ATTTAAGAGG CTACAACCAA 180
AAAGACCCCC TTGTAGAATA CAAAAAAGAG AGTTACAACC TTTTCTTAGA ATTCATTGAA 240 GACATCAAAA TGGAAGCGAT CAAAACCTTT TCTAAGATCC AGTTTGAAAA TGAGCAAGAT 300
TCTAGCGATG CGGAGCGTTA TTTGGATAAC TTTAGCGAAG AAAGAGAGTA TGAGAGCGTA 360
ACTTACCGCC ATGAAGAAGC CTTAGACGAA GATTTGAATG TGGCCATGAA AGCT 414
(2) INFORMATION FOR SEQ ID NO : 7 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: b5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 :
Ser Phe Val Ala Glu Lys Leu Lys Asn Asp Tyr Glu Asn Lys Met Lys 1 5 10 15 Val Leu Asp Ser Glu Gin Arg Ser Arg lie Glu Arg lie Val Tyr Leu 20 25 30
Gin lie Leu Asp Asn Ala Trp Arg Glu His Leu Tyr Thr Met Asp Asn
35 40 45
Leu Lys Thr Gly lie Asn Leu Arg Gly Tyr Asn Gin Lys Asp Pro Leu 50 55 60
Val Glu Tyr Lys Lys Glu Ser Tyr Asn Leu Phe Leu Glu Phe lie Glu 65 70 75 80
Asp lie Lys Met Glu Ala lie Lys Thr Phe Ser Lys lie Gin Phe Glu 85 90 95 Asn Glu Gin Asp Ser Ser Asp Ala Glu Arg Tyr Leu Asp Asn Phe Ser 100 105 110
Glu Glu Arg Glu Tyr Glu Ser Val Thr Tyr Arg His Glu Glu Ala Leu
115 120 125
Asp Glu Asp Leu Asn Val Ala Met Lys Ala 130 135
(2) INFORMATION FOR SEQ ID NO : 8 :
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1610 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Gla
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : AAGCTTTGAT TTAGTCATCA CGGATATTAA CATGCCCCAT ATGGACGGCT TGGAATTTTT 60
ACGCCTTTTA GAAGGCAAGT ATGAATCCAT TGTGATTACC GGTAATGCGA CCTTGAATAA 120
AGCCATTGAT TCCATTCGTT TAGGCGTGAA AGACTTTTTC CAAAAGCCTT TTAAACCAGA 180
ATTGCTTTTA GAATCCATCT ATCGCACCAA AAAAGTTTTA GAATTTCAAA AAAAACACCC 240
TTTAGAAAAA CCTTTAAAAA AACCACACAA ACACAGCTTT TTAGCCGCTT CAAAAGCGTT 300 AGAAGAGAGC AAACGGCAGG CCTTAAAAGT CGCAAGCACG GACGCTAATG TCATGCTATT 360
AGGCGAAAGC GGGGTGGGTA AGGAAGTTTT TGCTCATTTC ATCCACCAGC ATTCTCAGCG 420
ATCCAAGCAC CCTTTTATAG CGATCAACAT GTCCGCAATC CCAGAGCATA TATTAGAAAG 480
CGAGCTTTTT GGGTATCAAA AAGGGGCGTT CACGGACGCC ACAGCTCCTA AAATGGGGCT 540
TTTTGAGAGC GCTAATAAAG GCACGATCTT TTTAGATGAA ATCGCTGAAA TGCCCCTTCA 600 ATTGCAAAGC AAACTTTTAA GAGTGGTTCA AGAAAAAGAA ATCACGCGCC TTGGGGATAA 660
TAAGAGCGTT AAAATTGATG TTCGTTTCAT TTCCGCTACC AACGCCAACA TGAAAGAAAA 720
AATCGCTGCG AAAGAATTTG GCAAAACGGT GTTTAATTTG AATCTGATCA CTCTAAACAG 780
CAAATATATC CGCAGACTTA CCGTGAATGG CTCTAACCAG ATGCCTCGTT TTTCTACGGA 840
TGGGAGAAAT ATCATGTATA TCAAAAAGAC ACCACAAGAA TACGCCATGG GGCTTATTTT 900 GCTAGACTAT AATCAGAGTT TTTTATTCCC TTTAAAGAAT GTGAAAATAC AAGCCTTTGA 960
TTGGTAAGGT TAAATTAAGC GAATGTGAGT TAATATTTAC ACTTATTAAA ATTTTATTCT 1020
TGGAGAATTT ATAATGAAGA GATCTTCTGC ATTTAGTTTC TTGGTAGCTT TTTTATTGGT 1080
AGCTGGCTGT AGTCATAAAA TGGATAATAA GACTGTGGCT GGCGATGTGA GCGCTAAAAC 1140
GGTTCAGACT GCACCTGTTA CTACAGAACC AGCTCCAGAG AAAGAAGAGC CTAAACAAGA 1200 GCCAGCTCCA GTGGTTGAAG AAAAGCCGGC TATTGAAAGC GGGACTATCA TCGCTTCTAT 1260
TTATTTTGAT TTTGACAAGT ATGAGATCAA AGAATCCGAT CAAGAGACTT TAGATGAGAT 1320
CGTGCAAAAA GCTAAAGAAA ACCACATGCA AGTGCTTTTG GAAGCCAATA CCGATGAATT 1380
TGGCTCTAGC GAATACAACC AAGCGCTTGG CGTTAAAAGG ACTTTGAGCG TGAAAAACGC 1440
TTTAGTCATT AAAGGGGTAG AAAAAGATAT GATCAAAACC ATCAGTTTTG GTGAAACCAA 1500 ACCCAAATGC GCCCAAAAAA CTAGAGAATG TTACAAAGAA AACAGAAGAG TGGATGTCAA 1560
ATTAGTGAAG TAATTTTAGG ATGAAAAGGT TTTTTTTTAT CCCTTTTATC 1610
(2) INFORMATION FOR SEQ ID NO: 9: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 311 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: GlaORFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 :
Met Pro His Met Asp Gly Leu Glu Phe Leu Arg Leu Leu Glu Gly Lys 1 5 10 15 Tyr Glu Ser lie Val lie Thr Gly Asn Ala Thr Leu Asn Lys Ala lie 20 25 30
Asp Ser lie Arg Leu Gly Val Lys Asp Phe Phe Gin Lys Pro Phe Lys
35 40 45
Pro Glu Leu Leu Leu Glu Ser lie Tyr Arg Thr Lys Lys Val Leu Glu 50 55 60
Phe Gin Lys Lys His Pro Leu Glu Lys Pro Leu Lys Lys Pro His Lys 65 70 75 80
His Ser Phe Leu Ala Ala Ser Lys Ala Leu Glu Glu Ser Lys Arg Gin 85 90 95 Ala Leu Lys Val Ala Ser Thr Asp Ala Asn Val Met Leu Leu Gly Glu 100 105 110
Ser Gly Val Gly Lys Glu Val Phe Ala His Phe lie His Gin His Ser
115 120 125
Gin Arg Ser Lys His Pro Phe lie Ala lie Asn Met Ser Ala lie Pro 130 135 140
Glu His lie Leu Glu Ser Glu Leu Phe Gly Tyr Gin Lys Gly Ala Phe
145 150 155 160
Thr Asp Ala Thr Ala Pro Lys Met Gly Leu Phe Glu Ser Ala Asn Lys
165 170 175 Gly Thr lie Phe Leu Asp Glu lie Ala Glu Met Pro Leu Gin Leu Gin
180 185 190
Ser Lys Leu Leu Arg Val Val Gin Glu Lys Glu lie Thr Arg Leu Gly
195 200 205
Asp Asn Lys Ser Val Lys lie Asp Val Arg Phe lie Ser Ala Thr Asn 210 215 220
Ala Asn Met Lys Glu Lys lie Ala Ala Lys Glu Phe Gly Lys Thr Val
225 230 235 240
Phe Asn Leu Asn Leu lie Thr Leu Asn Ser Lys Tyr lie Arg Arg Leu
245 250 255 Thr Val Asn Gly Ser Asn Gin Met Pro Arg Phe Ser Thr Asp Gly Arg
260 265 270
Asn lie Met Tyr lie Lys Lys Thr Pro Gin Glu Tyr Ala Met Gly Leu
275 280 285 lie Leu Leu Asp Tyr Asn Gin Ser Phe Leu Phe Pro Leu Lys Asn Val 290 295 300
Lys lie Gin Ala Phe Asp Trp 305 310
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 179 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: GalORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Met Lys Arg Ser Ser Ala Phe Ser Phe Leu Val Ala Phe Leu Leu Val
1 5 10 15
Ala Gly Cys Ser His Lys Met Asp Asn Lys Thr Val Ala Gly Asp Val 20 25 30 Ser Ala Lys Thr Val Gin Thr Ala Pro Val Thr Thr Glu Pro Ala Pro 35 40 45
Glu Lys Glu Glu Pro Lys Gin Glu Pro Ala Pro Val Val Glu Glu Lys 50 55 60
Pro Ala lie Glu Ser Gly Thr lie He Ala Ser He Tyr Phe Asp Phe 65 70 75 80
Asp Lys Tyr Glu He Lys Glu Ser Asp Gin Glu Thr Leu Asp Glu He 85 90 95
Val Gin Lys Ala Lys Glu Asn His Met Gin Val Leu Leu Glu Ala Asn
100 105 110
Thr Asp Glu Phe Gly Ser Ser Glu Tyr Asn Gin Ala Leu Gly Val Lys 115 120 125 Arg Thr Leu Ser Val Lys Asn Ala Leu Val He Lys Gly Val Glu Lys 130 135 140
Asp Met He Lys Thr He Ser Phe Gly Glu Thr Lys Pro Lys Cys Ala 145 150 155 160
Gin Lys Thr Arg Glu Cys Tyr Lys Glu Asn Arg Arg Val Asp Val Lys 165 170 175
Leu Val Lys
(2) INFORMATION FOR SEQ ID NO: 11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: D7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll:
AGCTTAAAGA GAAAGAAAAA GAAGCCTTGA TTGAGCAAGC TATCCGCACC GCACTTGTAG 60
AAAATGAGGC TAAGGCAGAA AAGCTCGATC AGACTCCAGA ATTTAAAGCG ATGATGGAAG 120
CGGTTAAAAA ACAGGCTTTA GTGGAATTTT GGGCTAAAAA ACAGGCTGAA GAAGTGAAAA 180 AAGTCCAAAT CCCAGAAAAA GAAATGCAAG ATTTTTACAA CGCTAATAAA GATCAGCTTT 240
TTGTCAAGCA AGAAGCCCAT GCTAGGCATA TTTTAGTGAA AACCGAAGAT GAGGCTAAAC 300
GGATTATTTC TGAGATTGAC AAACAGCCAA AGGCTAAAAA AGAAGCCAAA TTCATTGAGT 360
TAGCCAATCG GGATACGATT GATCCTAACA GCAAGAACGC GCAAAATGGC GGTGATTTGG 420
GGAAATTCCA AAAGAACCAA ATGGCTCCGG ATTTTTCTAA AGCCGCTTTC GCTTTAACTT 480 CTGGGGATTA CACTAAAACC CCTGTTAAAA CAGAGTTTGG TTATCATATT ATCTATTTGA 540
TTTCTAAAGA TAGCCCTGTA ACTTATACTT ATGAGCAAGC TAAACCTACC ATTAAGGGGA 600
TGTTACAAGA AAAGCT 616
(2) INFORMATION FOR SEQ ID NO : 12 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 204 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: D7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:
Leu Lys Glu Lys Glu Lys Glu Ala Leu He Glu Gin Ala He Arg Thr
1 5 10 15
Ala Leu Val Glu Asn Glu Ala Lys Ala Glu Lys Leu Asp Gin Thr Pro 20 25 30 Glu Phe Lys Ala Met Met Glu Ala Val Lys Lys Gin Ala Leu Val Glu 35 40 45
Phe Trp Ala Lys Lys Gin Ala Glu Glu Val Lys Lys Val Gin He Pro
50 55 60
Glu Lys Glu Met Gin Asp Phe Tyr Asn Ala Asn Lys Asp Gin Leu Phe 65 70 75 80
Val Lys Gin Glu Ala His Ala Arg His He Leu Val Lys Thr Glu Asp 85 90 95 Glu Ala Lys Arg He He Ser Glu He Asp Lys Gin Pro Lys Ala Lys
100 105 110
Lys Glu Ala Lys Phe He Glu Leu Ala Asn Arg Asp Thr He Asp Pro 115 120 125 Asn Ser Lys Asn Ala Gin Asn Gly Gly Asp Leu Gly Lys Phe Gin Lys 130 135 140
Asn Gin Met Ala Pro Asp Phe Ser Lys Ala Ala Phe Ala Leu Thr Ser 145 150 155 160
Gly Asp Tyr Thr Lys Thr Pro Val Lys Thr Glu Phe Gly Tyr His He 165 170 175
He Tyr Leu He Ser Lys Asp Ser Pro Val Thr Tyr Thr Tyr Glu Gin
180 185 190
Ala Lys Pro Thr He Lys Gly Met Leu Gin Glu Lys 195 200
(2) INFORMATION FOR SEQ ID NO : 13
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1078 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B23
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
CTGCAGCAGG CAATATTGGT GGTGGAGGTT TTGCGGTTAT CCATTTGGCT AATGGTGAAA 60 ATGTTGCCTT AGATTTTAGA GAAAAAGCCC CCTTGAAAGC CACTAAAAAC ATGTTTTTAG 120
ACAAGCAAGG CAATGTAGTC CCTAAACTCA GTGAAGATGG CTATTTGGCG GCTGGAGTTC 180
CTGGAACGGT GGCGGGCATG GAAGCGATGT TGAAAAAATA CGGCACTAAA AAACTATCGC 240
AACTCATTGA TCCTGCCATT AAATTGGCTG AAAATGGTTA TGTGATTTCA CAAAGGACAA 300
GCAGAAACCC TAAAAGAAGC AAGGGGAGCG GTTTTTTAAA ATACACTTCT AGCAAAAAAG 360 TATTTTTTTT AAAAAAGGAC ACCTTGATTA TCAAGAAGGG GATTTGTTTG TCCCAAAAAG 420
ATTTAGCCCA GACTTTGAAT CAAATCAAAA CGCTAGGCGC TAAAGGCTTT TATCAAGGGC 480
AAGTCGCTGA GCTTATCGAG AAAGACATGA AAAAAAATGG AGGGATTATC ACTAAAGAAG 540
ATTTAGCCAG TTACAATGTG AAATGGCGCA AACCCGTGGT AGGGAGTTAT CGTGGGTATA 600
AGATCATTTC TATGTCGCCA CCAAGTTCAG GAGGCACGCA TTTGATCCAG ATTTTAAATG 660 TCATGGAGAA TGCGGATTTA AGCACCCTTG GGTATGGGGC TTCTAAGAAT ATCCATATCG 720
CTGCCGAAGC GATGCGTCAA GCTTATGCGG ACAGATCGGT TTATATGGGA GACGCTGATT 780
TTGCTTCGGT GCCGGTGGAT AAATTGATTA ATAAAGCGTA TGCCAAAAAG ATTTTTGACA 840
CTATCCAGCC AGATACGGTT ACGCCAAGCT CTCAAATCAA ACCAGGAATG GGGCAGTTGC 900
ATGAGGGGAG CAACACCACG CATTATTCTG TAGCGGACAG GTGGGGGAAT GCAGTCAGCG 960 TTACTTACAC CATTAACGCT TCTTATGGAA GCGCTGCTAG TATTGATGGC GCAGGATTTT 1020
TATTGAACAA TGAAATGGAT GATTTTTCCA TAAAGCCTGG GAATCCTAAT CTCTATGG 1078
(2) INFORMATION FOR SEQ ID NO : 14 : (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 358 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: B23
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : Ala Ala Gly Asn He Gly Gly Gly Gly Phe Ala Val He His Leu Ala 1 5 10 15
Asn Gly Glu Asn Val Ala Leu Asp Phe Arg Glu Lys Ala Pro Leu Lys
20 25 30
Ala Thr Lys Asn Met Phe Leu Asp Lys Gin Gly Asn Val Val Pro Lys 35 40 45
Leu Ser Glu Asp Gly Tyr Leu Ala Ala Gly Val Pro Gly Thr Val Ala 50 55 60 Gly Met Glu Ala Met Leu Lys Lys Tyr Gly Thr Lys Lys Leu Ser Gin 65 70 75 80
Leu He Asp Pro Ala He Lys Leu Ala Glu Asn Gly Tyr Val He Ser 85 90 95 Gin Arg Thr Ser Arg Asn Pro Lys Arg Ser Lys Gly Ser Gly Phe Leu 100 105 110
Lys Tyr Thr Ser Ser Lys Lys Val Phe Phe Leu Lys Lys Asp Thr Leu
115 120 125
He He Lys Lys Gly He Cys Leu Ser Gin Lys Asp Leu Ala Gin Thr 130 135 140
Leu Asn Gin He Lys Thr Leu Gly Ala Lys Gly Phe Tyr Gin Gly Gin
145 150 155 160
Val Ala Glu Leu He Glu Lys Asp Met Lys Lys Asn Gly Gly He He
165 170 175 Thr Lys Glu Asp Leu Ala Ser Tyr Asn Val Lys Trp Arg Lys Pro Val
180 185 190
Val Gly Ser Tyr Arg Gly Tyr Lys He He Ser Met Ser Pro Pro Ser
195 200 205
Ser Gly Gly Thr His Leu He Gin He Leu Asn Val Met Glu Asn Ala 210 215 220
Asp Leu Ser Thr Leu Gly Tyr Gly Ala Ser Lys Asn He His He Ala
225 230 235 240
Ala Glu Ala Met Arg Gin Ala Tyr Ala Asp Arg Ser Val Tyr Met Gly
245 250 255 Asp Ala Asp Phe Ala Ser Val Pro Val Asp Lys Leu He Asn Lys Ala
260 265 270
Tyr Ala Lys Lys He Phe Asp Thr He Gin Pro Asp Thr Val Thr Pro
275 280 285
Ser Ser Gin He Lys Pro Gly Met Gly Gin Leu His Glu Gly Ser Asn 290 295 300
Thr Thr His Tyr Ser Val Ala Asp Arg Trp Gly Asn Ala Val Ser Val 305 310 315 320
Thr Tyr Thr He Asn Ala Ser Tyr Gly Ser Ala Ala Ser He Asp Gly 325 330 335 Ala Gly Phe Leu Leu Asn Asn Glu Met Asp Asp Phe Ser He Lys Pro 340 345 350
Gly Asn Pro Asn Leu Tyr 355 (2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2705 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (A) LIBRARY: C3
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
GAATTCAATA AATGCAACCA AAAATGCTAG GTTGCACCAC TATTTGAACA AATGGCATGA 60
GTGTATAAAA GACAAGGATT TTAGAGATAT TGATAATGAC ATCAAACAAC TTTTAATACT 120 CCTTAGCAGA AATCCCAAAT ATCTTCAGTG TCTGGGATGA ATGCTACCAA TTCATGGTAT 180
CATATCCCCA TACATTCGTA TCTAGCGCAG GAAGTGCACA AAGTTACGCC TTCGGAGATA 240
TGATGTGTGA GACCTGTAGG GAATGCGTTG GGAGCTCAAA CTCTGTAAAA TCCCTATGAT 300
TAGGGACACA GAGTGAGAAC CAAATTCTCC CTACGGGCAA CATCAGCCTA GGAAGCCCAA 360
TCGTCTTTAG CGGTTGGGCA CTTCACTAAG TCAGCATCAA GTTTTAGGGG CTACTTGTGT 420 TGGTCTTGCT GTTAAGCATC TTGGGATTGA ATTGATGGAA TTTGATGTTA CCATCATAGA 480
TGAGACAGGC AGGGCCACAG CACCAGAAAT CTTGATTCCT GCACTTCGCA CTAAAAAACT 540
GATCTTAATA GGCGATCACA ACCAGCTCCC ACCTAGCATT GATAGGTACC TCCTAGAACA 600
TTAGAGAGCG ATGATATTCA AAACTTGGAT GCCATTGATC GCCAATTATT GGAAGAGAGT 660
TTTTTGGAAA ATCTCTATAA GTATATTCCA GAGAGTAATA AGGCCATGCT TAATGAGTAA 720 TTTAGAATGC CTGCTTCTAT TGGATCGCTA GTTAGTCAGC TTTTTTATAA AGAGAAACTT 780
AAGAATGGAG TGATCAAAAA TACCTCGCAA TTTTACGATC CTAAGAATAT TATCCGTTGG 840
ATTAATGTTG AAGGGGAGCA TAAACTAGAA AAAACAAGTA GCTATAACAA AAATCAAGTT 900 CAAAAAATCA TAGAGCTTTT AGAGCAAATC AATCGCATTC TTAATCAAAG AAAAATCAAA 960
AAAACCATAG GAATTATCAC ACCTTATAAT GCCCAAAAAA GATGCTTGCG ATCAGAAGTG 1020
GAAAAATACG GCTTCAAGAA TTTTGATGAG CTCAAAATAG ACACTGTGGA TGCCTTTCAA 1080
GGCGAGAAGG CAGATATTAT TATTTATTCC ACCGTGAAAA CTTATGGTAA TCTTTCTTTC 1140 TTGATAGATT CTAAACGCTT GAATGTGCTA TTTCTAGGGC AAAAGAAAAT CTCATTTTTG 1200
TGGGCAAAAA GTCTTTCTTT GAGAATTTGC GAAGCGATGA GAAGAATATC TTTAGCGCTA 1260
TTTTGCAAGT CTGTAGATAG GTAATCTTTT CCAAAGATAA TCATTAGGCA TTATCCGCTT 1320
CAAAACGCTC CTAAATTGCA AACTTATTTT TTTTGAATGC TTTACTTTAT GGTGAGCCAT 1380
AACTTTATAA TCTACCAATC CATCGCATGA CTTTTAAAAT ACTCAAAGAT CCTAGATGAG 1440 AGCTTGAGTT GGATTGACTT TAGTTTATTT TAATTTTTCT TTATTTTGAA ATATCTTGAA 1500
ATGCTTAGCT CAATCAACAT TTAACAAAAA AGCCAAAACA TTTTTTAAGA AGAAAAAACC 1560
CTAAAACCCA ATATCAGTTT GATTGCTAAA ATAAAAGCTA CCAAAGTCTT TGGGCGTGTG 1620
GTGCGATTCT TTCTCTATAA CGGCGTCTTT AACGCAAGCG ACACGCAGAG CGTCAAAGCG 1680
AGTTTCAACG CTAGCACAAC CGCCGAGTAA AGACGCTCCA ATAAACATAC TATTTCTAAT 1740 GGTTTTCATT TTATATCCTT TTGTTTTAAA ATTTTTAATA ACTCAAATAC TTTAATCATG 1800
TATTTATGGA TAGTTAAGAT TTATTATGAA AAAAAAGTAA ATGGAACTCA AAACAATCAA 1860
AATAACGCTC TAAATCCCAA CCAGAAACGC TACCCTTTGT AAATCCTTAA TAATTTTTGC 1920
TATAATAAAG CCCTAACTGA AATTTATCAT TTTATTTTAG TTAGGCTCCT TGAATTAGAA 1980
TTATAGTAGA CTTGTTATAC CTTGTTCTAA ATATTGTGGT ATACTAACAA TGTTCAAAGA 2040 CATGAATTGA TTACTCAAGT GTGTAGCGAT TTTTATCAGT CTTTGATACC AATAAGATAC 2100
CGATAGGTAT GAAACTAGGT ATAGTAAGGA GAAACAATGA CTAACGAAAC CATTAACCAA 2160
CAACCACAAA CCGAAGCGGC TTTTAACCCG CAGCAATTTA TCAACAATCT TCAAGTGGCT 2220
TTTCTTAAAG TTGATAACGC TGTCGCTTCA TACGATCCTG ATCAAAAACC AATCGTTGAT 2280
AAGAATGATA GGGATAACAG GCAAGCTTTT TGATGGATCT TCGCAATTAA GGGAAGAATA 2340 CTCCAATAAA GCGATCCAAA ATCCTACCCA AAAAGAATCA GTATTTTTCA GACTTTATCA 2400
ATGAGAGCAA TGATTTAATC AACAAAGACA ATCTCATTGA TATAGGTTCT TCCATAAAAA 2460
GCTTTCAGAA ATTTGGGACT CAGCGTTACC GAATTTTCAC AAGTTGGGTG TCCCATCAAA 2520
ACGATCCGTC TAAAATCAAC ACCCGATCGA TCCGAAATTT TATGGAAAAT ATCATACAAC 2580
CCCCTATCCC TGATGACAAA GAAAAAGCAG AGTTTTTGAA ATCTGCCAAA CAATCTTTTG 2640 CAGGAATCAT TATAGGGAAT CAAATCCGAA CGGATCAAAA GTTCATGGGC GTGTTTGATG 2700
AATTC 2705
(2) INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1878 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: A3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:
AGCTTGTATC ACAACCAGGT TTTAAGCGGG TTTGCCGGGA GCACGGCGGA CGCTTTTAGC 60
TTGTTTGATA TGTTTGAACG CATTTTAGAG AGCAAAAAAG GGGATTTGTT TAAAAGCGTG 120
GTGGATTTCA GCAAAGAATG GCGCAAGGAC AAGTATTTAC GCCGACTAGA AGCGATGATG 180
ATCGTTTTAA GTTTGGATCT CATTTTCATT TTGAGCGGCA CGGGCGATGT TTTAGAAGCT 240 GAAGACAATA AAATCGCTGC TATTGGGAGT GGGGGGAATT ACGCCTTAGG CGCGGCTAGG 300
GCTTTAGATC ATTTCGCTCA TTTACAGCCT AGAAAACTTG TAGAATAGTC CTTAAAAATC 360
GCAGGGGATC TTTGCATTTA CACCAACACA AATATTAAAA TTTTGGAGCT TTAATGTCTA 420
AATTGAATAT GACCTCACGA GAAATTGTCG CTTATTTAGA TGAATACATC ATTGGGCAAA 480
AGGAAGCTAA AAAGTCTATC GCTATCGCTT TTAGGAATCG TTACAGGCGT TTGCAACTGG 5 0 AAAAATCCTT ACAAGAAGAA ATCACGCCTA AAAACATTTT AATGATTGGT TCTACTGGCG 600
TGGGTAAGAC TGAAATCGCA AGACGAATAG CAAAAATCAT GGAACTCCCC TTTGTGAAAG 660
TGGAAGCGAG CAAATACACA GAAGTGGGTT TTGTGGGGCG CGATGTGGAG TCTATGGTAA 720
GGGATTTAGT CAATAACAGC GTGCTTTTAG TGGAAAATGA GCATAAAGAA AAATTAAAAG 780
ACAAGATTGA AGAAGCGGTT ATAGAAAAAA TCGCTAAAAA ACTCCTACCC CCCTTGCCTA 840 ATGGCGTGAG CGAAGAAAAA AAACAAGAAT ACGCTAACAG CCTTTTAAAA ATGCAACAAA 900
GGATCGCGCA AGGCGAGTTG GATAGTAAAG AAATTGAAAT TGAAGTGCGT AAAAAAAGCA 960
TAGAGATTGA TTCTAATGTG CCGCCTGAAA TTTTAAGGGT TCAAGAAAAT GTGATTAAGT 1020
TTTTCCATAA AGAACAGGAT AAAGTCAAAA AAACTTTAAG CGTTAAAGAG GCTAAAGAAG 1080
CCCTAAAAGC AGAAATCAGC GACACGCTTT TAGACAGCGA AGCCATTAAA ATGGAAGGTT 1140 TGAAGCGCGC GGAAAGTTCA GGGGTGATTT TTATTGATGA AATTGATAAG ATCGCTGTAA 1200
GCTCTAAAGA AGGAAGCCGT CAAGATCCCA GTAAAGAGGG GGTTCAAAGG GATTTGTTGC 1260
CGATTGTAGA GGGGAGCGTG GTGAATACGA AGTATGGTTC TATTAAAACA GAGCATATTT 1320 TATTCATTGC AGCAGGGGCG TTTCATCTTT CTAAACCAAG CGATTTGATC CCTGAATTGC 1380
AGGGGCGTTT CCCTTTAAGG GTGGAGTTAG AAAATTTAAC CGAAGAAATC ATGTATATGA 1440
TTTTAACCCA AACTAAGACC TCTATCATCA AGCAATACCA AGCCCTTTTA AAAGTGGAGG 1500
GCGTAGAAAT TGCGTTTGAA GACGATGCGA TCAAAGAGTT AGCCAAACTT TCTTATAACG 1560 CCAATCAAAA AACCGAACAT ATAGGCGCTA GAAGGTTGCA CACCACCATT GAAAAAGTGC 1620
TAGAAGACAT TACTTTTGAA CCGGAGGATT ATTCGGGGCA AAATATTACT ACCACTAAAG 1680
AATTGGTTCA ATCCAACCTA GAGGATTTAG TGGCTGATGA AAATTTGGTG AAGTATATTT 1740
TATGATGAAA ACTAAGGCGG GCTTTGTATC TCTCATGGGC AAACCAAACG CTGGAAAAAG 1800
CACTCTTTTA AACACTTTAT TCGCCCTATA GTGAGTCGTA TTACAATTCA CTGGCCGTCG 1860 TTTTACAACG TCGTGACT 1878
(2) INFORMATION FOR SEQ ID NO: 17:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 448 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: A3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
Asn Phe Gly Ala Leu Met Ser Lys Leu Asn Met Thr Ser Arg Glu He 1 5 10 15
Val Ala Tyr Leu Asp Glu Tyr He He Gly Gin Lys Glu Ala Lys Lys
20 25 30
Ser He Ala He Ala Phe Arg Asn Arg Tyr Arg Arg Leu Gin Leu Glu 35 40 45 Lys Ser Leu Gin Glu Glu He Thr Pro Lys Asn He Leu Met He Gly 50 55 60
Ser Thr Gly Val Gly Lys Thr Glu He Ala Arg Arg He Ala Lys He 65 70 75 80
Met Glu Leu Pro Phe Val Lys Val Glu Ala Ser Lys Tyr Thr Glu Val 85 90 95
Gly Phe Val Gly Arg Asp Val Glu Ser Met Val Arg Asp Leu Val Asn
100 105 110
Asn Ser Val Leu Leu Val Glu Asn Glu His Lys Glu Lys Leu Lys Asp 115 120 125 Lys He Glu Glu Ala Val He Glu Lys He Ala Lys Lys Leu Leu Pro 130 135 140
Pro Leu Pro Asn Gly Val Ser Glu Glu Lys Lys Gin Glu Tyr Ala Asn 145 150 155 160
Ser Leu Leu Lys Met Gin Gin Arg He Ala Gin Gly Glu Leu Asp Ser 165 170 175
Lys Glu He Glu He Glu Val Arg Lys Lys Ser He Glu He Asp Ser
180 185 190
Asn Val Pro Pro Glu He Leu Arg Val Gin Glu Asn Val He Lys Phe 195 200 205 Phe His Lys Glu Gin Asp Lys Val Lys Lys Thr Leu Ser Val Lys Glu 210 215 220
Ala Lys Glu Ala Leu Lys Ala Glu He Ser Asp Thr Leu Leu Asp Ser 225 230 235 240
Glu Ala He Lys Met Glu Gly Leu Lys Arg Ala Glu Ser Ser Gly Val 245 250 255
He Phe He Asp Glu He Asp Lys He Ala Val Ser Ser Lys Glu Gly
260 265 270
Ser Arg Gin Asp Pro Ser Lys Glu Gly Val Gin Arg Asp Leu Leu Pro 275 280 285 He Val Glu Gly Ser Val Val Asn Thr Lys Tyr Gly Ser He Lys Thr 290 295 300
Glu His He Leu Phe He Ala Ala Gly Ala Phe His Leu Ser Lys Pro 305 310 315 320
Ser Asp Leu He Pro Glu Leu Gin Gly Arg Phe Pro Leu Arg Val Glu 325 330 335
Leu Glu Asn Leu Thr Glu Glu He Met Tyr Met He Leu Thr Gin Thr 340 345 350 Lys Thr Ser He He Lys Gin Tyr Gin Ala Leu Leu Lys Val Glu Gly
355 360 365
Val Glu He Ala Phe Glu Asp Asp Ala He Lys Glu Leu Ala Lys Leu
370 375 380 Ser Tyr Asn Ala Asn Gin Lys Thr Glu His He Gly Ala Arg Arg Leu
385 390 395 400
His Thr Thr He Glu Lys Val Leu Glu Asp He Thr Phe Glu Pro Glu
405 410 415
Asp Tyr Ser Gly Gin Asn He Thr Thr Thr Lys Glu Leu Val Gin Ser 420 425 430
Asn Leu Glu Asp Leu Val Ala Asp Glu Asn Leu Val Lys Tyr He Leu 435 440 445
(2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer BP (Ncol) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
GGCCATGGGA TCTAAATTGA ATATGACC 28
(2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer BQ (BamHl) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
GCGGATCCTA AAATATACTT CACCAA 26
(2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 616 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: C7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
GAATTCTATG ACAATAAAGA AGGGGCGGTA GTCATTAGCG TAGAAAAAGA CTCCCCGGCT 60
AAAAAAGCAG GGATTTTGGT GTGGGATTTG ATCACCGAAG TCAATGGCAA AAAGGTTAAA 120
AACACGAACG AATTGAGAAA TCTAATCGGC TCTATGCTAC CCAATCAAAG GGTAACCTTA 180 AAGGTCATTA GAGACAAAAA AGAACGCGCC TTCACCCTCA CACTTGCTGA AAGGAAAAAC 240
CCTAACAAAA AAGAAACCAT TTCTGCTCAA AACGGCGCGC AAGGCCAATT GAACGGGCTT 300
CAAGTAGAAG ATTTAACCCA AAAAACCAAA AGGTCTATGC GTTTGAGCGA TGATGTTCAA 360
GGGGTTTTAG TCTCTCAAGT GAATGAAAAT TCCCCAGCAG AGCAAGCCGG ATTTAGGCAA 420
GGTAACATTA TCACACAAAT TGAAGAGGTT GAAGTTAAAA GCGTTGCGGA TTTTAACCAT 480 GCTTTAGAAA AGTATAAAGG CAAACCCACA CGATTCTTAG TTTTAGATTT GAATCAAGGT 540
TATAGGATCA TTTTGGTGAA ATGATAGAGG TGGGTTGTTA GTCGCATGTC TTTGATTAGA 600
GTGAATGGGG AAGCTT 616 (2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 187 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: C7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
Glu Phe Tyr Asp Asn Lys Glu Gly Ala Val Val He Ser Val Glu Lys 1 5 10 15 Asp Ser Pro Ala Lys Lys Ala Gly He Leu Val Trp Asp Leu He Thr 20 25 30
Glu Val Asn Gly Lys Lys Val Lys Asn Thr Asn Glu Leu Arg Asn Leu
35 40 45
He Gly Ser Met Leu Pro Asn Gin Arg Val Thr Leu Lys Val He Arg 50 55 60
Asp Lys Lys Glu Arg Ala Phe Thr Leu Thr Leu Ala Glu Arg Lys Asn 65 70 75 80
Pro Asn Lys Lys Glu Thr He Ser Ala Gin Asn Gly Ala Gin Gly Gin 85 90 95 Leu Asn Gly Leu Gin Val Glu Asp Leu Thr Gin Lys Thr Lys Arg Ser 100 105 110
Met Arg Leu Ser Asp Asp Val Gin Gly Val Leu Val Ser Gin Val Asn
115 120 125
Glu Asn Ser Pro Ala Glu Gin Ala Gly Phe Arg Gin Gly Asn He He 130 135 140
Thr Gin He Glu Glu Val Glu Val Lys Ser Val Ala Asp Phe Asn His 145 150 155 160
Ala Leu Glu Lys Tyr Lys Gly Lys Pro Thr Arg Phe Leu Val Leu Asp 165 170 175 Leu Asn Gin Gly Tyr Arg He He Leu Val Lys 180 185
(2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: GD100 (forward Ncol primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:
GGCCATGGAC AATAAAGAAG GGGCGG 26
(2) INFORMATION FOR SEQ ID NO:23: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: GDI01 (reverse BamHl primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:
GCGGATCCTT TCACCAAAAT GATCCTATAA C 31 (2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2006 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B17
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:
CAATTTCAGA AAGGTTCATA ACGATTTATT CCTTATATTG TTTTAAAATA TGCCACAAAA 60
CATAATCTTA CTAAGCGTTT TTAGCAAAAA ATTATTGTTA TAATAACATA ATTATTAACA 120
AACTTTAAAG GCTTTGTGCG TATGGGATTG AAAGCGGATT CTTGGATTAA AAAAATGAGT 180
TTAGAGCATG GCATGATTAG CCCTTTTTGC GAAAAGCAAG TCGGTAAGAA TGTGATCAGC 240
TATGGTTTGA GCAGTTACGG GTATGATATT AGAGTGGGGA GTGAGTTCAT GCTCTTTGAT 300
AACAAAAACG CTTTAATTGA CCCTAAAAAC TTTGACCCTA ACAACGCGAC TAAAATTGAT 360
GCGAGTAAAG AAGGGTATTT TATCTTGCCC GCTAACGCGT TCGCCCTAGC TCATACGATA 420
GAGTATTTTA AAATGCCTAA AGACACCTTA GCGATTTGTT TAGGCAAAAG CACTTACGCC 480
AGGTGTGGGA TCATTGTGAA TGTTACGCCT TTTGAGCCAG AATTTGAAGG GTATATCACG 540
ATTGAAATTT CTAACACCAC CAACTTACCG GCTAAAGTCT ATGCCAATGA GGGGATCGCG 600
CAAGTGGTGT TTTTACAAGG CGATGAAATG TGCGAGCAAA GCTATAAAGA CAGAGGCGGT 660
AAGTATCAAG GGCAAGTGGG CATCACTTTG CCTAAAATTT TAAAGTGATT TTGAAGAAAG 720
ATAAAAACAC TTATCCTTGT GAGCGTTGTT TATTTTTTAA TACAAACTAT ACCGCAGTTT 780
CTCAAGAAAT ATATTATAAT ACGAAAGTTC AGTTTGATTG AGTTACACAC TCTTTGAGAA 840
CAAGACGCTA AATTATTTAG GAAATTACCA TGCTAAGATT CGTTAGTAAA ACGATTTGCT 900
TGTCTTTAAT CGGCTTGTTC AACCCTTTAG AAGCCTTTCA AAAACACCAA AAAGACGGCT 960
TTTTTATAGA AGCCGGGTTT GAAACCGGGT TATTAGAAGG CGTGCAAAAT AAAGAGCAAA 1020
CCATAACCAC CCAAAAAATC CAAAAAAACC CCCTAACCCA CCCACAAATT AAAGAACAGC 1080
CTAAAGAACA AAACAAAAGC GATACAGCCA CCCCACAAAG TGCTTACGGA AAATACTACA 1140
TACCCCAAAG CACCATTTTA AAAAACGCAA CGGCTTTATT CACCACGGAT AATATAGAAA 1200
AAAATGGCTT AACTTTTTAT TCTCAAAACC CTGTGTATGC GAATATGGTT AATGGGAGCG 1260
TAACCATACA AAACTTTCTG CCCTACAATT TAAACAATAT TGAGCTGAGC TATACAGACG 1320
CTCAAGGCAA GGTAGTCAAT TTAGGCGTGA TAGAAACTAT CCCTAAAGAT TCTCAAATCA 1380
TTCTGCCTGC AAGCTTGTTT AATGATTAGA ATTTGAACAA GCTGATAGCT TTAATTACCA 1440
ACAACTTCAA GCTACTGCCA CACAATTTTC TGATGCTAAC ACGCAAAGTT TGTTTGAAAA 1500
GCTCAGCCAA ATCACGACCC AATGTAACGA TGAGTTATGA AAACGCCGAT ACCAACAATT 1560
TTAAAGGTAA TTGCAATGAT TGTGTGTCAG ATTTCACCCC ACAAACCGCA GAAGAATTGA 1620
CCAATTTAAT GCTAGATATG ATTGCGGTGT TTGACTCTAA ATCGTGGGAA GAAGCCATTT 1680
TAAACGCTCC TTTCCAATTT TCTAACAGCC CATCAGAGTG CGGCTCTGAC TTTCCTAAAT 1740
GCGTGAATCC TTTCAATAAC GGGCGTGTCG CTCCCATCTA TGAAAAATAC GTGCTAACCC 1800
CACAATCCGT TATAGATGCG TTTAGAAGAG CAATCAATCT TGAAGTGAAC ATCATGAAAT 1860
CAGGGTTTTT AGGGCTAGGG TATGAACTTG ATGATAATGA TGGCAATCTA GGGATAGCCG 1920
CTTCTGCATT AAATCCCGAA AAATTGTTTG GTAAAACTTT GAACAAAGTT GATATTGTGG 1980
AATTAAGAGA CATTATCCAT GAATTC 2006
(2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 188 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B170RF4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
Met Gly Leu Lys Ala Asp Ser Trp He Lys Lys Met Ser Leu Glu His
1 5 10 15
Gly Met He Ser Pro Phe Cys Glu Lys Gin Val Gly Lys Asn Val He 20 25 30 Ser Tyr Gly Leu Ser Ser Tyr Gly Tyr Asp He Arg Val Gly Ser Glu 35 40 45
Phe Met Leu Phe Asp Asn Lys Asn Ala Leu He Asp Pro Lys Asn Phe 14
50 55 60
Asp Pro Asn Asn Ala Thr Lys He Asp Ala Ser Lys Glu Gly Tyr Phe 65 70 75 80
He Leu Pro Ala Asn Ala Phe Ala Leu Ala His Thr He Glu Tyr Phe 85 90 95
Lys Met Pro Lys Asp Thr Leu Ala He Cys Leu Gly Lys Ser Thr Tyr
100 105 110
Ala Arg Cys Gly He He Val Asn Val Thr Pro Phe Glu Pro Glu Phe 115 120 125 Glu Gly Tyr He Thr He Glu He Ser Asn Thr Thr Asn Leu Pro Ala 130 135 140
Lys Val Tyr Ala Asn Glu Gly He Ala Gin Val Val Phe Leu Gin Gly 145 150 155 160
Asp Glu Met Cys Glu Gin Ser Tyr Lys Asp Arg Gly Gly Lys Tyr Gin 165 170 175
Gly Gin Val Gly He Thr Leu Pro Lys He Leu Lys 180 185
(2) INFORMATION FOR SEQ ID NO:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 67 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B170RF1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:
Met Leu Arg Leu Leu Ser Gin Asn Leu Lys Gly He Ser Arg Leu Lys
1 5 10 15
Phe Leu Thr Pro Pro Thr Tyr Arg Leu Lys Ser Met Pro Met Arg Gly 20 25 30 Ser Arg Lys Trp Cys Phe Tyr Lys Ala Met Lys Cys Ala Ser Lys Ala 35 40 45
He Lys Thr Glu Ala Val Ser He Lys Gly Lys Trp Ala Ser Leu Cys
50 55 60
Leu Lys Phe 65
(2) INFORMATION FOR SEQ ID NO: 27:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 179 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B170RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
Met Leu Arg Phe Val Ser Lys Thr He Cys Leu Ser Leu He Gly Leu 1 5 10 15
Phe Asn Pro Leu Glu Ala Phe Gin Lys His Gin Lys Asp Gly Phe Phe
20 25 30
He Glu Ala Gly Phe Glu Thr Gly Leu Leu Glu Gly Val Gin Asn Lys 35 40 45 Glu Gin Thr He Thr Thr Gin Lys He Gin Lys Asn Pro Leu Thr His 50 55 60
Pro Gin He Lys Glu Gin Pro Lys Glu Gin Asn Lys Ser Asp Thr Ala 65 70 75 80
Thr Pro Gin Ser Ala Tyr Gly Lys Tyr Tyr He Pro Gin Ser Thr He 85 90 95
Leu Lys Asn Ala Thr Ala Leu Phe Thr Thr Asp Asn He Glu Lys Asn 100 105 110 Gly Leu Thr Phe Tyr Ser Gin Asn Pro Val Tyr Ala Asn Met Val Asn
115 120 125
Gly Ser Val Thr He Gin Asn Phe Leu Pro Tyr Asn Leu Asn Asn He
130 135 140 Glu Leu Ser Tyr Thr Asp Ala Gin Gly Lys Val Val Asn Leu Gly Val
145 150 155 160
He Glu Thr He Pro Lys Asp Ser Gin He He Leu Pro Ala Ser Leu
165 170 175
Phe Asn Asp
(2) INFORMATION FOR SEQ ID NO: 28:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 199 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B170RF3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
Asn Leu Asn Lys Leu He Ala Leu He Thr Asn Asn Phe Lys Leu Leu 1 5 10 15
Pro His Asn Phe Leu Met Leu Thr Arg Lys Val Cys Leu Lys Ser Ser
20 25 30
Ala Lys Ser Arg Pro Asn Val Thr Met Ser Tyr Glu Asn Ala Asp Thr 35 40 45 Asn Asn Phe Lys Gly Asn Cys Asn Asp Cys Val Ser Asp Phe Thr Pro 50 55 60
Gin Thr Ala Glu Glu Leu Thr Asn Leu Met Leu Asp Met He Ala Val 65 70 75 80
Phe Asp Ser Lys Ser Trp Glu Glu Ala He Leu Asn Ala Pro Phe Gin 85 90 95
Phe Ser Asn Ser Pro Ser Glu Cys Gly Ser Asp Phe Pro Lys Cys Val
100 105 110
Asn Pro Phe Asn Asn Gly Arg Val Ala Pro He Tyr Glu Lys Tyr Val 115 120 125 Leu Thr Pro Gin Ser Val He Asp Ala Phe Arg Arg Ala He Asn Leu 130 135 140
Glu Val Asn He Met Lys Ser Gly Phe Leu Gly Leu Gly Tyr Glu Leu 145 150 155 160
Asp Asp Asn Asp Gly Asn Leu Gly He Ala Ala Ser Ala Leu Asn Pro 165 170 175
Glu Lys Leu Phe Gly Lys Thr Leu Asn Lys Val Asp He Val Glu Leu
180 185 190
Arg Asp He He His Glu Phe 195
(2) INFORMATION FOR SEQ ID NO: 29:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 24 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer CA (forward Ncol primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:
CGCCATGGGA TTGAAAGCGG ATTC 24
(2) INFORMATION FOR SEQ ID NO: 30: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer CB (reverse BamHl primer) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:
GCGGATCCCT TTAAAATTTT AGGCAAAGTG 30
(2) INFORMATION FOR SEQ ID NO: 31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer CC (forward Ncol primer) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
CGCCATGGGA AACCCTTTAG AAGCCTTTC 29
(2) INFORMATION FOR SEQ ID NO: 32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer CD (reverse BamHl primer) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:
GCGGATCCAT CATTAAACAA GCTTGC 26
(2) INFORMATION FOR SEQ ID NO: 33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer BW (forward Ncol primer) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:
CGCCATGGGA AAGAGATCTT CTGCATTTAG 30
(2) INFORMATION FOR SEQ ID NO: 34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: BX (reverse BamHl primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GCGGATCCCT TCACTAATTT GACATCCA 28
(2) INFORMATION FOR SEQ ID NO: 35:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1208 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: al
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: AGCTTATTCC TTTTATTGTA AGGATTTAGG CTATTGAACT TTAGGAGTTT TAATGATATT 60
AAGAGCGAGT GTGTTGAGCG CGTTACTTCT TGTAAGCTTA GGGGCAGCCC CTAAACATTC 120
AGTTTCAGCT AATGACAAAC GGATGCAGGA TAATTTAGTG AGCGTGATTG AAAAACAGAC 180
CAATAAAAAG GTGCGTATTT TAGAAATCAA ACCTTTAAAA TCCAGCCAGG ATTTAAAAAT 240
GGTCGTTATT GAAGATCCGG ACACTAAATA CAATATCCCG CTTGTGGTGA GTAAGGATGG 300 CAATTTAATC ATAGGGCTTA GCAGCATATT CTTTAGCTAT AAAAGCGATG ATGTGCGATT 360
AGTTGCGGAA ACCAATCAGA ATGTTCAAGC TGCGTTAACG CTACCCAGCA AAAGCAGCGC 420
GAAAGTTGAA GCGCTAGTTT GTGTGAATGA GAATATACCG GCTGATTATG CGATAGAGTT 480
GCCCTCTACT AACGCTGAAA ATAAGGATAA AATCCTTTAT ATTGTCTCTG ATCCCATGTG 540
CCCGCATTGC CAAAAAGAGC TCACTAAACT TAGGGATCAC TTAAAAGAAA ACACCGTGAG 600 AATGGTTGTA GTGGGGTGGC TTGGAGTCAA TTCGGCTAAA AAAGCGGCTT TAATCCAAGA 660
AGAAATGGCG AAAGCTAGGG CTAGGGGAGC GAGCGTGGAA GATAAAATCT CTATTCTTGA 720
AAAGATTTAT TCCACCCAAT ACGATATTAA CGCTCAAAAA GAGCCTGAAG ATTTACAGCA 780
CTTAAGTGGA AAATACCACT AAAAAGATTT TTGAATCTGG CGTGATTAAT GGGTGTGCCT 840
TTCTTGTACC ATTATAAGGC ATGATATAAG GTTACTCTCA TGAAAAAACC CTACAGGAAG 900 ATTTCTGATT ATGCGATCGT GGGTGGTTTG AGCGCGTTAG TGATGGTGAG CATTGTGGGG 960
TGTAAGAGCA ATGCCGATGA CAAACCAAAA GAGCAAAGCT CTTTAAGTCA AAGCGTTCAA 1020
AAAGGTGCGT TTGTGATTTT AGAAGAGCAA AAGGATAAAT CTTACAAGGT TGTTGAAGAA 1080
TACCCTAGCT CAAAAACCCA CATCATAGTG CGCGATTTGC AAGGCAATGA ACGAGTGTTA 11 0
AGCAATGAAG AGATTCAAAA GCTCATCAAA GAAGAAGAAG CCAAAATTGA TAACGGCACG 1200 AGCAAGCT 1208
(2) INFORMATION FOR SEQ ID NO: 36:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 249 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: alORFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:
Met He Leu Arg Ala Ser Val Leu Ser Ala Leu Leu Leu Val Ser Leu 1 5 10 15
Gly Ala Ala Pro Lys His Ser Val Ser Ala Asn Asp Lys Arg Met Gin
20 25 30
Asp Asn Leu Val Ser Val He Glu Lys Gin Thr Asn Lys Lys Val Arg 35 40 45 He Leu Glu He Lys Pro Leu Lys Ser Ser Gin Asp Leu Lys Met Val 50 55 60
Val He Glu Asp Pro Asp Thr Lys Tyr Asn He Pro Leu Val Val Ser 65 70 75 80
Lys Asp Gly Asn Leu He He Gly Leu Ser Ser He Phe Phe Ser Tyr 85 90 95
Lys Ser Asp Asp Val Arg Leu Val Ala Glu Thr Asn Gin Asn Val Gin 100 105 110 Ala Ala Leu Thr Leu Pro Ser Lys Ser Ser Ala Lys Val Glu Ala Leu
115 120 125
Val Cys Val Asn Glu Asn He Pro Ala Asp Tyr Ala He Glu Leu Pro
130 135 140 Ser Thr Asn Ala Glu Asn Lys Asp Lys He Leu Tyr He Val Ser Asp
145 150 155 160
Pro Met Cys Pro His Cys Gin Lys Glu Leu Thr Lys Leu Arg Asp His
165 170 175
Leu Lys Glu Asn Thr Val Arg Met Val Val Val Gly Trp Leu Gly Val 180 185 190
Asn Ser Ala Lys Lys Ala Ala Leu He Gin Glu Glu Met Ala Lys Ala
195 200 205
Arg Ala Arg Gly Ala Ser Val Glu Asp Lys He Ser He Leu Glu Lys 210 215 220 He Tyr Ser Thr Gin Tyr Asp He Asn Ala Gin Lys Glu Pro Glu Asp 225 230 235 240
Leu Gin His Leu Ser Gly Lys Tyr His 245 (2) INFORMATION FOR SEQ ID NO: 37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 109 amino acids
(B) TYPE: amino acid' (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: alORF2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
Met Lys Lys Pro Tyr Arg Lys He Ser Asp Tyr Ala He Val Gly Gly
1 5 10 15
Leu Ser Ala Leu Val Met Val Ser He Val Gly Cys Lys Ser Asn Ala 20 25 30
Asp Asp Lys Pro Lys Glu Gin Ser Ser Leu Ser Gin Ser Val Gin Lys
35 40 45
Gly Ala Phe Val He Leu Glu Glu Gin Lys Asp Lys Ser Tyr Lys Val 50 55 60 Val Glu Glu Tyr Pro Ser Ser Lys Thr His He He Val Arg Asp Leu 65 70 75 80
Gin Gly Asn Glu Arg Val Leu Ser Asn Glu Glu He Gin Lys Leu He
85 90 95
Lys Glu Glu Glu Ala Lys He Asp Asn Gly Thr Ser Lys 100 105
(2) INFORMATION FOR SEQ ID NO: 38:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2534 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Cl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: CTGCAGGAAT TCAAGGAGCG AAATACCGCA AGGCTTTTAG CGTGGAAGAA ATGATTCCTA 60
GCATGGGTCA GGGGGCTTTA GGGGTAGAAA TGCTCAAAAA CCACAAGCAT TTTATTACGC 120
TCCAAAAACT CAACGACGAG AAAAGCGCGT TTTGCTGCCA TTTAGAAAGG GAGTTTGTTA 180
AGGGGCTTAA TGGGGGGTGT CAGATCCCTA TAGGCGTGCA TGCGAGTTTA ATGGGCGATA 240
GGGTTAAAAT CCAGGCGGTT TTAGGCTTGC CTAACGGGAA AGAAGTCATC ACTAAAGAAA 300 AGCAAGGGGA TAAAACTAAA GCGTTTGATT TAGTTCAAGA GCTTTTAGAA GAATTTTTGC 360
AAAGCGGGGC CAAAGAGATT TTAGAAAAGG CGCAGTTGTT TTAATGCGTT TGTTTATCGC 420
GCTAGTTTTG TTTTGGTGGT GGTTAAGTTT GAGCGCTAAG GAAGCGGATT TCATCTCTGA 480 TTTAGAATAC GGGATGGCTC TTTATAAAAA CCCTAGGGGT GTTGCGTGTG CGAAATGCCA 540
TGGCATTAAA GGCGAACAAC AAGAAATCAC CTTTTATTAT GAAAAAGGCG AAAAAAAAAT 600
CCTCTACGCC CCTAAAATCA ACCATTTGGA TTTTAAAACC TTTAAAGATG CCTTGAGTTT 660
AGGCAAAGGC ATGATGCCTA AATACAATCT CAATTTAGAA GAAATCCAAG CGATTTATCT 720 TTACATCACC TCTTTAGAGC ATAAAGAAGA GCGTAAGGAT TCTCCTAAGC CTTAATCAAA 780
GCGCTTGATT TATGTTAAAA TGGAGCGTTG CATTTTTGTT TTGATTAAAG AAGGGTTCTA 840
AAAATCAGAA TTTAAAAGAA GGTAAAAATG AGTGTCAAAA TTTTAAAAAT ATTAGTTTGT 900
GGGTTATTTT TTTTGAATGC CCATTTATGG GGGAAACAAG ACAATAGTTT TTTGGGGGTT 960
GCTGAAAGAG CCTATAAAAG CGGGAATTAT TCTAAAGCCA CATCTTATTT TAAAAAAGCA 1020 TGCAACGATG GGGTGAGTGA AGGTTGCACG CAATTAGGAA TCATTTATGA AAACGGGCAA 1080
GGCACTAGAA TAGATTATAA AAAAGCCCTA GAATATTATA AAACCGCATG CCAGGCTGAC 1140
GATAGGGAAG GGTGTTTTGG TTTAGGGGGG CTTTATGATG AGGGGTTAGG CACGACTCAA 1200
AATTATCAAG AGGCCATTGA CGCTTATGCT AAGGCGTGCG TTTTAAAACA CCCTGAGAGT 1260
TGCTACAATT TAGGCATTAT TTATGACCGA AAAATCAAGG GCAATGCCGA TCAAGCGGTT 1320 ACCTACTATC AAAAAAGCTG TAATTTTGAT ATGGCTAAGG GGTGTTATGT TTTGGGCGTG 1380
GCTTATGAAA AAGGCTTTTT AGAAGTCAAA CAAAGCAACC ATAAAGCCGT CATCTATTAT 1440
TTGAAAGCAT GCCGATTGGA TGATGGGCAG GCTTGCCGCG CGTTAGGGAG TTTGTTTGAA 1500
AATGGCGATG CAGGGCTTGA TGAAGATTTT GAAGTGGCGT TTGATTACTT GCAAAAAGCT 1560
TGCGGGTTAA ACAATTCTGG TGGTTGCGCG AGTTTAGGCT CTATGTATAT GTTAGGCAGG 1620 TATGTCAAAA AAGATCCCCA AAAGGCTTTT AATTATTTCA AACAAGCATG CGATATGGGA 1680
AGCGCGGTGA GTTGCTCTAG GATGGGCTTT ATGTATTCCC AAGGGGACGC TGTTCCAAAA 1740
GATTTGAGGA AAGCCCTTGA TAATTATGAA AGGGGTTGCG ATATGGGCGA TGAAGTGGGT 1800
TGCTTCGCTC TAGCGGGGAT GTATTACAAC ATGAAAGATA AAGAAAACGC CATAATGATT 1860
TATGACAAGG GCTGTAAGCT AGGCATGAAA CAAGCATGCG AAAACCTCAC TAAACTTAGG 1920 GGGTATTGAA AATTTAACCA ACCCCCTAAA TTATCATTGC GCTTGACTCA AAACTTTTCA 1980
AAGATTTGGC TCTGTTTTAA GAGCTAAAGC AGAAACCCAC CCTATTAATT TTTTAATCTT 2040
TGGTGTTTTT AGGGCTTTGT CTATTTTCAA AAAGAAAACT TTTTGAATGT TTTTTGCGGT 2100
TGTTTGGTTG TATTTGTAGT GTATTTTTAT GGTGTAAATT TTTGTTAGGT TAGCTTGAAG 2160
TGGGTTTTAG GTTTAAAAGT CCTATAAAAA ATGTTTTAGC GTGTTTTTGC GCTATGGATA 2220 GATATGCGTT TGGTTGTGTT TTTCCCAATG GCTTTAATTT ATGGCTTTTG CGTGGTTATT 2280
ATT TAAGCA CGCTATAAAC ACGAATTACA CGATAACAGA GCGGTATACG CACGCTATAA 2340
AAAGACTTGA TAAAAATAAC GAAAAATAGT TAAATTTCAA GCGTTCTTTT AAAAATTGTT 2400
GTTAGGTGAG ACAGATAAAA ACGCTTTTAG TTTAAAGATA GAGTTTTAGG GGTTTTTTGT 2460
GTTGGTTTAG TTATTCTTTA TTTTTTTAAA AAATGGGATT TTTAAAACTC ATAAAGAGAT 2520 AGGGGGTATT TTGA 2534
(2) INFORMATION FOR SEQ ID NO: 39:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 353 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Cl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:
Met Ser Val Lys He Leu Lys He Leu Val Cys Gly Leu Phe Phe Leu 1 5 10 15
Asn Ala His Leu Trp Gly Lys Gin Asp Asn Ser Phe Leu Gly Val Ala
20 25 30
Glu Arg Ala Tyr Lys Ser Gly Asn Tyr Ser Lys Ala Thr Ser Tyr Phe 35 40 45 Lys Lys Ala Cys Asn Asp Gly Val Ser Glu Gly Cys Thr Gin Leu Gly 50 55 60
He He Tyr Glu Asn Gly Gin Gly Thr Arg He Asp Tyr Lys Lys Ala 65 70 75 80
Leu Glu Tyr Tyr Lys Thr Ala Cys Gin Ala Asp Asp Arg Glu Gly Cys 85 90 95
Phe Gly Leu Gly Gly Leu Tyr Asp Glu Gly Leu Gly Thr Thr Gin Asn
100 105 110
Tyr Gin Glu Ala He Asp Ala Tyr Ala Lys Ala Cys Val Leu Lys His 115 120 125 Pro Glu Ser Cys Tyr Asn Leu Gly He He Tyr Asp Arg Lys He Lys 130 135 140
Gly Asn Ala Asp Gin Ala Val Thr Tyr Tyr Gin Lys Ser Cys Asn Phe 145 150 155 160
Asp Met Ala Lys Gly Cys Tyr Val Leu Gly Val Ala Tyr Glu Lys Gly
165 170 175
Phe Leu Glu Val Lys Gin Ser Asn His Lys Ala Val He Tyr Tyr Leu 180 185 190
Lys Ala Cys Arg Leu Asp Asp Gly Gin Ala Cys Arg Ala Leu Gly Ser
195 200 205
Leu Phe Glu Asn Gly Asp Ala Gly Leu Asp Glu Asp Phe Glu Val Ala
210 215 220 Phe Asp Tyr Leu Gin Lys Ala Cys Gly Leu Asn Asn Ser Gly Gly Cys
225 230 235 240
Ala Ser Leu Gly Ser Met Tyr Met Leu Gly Arg Tyr Val Lys Lys Asp
245 250 255
Pro Gin Lys Ala Phe Asn Tyr Phe Lys Gin Ala Cys Asp Met Gly Ser 260 265 270
Ala Val Ser Cys Ser Arg Met Gly Phe Met Tyr Ser Gin Gly Asp Ala
275 280 285
Val Pro Lys Asp Leu Arg Lys Ala Leu Asp Asn Tyr Glu Arg Gly Cys
290 295 300 Asp Met Gly Asp Glu Val Gly Cys Phe Ala Leu Ala Gly Met Tyr Tyr
305 310 315 320
Asn Met Lys Asp Lys Glu Asn Ala He Met He Tyr Asp Lys Gly Cys
325 330 335
Lys Leu Gly Met Lys Gin Ala Cys Glu Asn Leu Thr Lys Leu Arg Gly 340 345 350
Tyr
(2) INFORMATION FOR SEQ ID NO: 40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE : forward primer BF (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40:
GCCATGGGAA GTGTCAAAAT TTTAAAAATA 30
(2) INFORMATION FOR SEQ ID NO: 41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: reverse primer BG (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:
CCGGATCCAT ACCCCCTAAG TTTAGTGAG 29
(2) INFORMATION FOR SEQ ID NO:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer BO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: GGCCATGGGG GTTGCTGAAA GAGCC 25
(2) INFORMATION FOR SEQ ID NO: 43:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 887 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: A22
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: AATTCTTTCT TGCAAGATGT GCCTTATTGG ATGTTGCAAA ATCGCAGTGA GTATATCACG 60
CAAGGGGTGG ATAGCTCGCA CATTGTGGAT GGTAAGAAAA CTGAAGAGAT AGAAAAAATC 120
GCTACCAAAA GAGCGACAAT AAGAGTGGCG CAAAATATTG TGCATAAACT CAAAGAGGCT 180
TACCTTTCTA AATCCAACCG CATCAAGCAA AAGATCACTA ATGAAATGTT TATCCAAATG 240
ACACAGCCCA TTTATGACAG CTTGATGAAT GTGGATCGTT TAGGGATTTA TATCAATCCT 300 AACAATGAGG AAGTGTTTGC GTTAGTGCGC GCGCGTGGTT TTGATAAGGA CGCTTTGAGC 360
GAAGGGTTGC ATAAAATGGC ATTAGACAAT CAAGCGGTGA GTATCCTTGT GGCTAAAGTG 420
GAAGAAATCT TTAAAGATTC TGTCAATTAC GGAGATGTTA AAGTCCCTAT AGCCATGTAG 480
GATTAGAACA ACAAGCGTTC CTCACTATCG TCTGTTCTTT TGGGGGTGGG GGTGATGGAA 540
TTAGGAGTTG AGTAGTAGGG GATTTTATCC ACGATTTCTT TTACGCAAGC CTTTGGGGAC 600 ATCAAACTTT CTTTTTAAAG AAGGTTACAA TCGCTAAAAT ATTAGGCATG AAATACGAAT 660
ACACAGGTGC ACTCACAACG CCTCCTGTCG CTCGTTTGCT AATAGGCGTG TTATCGTCTC 720
TCCCAAACCA GATCACGCTT TGCAAGGTGG GGGTAAAGCC AATGAACCAA GCGTCAATAT 780
TGTTGTTAGA AGTCCCGGTT TTACCGGCGA TTTCTAAACC TTTAGTGCGA GCCAAACGCC 840
CTGTGCCGTT TTCTACCGCA TTTATCAGCA CTGAAAGGGT TAAAAAA 887
(2) INFORMATION FOR SEQ ID NO: 44:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 159 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: A22
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:
Asn Ser Phe Leu Gin Asp Val Pro Tyr Trp Met Leu Gin Asn Arg Ser 1 5 10 15 Glu Tyr He Thr Gin Gly Val Asp Ser Ser His He Val Asp Gly Lys 20 25 30
Lys Thr Glu Glu He Glu Lys He Ala Thr Lys Arg Ala Thr He Arg
35 40 45
Val Ala Gin Asn He Val His Lys Leu Lys Glu Ala Tyr Leu Ser Lys 50 55 60
Ser Asn Arg He Lys Gin Lys He Thr Asn Glu Met Phe He Gin Met 65 70 75 80
Thr Gin Pro He Tyr Asp Ser Leu Met Asn Val Asp Arg Leu Gly He 85 90 95 Tyr He Asn Pro Asn Asn Glu Glu Val Phe Ala Leu Val Arg Ala Arg 100 105 110
Gly Phe Asp Lys Asp Ala Leu Ser Glu Gly Leu His Lys Met Ala Leu
115 120 125
Asp Asn Gin Ala Val Ser He Leu Val Ala Lys Val Glu Glu He Phe 130 135 140
Lys Asp Ser Val Asn Tyr Gly Asp Val Lys Val Pro He Ala Met 145 150 155 (2) INFORMATION FOR SEQ ID NO: 45:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer BU (Ncol)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:
CGCCATGGGA AATTCTTTCT TGCAAGATGT 30
(2) INFORMATION FOR SEQ ID NO: 46:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: reverse primer BV (BamHl)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:
GCGGATCCCA TGGCTATAGG GACTTTAACA 30
(2) INFORMATION FOR SEQ ID NO: 47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 984 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:
AAGCTTCTGG GGTTGAAAAA GGTTCTGACA ACCCGCTCAA AAATAAGATC GCAAAGCTCA 60 CCCACAAGCA AGTGGAAGAG ATTGCGCAAT TGAAAATGGA AGATTTAAAC ACAAGCACCA 120
TGGAAGCGGC CAAAAAAATC GTTATGGGCA GCGCTAGGAG CATGGGCGTA GAAGTTGTGG 180
ATTGATTGGG TTTTGTTGGA ATTGAAAGAA ATTTTTAAGG ATTAGAATCG TGGCAAAAAA 240
AGTATTTAAA AGATTGGAAA AACTTTTTTC TAAAATTCAA AACGATAAAG CGTATGGCGT 300
AGAGCAGGGC GTAGAGGTGG TTAAATCCCT CGCTTCAGCC AAATTTGATG AAACCGTGGA 360 AGTAGCGTTA AGGCTAGGGG TTGATCCAAG GCATGCGGAT CAAATGGTGC GCGGTGCGGT 420
GGTGCTTCCT CATGGAACAG GGAAAAAAGT AAGAGTGGCC GTTTTTGCAA AAGACATCAA 480
GCAAGATGAA GCCAAGAACG CTGGGGCTGA TGTCGTTGGC GGAGATGATT TGGCTGAAGA 540
AATCAAAAAT GGTCGCATTG ATTTTGACAT GGTGATTGCA ACGCCTGATA TGATGGCGGT 600
TGTCGGTAAA GTGGGTAGGA TTTTAGGCCC TAAGGGTTTG ATGCCAAACC CTAAAACCGG 660 AACCGTTACG ATGGATATTG CTAAAGCGGT TACTAACGCT AAAAGCGGTC AAGTGAATTT 720
CAGGGTGGAT AAAAAGGGCA ATGTTCATCC CCCTATTGGT AAGGCGAGTT TTCCTGAAGA 780
AAAAATCAAA GAAAACATGC TTGAGTTGGT TAAAACGATC AACCGCCTAA AACCCAGTAG 840
TGCGAAAGGC AAGTATATTA GAAACGCCGC TCTTTCGCTC ACCATGTCGC CTTCAGTGAG 900
TTTGGACGCG CAAGAATTGA TGGATGTTAA ATAGCGTTAG GAGTTTTTAA TCTTAGGCTG 960 AAGATCGTAA GAGCTAAAAA GCTT 984
(2) INFORMATION FOR SEQ ID NO: 48:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 234 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: B9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48:
Val Ala Lys Lys Val Phe Lys Arg Leu Glu Lys Leu Phe Ser Lys He
1 5 10 15
Gin Asn Asp Lys Ala Tyr Gly Val Glu Gin Gly Val Glu Val Val Lys 20 25 30 Ser Leu Ala Ser Ala Lys Phe Asp Glu Thr Val Glu Val Ala Leu Arg 35 40 45
Leu Gly Val Asp Pro Arg His Ala Asp Gin Met Val Arg Gly Ala Val
50 55 60
Val Leu Pro His Gly Thr Gly Lys Lys Val Arg Val Ala Val Phe Ala 65 70 75 80
Lys Asp He Lys Gin Asp Glu Ala Lys Asn Ala Gly Ala Asp Val Val
85 90 95
Gly Gly Asp Asp Leu Ala Glu Glu He Lys Asn Gly Arg He Asp Phe 100 105 110 Asp Met Val He Ala Thr Pro Asp Met Met Ala Val Val Gly Lys Val 115 120 125
Gly Arg He Leu Gly Pro Lys Gly Leu Met Pro Asn Pro Lys Thr Gly
130 135 140
Thr Val Thr Met Asp He Ala Lys Ala Val Thr Asn Ala Lys Ser Gly 145 150 155 160
Gin Val Asn Phe Arg Val Asp Lys Lys Gly Asn Val His Pro Pro He
165 170 175
Gly Lys Ala Ser Phe Pro Glu Glu Lys He Lys Glu Asn Met Leu Glu 180 185 190 Leu Val Lys Thr He Asn Arg Leu Lys Pro Ser Ser Ala Lys Gly Lys 195 200 205
Tyr He Arg Asn Ala Ala Leu Ser Leu Thr Met Ser Pro Ser Val Ser
210 215 220
Leu Asp Ala Gin Glu Leu Met Asp Val Lys 225 230
(2) INFORMATION FOR SEQ ID NO: 49:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: GD72 (forward primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: GGCCATGGTG GCAAAAAAAG TATTTAAAAG A 31
(2) INFORMATION FOR SEQ ID NO: 50:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: GD73 (reverse primer)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: CCGGATCCTT TAACATCCAT CAATTCTTG 29
(2) INFORMATION FOR SEQ ID NO: 51: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 400 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: B2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51:
ACAAAGAGCA AGATTTCACT CAAGCTAAGA AATATTTTGA GAAAGCGTGC GATTTGAAAG 60
AAAATAGCGG GTGTTTTAAT TTAGGGGTGC TTTATTATCA AGGGCAAGGG GTGGAAAAGA 120
ACTTGAAAAA AGCCGCCTCC TTTTACGCTA AAGCTTGCGA TTTGAATTAC AGCAATGGGT 180 GTCATTTGCT AGGGAATTTA TATTACAGTG GGCAAGGCGT CTCCCCACAC ACCAATAAAG 240
CCTTACAATA CTACTCTAAA GCGTGCGATT TGAAATACTC TGAAGGGTGC GCGAGCTTAG 300
GGGGGATTTA TCATGATGGT GGAAAATGGT ACACTAGGGA TTTTAAAAAC AGCGGTGGAA 360
TATTTCACTA AAGCGTGCGA TTTAATCGAT GGCGATGGTT 400 (2) INFORMATION FOR SEQ ID NO: 52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: GD77 (for 5' extension)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: TTCTTTCAAA TCGCACGCTT TC 22 (2) INFORMATION FOR SEQ ID NO:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: GD80 (for 3' extension)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: GCGAGCTTAG GGGGGATTTA TC 22 (2) INFORMATION FOR SEQ ID NO: 54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1204 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: B2 (extension clone)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:54:
AAGCTTTTAA ACCCCTTTTT AATGCCCCTT TTAAAGGGTG AAATTTTGCC CTCCTGGGGG 60
GATTAGCGCA GATAACATGC GTTCTTATTT AAATTTAGAA AATGTTTTGT GCGTGGGGGG 120 GAGCTGGCTT ACCCCTAAAG ATTTGATNCC AAACNAAGAG TGGGATAAAA TCACAGAAAT 180
TTGTAAGAGA GCGCTAACTT TAAGATACCC AAAAGATCAT GGCATTTTGT ATTTGCTTAA 240
TAACACTATA AAAAAANTTT TAATTAGGAG ATACATCATG TTAGAAAATG TCCAAAAATC 300 CCTTTTTAGG GTTTTGTGCT TGGGAGCGTT GTGTTTAGGG GGGCTAATGG CAGAGCCAGA 360
CCCTAAAGAG CTTGTGGGTT TGGGCGCGAA GAGTTACAAA GAGCAAGATT TCACTCAAGC 420
TAAGAAATAT TTTGAGAAAG CGTGCGATTT GAAAGAAAAT AGCGGGTGTT TTAATTTAGG 480
GGTGCTTTAT TATCAAGGGC AAGGGGTGGA AAAGAACTTG AAAAAAGCCG CCTCCTTTTA 540 CGCTAAAGCT TGCGATTTGA ATTACAGCAA TGGGTGTCAT TTGCTAGGGA ATTTATATTA 600
CAGTGGGCAA GGCGTCTCCC CACACACCAA TAAAGCCTTA CAATACTACT CTAAAGCGTG 660
CGATTTGAAA TACTCTGAAG GGTGCGCGAG CTTAGGGGGG ATTTATCATG ATGGTGGAAA 720
ATGGTACACT AGGGATTTTA AAAAAGCGGT GGAATATTTC ACTAAAGCGT GCGATTTAAA 780
CGATGGCGAT GGTTGCACGA TATTAGGGAG CTTGCTTGAT GCAGGCAGAG GCACGCCTAA 840 GGATTTGAAA AAGGCGCTCG CTTCGTTTGA TAAAGCTTGC GACTTAAAAG ACAGCCCGGG 900
GTGTTTTAAC GCAGGGAATA TGTATCATCA TGGCGATGGC GTGGCGAAGA ATTTTAAAGA 960
GGCTCTCGAT CGTTATTCTA AAGCATGCGA GATGCAAAAC GGCGGAGGGT GTTTCAATTT 1020
AGGGGCTATG CAATACAATG GCGAAGGTGC AACAAGGAAT GAAAAGCAAG CCATAGAAAA 1080
CTTTAAAAAA GGCTGTAAAT TGGGCGCTAA AGGGGCATGC GATATTCTCA AGCAGGTCAA 1140 AATCAAAGTT TAGTTTGGAT TAAGGTTGAN CAAGCGGTTT AAAAAGCGGC TTTTNAGGGT 1200
GTTA 1204
(2) INFORMATION FOR SEQ ID NO: 55: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 291 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: B2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: Met Leu Glu Asn Val Gin Lys Ser Leu Phe Arg Val Leu Cys Leu Gly
1 5 10 15
Ala Leu Cys Leu Gly Gly Leu Met Ala Glu Pro Asp Pro Lys Glu Leu
20 25 30
Val Gly Leu Gly Ala Lys Ser Tyr Lys Glu Gin Asp Phe Thr Gin Ala 35 40 45
Lys Lys Tyr Phe Glu Lys Ala Cys Asp Leu Lys Glu Asn Ser Gly Cys
50 55 60
Phe Asn Leu Gly Val Leu Tyr Tyr Gin Gly Gin Gly Val Glu Lys Asn 65 70 75 80 Leu Lys Lys Ala Ala Ser Phe Tyr Ala Lys Ala Cys Asp Leu Asn Tyr
85 90 95
Ser Asn Gly Cys His Leu Leu Gly Asn Leu Tyr Tyr Ser Gly Gin Gly
100 105 110
Val Ser Pro His Thr Asn Lys Ala Leu Gin Tyr Tyr Ser Lys Ala Cys 115 120 125
Asp Leu Lys Tyr Ser Glu Gly Cys Ala Ser Leu Gly Gly He Tyr His
130 135 140
Asp Gly Gly Lys Trp Tyr Thr Arg Asp Phe Lys Lys Ala Val Glu Tyr 145 150 155 160 Phe Thr Lys Ala Cys Asp Leu Asn Asp Gly Asp Gly Cys Thr He Leu
165 170 175
Gly Ser Leu Tyr Asp Ala Gly Arg Gly Thr Pro Lys Asp Leu Lys Lys
180 185 190
Ala Leu Ala Ser Phe Asp Lys Ala Cys Asp Leu Lys Asp Ser Pro Gly 195 200 205
Cys Phe Asn Ala Gly Asn Met Tyr His His Gly Asp Gly Val Ala Lys
210 215 220
Asn Phe Lys Glu Ala Leu Asp Arg Tyr Ser Lys Ala Cys Glu Met Gin 225 230 235 240 Asn Gly Gly Gly Cys Phe Asn Leu Gly Ala Met Gin Tyr Asn Gly Glu
245 250 255
Gly Ala Thr Arg Asn Glu Lys Gin Ala He Glu Asn Phe Lys Lys Gly
260 265 270
Cys Lys Leu Gly Ala Lys Gly Ala Cys Asp He Leu Lys Gin Val Lys 275 280 285
He Lys Val 290 (2) INFORMATION FOR SEQ ID NO: 56:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: GD96 (forward primer, Ncol cloning site)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:
GGCCATGGAG CCAGACCCTA AAGAGC 26
(2) INFORMATION FOR SEQ ID NO: 57:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 32 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: GD97 (reverse primer, BamHl cloning site)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57:
GCGGATCCAA CTTTGATTTT GACCTGCTTG AG 32
(2) INFORMATION FOR SEQ ID NO: 58:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1795 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: d5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58:
GAATTCGCGC AAAAAGCGAA TTTGAATCTG GCTGATGTGA TTAAAACCCT CTTTAATTTA 60 GGGCTTATGG TAACTAAAAA CGACTTTTTG GATAAGGATA GTATAGAAAT TTTAGCCGAA 120
GAGTTCCATT TAGAAATTTC TGTTCAAAAC ACTTTAGAAG AATTTGAAGT GGAAGAAGTG 180
CTAGAGGGGG TGAAAAAAGA GCGCCCGCCT GTGGTTACTA TCATGGGGCA TGTTGATCAT 240
GGTAAAACTT CGCTATTGGA TAAAATCCGT GATAAAAGAG TCGCTCACAC GGAAGCTGGG 300
GGGATCACTC AGCACATTGG CGCTTACATG GTAGAAAAGA ATGATAAGTG GGTGTCTTTC 360 ATTGACACCC CAGGGCATGA AGCCTTTAGC CAGATGCGTA ATCGTGGGGC TCAAGTTACA 420
GATATTGCAG TGATTGTGAT AGCGGCTGAT GATGGCGTGA AGCAACAGAC TATTGAAGCG 480
TTAGAGCATG CAAAGGCCGC TAATGTGCCT GTGATTTTTG CGATGAATAA AATGGATAAG 540
CCTAATGTGA ATCCGGACAA ACTCAAAGCC GAATGCGCTG AGCTTGGCTA TAACCCTGTG 600
GATTGGGGCG GAGAGCATGA GTTTATCCCT GTTTCGGCTA AAACGGGCGA TGGCATTGAC 660 AATTTATTAG ACACCATTCT TATCCAAGCG GATATTATGG AATTGAAAGC CATAGAAGAG 720
GGCAGCGCTA GAGCGGTTGT TTTAGAAGGA AGCGTGGAAA AAGGGCGTGG GGCAGTGGCC 780
ACTGTGATTG TCCAAAGCGG GACTTTGAGC GTGGGGGATA GTTTTTTTGT CGAAACCGCG 840
TTTGGTAAAG TAAGAACGAT GACTGATGAT CAAGGCAAGA GCATTCAAAA TTTAAAACCC 900
TCTATGGTGG CTCTCATCAC AGGCTTGAGC GAAGTGCCTC CTGCGGGATC TGTTTTAATA 960 GGGGTAGAAA ACGATTCTAT CGCGCGCTTG CAAGCTCAAA AGAGGGCGAC TTATTTGCGC 1020
CAAAAAGCGT TGAGTAAAAG CACTAAAGTG TCTTTTGATG AGCTTTCAGA AATGGTCGTT 1080
AAT AGGAAT TGAAAAACAT TCCTGTAGTC ATT AAGCGG ACACGCAAGG AAGCTTAGAA 1140
GCCATTAAAA ACAGCCTGTT GGAGCTTAAT AACGAAGAAG TGGCGATTCA AGTGATCCAC 1200
TCAGGGGTGG GGGGCATTAC TGAGAATGAT TTAAGCCTAG TCTCTAGCAG TGAGCATGCC 1260 GTGATTTTAG GCTTTAATAT CCGCCCCACC GGTAATGTGA AAAATAAGGC TAAAGAATAC 1320
AATGTGAGCA TTAAAACTTA CACGGTGATT TATGCCTTGA TTGAGGAAAT GCGATCGCTG 1380
TTATTAGGCT TGATGAGTCC TATTATTGAA GAAGAGCATA CTGGGCAAGC GGAAGTGAGA 1440 GAAACCTTTA ATATCCCTAA AGTTGGCATG ATAGCCGGGT GTGTGGTGAG CGATGGGGTG 1500
ATCGCTCGTG GCATTAAGGC GCGTTTGATT AGAGATGGCG TGGTGGTTCA TACCGGTGAA 1560
ATCCTTTCTT TGAAACGCTT TAAAAATGAT GTGAAAGAAG TTTCTAAGGG CTATGAGTGT 1620
GGGATCATGC TAGACAATTA TAACGAAATT AAAGTGGGCG ATGTGTTTGA AACCTATAAA 1680 GAAATCCATA AAAAAAGAAC CCTTTAATGA ACGCTCATAA AGAACGCTTA GAATCCAATC 1740
TTTTAGAATT ACTACAAGAG GCTTTAGCGA GTTTGAACGA CAGTGAGTTG AATTC 1795
(2) INFORMATION FOR SEQ ID NO: 59: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 568 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: d5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: Glu Phe Ala Gin Lys Ala Asn Leu Asn Leu Ala Asp Val He Lys Thr 1 5 10 15
Leu Phe Asn Leu Gly Leu Met Val Thr Lys Asn Asp Phe Leu Asp Lys
20 25 30
Asp Ser He Glu He Leu Ala Glu Glu Phe His Leu Glu He Ser Val 35 40 45
Gin Asn Thr Leu Glu Glu Phe Glu Val Glu Glu Val Leu Glu Gly Val
50 55 60
Lys Lys Glu Arg Pro Pro Val Val Thr He Met Gly His Val Asp His 65 70 75 80 Gly Lys Thr Ser Leu Leu Asp Lys He Arg Asp Lys Arg Val Ala His
85 90 95
Thr Glu Ala Gly Gly He Thr Gin His He Gly Ala Tyr Met Val Glu
100 105 110
Lys Asn Asp Lys Trp Val Ser Phe He Asp Thr Pro Gly His Glu Ala 115 120 125
Phe Ser Gin Met Arg Asn Arg Gly Ala Gin Val Thr Asp He Ala Val
130 135 140
He Val He Ala Ala Asp Asp Gly Val Lys Gin Gin Thr He Glu Ala 145 150 155 160 Leu Glu His Ala Lys Ala Ala Asn Val Pro Val He Phe Ala Met Asn
165 170 175
Lys Met Asp Lys Pro Asn Val Asn Pro Asp Lys Leu Lys Ala Glu Cys
180 185 190
Ala Glu Leu Gly Tyr Asn Pro Val Asp Trp Gly Gly Glu His Glu Phe 195 200 205
He Pro Val Ser Ala Lys Thr Gly Asp Gly He Asp Asn Leu Leu Asp
210 215 220
Thr He Leu He Gin Ala Asp He Met Glu Leu Lys Ala He Glu Glu 225 230 235 240 Gly Ser Ala Arg Ala Val Val Leu Glu Gly Ser Val Glu Lys Gly Arg
245 250 255
Gly Ala Val Ala Thr Val He Val Gin Ser Gly Thr Leu Ser Val Gly
260 265 270
Asp Ser Phe Phe Val Glu Thr Ala Phe Gly Lys Val Arg Thr Met Thr 275 280 285
Asp Asp Gin Gly Lys Ser He Gin Asn Leu Lys Pro Ser Met Val Ala
290 295 300
Leu He Thr Gly Leu Ser Glu Val Pro Pro Ala Gly Ser Val Leu He 305 310 315 320 Gly Val Glu Asn Asp Ser He Ala Arg Leu Gin Ala Gin Lys Arg Ala
325 330 335
Thr Tyr Leu Arg Gin Lys Ala Leu Ser Lys Ser Thr Lys Val Ser Phe
340 345 350
Asp Glu Leu Ser Glu Met Val Val Asn Lys Glu Leu Lys Asn He Pro 355 360 365
Val Val He Lys Ala Asp Thr Gin Gly Ser Leu Glu Ala He Lys Asn 370 375 380 Ser Leu Leu Glu Leu Asn Asn Glu Glu Val Ala He Gin Val He His
385 390 395 400
Ser Gly Val Gly Gly He Thr Glu Asn Asp Leu Ser Leu Val Ser Ser
405 410 415 Ser Glu His Ala Val He Leu Gly Phe Asn He Arg Pro Thr Gly Asn
420 425 430
Val Lys Asn Lys Ala Lys Glu Tyr Asn Val Ser He Lys Thr Tyr Thr
435 440 445
Val He Tyr Ala Leu He Glu Glu Met Arg Ser Leu Leu Leu Gly Leu 450 455 460
Met Ser Pro He He Glu Glu Glu His Thr Gly Gin Ala Glu Val Arg
465 470 475 480
Glu Thr Phe Asn He Pro Lys Val Gly Met He Ala Gly Cys Val Val
485 490 495 Ser Asp Gly Val He Ala Arg Gly He Lys Ala Arg Leu He Arg Asp
500 505 510
Gly Val Val Val His Thr Gly Glu He Leu Ser Leu Lys Arg Phe Lys
515 520 525
Asn Asp Val Lys Glu Val Ser Lys Gly Tyr Glu Cys Gly He Met Leu 530 535 540
Asp Asn Tyr Asn Glu He Lys Val Gly Asp Val Phe Glu Thr Tyr Lys 545 550 555 560
Glu He His Lys Lys Arg Thr Leu 565
(2) INFORMATION FOR SEQ ID NO: 60:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1117 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y104- l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:
GAATTCGCGG CCCGCTCGGG CGAACATTTT TACCCACTTA CCCACTTTGA ACGGAATGAA 60 TAACTTTAAG CATTTCTTAC CTCTCAATGT GAGTTTTCTG CAGTCATGAT AGCTGATTTT 120
GTTTTAAATT TGCTATAATG TAAATTTAAT GATGAAAATT AGTTTAGAGT GGAGAACACA 180
CAATGAAAAA AAATATCTTA AATTTAGCGT TAGTGGGCGC GTTGAGTGCG TCGTTTTTGA 240
TGGCTAAGCC GGCTCATAAC GCAGATAACG CTACGCATAA CACCAAAAAA ACGACTGATT 300
CTTCACCCGG CGTGTTAGCG ACAGTGGATG GCAGACCTAT CACTAAAAGC GATTTTGATA 360 TGATTAAGCA ACGAAATCCT AATTTTGATT TTGACAAGCT TAAAGAGAAA GAAAAAGAAG 420
CCTTGATTGA GCAAGCTATC CGCACCGCAC TTGTAGAAAA TGAGGCTAAG GCAGAAAAGC 480
TCGATCAGAC TCCAGAATTT AAAGCGATGA TGGAAGCGGT TAAAAAACAG GCTTTAGTGG 540
AATTTTGGGC TAAAAAACAG GCTGAAGAAG TGAAAAAAGT CCAAATCCCA GAAAAAGAAA 600
TGCAAGATTT TTACAACGCT AATAAAGATC AGCTTTTTGT CAAGCAAGAA GCCCATGCTA 660 GGCATATTTT AGTGAAAACC GAAGATGAGG CTAAACGGAT TATTTCTGAG ATTGACAAAC 720
AGCCAAAGGC TAAAAAAGAA GCCAAATTCA TTGAGTTAGC CAATCGGGAT ACGATTGATC 780
CTAACAGCAA GAACGCGCAA AATGGCGGTG ATTTGGGGAA ATTCCAAAAG AACCAAATGG 840
CTCCGGATTT TTCTAAAGCC GCTTTCGCTT TAACTTCTGG GGATTACACT AAAACCCCTG 900
TTAAAACAGA GTTTGGTTAT CATATTATCT ATTTGATTTC TAAAGATAGC CCTGTAACTT 960 ATACTTATGA GCAAGCTAAA CCTACCATTA AGGGGATGTT ACAAGAAAAG CTTTTCCAAG 1020
AACGCATGAA TCAACGCATT GAGGAATTAA GGAAGCACGC TAAAATTGTT CTCAACAAGT 1080
AGATGAGGTG TTATCATGTT CGAGCGGCCG CGAATTC 111
(2) INFORMATION FOR SEQ ID NO: 61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 299 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y104- l.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61:
Met Lys Lys Asn He Leu Asn Leu Ala Leu Val Gly Ala Leu Ser Ala 1 5 10 15 Ser Phe Leu Met Ala Lys Pro Ala His Asn Ala Asp Asn Ala Thr His 20 25 30
Asn Thr Lys Lys Thr Thr Asp Ser Ser Pro Gly Val Leu Ala Thr Val
35 40 45
Asp Gly Arg Pro He Thr Lys Ser Asp Phe Asp Met He Lys Gin Arg 50 55 60
Asn Pro Asn Phe Asp Phe Asp Lys Leu Lys Glu Lys Glu Lys Glu Ala 65 70 75 80
Leu He Glu Gin Ala He Arg Thr Ala Leu Val Glu Asn Glu Ala Lys 85 90 95 Ala Glu Lys Leu Asp Gin Thr Pro Glu Phe Lys Ala Met Met Glu Ala 100 105 110
Val Lys Lys Gin Ala Leu Val Glu Phe Trp Ala Lys Lys Gin Ala Glu
115 120 125
Glu Val Lys Lys Val Gin He Pro Glu Lys Glu Met Gin Asp Phe Tyr 130 135 140
Asn Ala Asn Lys Asp Gin Leu Phe Val Lys Gin Glu Ala His Ala Arg
145 150 155 160
His He Leu Val Lys Thr Glu Asp Glu Ala Lys Arg He He Ser Glu
165 170 175 He Asp Lys Gin Pro Lys Ala Lys Lys Glu Ala Lys Phe He Glu Leu
180 185 190
Ala Asn Arg Asp Thr He Asp Pro Asn Ser Lys Asn Ala Gin Asn Gly
195 200 205
Gly Asp Leu Gly Lys Phe Gin Lys Asn Gin Met Ala Pro Asp Phe Ser 210 215 220
Lys Ala Ala Phe Ala Leu Thr Ser Gly Asp Tyr Thr Lys Thr Pro Val
225 230 235 240
Lys Thr Glu Phe Gly Tyr His He He Tyr Leu He Ser Lys Asp Ser
245 250 255 Pro Val Thr Tyr Thr Tyr Glu Gin Ala Lys Pro Thr He Lys Gly Met
260 265 270
Leu Gin Glu Lys Leu Phe Gin Glu Arg Met Asn Gin Arg He Glu Glu
275 280 285
Leu Arg Lys His Ala Lys He Val Leu Asn Lys 290 295
(2) INFORMATION FOR SEQ ID NO: 62:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer CO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: CGCCATGGAT AATAAGACTG TGGCTGGC 28
(2) INFORMATION FOR SEQ ID NO: 63:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: top strand of AB SISPA linker (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: GGAATTCGCG GCCGCTCG 18 (2) INFORMATION FOR SEQ ID NO: 64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: bottom strand of AB SISPA linker
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: CGAGCGGCCG CGAATTCCTT 20 (2) INFORMATION FOR SEQ ID NO: 65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: gtll . F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65: CACATGGCTG AATATCGACG 20 (2) INFORMATION FOR SEQ ID NO: 66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: gtllR
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: GGCAGACATG GCCTGCCCGG 20 (2) INFORMATION FOR SEQ ID NO: 67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: AATTAACCCT CACTAAAGGG 20 (2) INFORMATION FOR SEQ ID NO: 68:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68:
GTAATACGAC TCACTATAGG GC 22
(2) INFORMATION FOR SEQ ID NO: 69: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 102 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y104A (N-term of Y104-1)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: Met Lys Lys Asn He Leu Asn Leu Ala Leu Val Gly Ala Leu Ser Ala 1 5 10 15
Ser Phe Leu Met Ala Lys Pro Ala His Asn Ala Asp Asn Ala Thr His
20 25 30
Asn Thr Lys Lys Thr Thr Asp Ser Ser Pro Gly Val Leu Ala Thr Val 35 40 45
Asp Gly Arg Pro He Thr Lys Ser Asp Phe Asp Met He Lys Gin Arg
50 55 60
Asn Pro Asn Phe Asp Phe Asp Lys Leu Lys Glu Lys Glu Lys Glu Ala 65 70 75 80 Leu He Glu Gin Ala He Arg Thr Ala Leu Val Glu Asn Glu Ala Lys
85 90 95
Ala Glu Lys Leu Asp Gin 100 (2) INFORMATION FOR SEQ ID NO: 70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 247 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y103-1/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70:
TGGGGGTGAT TATCATGCTG CTTATGGGGA ATAAGGAAGA ATCTAAAGAA AACGCTTCTA 60
AAAACACCCA AGAAGTTCAA GCTAATCCTA TGGCGAACAA GAATCAAGAG GCTAAAGAAG 120 GCTCTAATAT CCAGCAATAT TTGGTGCTTG GGCCTTTGTA TGCGATTGAT GCGCCTTTTG 180
CGGTGAATTT AGTCTCTCAA AATGGCAGAC GCTACCTTAA GGCTTCTATT TCGTTAGAAT 240
TGAGTAA 247
(2) INFORMATION FOR SEQ ID NO: 71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 218 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y103-1/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: TCAAAAAATA CGAGACAACA TGCGTTTTTC TTCGTTAAAA GAGCTTAAAA ATCAGATCCA 60
ACAAGACATC TTAAGAGCCA AAGAGATTTT GAGATAATTT GTGTTAAAAT GACTCTCAAA 120
AACCTTAAAA ACGGAAAAAT TTGATGCGAT TGAGTAACGC TGACTTAGAA CGATTAAAAA 180
GCATGGCGAA TGCGCTTCGC TTTTTGTGCG CAGACATG 218 (2) INFORMATION FOR SEQ ID NO: 72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 822 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y103-2.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 72 :
CAAAGAAAAT TTAAAGCTCA TTAGTCTAAT CACACCTAAA ATCTCTAATT TAGAGATTTA 60
CTTACGCAAC GCACTAGATT ATTGCCTAAC TCAAATTAAA GGGAATGAGT GGGTGTTTGA 120 TGAAGTTTCT TTAATCCCCT TAATAGAAGA ATTGAAAGAC AAGAAAAAAG AAATCACGCA 180
TTCTTTGGTC TTGTCTAAAA TGTCTTTAGA AGCGGTGATT AAGCTTATCT TTTTTTACAA 240
ATTAGAGGGG GTAGCATTAG ATTTGAGAGC CTATAGTCTT AAGGCTTATT ACAAAGATAA 300
TAAGGATACT TTGCTTATTA AGGGTAGAAA ACAACATCTT TCTAATTATG CTAAAGCCTA 360
TATTGCTTTA AACTTACTAT GGACAATTAG AAATCGTGCG TATCATTGGG AAAATTTACT 420 AAAGCTAAGG GCAAACAACC GCCCACGCAT TACAACACGC TTTATCAGAG AATTAGAAAA 480
GCCTACAAGT AAAAGTTTTA ACTTTGGTAT CATGCCTAAC AAAATTGTTT CATTTTTAGA 540
TGATTTAATC AAAAGTGTTG GAAACAAAGA CTTGGAAAAA CTAAGTAGTC TATAAGCTAT 600
AAGAAAGTGG GCTTCGGTCA GCCCTACGAA TGTAGGCGTA TTATAGCTAA ACAATCATTA 660
TAAGTCAAAC CAAAACCAAC ACAAAATTTG CTAAACTACA ATCAAATCAA TTTAGGGAGA 720 ATAAAAATGT CATTTGCTCC TATGTTATTA GCTACAATCA ATAACTCTAT TGGCAATAAA 780
GATAAGCATG TGAGTTTAGA GTATCTTATA GGGCTTTTTA TG 822
(2) INFORMATION FOR SEQ ID NO: 73: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1082 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y104- l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73:
GGCGAACATT TTTACCCACT TACCCACTTT GAACGGAATG AATAACTTTA AGCATTTCTT 60
ACCTCTCAAT GTGAGTTTTC TGCAGTCATG ATAGCTGATT TTGTTTTAAA TTTGCTATAA 120
TGTAAATTTA ATGATGAAAA TTAGTTTAGA GTGGAGAACA CACAATGAAA AAAAATATCT 180
TAAATTTAGC GTTAGTGGGC GCGTTGAGTG CGTCGTTTTT GATGGCTAAG CCGGCTCATA 240 ACGCAGATAA CGCTACGCAT AACACCAAAA AAACGACTGA TTCTTCACCC GGCGTGTTAG 300
CGACAGTGGA TGGCAGACCT ATCACTAAAA GCGATTTTGA TATGATTAAG CAACGAAATC 360
CTAATTTTGA TTTTGACAAG CTTAAAGAGA AAGAAAAAGA AGCCTTGATT GAGCAAGCTA 420
TCCGCACCGC ACTTGTAGAA AATGAGGCTA AGGCAGAAAA GCTCGATCAG ACTCCAGAAT 480
TTAAAGCGAT GATGGAAGCG GTTAAAAAAC AGGCTTTAGT GGAATTTTGG GCTAAAAAAC 540 AGGCTGAAGA AGTGAAAAAA GTCCAAATCC CAGAAAAAGA AATGCAAGAT TTTTACAACG 600
CTAATAAAGA TCAGCTTTTT GTCAAGCAAG AAGCCCATGC TAGGCATATT TTAGTGAAAA 660
CCGAAGATGA GGCTAAACGG ATTATTTCTG AGATTGACAA ACAGCCAAAG GCTAAAAAAG 720
AAGCCAAATT CATTGAGTTA GCCAATCGGG ATACGATTGA TCCTAACAGC AAGAACGCGC 780
AAAATGGCGG TGATTTGGGG AAATTCCAAA AGAACCAAAT GGCTCCGGAT TTTTCTAAAG 840 CCGCTTTCGC TTTAACTTCT GGGGATTACA CTAAAACCCC TGTTAAAACA GAGTTTGGTT 900
ATCATATTAT CTATTTGATT TCTAAAGATA GCCCTGTAAC TTATACTTAT GAGCAAGCTA 960
AACCTACCAT TAAGGGGATG TTACAAGAAA AGCTTTTCCA AGAACGCATG AATCAACGCA 1020 TTGAGGAATT AAGGAAGCAC GCTAAAATTG TTCTCAACAA GTAGATGAGG TGTTATCATG 1080 TT 1082
(2) INFORMATION FOR SEQ ID NO: 74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 735 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y105.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74;
GTTAACTCCC ATGCGTTAGC CGTGTAAGCG AAATCTTTAT AGCTCAAAGC GCCCACAGGG 60
CATACCGCAA TGCATTCCCC GCAATCATAG CAAGGCACGC TGCCCACAAA AGAAATAATG 120
CCTTTTTGCT TGCGACTCCA CACGCTAAAA GCGTCTTTGG ACATGCTGTC TTTAAATTTA 180
TCCGGAGCAT GCAAGTCGGC TTTAGTGGCT TTGAGGTTGT TTTCGCCCAC ATTGTCCTTG 240
CAAGTGGTTA CGCACCTTTC GCACATGATG CACAAATTAG GGTCATACAA GGCTTTTGCC 300
CAAAAATCCA GCGCTTTAAA ATCATCAGCC ACCGCATAAG GTTGGTGCTC CACGCCGGTT 360
AAATGCGTCA TGTCTTGCAA TTCGCACTCC CCGCTCTTAT CGCACACGCC ACACTCTAAG 420
GGGTGGTTGA CATCATAAGT TTGCATGATG CTTTTTCTTT CATCCATGAG CGTGGGGGTG 480
TTAGTGAGAA TGGTGGCGTT ATTTTTGGCT TTCGTGTTGC AGCTATAAAT GCGTTTGCCA 540
TCCATTTCAA CCATGCACAT TTTGCATGCG ACTGTGGGCG AGCAACCGCT TAAATAGCAA 600
ATAGTAGGGA TGTAGATCCC AGCACTCCTA GCAGCCTCTA AAACGCTTTG TCCCTCTTGG 660
CATTCAATCA TTTTGCCATT GATATTCATT GTGATCATGA AATTCCTTGA GAGTGGGATA 720
AAATAGGGAT ATAAG 735
(2) INFORMATION FOR SEQ ID NO:75:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 284 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y107-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75:
CAAATCTTGC AAGCCCACGC CCAGGTAACC TCTTTCAATC TTACCGGTTT TGATGAGTTG 60 GGTTACAATA TCTTTAACCA TGTTAGAAGG GATGGCAAAG CCAATGCCGT GGTTGCCCCC 120
AGTTTTAGAG ATGATAGCGG TATTAATCCC CACTAACCCT CCACGGCTAT CAATTAAAGC 180
GCCGCCGGAA TTTCCAGGAT TAATAGAGGC GTCTGTTTGA ATGAAATTCT CATAGCTGTT 240
GATCCCAATC CCGCTTTTAT TGAGCGCTGA AACAATGCCT TGAG 284 (2) INFORMATION FOR SEQ ID NO:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 308 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y107-1.T7
(ix) FEATURE:
(A) NAME/KEY: Other
(B) LOCATION: 21, 84, 85 (D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: GAATTCTATC TCATCGGTGT NTTCTTCTAA AAGCTCTCTG GCTTGCTCAT CAATGGCGTT 60
TTCTTTTTTA ACGCTTTTTT CCANNATCCC TTCTATCAAT TCACACAATA ATTCTCTAGG 120
GGCTTTTAAT TCTAATAGTT TTGAATGGAT AAAGTCATTG GCAATCTTAT GGCTTATATG 180
GTTTATATGG GTTAGTTTGA GTCTCATGAT TTTTCTCGCA TATTTTATTT ATTTTTAGAG 240 TGTATTTTAT CGTTTTTTAG ATAAAACGCT TTTTAAAGAA TTAATTTAAT GGCGCACCAA 300
ACTATAGG 308
(2) INFORMATION FOR SEQ ID NO: 77: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 780 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y108-l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:
CAATCGCTCG CATAAGCATT GGGCCTCTCC AAATGAGACT CTGCCCCTCA TCATACAAAA 60
GCCCCATGCT CATCACAGAA ACGCCAAAAG CTTTTAAAGG AATGAGTTTT TTACCGCTAG 120
GATCCATGAT CACATCAGCG TTTTGTAAGC CCATCATTCT AGGGATATTA GGGCCATACA 180
CATCAGCGCC TAGTAACCCC ACTTTTTGGT TTAAATTCGC TAAAGCGATG CTTAAATTCA 240 CGCTGGTGGT GCTTTTACCC ACACCGCCCT TACCTGAGCT TATCATCACT ACATGCTTGA 300
TGTTTTTAGC CAAATTTTTA GTAGTGGGCT TTGGAGCTTG CGGCTTGGCG GTTTATATCA 360
ATTCAAGCTT TCACGCTATT TCTGCACGCT TCAGAGATAT TTTCCCTTAA ATCGCGCTCG 420
TTTCTTCAGA GCTTGAGGGG ATTTCTATTA AAAGCCCTAA TTGATTGTCA TGCAAAGCGA 480
TGCTTTTAAC AAAACCAAAG CTGACAATAT CCTTTTCAAA ATTAGGGTAA ATGATCGTTT 540 TTAACGCGTT TAAGACATCT TCTTGGGTGA GCATTTTAAT CCTTAAAATG ATAGTATTTG 600
AATTTTCTTT ATAGCATGTT TTGTTTAATT TTGCATTATA AATTAAATTT GAGTTTTCAA 660
AAAGGATATG GCATGAAATT TTATAAGCGT GTTTTGAAAT TGCACCATTT ACCGAATTTG 720
GGTAGAAACT CGCCTATGGA GTTGCTTTTA AATTCAAGTT TTGAAAAACA TGGAGGACTC 780 (2) INFORMATION FOR SEQ ID NO: 78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y109-1/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:
AGGTTCGTAT CGCTACAGGC GCGTTAATCA CGGCTTCTGG GGATATTAGC TTGACTTTTA 60
AACAAGTGGA TGGCGTGAAT GATGTAACTT TAGAGAGCGT AAAAGTTTCT AGTTCAGCAG 120
GCACGGGGAT CGGTGTGTTA GCGGAAGTGA TTAACAAAAA ATTCTAACCG AACAGGGGTT 180
AAAAGCTTAT GCGAGCGTTA T 201
(2) INFORMATION FOR SEQ ID NO: 79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 229 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y109-1/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79:
CCACTGGCTG ATTTGACATT AGCGTTAAAA TTCCCAGTAA CATCACGCAA ATTCACCGTG 60 GTTTCTGCCA CTTGAGATTC CCCAAAACCA ATCGCTGTGA AACCTAAATG TTGCGAATCA 120 GGAGCCGAAA CGACATTGAT GCTTTTAGCG TCTAAGCGTG TGAGAGAAAG CCTCCCATAG 180
TTAGTAGAAC CTTTTGTTAA ATCCTGACCG CCATTGACCA TCGTTAAAG 229
(2) INFORMATION FOR SEQ ID NO: 80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1564 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Ylll.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO:80:
GCTCGAAATC TTGAACGAAT GCCTAAATTT AGTCATCGCT CACCCCAAAA ATAACGCTTT 60
AGAATGGCTG ATTGAATGGG TTAAGGGTCA TTATTTACCT AATAATGCTA TAAACCATTC 120
GCCAATAGGC ACAAAAAATT AAAAACAGAG AAAACATGAT AACGATGGAT GCGATTCAAT 180 GGCCTAAGAA ATGGATTTTA GGAGAGACTG AT GTTTCGT GTCTAATGAA GTCATTGTCA 240
AAGGTTTGGA TTTTAAAAAA GTGGTGCAGC ATTTAATGGC GTATTTGTGC TAATAAGTGA 300
GGGTTAAATG AAAATAGGAT GGATTGGACT TGGGGCTATG GGGACTCCTA TGGCGACTCG 360
CGAGCCGGCC GCGAATTCGC GGCCGCTCGC TATCAATAGT GAAGTCCTGC AAACCTCAAA 420
ATTGGGCTAT CGCACTTTGA AATTTTTCCC GGCAGATATT GCGGGGGCGT CAAGCTTTTA 480 AACGCTTTTA ATGCCCCTTT TAAACGGGTG AAATTTTGCC CTCCTGGGGG GATTAGCGCA 540
GATAACATGC GTTCTTATTT AAATTTAGAA AATGTTTTGT GCGTGGGGGG GAGCTGGCTT 600
ACCCCTAAAG ATTTGATNCC AAACNAAGAG TGGGATAAAA TCACAGAAAT TTGTAAGAGA 660
GCGCTAACTT TAAGATACCC AAAAGATCAT GGCATTTTGT ATTTGCTTAA TAACACTATA 720
AAAAAANTTT TAATTAGGAG ATACATCATG TTAGAAAATG TCCAAAAATC CCTTTTTAGG 780 GTTTTGTGCT TGGGAGCGTT GTGTTTAGGG GGGCTAATGG CAGAGCCAGA CCCTAAAGAG 840
CTTGTGGGTT TGGGCGCGAA GAGTTACAAA GAGCAAGATT TCACTCAAGC TAAGAAATAT 900
TTTGAGAAAG CGTGCGATTT GAAAGAAAAT AGCGGGTGTT TTAATTTAGG GGTGCTTTAT 960
TATCAAGGGC AAGGGGTGGA AAAGAACTTG AAAAAAGCCG CCTCCTTTTA CGCTAAAGCT 1020
TGCGATTTGA ATTACAGCAA TGGGTGTCAT TTGCTAGGGA ATTTATATTA CAGTGGGCAA 1080 GGCGTGTCCC AAAACACCAA TAAAGCCTTA CAATACTACT CTAAAGCGTG CGATTTGAAA 1140
TACGCTGAAG GGTGCGCGAG CTTAGGGGGG ATTTATCATG ATGGTAAAAT GGTAACTAGG 1200
GATTTTAAAA AAGCGGTGGA ATATTTCACT AAAGCGTGCG ATTTAAACGA TGGCGATGGT 1260
TGCACGATAT TAGGGAGCTT GTATGATGCA GGCAGAGGCA CGCCTAAGGA TTTGAAAAAG 1320
GCGCTCGCTT CGTATGATAA AGCTTGCGAC TTAAAAGACA GCCCGGGGTG TTTTAACGCA 1380 GGGAATATGT ATCATCATGG CGATGGCGTG GCGAAGAATT TTAAAGAGGC TCTCGATCGT 1440
TATTCTAAAG CATGCGAGAT GCAAAACGGC GGAGGGTGTT TCAATTTAGG GGCTATGCAA 1500
TACAATGGCG AAGGTGCAAC AAGGAATGAA AAGCAAGCCA TAGAAAACTT TAAAAAAGGC 1560
TGTA 1564 (2) INFORMATION FOR SEQ ID NO: 81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y114-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81:
GCCAAAGAAT GCGTCTCGCC CATAACAAGA AGCGTTAAGT ATCATCAGCA AAGCGCTGAG 60
ATCAGAGCCT TGCAATTGCA AAGTTACAAA ATGGCGAAAA TGGCGCTAGG CAATAATCTC 120 AAGCTCGTTA AAGACAAAAA GCCAGCCGTC ATCTTGGATT TAGATGAAAC CGTTTTGAAC 180
ACTTTTGATT ATGCGGGCTA T 201
(2) INFORMATION FOR SEQ ID NO: 82: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 221 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y114-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82:
TCCGCCTAAG AGATCAAATT TTTTGTCATG TTAGCAAAAA TCGTTCTTAT GCGATGGATA 60 AATTTTATTT TTTATTCTGC CATGCTTTTA TAGGCTCATC TTCCCATGTG CCATAGAGAG 120
AATTAGGTAA AATGATCCAC TCTGTGCCGA ATTTTTGAGC GTTTTGCAAG ACTTTAGCTC 180
GTTGTTCTTG GCTGTTTTTA GCGTCTTTAG CAAAAAGTGC A 221
(2) INFORMATION FOR SEQ ID NO: 83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y115-1.T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83:
TTTTTATTTT CTTTATTCAC TTTATTCATG GATTTTAAAG CTTTTCAACT TTGATGAGTT 60
TCGTGCCGTA TTCTACCGGT TGAGCGTCTC CCACTTCANN AGAAACCACC TTGCAAGGGT 120
ATTCCACTTC NATTTCATTC ATGATTTTCA TCGCTTCTAC AATGCCCACG ATTTGCCCTT 180 TTTTAAGCGT ATCGCCCGCT T 201
(2) INFORMATION FOR SEQ ID NO: 84:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 271 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y115-1.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: CTCGCATAGC TGATCACATT CTTACCGACT TGCTTTTCGC AAAAAGGGCT AATCATGCCA 60
TGCTCTAAAC TCATTTTTTT AATCCAAGAA TCCGCTTTCA ATCCCATACG CACAAAGCCT 120
TTAAAGTTTG TTAATAATTA TGTTATTATA ACAATAATTT TTTGCTAAAA ACGCTTAGTA 180
AGATTATGTT TTGTGGCATA TTTTACAACA ATATAAGGAA TAAATCGTTA TGAACCTTTC 240
TGAAATTGAA GAGTTGATCA AAGAATTTAA A 271
(2) INFORMATION FOR SEQ ID NO: 85:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 510 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y116-l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85:
CCACGATTTG CCCTTTTTTA AGCGTATCGC CCGCTTTGAC ATAAGGCTCA GCCCCAGGCG 60 AGGGCGCATG ATAAAAAGTG CCTACCATAG GCGAAAGCAC GAAATCTTCT TTTTTATCCA 120
CAATAGGGGT GCATACCATA GGCACAGGGG CTTGAGTGCT TGGCATGCTC GCTTCTACCA 180
TGATGGGGGC TTGGATGGAG GCTTGAGAAT GAGCGGGATT TAGTGCATTT TTTTTCGCAT 240 AAGCGGATTC TTTATCCAAA ACCAACTCAA AATGCTCATG CTTTAATTTC AAATGCCCCA 300
AATCAGAAGC TTTAAATTCT TTGATCAACT CTTCAATTTC AGAAAGGTTC ATAACGATTT 360
ATTCCTTATA TTGTTTTAAA ATATGCCACA AAACATAATC TTACTAAGCG TTTTTAGCAA 420
AAAATTATTG TTATAATAAC ATAATTATTA ACAAACTTTA AAGGCTTTGT GCGTATGGGA 480 TTGAAAGCGG ATTCTTGGAT TAAGGAATTC 510
(2) INFORMATION FOR SEQ ID NO:86:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 195 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y117-2.T3
(ix) FEATURE:
(A) NAME/KEY: Other (B) LOCATION: 1, 2, 11, 23, 25, 27, 85, 144, 154, 159, 160, 194
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86:
NNAGGGTATT NATAAGAGGT CANTNCNGGG AACGCTAACA CTTCCAATCG GTTGCAACGC 60
GGTGGTATCA ACCAGGATGG AGAANTCCAC GCTGTGCAAT AACTTCGAGA CGACTTCTTG 120
TGGGACTGGA GCTTGCGCCA AAANAGACGG CTTNTATCNN GCAGACGGCT ATTTGGAATC 180
CAGAACGCGA TATNG 195
(2) INFORMATION FOR SEQ ID NO: 87:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 236 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y117-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87:
CGCATACATG TGCTGTAACC TGTATGGAAG CTTCTTTTTC AAGGATAGCG TTTCTTTCTT 60 GCTCCATCCA TTGTTTTACC GGTATGGATC TAACCTTTCT GTGTTCAAGA ATTTCTATAC 120
TACTGGCTTT CGTATTGGCT TCATCGCCGA CCCACACGCC GTCTTTATTT TGACTCACCA 180
CTGCGCCATA AACCGTGTAA GCGTATNCTG GCAATAGCTG TNTGCTGTTG AGATCT 236
(2) INFORMATION FOR SEQ ID NO: 88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 226 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y119-1.T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88:
GCAACTATGA GTAATGTTNG AAGCNNNTCA CNGACAATAN ACTGCANCAT CAAATCAAAA 60
ACACTCTAGA GGCTTCTAAT GACTGCATCA AAAATCCCAA CNCTGAAGAA GAAATGATCA 120
AGTGTTTCCA TTTACTCANA AATCACTACT TGAAACACAG CTTCCTGACC CAACNNCNCG 180 TTCAGGTGGC GCTCGATTGT TTGNNNAACG CTCNCACCGA TGACTA 226
(2) INFORMATION FOR SEQ ID NO: 89: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 243 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y119-1.T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89:
GAATTCCTTC AAGCACTCGT TTCGTTCTTC ATCGGTTTTA GCGTTTTTCA AACAATCTAA 60
CGCTTGCTGT TTCAATCTTT CTATAGCTTC TTTAAANAAG CCTTTCAAGC ATTCATTTTT 120
CTCAGCCTCT GTTTTGGCGT NTTTGATNCN ATCCTTATAC TCTTGAAGCT CTTTTTGAAG 180 CTCTAATTCC TTACGGAATT TCTCTCTAAT GTCAGGGTCA TTTATGAGTT TTAGGCACTC 240
GTT 243
(2) INFORMATION FOR SEQ ID NO: 90: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 501 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y120.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90:
TATGAGTGGG GGGCTTATTT GAAAAGAACG GGATTAGGCG AGCATGAAAT GGCGTTTGCA 60
GGCTGGATGG CATACATAGC GGATCCGGAT AATTTCTTAT ACACCTTATG GAGCAAGCAA 120
GCCGCCTCAG CCATACCCAC TCAAAATGGT TCCTTTTATA AGAGCGACGC CTTTTCCGAT 180
CTGCTCATAA AGGCTAAACG GGTTTCGGAT CAAAAAGAGA GGGAAGCCCT TTATTTAAAG 240 GCTCAAGAAA TTATCCATAA AGATGCACCC TATGTGCCTT TAGCCTACCC TTATTCAGTG 300
GTGCCGCACT TGTCTAAAGT CAAAGGCTAT AAAACGACTG GAGTGAGCGT GAATCGCTTC 360
TTTAAGGTGT ATTTAGAAAA ATAAAAGGGG TTGCATGCTG AGTTTTATCA TTAAGCGTAT 420
TTTGTGGGCG ATCCCCACGC TGTTTGGAGT GAGCATTATC GTGTTTATGA TGGTGCATTT 480
AGTGCCAGGA GATCCGGCGT T 501
(2) INFORMATION FOR SEQ ID NO: 91:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1055 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y122.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91:
GGAAAAGAGG TGATAGCAGT ATGCCAGGGA TTAAGGTTAG AGAAGGCGAT GCGTTTGATG 60 AAGCTTACAG GAGATTCAAA AAGCAAACCG ATCGCAATTT GGTGGTAACA GAATGCCGTG 120
CAAGGAGATT CTTTGAATCT AAAACTGAAA AGCGCAAAAA GCAAAAAATC AGTGCTAAAA 180
AGAAAGTCTT AAAGCGTCTT TATATGTTAA GGCGTTATGA ATCAAGACTA TAAGAACTTG 240
AAAAAATTTA AAAATTAAGG ATTATTGAAT AATGCAATTC ACAGGGAAAA ATGTTCTCAT 300
TACCGGAGCT TCTAAAGGCA TTGGGGCTGA AATCGCTAAA ACTCTCGCTT CTATGGGGCT 360 GAAAGTTTGG ATCAATTACC GCAGTAATGC TGAAGTGGCT GACGCTTTGA AAAATGAGCT 420
TGAAGAAAAA GGCTGTAAGG CAGCTGTCAT TAAATTTGAT GCGGCTTCTG AGAGCGATTT 480
TATTGAAGCG ATACAAACCA TTGTCCAAAG CGATGGGGGG TTGTCTTAAC TTGGTGAATA 540
ACGCCGGTGT GGTGCGCGAT AAATTAGCGA TCAAAATGAA AACAGAAGAC TTTCACCATG 600
TCATAGACAA TAACCTCACT TCAGCCTTTA TAGGTTGCAG AGAGGCTTTA AAGGTGATGA 660 GCAAGAGTCG TTTTGGGAGC GTGGTCAATG TCGCTTCTAT CATTGGTGAA AGAGGCAATA 720
TGGGGCAGAC AAACTACTCA GCGAGTAAGG GAGGAATGAT TGCGATGAGC AAGTCCTTTG 780
CTTATGAGGG AGCTTTAAGG AATATTCGTT TCAACTCTGT AACACCAGGC TTTATAGAAA 840 CCGACATGAA CGCCAATTTG AAAGACGAAC TCAAAGCGGA TTATGTTAAA AACATTCCTT 900
TAAACAGACT AGGGTCTGCT AAGGAAGTGG CAGAAGCGGT AGCGTTTCTT TTGAGTGATC 960
ACTCTAGTTA CATCACTGGA GAGACTCTCA AAGTCAATGG CGGGCTTTAT ATGTAGTCCT 1020
AAACAAAGGG TTCTTTTAGC GATAAAAGTT TGTAC 1055
(2) INFORMATION FOR SEQ ID NO: 92:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 261 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y123-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92:
AGAGGGACTC TTAAAGAACG CCCTGATGAA ATCGCAACAC TCTTTAAACT CCTAATCAAA 60 GATGAAATCT CTTCAGACAG CGCGAAAGGT TAAATAAAGG TTAAAAATGG CAACCAAGCT 120
TACCCCCAAA CAAAAGGCTC AATTAAACGA ACTCTCCATG AGTGAAAAAA TCGCTATTTT 180
ACTCATTCAA GTGGGCGAAG ACACCACAGG CGAGATTTTA AGGCATTTAG ACATTGACTC 240
TATTACAGAG ATTTCTAAGC A 261 (2) INFORMATION FOR SEQ ID NO: 93:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 331 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y123-1.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93:
CTTATTGCAT TTGCAATTTC ACTAAATGGC TTGAGAGCTC ATCGGTTTTT TTCAATAAGC 60
AATCAATCAA ATCGTTTTCT ATCGCTTTTT TTTCCAAAGG CTCTTCTAGA TTAGGCGTTT 120 CTAAAGACGC GCCATTAGGA TTAGTTTTAG GGGGTAAGTT TGCCATGCTC TTAAATTCGT 180
ATTTTTGAAT GTCATGCTTA TTCAAATGGT CTTTTTGGAT CAAATTTTTG CGGCTATTCA 240
ATGACATCTT CCTCTTCACC GGTTTGGATC ACGCCTTTTT CTTGCAAGTT CTGCACGATT 300
TCAATGATTT TCCTTTGAGC CACATCCACA T 331 (2) INFORMATION FOR SEQ ID NO: 94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 215 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y124-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 :
TAGAGGGCAT GCAATTTGAT AGAGGCTACC TCTCCCCTTA TTTTGTAACG AACGCTGAGA 60
AAATGACCGC TCAATTGGAT AACGCTTACA TCCTTTTAAC GGACAAAAAA ATCTCTAGCA 120 TGAAAGACAT CCTCCCGCTA TTGGAAAAAA CCATGAAAGA GGGCAAACCG CTTTTAATCA 180
TCGCTGAAGA CATTGAGGGC GAAGCTTTAA CGACT 215
(2) INFORMATION FOR SEQ ID NO: 95: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 250 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y124-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95:
CACCGCCAGA GAGTTTAGCC AATCTTTCTT GCAATTTTTC TTTGTCATAA TCGCTTGTCG 60 TGCTTGCAAT TTGGGTTTTG ATTTGCGCAA CTCTGTCTTT AACATCATCG CTATGGCCTT 120
TGCCATCTAC GATCGTGGTG TTGTCTTTGT CAATCACAAT CCTTCCGGCT TTGCCTAAAA 180
ACTCCACTTC AGCGTTTTCT AAAGTCAAGC CCAATTCTTC GCTAATAACT TGACCGCCGG 240
TTAAAATAGC 250 (2) INFORMATION FOR SEQ ID NO:96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 276 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y126-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:96:
TGTCCGGCCC AATGAATGGC TTAATGAGCG ATGGATAAAA ACCAATATCA TCACCCCCAT 60
AGAGCAAGCC AAACGGCTTT TAATGAAAGG ATAGTCATGT TAAAAACGAA TCAAAAAAAT 120 GTGCATGCGT TTGAAATTGA AAAGCAAGAG CCTAAAGCGG TCATAGGATT TTTAGAAAAA 180
AACCATGCCC TTTTGCAGTA TTTTCTTATT ATATTTAAAT ATGATATTGA ACCAGAATTC 240
AAAGCCATTT TGCACAAACA CCAGCTTTTG TTTTTG 276
(2) INFORMATION FOR SEQ ID NO: 97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 287 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y126-2.T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97:
TCTTTAATCC CTGCTTCGTC TAAAAGCATG CAATAAGTCA AAGCACTCCC ATCCATGATA 60
GGGATTTCTT CGTTATCCAC AGAGATCTTA AGATTGTCAA TGCCATACGC ATGGACAGCT 120
GAAAGCAAAT GCTCAATTGT AGAAATCCTA GCGTTATTCT TACCCAACAC GGTTGCCATT 180 TTGGTATCCA CAATGTTTTC AGGTTTTAAG GGGAGCTTCA CGCCCAAATC AGAGCGGTAA 240
AAAACAATGC CTTGATTTTC TTCTAAAGGC TCTAAAACAA GCTTCAC 287
(2) INFORMATION FOR SEQ ID NO: 98: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 933 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y128.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:98:
TGAGCTTGAA AGAATTAGAG CTTTTAGAAA AAGTGTTTTT AGGGGTTTTA GAGGACTTGA 60 GTGAGAAATA AAATAAATAA ACATTAAGTA AGGCTTATCA ATATTTGATT ACAATTATAA 120 AGGGTTACAT TTTTTTAATA GGAGATATAC CATGCTAGGA AGCGTTAAAA AAACCCTTTT 180
TGGGGTCTTG TGTTTGGGCA CATTGTGTTT GAGATGGTTA ATGGCAAGCC AGACGCTAAA 240
GAGCTTGTGA ATTTAGGCAT AGAGAGCGCA AAGAAGCAAG ATTTCGCTCA AGCTAAAACG 300
CATTTTGAAA AAGCTTGTGA GTTAAAAAAT GGCTTTGGAT GTGTTTTTTT AGGGGCGTTC 360 TATGAAGAAG GGAAAGGAGT GGGAAAAGAC TTGAAAAAAG CCATCCAATT TTACACTAAA 420
GGTTGTGAAT TAAATGATGG TTATGGGTGC AACCTGCTAG GAAATTTATA CTATAACGGA 480
CAAGCGTGTC TAAAGACGCC AAAAAAGCCT CACAATACTA CTCTAAAGCT TGCGACTTAA 540
ACCATGCTGA AGGGTGTATG GTATTAGGAA GCTTACACCA TTATGGCGTA GGCACGCCTA 600
AGGATTTAAG AAAGGCTCTT GATTTGTATG AAAAAGCTTG CGATTTAAAA GACAGCCCAG 660 GGTGTATTAA TGCAGGATAC ATATATAGTG TAACAAAGAA TTTTAAGGAG GCTATCGTTC 720
GTTATTCTAA AGCATGCGAG TTGAACGATG GTAGGGGGTG TTATAATTTA GGGGTTATGC 780
AATACAACGC TCAAGGCACA ACAAAGGATG AAAAACAAGC GGTAGAAAAC TTTAAAAAAG 840
GTTGCAAATC AGGCGTTAAA GAAGCATGCG ACGCTCTCAA GGAATTGAAA ATAGAACTTT 900
AGTTTCAATA AAGTTAAGCT AAACGCCATG TTT 933
(2) INFORMATION FOR SEQ ID NO: 99:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 249 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y12/t3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99:
GTGTTTTTGT GGGGCGATGA TCAAATCGCT CTCTCTAAAC TCGTGTTTGA TTTCCAAAAA 60 GAGCATAAAG ATCACTTTGT GTTGAAAGCG GGCTTGTTTG ATAAAGAAAG CGTTAGCGTA 120
GCTCATGTGG AAGCGGTTTC AAAACTCCCA AGCAAAGAAG AGCTTATGGG AATGTTGCTT 180
TCTGTTTGGA CGGCTCCGGT GCGTTATTTT GTGACCGGTT TAGACAATTT GCGTAAAGCG 240
AAAGAAGAA 249 (2) INFORMATION FOR SEQ ID NO: 100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 308 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y12/t7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100:
TTCAAGCCTT TGATCAAAAA TTTTGAGTTA TTTAATGCAT TAATTCAAGT AAGCATTGCA 60
AAAGGATCTT TAAACTTTAG AACTATCCAT GATAAAAATA CCAAAAACCC ACCTGCAATA 120 CTAACCACTT TAGGCCACAC GCTCACAAAC AAAATCATGG AGCAAAATCA TCTGCCGCAC 180
TAGATCAGAG AAAGGCGCAA AAGGCCTTTC TGTTTTTAAG TCTTACTTGA CTTCAACCTT 240
AGCACCTACT TCTTCAAGTT TCTTCTTGAT GGTTTCAGCT TCTTCTTTAT TCACGCCCTC 300
TTTAAGCA 308 (2) INFORMATION FOR SEQ ID NO: 101:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 271 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y130-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: TGAAGAAATG GATGAAGAAG AAGATGAATT GAACAAACTG GGCGATTTGA GAAAAAAAGT 60
AGAAGATCAA TTAGGGCTTA ATGCAACCTT TAGCGAAGAA GAAGTAAGAT ACGAAATTAT 120
ATTAGAAAAG ATTAGAGGGA CTCTTAAAGA ACGCCCTGAT GAAATCGCAA CACTCTTTAA 180
ACTCCTAATC AAAGATGAAA TCTCTTCAGA CAGCGCGAAA GGTTAAATAA AGGTTAAAAA 240 TGGCAACCAA GCTTACCCCC AAACAAAAGG C 271
(2) INFORMATION FOR SEQ ID NO:102:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 191 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y130-1.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: GCACTTGGGG CGATATTTCG CCTAAATTCG CCATTCTAAT AGAGATTTCG GCTTTCATTT 60
CATCAGGGAA ATAGCTCAAA GTCTCAGCCG CATTAGGGGC TTCCATGTGG GCTAAAATCA 120
AGGCAATGGT TTGAGGGTGT TCGTTAATGA TGAAATCAGC GAGTTGTGGG GGCTTGATTT 180
TGCCTAAATA A 191 (2) INFORMATION FOR SEQ ID NO:103:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 274 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y131-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103
CATGTTCTTT ATGGGTAATG GCTACATGAA TAGCGATTGG GGTATGATGG GGGGCTATCC 60
AGCGGCCAGT GGCTATAGGT TTGAAGCGCA CAACACCGAC TTGGAAAACA GGATTAAAAA 120 TAACGCCAGC TTGCCTTTGG GGGGCGATTT TAACCCAACG GATAGAGATT ATGAAAAGCA 180
TATTTCTCAT GCGTCTCAAG TCAAAAGGGA TAAGCAATGC ATCACCACTG AGAACTGCTT 240
TGACAATTAT GATTTGTATT TGAATTACAT CAAA 274
(2) INFORMATION FOR SEQ ID NO: 104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 283 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y131-1.T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:104:
TGCTCGCAAA TCCATCCTGT ATTTAGAACC ATAGGTGAAT ACGCCAAGCT CATCTTCTTT 60
CATAGTCCAG CTCTTTGGCA AGTTCCAAAA TGTTTTAAAA TCGCTTAAAA ACTTAGGCGA 120
AAGATCAAAG CTAGTCGCAT ACATGTGCTT AACCTGTTTG GAAGCCTCTT TTTCAAGGAT 180 AGCGTTTCTT TCTTGCTCCA TCCATTGTTT TACCGGTATG GATCTAGCCT TTCCGTTTTC 240
AAGAATTTCT TTTCTTCTGG CTTTCGTTTT GGCTTCATCG CCT 283
(2) INFORMATION FOR SEQ ID NO: 105: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 237 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y132-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105:
AAACTCAATT TAAAGAACAA CTACCATTGT TTTTAGTTAT AATGGCTAAC ACCATTAAGA 60 TCAAAACAAA NGTTATTCAA TGAAGGTATT ATCTTATTTG AAAAATTTTT ATCTTTTTTN 120
AGCGATGGGA GCNATTATGC ANCCGAGTGA AAACATGGGG GCAANNCACC AAAAAACCGA 180
TGAAAGAGTG ATCTACTTGG CTGGGGGGTG CTTTTGGGGG CTAGAGGCGT ATATGGA 237
(2) INFORMATION FOR SEQ ID NO:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 261 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y132-2.T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106:
CTTAAGGNCA TCGTNAAACA NATGCCCTAA ATGCGNCTTA CCAATACGGC TCAACACTTC 60
AATGCGTTTC CTATTAAGGC TCTCATCGTC TGTCGTATTT CACCACGTCT TTATTGATGG 120
GCTTAGAAAA GCTTGGCCAT CCGCAACCGG AGTCGTATTT ATCCGCAGAA AAAAATAACG 180 GGTCGCCTGT GGTAATATCC ACANAAANGC CCTCTNCTTC TTTGTGGNGT TACTCGTGGG 240
CNAAGGGTTT CTCAATGTGT T 261
(2) INFORMATION FOR SEQ ID NO: 107: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 408 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y133.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:107:
TACCATTTTA GAATGATTGG TTTGTATGGT GTTGGTAGTT TTTGGCTTAT TTTGAGTGAA 60
AGGATTTAAT CAAGATGTTT GTGGTTTTTA TAGAAGGTTT TGGTTTAGCG ATTTCTTTGT 120
GCGCGGCTGT GGGAGCGCAA TCCTTGTTCA TTGTAGAGCG AGGTATGGCT AGGAATTATG 180
TGTTTTTGAT TTGCGCTCTG TGTTTTATGT GCGATATTGT GTTAATGAGC ATGGGCGTGT 240 TTGGCGTGGG GGCTTATTTC GCTAAAAACC TTTATTTGAG TTTGTTTTTG AATTTATTCG 300
GGGCGGTTTT TACCGGATTT TATGCTTTTT TAGCTTTAAA AACCCTTTTT CAAACCTTTA 360
AAAAGAAGCA AGTCCAAACC CCTAAAAAAT TATCCTTAAA AAAGACCT 408
(2) INFORMATION FOR SEQ ID NO: 108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 543 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y135.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO:108:
AAATCGTCGT AAAAACGGGC GAATTTGTCA AAAAAGGGCA GTTGATTGGG TATAGTGGTA 60 ATACAGGAAT GAGTACAGGA CCGCATTTGC ATTATGAAGT GCGCTTCTTA GATCAACCCA 120
TAAACCCCAT GAGTTTCACC AAATGGAACA TGAAAAATTT TGAAGAAGTT CTCAATAAAG 180
AAAGGAGCAT CAGATGGCAA TCTTTGATAA CAATAATAAA TCGGCTAATG CAAAAACAGG 240
ACCAGCGACT ATCATCGCTC AAGGCACAAA AATAAAGGGT GAGCTTCAAC TTGGATTACC 300
ATTTGCACAT AGATGGCGAA TTAGAAGGGG TGGTGCATTC TAAAAGCACA GTGGTGATCG 360
GGCAAACCGG CTCGGTAGTG GGTGAGATTT TTGCTAATAA ATTAGTGGTC AATGGCAAGT 420
TCACTGGCAC GGTGGAAGCG GAAGTGGTAG AAATCATGCC TTTAGGGCGC CTTGATGGTA 480
AGATCTCTAG CCAAGAGCTT GTGGTGGAAA GGAAGGGGAT TTTGATTGGG GAAACTCGCC 540
CTA 543
(2) INFORMATION FOR SEQ ID NO: 109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 371 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y136-l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109:
GAATTCACTT GGGAAACAAC CAACAAAGCT GAGGAGATTA AAGCGTTCTT TAAGCGTTGA 60 AAAACTATTT TAAGGGTAAT ATTTGGTAAA ATTTTTGTAT AATCAACAAT TCACAAGGAG 120
TTTAAATTGA AACAAAGAAC GCTGTCTATT ATTAAGCCTG ATGCACTTAA GAACAAAGTG 180
GTAGGCAAAA TCATTGATCG CTTTGAGAGT AACGGCTTGG AAGTGGTTGC TATGAAACGC 240
TTGCATTTGA GCGTTAAAGA CTCTGAAAAC TTTTATGCGA TCCACAGAGA GAGACCCTTT 300
TTTACGATCT AATAGAATTT ATGGTCAGTG GTCCGGTAGT GGTTTTGGTT TTAGAGGGCG 360 AAGATGCGGT G 371
(2) INFORMATION FOR SEQ ID NO: 110:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 686 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y139.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: TTAGATCTAA AATGGTGTAT GCAGAATGGA GTTTGAGAAT GTTTGTTGGT TTTAGAACGC 60
ATACAGTTTC TTTTTTAAGA TTGGTAGCCA TTTGCATTAT GTTTGATCTT ATTAAGGCAG 120
AGGAGTAACA ATGGGATACG CAAGCAAATT AGCCTTGAAG ATTTGTTTGG CAAGTTTATG 180
TTTATTTAGC GCTCTTGGTG CAGAACACCT TGAACAAAAA AGGAATTTTA TTTATAAAGG 240
GGAGGAAGCC TATAATAATA AGGAATATGA GCGGGCGGCT TCTTTTTATA AGAGCGCGAT 300 TAAAAATGGC GAGCCGCTTG CTTATGTTCT TTTAGGAATC ATGTATGAAA ATGGTAGGGG 360
TGTGCCTAAA GATTACAAGA AAGCGGCTGA ATATTTTCAA AAAGCGGTTG ATAACGATAT 420
ACCTAGAGGG TATAACAATT TAGGCGTGAT GTATAAAGAG GGTAGGGGCG TTCCTAAAGA 480
TGAAAAGAAA GCCGTGGAGT ATTTTAGAAT AGCTACAGAG AAAGGCTATA CTAACGCTTA 540
TATCAACTTA GGCATCATGT ATATGGAGGG TAGGGGCGTT CCAAGCAACT ATGTGAAAGC 600 GACAGAGTGT TTTAGAAAAG CGATGCATAA GGGTAATGTT AGAAGCTTAT ATCCCTTTTT 660
AGGGGATATT TATTATAGTG GAATGA 686
(2) INFORMATION FOR SEQ ID NO: 111: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 234 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y13/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111:
TAGCCCTTTA TAACAACAAT AACCGCATGG ATACTTGTGT GGTGCGAAAT ACTGATGACA 60
TTAAAGCATG CGGTATGGCT ATCGGCAATC AAAGCATGGT GAACAACCCT GACAATTACA 120 AGTATCTTAT CGGTAAAGCA TGGAAAAATA TAGGCATCAG TAAAACGGCT AACGGCTCTA 180
AAATTTCGGT GTATTATTTA GGCAACTCTA CGCCTACTGA GAATGGTGGC AATA 234
(2) INFORMATION FOR SEQ ID NO: 112: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 294 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y13/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112:
GATTGACTAA ACGAGAATTT AAAATCATCG CATTGCTCAA GCTCAAGGTT TGTAAGCTGC 60
TGGTTTTATG CTCTAAACTG GCTATGTTGT TTAAAGTGGT AGTGGCCGCA TTCAATTGCT 120
TGGTGATTTC ACCGGTGCTT GTGTTATCAA TCATTTGTCT AGCGTAACCC GCATCATGGC 180
TATCAATCAA TAAGGTTTGC AAGAGATCCC TACCTTTNGC ACCTGAATGA GTATAAAGCG 240 TGTCAATATC TTTAGAGCGG TTAGCCAATT CAAACACGCT CTCAATAGTG CCAA 294
(2) INFORMATION FOR SEQ ID NO: 113:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 216 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y140-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: AGTGGTAAAG TATCTATTAN CGNGGGCTCT ATTACTAATT CTAGGCAACG CTTGGACTAT 60
GACTTCACCC TAAGCCTTAC CAACAGGAAA ACGGGTGAAG AGGTATGGAG CGATGTTAAG 120
CCTATTGTGA AGAACGCTAG CAATAAGCGT ATGTTTNAAA TTTATATTTG AAAGGATGAG 180
AGATGACAAA TCAAGTTTAA AAAAATTTAA GGGATG 216 (2) INFORMATION FOR SEQ ID NO: 114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y140-1.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114:
AACTCTTCGN GCACTTTATC CACAATTTGC TNATCCAAGC CCACTAAAAC ANAAACCCTA 60
TCTTGNACCG ACATAGCGGG CAAGCATTTT AGAAGCGATC AATTCCTTAT CCACTAATTG 120 AGAAATTTTT TCAGTATCAG TGCCGCTTAT GGACCTTTTA CCAGAAGCGT CTACCGTTCT 180
GGTTTTTTCA TTTTCCAAAT CTTTTTGTAA AGTGGATTTT AAATTCGCCG CTAAATTAAC 240
(2) INFORMATION FOR SEQ ID NO: 115: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 752 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y141.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115:
ACGAACTCGC TAAAATCGTG CTAGCTAATG AAGAGGATGC GACCTTAAAG CTTTTAAAAG 60 AAAGGGTTGA GGGGCAGTTG TTTTTAGAAA ATAAAGCCAG GCTCTATAAT GAAGAGTTGA 120
AAGAAAAATT GATTGAAAAT TTAGATGAAA AGATTGTTTT TGATTTGCCT AAAACGATCA 180
TAGAGCAAGA AATGGATTTG TTGTTCAGGA ACGCTCTTTA TTCCATGCAA GCTGAGGAAG 240
TCAAATCCTT ACAAGAAAGT CAAGAAAAAG CCAAAGAAAA GCGTGAGAGC TTTAGGAACG 300
ATGCCACAAA AAGCGTGAAA ATCACTTTTA TCATTGACGC TTTAGCGAAA AGAAGAAAAA 360 TTGGCGTGCA TGACAATGAA GTCTTTCAAA CCTTGTATTA TGAAGCGATG ATGACAGGGC 420
RAAACCCAGA AAGTCTCATT GAACAATACC GCAAAAATAA CATGTTAGCG GCGGTGAAAA 480
TGGCGATGAT TGAAGATAGG GTGTTAGCTT ATTTGTTGGA TAAAAACCTG CCTAAAGAGC 540
AACAAGAAAT TTTGGAAAAA ATGAGGCCCA ACGCTCAAAA AATTCAAGCG GGTTAAACGG 600
CTAAAAAGGA GAGATGATGG GATACATTCC TTATGTAATA GAGAATACCG ATCGTGGGGA 660 GCGCAGCTAT GATATTTACT CGCGCCTTTT AAAGGATCGC ATTGTTTTAT TGAGCGGTGA 720
GATTAACGAT AGCGTGGCGT CTTCTATCGT GG 752
(2) INFORMATION FOR SEQ ID NO: 116: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 238 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y143-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116:
TTCAAAAAGC GGTTGATAAC GATATACCTA GAGGGTATAA CAATTTAGGC GTGATGTATA 60
AAGAGGGTAG GGGCGTTCCT AAAGATGAAA AGAAAGCCGT GGAGTATTTT AGAATAGCTA 120
CAGAGAAAGG CTATACTAAC GCTTATATCA ACTTAGGCAT CATGTATATG GAGGGTAGGG 180
GCGTTCCAAG CAACTATGTT GAAAGCGACA GAATGTTTTA GAAAAGCGAT GCATAAGG 238
(2) INFORMATION FOR SEQ ID NO: 117:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 290 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y143-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117:
ATGAAAAATG CAATATCTTT TTATCGCTTC AAATCAATTG TTCAAAAAAC ATGCCACCTA 60 GCATAAAACA AAAAATCCTT GAATCAAGTC CAAGTTTGAG AATTATCGGC TTGAAGTGTT 120
CTTTTTCTTA CAATTTTTAT CAATGTCAAA ATCGCATGCT TTTTGCATGT ATTCTTCAGC 180
CTTTTTCTTA TCTTTTTCCA CGCCTAACCC ATACTGATAA GACTCTGCTA ACCCTTCATA 240
AGCTCTAGAA GAGCTCATAT CAGCCGCCAT TTTATAATAG ACAATCGCCT 290 (2) INFORMATION FOR SEQ ID NO: 118:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 613 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: Y144.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118:
CCCGCTGCCG ATGGTTCCGG TAATGTAAAT GTTTTTAGCC CTAAAATTAG CGTTAATGTA 60
GCCCAAATAA GGCGCGCTAG AATGGAGCAA CTGCCCTAAA TTAGTCTGTC TAAACATCTG 120
GCAAGTGTTA TCTCCAACCG CACATGGCTC TTTATACCCA TCGCCTCCAA ACCACACCAC 180
GCTATTAGTG TTCGTTTGCC CTTCTTTGAT CGCCCCCACC ACGAGATTGC CTTGAGAGAA 240 ATGATAATTA GCCGTGTTGG AAAAATTATT GATAATACCC AATAAACTCT TCCAATTTGC 300
TTGAGTTAAA GTGATACCAT ATTTTTCAAA GATACCAAAA AGAGTTTCAA AACTATCCGG 360
ACTATTGAGA TTGAGATTGT CCAAAGCCCC TGAAGATAAA AGATTGGCTA TTTTAGGCAA 420
AAATTCCTTA CCCAAGCTGG CAATAACGCT TAAATCCTGC TTAGTGATAG CCTGATTGTA 480
AATGTGCAAG GCTTGTAAGG GTTGGTTTTG CGCGTTATAA GTTCCTGGTA TCTCATTGTT 540 TTGATTAAAA CCTTTGATGT TGCTCGTCAA ATAATAAGTG CCAGCCTTAT CGCTTGTATA 600
ATTATAGGAA TTC 613
(2) INFORMATION FOR SEQ ID NO: 119: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 219 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y145-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119:
CTCGTCTTGA ATGGTTTGGA TCGCTCTTAA AGCGATCTCG CCTCTATTAG ACGATCTAAA 60
TGCGTGAAAT CTCTTTTTAT TCTACCNTTT TATCTTCTTT ATTCACTTTA TTCATGGATA 120
TTAAAGCTTT TCAACTTTGA TGACTTNCGT GCCGTATTCT ACCGGTTGAC CGTCTCCCAC 180
TTCAGTCNGG AAACCACCTT GCAAGGGTAT TCCACTTCA 219
(2) INFORMATION FOR SEQ ID NO: 120:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 868 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y146-l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120:
GGATCGCTCT TAAAGCGATC TCGCCTCTAT TAGCGATCAA AATGCGTGAA AGCTCTTTTT 60 TTTCTACCTT TTTATTTTCT TTATTCACTT TATTCATGGA TTTTAAAGCT TTTCAACTTT 120
GATGAGTTTC GTGCCGTATT CTACCGGTTG AGCGTCTCCC ACTTCAACAG AAACCACCTT 180
GCAAGGGTAT TCCACTTCAA TTTCATTCAT GATTTTCATC GCTTCTACAA TGCCCACGAT 240
TTGCCCTTTT TTAAGCGTAT CGCCCGCTTT GACATAAGGC TCAGCCCCAG GCGAGGGCGC 300
ATGATAAAAA GTGCCTACCA TAGGCGAAAG CACGAAATCT TCTTTTTTAT CCACAATAGG 360 GGTGCATACC ATAGCACAGG GGCTTGAGTG CTTGGCATGC TCGCTTCTAC CATGATGGGG 420
GCTTGGATGG AGGCTTGAGA ATGAGCGGGA TTTAGTGCAT TTTTTTTCGC ATAAGCGGAT 480
TCTTTATCCA AAACCAACTC AAAATGCTCA TGCTTTAATT TCAAATGCCC CAAATCAGAA 540
GCTTTAAATT CTTTGATCAA CTCTTCAATT TCAGAAAGGT TCATAACGAT TTATTCCTTA 600
TATTGTTTTA AAATATGCCA CAAAACATAA TCTTACTAAG CGTTTTTAGC AAAAAATTAT 660 TGTTATAATA ACATAATTAT TAACAAACTT TAAAGGCTTT GTGCGTATGG GATTGAAAGC 720
GGATTCTTGG ATTAAAAAAA TGAGTTTAGA GCATGGCATG ATTAGCCCTT TTTGCGAAAA 780
GCAAGTCGGT AAGAATGTGA TCAGCTATGG TTTGAGCAGT TACGGGTATG ATATTAGAGT 840
GGGGAGTGAG TTCATGCTCT TTGATAAC 868 (2) INFORMATION FOR SEQ ID NO: 121:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1024 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y147.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121:
CACACGCTAG AGTTTGCCTT ATCAATCGTA TAAGGTTTTG CAAACGCCAG ACTAACGCCC 60
AAAAGGGTGA ACATTAACGC TTTTTTCATT GTTTCCTCGC TTCATTTTGA ATTAAAAACG 120
CCAACTATAC AACAAATTGG TTAATGGTAA AAATACTCCT AACCATCTGT TTTAAGGTAT 180
AATACAAAAA ATCCCCACTA GGTAGGTGTT GTTATGATAA AAAAGACCTT TGCATCACTT 240 TTATTAGGAT TGAGTTTGGT GAGTGTTTTA AATGCCAAAG AATGCGTCTC GCCCATAACA 300
AGAAGCGTTA AGTATCATCA GCAAAGCGCT GAGATCAGAG CCTTGCAATT GCAAAGTTAC 360
AAAATGGCGA AAATGGCGCT AGGCAATAAT CTCAAGCTCG TTAAAGACAA AAAGCCAGCC 420
GTCATCTTGG ATTTAGATGA AACCGTTTTG AACACTTTTG ATTATGCGGG CTATTTGATC 480
AAAAATTGCA TCAAATACAC CCCAGAAACT TGGGATAAAT TTGAAAAAGA AGGCTCTCTC 540 ACGCTCGTTC CTGGAGTGCT AGACTTTTTA GAATACGCTA ATTCTAAGGG CGTTAAGATT 600
TTTTACATTT CTAACCGCAC GCAAAAAAAT AAGGCATTCA CCTTAAAAAC GCTCAATAGT 660
TTTAAACTCC CCCAAGTGAG TGAAGAATCC GTTTTATTAA AAGAAAAAGC AGCTAAAGCC 720
GTTAGGCGAG AATTAGTCGC TAAGGATATG CGATTGTTTT ACAAGTGGGC GACACTTTGC 780
ATGATTTTGA TGCACTTTTT GCTAAAGACG CTAAAAACAG CCAAGAACAA CGAGCTAAAG 840 TCTTGCAAAA CGCTCAAAAA TTCGGCACAG AGTGGATCAT TTTACCTAAT TCTCTCTATG 900
GCACATGGGA AGATGAGCCT ATAAAAGCAT GGCAGAATAA AAAATAAAAT TTATCCATCG 960
CATAAGAACG ATTTTTGCTA ACATGACAAA AAATTTGATC TCTTAGGCGG AGCTATGGAT 1020
TTTG 1024 (2) INFORMATION FOR SEQ ID NO:122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 372 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y150-2.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122:
TGTTTTTAGC CAAATTTTTA GTAGTGGGCT TTGGAGCTTG CGGCTTTGGC GGGGTTTTAA 60
TATCCAAATT CAAAGCTTTC ACGCCTATTT TCTGCACCGC TTCAGAGATA TTTTCCCTTA 120 AAATCGCGCT CGTTTCTTCA GAGCTTGAGG GGATTTCTAT TAAAAGCCCT AATTGATTGT 180
CATGCAAAGC GATGCTTTTA ACAAAACCAA AGCTGACAAT ATCCTTTTCA AAATTAGGGT 240
AAATGATCGT TTTTAACGCG TTTAAGACAA CTTCTTGGGT GAGCATTTTA ATCCTTAAAA 300
TGATAGTATT TGAATTTTCT TTATAGCATG TTTTGTTTAA TTTTGCATTG TAAATTAAAT 360
TTGAGTTTTC AA 372
(2) INFORMATION FOR SEQ ID NO: 123:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 291 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y151-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:123:
AGTTTTTGAT TTTATCAAAT GATTCTAAAG GGTATTAAAT GCGCTTCCAA TAACGCTTTT 60 ATAACGCTTC AAAACTATAA CACTAATTCA TTTTAAATAA TAATTAGTTA ATGAACGCTT 120
CTGTTAATCT TAGTAAATCA AAACATTGCT ACAATTACAT CCAACCTTGA TTTCGTTATG 180
TCTTCAAGGA AAAACACTTT AAGAATAGGA GAATGAGATG AAACTCACCC CAAAAGAGTT 240 AGACAAGTTG ATGCTCCACT ACGCTGGAGA ATTGGCTAAA AAACGCAAAG A 291
(2) INFORMATION FOR SEQ ID NO:124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 323 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y151-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:124:
CAATGTCTAA GCGTTTGCCG AAAGTTTTTT CTCTGTCAAA GTCTAAGCAT CTATTCACTT 60
CAAAGAAATG GAAGTGTGAG CCGATTTGAA CCGGTCTGTC GCCAACATTT TTAACTTTCA 120
CGCTAACGGC TTTTTTGCCT TCGTTGATAG TGATGTCTTC ATTTTTTAAG AACAACTCAC 180
CAGGAACTAA TTTACCATTG GCCTCAATAG GGGTATGCAC GGTTACGAGT TTAGTCCCAT 240
CAGGAAACAT CGCTTCAATA CCCACTTCAT GGATCATGCT TGCCACGCCA TCCATCACAT 300
CATCCGGTTT TAAAAGAGTG CGC 323
(2) INFORMATION FOR SEQ ID NO: 125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 603 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y152-2.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125:
ACGCTAAAGA GCTTGTGAAT TTAGGCATAG AGAGCGCAAA GAAGCAAGAT TTCGCTCAAG 60
CTAAAACGCA TTTTGAAAAA GCTTGTGAGT TAAAAAATGG CTTTGGATGT GTTTTTTTAG 120
GGGCGTTCTA TGAAGAAGGG AAAGGAGTGG GAAAAGACTT GAAAAAAGCC ATCCAATTTT 180
ACACTAAAGG TTGTGAATTA AATGATGGTT ATGGGTGCAA CCTGCTAGGA AATTTATACT 240 ATAACGGACA AGCGTGTCTA AAGACGCCAA AAAAGCCTCA CAATACTACT CTAAAGCTTG 300
CGACTTAAAC CATGCTGAAG GGTGTATGGT ATTAGGAAGC TTACACCATT ATGGCGTAGG 360
CACGCCTAAG GATTTAAGAA AGGCTCTTGA TTTGTATGAA AAAGCTTGCG ATTTAAAAGA 420
CAGCCCAGGG TGTATTAATG CAGGATACAT ATATAGTGTA ACAAAGAATT TTAAGGAGGC 480
TATCGTTCGT TATTCTAAAG CATGCGAGTT GAACGATGGT AGGGGGTGTT ATAATTTAGG 540 GGTTATGCAA TACAACGCTC AAGGCACAAC AAAGGATGAA AAACAAGCGG TAGAAAGGAA 600
TTC 603
(2) INFORMATION FOR SEQ ID NO: 126: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1031 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y153.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:126:
TGTGTGTATT TTGACATAAC CATTCTCCTT TCTTTATTTC TCATCAACCA ACAGAACTGT 60
GCGCACATCA GGCAATTTGC TCAAATCCAT CCTGTATTTA GAACCATAGG TGAATACGCC 120
AAGCTCATCT TCTTTCATAG TCCAGCTCTT TGGCAAGTTC CAAAATGTTT TAAAATCGCT 180
TAAAAACTTA GGCGAAAGAT CAAAGCTAGT CGCATACATG TGCTTAACCT GTTTGGAAGC 240 CTCTTTTTCA AGGATAGCGT TTCTTTCTTG CTCCATCCAT TGTTTTACCG GTATGGATCT 300
AGCCTTTCTG TTTTCAAGAA TTTCTTTTCT TCTGGCTTTC GTTTTGGCTT CATCGCCGAC 360
CCACACGCCG TCTTTATTTT GACTCACCAC TGCGCCATAA ACCTTGTAAG CGTATTCTGG 420 CAATAGCTGT TTGCTGTTGA GATCTTCTAA AATCGCATTC AAATCCCTTT CAATCGGATC 480
GCCAAATCCA GGACCACCTT TGATGTAATT CAAATACAAA TCATAATTGT CAAAGCAGTT 540
CTCAGTGGTG ATGCATTGCT TATCCCTTTT GACTTGAGAC GCATGAGAAA TATGCTTTTC 600
ATAATCTCTA TCCGTTGGGT TAAAATCGCC CCCCAAAGGC AAGCTGGGGT TATTTTTAAT 660 CCTGTTTTCC AAGTCGGTGT TGTGCGCTTC AAACCTATAG CCAATGGCCG CTGGATAGCC 720
CCCCATCATA CCCCAATCGC TATTCATGTA GCGATTACCC ATAGAGAACA TGGTCCAATC 780
ATGCACACCC CACACCATTC TTAAGGTTTC AAACCCGTTA CCGCCTCGAT ATTTCCCATA 840
CCCACCGGTA TTGGCTTTGA CATTCCTGCC CAAATAAAGA AGAGGCTCTG CCATTTCCCA 900
AATTTCAACA TCGCCCATAT CGCCTTCTGG ATTCCAAATA GCCGCTGCGT GATTTAAGCC 960 GTCTTTTATC GCGCAAGCTC CAGTCCCACA AGAACTCGTC TCAAAGCTAT TCACCGCATG 1020
GATTTCTCCA T 1031
(2) INFORMATION FOR SEQ ID NO: 127: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y154-2.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127:
CTGGGGGCAA CCACGGCATT GGCTTTGCCA TCCCTTCTAA CATGGTTAAA GATATTGTAA 60
CCCAACTCAT CAAAACCGGT AAGATTGAAA GAGGTTACTT GGGCGTGGGC TTGCAAGATT 120
TGAGCGGCGA TTTGCAAAAT TCTTATGACA ATAAAGAAGG GGCGGTAGTC ATTAGCGTAG 180
AAAAAGACTC CCCGGCTAAA AAAGCAGGGA TTTTGGTGTT GGGATTTGAT CACCGAANTC 240 AATGGCAAAA AGTTAAAAAC ACGAA 265
(2) INFORMATION FOR SEQ ID NO: 128:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 301 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y154-2.T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: TCTAAACTTT CTAAAGAGAG TTTAAAAGCT TCCCCATTCA CTCTAATCAA AGACATGCGA 60
CTAACAACCC ACCTCTATCA TTTCACCAAA ATGATCCTAT AACCTTGATT CAAATCTAAA 120
ACTAAGAATC GTTTGGGTTT GCCTTTATAC TTTTCTAAAG CATGGTTAAA ATCCGCAACG 180
CTTTTAACTT CAACCTCTTC AATTTTTGTG ATAATGTTAC CTTGCCTAAA TCCGGCTTGC 240
TCTGCTGGGG AATTTTCATT CACTTGAGAG ACTAAAACCC CTTGAACATC ATCGCTCAAA 300 C 301
(2) INFORMATION FOR SEQ ID NO: 129:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 258 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y160-1.T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: TCTTTAATTA AGCTAAAAAT GAGCGAATTT GTGGTACTAA TGCTTGCAGT GTGAAATTTG 60 TAGGGCTAGA ACTTGTAAAA TGTGAAAAAA AAGACGCATA AAATGCTAAA ATCCGCAAAA 120 ACACTGTGTG TTTAGATTAG AAGTTTAAAA TTAGGAACGG AAGTATCTGT TTTAAATTTT 180 GAATAGGGAG TTTCTATCAC TATGGTATTG AAAACAAAAT TAAAAATTAT AAGCTCGGTG 240 ATTTTGAACA CTTTATTG 258
(2) INFORMATION FOR SEQ ID NO: 130:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 321 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y160-1.T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130:
CCCAATCAGG AGCGCCTTTA GTCGCTTCTT TGTAAGCCTT ATTGCTTTTA CTGATACCTG 60
ATTTTGGGGC ATGACTACAA CCTACGATCA CCATCGCTGC TATCACACTC ATCCCTAAAA 120
TTTTTTTAAC TTGATTTTTC ATCTCTCATC CTTTCAAATA TAAATTTAAA ACATACGCTT 180 ATTGCTAGCG TTCTTCACAA TAGGCTTAAC ATCGCTCCAT ACCTCTTCAC CCGTTTTCCT 240
GTTGGTAAGG CTTAGGGTGA AATCATAGTC CAAGCGCTGC CTAGAACTAC TAATAGAAGC 300
TGCGATACTA GATACTTTAC C 321
(2) INFORMATION FOR SEQ ID NO: 131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 610 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y163.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131:
ACATCGTCAA TAATTCTCAA AACGCTTTAA CGCTAGCCAA CAACGCTAAC ATCAGCAATT 60
CAACAGGCTA TCGAATGAGC TATGGCGGGA ATATTGATCA AGCGCGCTCT ACCCAACTGT 120
TAAACAACAC CACAAACACT TTGGCTAAAG TTACCGCTCT AAACAACAAG CTTAAAGCTA 180 ACCCATGGCT TGGGAATTTC GCTGCCGGTA ACAGCTCTCA AGTGAATGCG TTTAACGGGT 240
TTATCACTAA AATCGGTTAT AAGCAATTCT TTGGGGAAAA CAAGAATGTG GGCTTACGCT 300
ACTACGGCTT CTTCAGCTAT AATGGTGCGG GCGTGGGTAA TGGCCCTACT TACAATCAAG 360
TCAATCTGCT CACTTATGGG GTGGGGACTG ATGTGCTTTA CAATGTGTTT AGCCGCTCTT 420
TTGGTAGCCG AAGTCTTAAT GCGGGCTTCT TTGGGGGGAT CCAACTCGCA GGGGACACTT 480 ACATCAGCAC GCTAAGAAAC AGCCCTCAGC TTGCGAATAG ACCCACAGCG ACGAAATTCC 540
AATTCTTGTT TGATGTGGGC TTACGCATGA ACTTTGGTAT CTTGAAAAAA GACTTGAAAA 600
GCCACGAGCG 610
(2) INFORMATION FOR SEQ ID NO: 132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y16/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132:
GTTTGGTGAG TGTTTTAAAT GCCAAAGAAT GCGTCTCGCC CATAACAAGA AGCGTTAAGT 60
ATCATCAGCA AAGCGCTGAG ATCAGAGCCT TGCAATTGCA AAGTTACAAA ATGGCGAAAA 120
TGGCGCTAGA CAATAATCTC AAGCTCGTTA AAGACAAAAA GCCAGCCGTC ATCTTGGATT 180 TAGATGAAAC CGTTTTGAAC ACTTTTGATT ATGCGGGCTA TTTGATCAAA AATTGCATCA 240
(2) INFORMATION FOR SEQ ID NO: 133: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 310 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y16/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:133:
TTGCATTTTA AATCTTCAAA CCCTACAAAA TCCATAGCTC CGCCTAAGAG ATCAAATTTT 60
TTGTCATGTT AGCAAAAATC GTTCTTATGC GATGGATAAA TTTTATTTTT TATTCTGCCA 120
TGCTTTTATA GGCTCATCTT CCCATGTGCC ATAGAGAGAA TTAGGTAAAA TGATCCACTC 180 TGTGCCGAAT TTTTGAGCGT TTTGCAAGAC TTTAGCTCGT TGTTCTTGGC TGTTTTTAGC 240
GTCTTTAGCA AAAAGTGCAT CAAAATCATG CAAAGTGTTC GCCCACTTGT TAAAACAATC 300
GCATAATCCT 310
(2) INFORMATION FOR SEQ ID NO: 134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 767 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y173.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134:
TGCAATACTA ACCACTTTAG GCCACACGCT CACAAACAAA ATCATGGAGC AAAATCATCT 60
GCCGCACTAG ATCAGAGAAA GGCGCAAAAG GCCTTTCTGT TTTTAAGTCT TACTTGACTT 120
CAACCTTAGC ACCTACTTCT TCAAGTTTCT TCTTGATGGT TTCAGCTTCT TCTTTATTCA 180 CGCCCTCTTT AAGCACATGA GGGGTTTTTT CGGTAGCGTC TTTAGCTTCT TTCAGGCCAA 240
GTCCAGTGAT TTCACGAACC ACTTTAATCA CCTTAATTTT TTCAGCACCG CTATCGGCTA 300
AAATCACATT AAATTCGGTT TTTTCTTCGC TCTCAGCCGC TGCACCGCCA GCTACAGCCG 360
CACCCGCTAC GACCGTTGGA GTCGCGCTCA CGCCAAATTT TTCCTCAAAC ATTTTAACCA 420
ATTCAGCAAG CTCTAAAACG CTCAATGAAC CAATATACTC TAACACTTCT TCTTTTGAAA 480 TTGCCATAAT CCAATCCTTC AAATTTTTTT AATTAAAGCC ATCATAGGCT CTTAGTTTTC 540
TTCTTTCGCT TTACGCAAAT TGTCTAAACC GGTCACAAAA TAACGCACCG GAGCCGTCCA 600
AACAGAAAGC AACATTCCCA TAAGCTCTTC TTTGCTTGGG AGTTTTGAAA CCGCTTCCAC 660
ATGAGCTACG CTAACGCTTT CTTTATCAAA CAAGCCCGCT TTCAACACAA AGTGATCTTT 720
ATGCTCTTTT TGGAAATCAA ACACGAGTTT AGAGAGAGCG ATTTGAT 767
(2) INFORMATION FOR SEQ ID NO: 135:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 770 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y175.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135:
TATTAAAATT TTATTCTTGG AGAATTTATA ATGAAGAGAT CTTCTGCATT TAGTTTCTTG 60 GTAGCTTTTT TATTGGTAGC TGGCTGTAGT CATAAAATGG ATAATAAGAC TGTGGCTGGC 120
GATGTGAGCG CTAAAACGGT TCAGACTGCA CCTGTTACTA CAGAACCAGC TCCAGAGAAA 180
GAAGAGCCTA AACAAGAGCC AGCTCCAGTG GTTGAAGAAA AGCCGGCTAT TGAAAGCGGG 240
ACTATCATCG CTTCTATTTA TTTTGATTTT GACAAGTATG AGATCAAAGA ATCCGATCAA 300
GAGACTTTAG ATGAGATCGT GCAAAAAGCT AAAGAAAACC ACATGCAAGT GCTTTTGGAA 360 GCCAATACCG ATGAATTTGG CTCTAGCGAA TACAACCAAG CGCTTGGCGT TAAAAGGACT 420
TTGAGCGTGA AAAACGCTTT AGTCATTAAA GGGGTAGAAA AAGATATGAT CAAAACCATC 480
AGTTTTGGTG AAACCAAACC CAAATGCGCC CAAAAAACTA GAGAATGTTA CAAAGAAAAC 540 AGAAGAGTGG ATGTCAAATT AGTGAAGTAA TTTTAGGATG AAAAGGTTTT TTTTTATCCC 600
TTTTATCGCT CCCTTTTTTC TCAATGGGGA GCCTTCAGCG TTTGATTTAC AAAGCGGAGC 660
CACCAAAAAA GAACTCAAGC AGTTGCAAGT CAATAGTAAG AATTTTTCCA ATATTTTGAC 720
CAAAATCCAT TCGCAAGTAG AGGCTAACAC TCAAGCTCAA GAGGGTTTGA 770
(2) INFORMATION FOR SEQ ID NO: 136:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 425 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y176- l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136:
ATGCCATGCT CTAAACTCAT TTTTTTAATC CAAGAATCCG CTTTCAATCC CATACGCACA 60 AAGCCTTTAA AGTTTGTTAA TAATTATGTT ATTATAACAA TAATTTTTTG CTAAAAACGC 120
TTAGTAAGAT TATGTTTTGT GGCATATTTT AAAACAATAT AAGGAATAAA TCGTTATGAA 180
CCTTTCTGAA ATTGAAGAGT TGATCAAAGA ATTTAAAGCT TCTGATTTGG GGCATTTGAA 240
ATTAAAGCAT GAGCATTTTG AGTTGGTTTT GGATAAAGAA TCCGCTTATG CGAAAAAAAA 300
TGCACTAAAT CCCGCTCATT CTCAAGCCTC CATCCAAGCC CCCATCATGG TAGAAGCGAG 360 CATGCCAAGC ACTCAAGCCC CTGTGCCTAT GGTATGCACC CCTATTGTGG ATAAAAAAGG 420
AATTC 425
(2) INFORMATION FOR SEQ ID NO: 137: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 285 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: antigenic region of structure meld for Y178-1/T3 and Y194-2/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:137:
GAATTCATCA AACACGCCCA TGAACTTTTG ATCCGTTCGG ATTTGATTCC CTATAATGAT 60
TCCTGCAAAA GATTGTTTGG CAGATTTCAA AAACTCTGCT TTTTCTTTGT CATCAGGGAT 120
AGGGGGTTGT ATGATATTTT CCATAAAATT TCGGATCGAT CGGGTGTTGA TTTTAGACGG 180 ATCGTTTTGA TGGGACACCC AACTTGTGAA AATTCGGTAA CGCTGAGTCC CAAATTTCTG 240
AAAGCTTTTT ATGGAAGAAC CTATATCAAT GAGATTGTCT TTTGT 285
(2) INFORMATION FOR SEQ ID NO: 138: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 267 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y178-1/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:138:
TTAATCATGT ATTTATGGAT AGTTAAGATT TATTATGAAA AAAAAGTAAA TGGAACTCAA 60
AACAATCAAA ATAACGCTCT AAATCCCAAC CAGAAACGCT ACCCTTTGTA AATCCTTAAT 120
AATTTTTGCT ATAATAAAGC CCTAACTGAA ATTTATCATT TTATTTTAGT TAGGCTCCTT 180
GAATTAGAAT TATAGTAGAC TTGTTATACC TTGTTCTAAA TATTGTGGTA TACTAACAAT 240 GTTCAAAGAC ATGAATTGAT TACTCAG 267
(2) INFORMATION FOR SEQ ID NO: 139: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 810 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y180.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO:139:
GAATTCTTCA CGAATATACT TAAGCTTATT ATCAAAACCC GCACGAGCGG CCGCGATTCC 60
TTCACCACAA GCTCTTGGCT AGAGATCTTA CCATCAAGGC GCCCTAAAGG CATGATTTCT 120
ACCACTTCCG CTTCCACCGT GCCAGTGAAC TTGCCATTGA CCACTAATTT ATTAGCAAAA 180 ATCTCACCCA CTACCGAGCC GGTTTGCCCG ATCACCACTG TGCTTTTAGA ATGCACCACC 240
CCTTCTAATT CGCCATCTAT GTGCAAATGG TAATCCAAGT GAAGCTCACC CTTTATTTTT 300
GTGCCTTGAG CGATGATAGT CGCTGGTCCT GTTTTTGCAT TAGCCGATTT ATTATTGTTA 360
TCAAAGATTG CCATCTGATG CTCCTTTCTT TATTGAAAAC TTCTTCAAAA TTTTTCATGT 420
TCCATTTGGT GAAACTCATG GGGTTTATGG GTTGATCTAA GAAGCGCACT TCATAATGCA 480 AATGCGGTCC TGTACTCATT CCTGTATTAC CACTATACCC AATCAACTGC CCTTTTTTGA 540
CAAATTCGCC CGTTTTTACG ACGATTTTAT TCAAATGGGC GTAGTAGGTT TTAAAACCAA 600
AAGGGTGGAA AACTTTAATC AAATTCCCAT ACCCCCCATT CCACCCTTTG CTCGCTAACC 660
CCACTACCCC GCTCGCGCTC GCATACACAG GGGTGTTAAT AGCGGTGCTT AAATCAAGCC 720
CGGTATGGTT GTGCAACACA TGCAAAATAG GGTGGATTCT TTTATTAAAG GCGGCTGAAA 780 CGCGCCGATA GGATTCTAGC GGGTAGTCAT 810
(2) INFORMATION FOR SEQ ID NO: 140:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 228 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y184-1/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: TAGGGCGTTT TAGCGTGGGT TTAGAGATTG AAAAAGAATA TTGTGAGTTG TCTAAAAAGC 60
GTATTTTGGA GAGTTTGTCG TTAGTGTGAG CGTTTTAAAA ACCTTTGAGG GTTAAAATAG 120
TGTACAATAC TAAAGATTTT AAAACTCAAA AAGGATTGAT AATGAATTTA TTTGAAAAAA 180
TGACTGACCA ATTGCATGAG GCTTTAGACA GCGCGCTCGC TTTAGCTT 228 (2) INFORMATION FOR SEQ ID NO: 141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 288 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y184-1/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141:
CTGCCTTTTC TTAAAGATTC TAAAGTTTTT TGCAATTCCT TAGTGTCTAA ATAAGGCTTT 60
AAAACGCTTT CAAAAAGGCT CATGTTCGCC AAAAGATACA CATCGGTAGC GATGAAAGAA 120 TCGCCCGTTT TAGCCATCAA GCCTTGAGCG TTTTCTAAAC TTTGGATTAG AGCTTGGTTT 180
AATTGGATAT TTTGCTTGCT GATTTGTGAA ACTTTAGCGA GCTTATTCAA CTCGCTTTGA 240
ACGCTAATTT TAAAGCTTCA ATATCCACAG GCATTTTTTG TTAGGCTT 288
(2) INFORMATION FOR SEQ ID NO: 142:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 478 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y187-2.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142:
ATGGCGTGAA TGATGTAACT TTAGAGAGCG TAAAAGTTTC TAGTTCAGCA GGCACGGGGA 60 TCGGTGTGTT AGCGGAAGTG ATTAACAAAA ATTCTAACCG AACAGGGGTT AAAGCTTATG 120 CGAGCGTTAT CACCACGAGC GATGTGGCGG TCCAATCAGG AAGTTTGAGT AATTTAACTT 180 TAAATGGGAT CCATTTGGGT AATATCGCAG ATATTAAGAA AAATGACTCA GACGGAAGGT 240 TAGTCGCAGC GATCAATGCG GTTACTTCAG AAACCGGCGT GGAAGCTTAT ACGGATCAAA 300 AAGGGCGCTT GAATTTGCGC AGTATAGATG GTCGTGGGAT TGAAATCAAA ACCGATAGCG 360 TCAGTAATGG GCCTAGTGCT TTAACGATGG TCAATGGCGG TCAGGATTTA ACAAAAGGTT 420 CTACTAACTA TGGGAGGCTT TCTCTCACAC GCTTAGACGC TAAAAGCATC AGGAATTC 478
(2) INFORMATION FOR SEQ ID NO:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 762 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y190.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143:
AATTGAGCAT TTTGGCTTGC GCGCTAGCGT TAGCGAGCAT GCCTTGAGCG AAGCTAGCGT 60
CTGTGAAAGG GTTGAAAGGC TTGCCATTAT TACCGCCTAC TGGGGTTGAT TGTTCATGCT 120
CGTTAATGAC GCTCGTTTGA TTGACCAGCT CTTGCGCGTC CGTGATCATC TTTTGGATCG 180
CGCTGATTTC TTCTGAAAAA GCGCCGCATA TTTTCCCAGT AGTAGTAGAG AATTTTGGGG 240
CATTAGCCTC ACTACTATTA TTAGCATGGA AATACGGGCA TGCCGTGTTG ATGGTGTTAA 300
TGAGCGTGCT CGCTTGCGCT AAGAGCGCTT GAGCGCTATT AGGCACACCG TCTAATTGGT 360
TAGTGATTTC GGTGTAGGAG ACACGTTGGG TGTTACCTAG TGCGGTTGAA TCAACGACTT 420
TTGAACTGAT CGTGGTGGTT ACGCTTTTGC CGTCTATGGT TTGGATTTTA GTCGTGGTTC 480
CGCCATGTTG GTTTTTTACA CCTGTGGCTT TTTCCGAGCA GTTATCATTC CCTTCCCCTG 540
AGCATGTGTA AGTATAGCTT ACATTCACCG TTCCGTTGTT TTCTTTGAGC GCGGGTAAGC 600
CTTTTTTTAA AGCCGTTTGG AGGATCTGAT AGGCTTCGTT AATCTTTTTA AAATTTTCAA 660
TGCTCATAGG GCCGTAGTAT CCAGGCGTAT ACCCGTTCAA AGAGCAAGTG ATGGAAGTGG 720
ATCGATACCC TGGCTCGTTG TTGAAGATGG TGGTTGAAGA GG 762
(2) INFORMATION FOR SEQ ID NO: 144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 239 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y194-2/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144:
CAAGTGTGTA GCGATTTTTA TCAGTCTTTG ATACCAATAA GATACCGATA GGTATGAAAC 60 TAGGTATAGT AAGGAGAAAC AATGACTAAC GAAACCATTA ACCAACAACC ACAAACCGAA 120 GCGGCTTTTA ACCCGCAGCA ATTTATCAAC AATCTTCAAG TGGCTTTTCT TAAAGTTTGA 180 TAACGCTGTC GCTTCATACG ATCCTGATCA AAAACCAATC GTTGATAAGA ATGACAGGA 239
(2) INFORMATION FOR SEQ ID NO: 145:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 529 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y195-l.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: GAATTCTTTC TTGCAAGATG TGCCTTATTG GATGTTGCAA AATCGCAGTG AGTATATCAC 60
GCAAGGGGTG GATAGCTCGC ACATTGTGGA TGGTAAGAAA ACTGAAGAGA TAGAAAAAAT 120
CGCTACCAAA AGAGCGACAA TAAGAGTGGC GCAAAATATT GTGCATAAAC TCAAAGAGGC 180
TTACCTTTCT AAATCCAACC GCATCAAGCA AAAGATCACT AATGAAATGT TTATCCAAAT 240
GACACAGCCC ATTTATGACA GCTTGATGAA TGTGGATCGT TTAGGGATTT ATATCAATCC 300 TAACAATGAG GAAGTGTTTG CGTTAGTGCG CGCGCGTGGT TTTGATAAGG ACGCTTTGAG 360
CGAAGGGTTG CATAAAATGG CATTAGACAA TCAAGCGGTG AGTATCCTTG TGGCTAAAGT 420
GGAAGAAATC TTTAAAGATT CTGTCAATTA CGGAGATGTT AAAGTCCCTA TAGCCATGTA 480
GGATTAGAAC AAC AGCGCT CCTCACTATC GTCTGTTCTT TTGGGGGTG 529 (2) INFORMATION FOR SEQ ID NO:146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 424 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y196-1.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146:
TAATTTTAAG ACAATCATTC TTAATTTGCT ATTAAGTGTT TCAAAAAAGC ATCAAATTGA 60
CATCTTTTAA GCAAAATGTT ATAATGCTCG CTTAATTTTA TTTGGGGTGT TAGCTCAGTT 120 GGGAGAGCAC AACGCTGGCA GCGTTGAGGT CAGGGGTTCG AACCCCCTAC ACTCCACCAT 180
TATCACGCTT ATCCTCTTTC TTTATGGATC AAACCCATTT TCTTGCTAGT ATTACCCATA 240
ATTAATTATA ATCTTAGGAA TTGTTTTTCA TCAAGGGAGT TTGAGCGTGA GCAGTTTGTT 300
TAAAATGCGC ATTCTGAGTT TTAAAAAGAA TAAGCGGGCG GTGTTTTCGC TCTATCTTTT 360
TATCGCTTTA TTAGCGCTTT CTCTTTTAGC CCCCTTGTGG GTCAATGATC GCCCCTTATT 420 CATC 424
(2) INFORMATION FOR SEQ ID NO: 147:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 324 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y19/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: CATCTCTTTA TTTTGCTTGT CTAGCTGTTG GATGGCCTGT TGGTTAGCGT GAATCTCATT 60
CCTTAAATCC TCTAAAGTTT GCGATTGCTG CTTTAATGTG TTAGCTTGCA CTTCTTGCAA 120
GGCTTTTAAG GCTCGCAAGG ATTCTTCTTG GGAAAGGATA GCGTTATTGA NATCTTTAAT 180
CTTATTAGCC TGCCCCTCAT AAACGCTTCT CAAACCCTCT TGAGCTTGAG TGTTAGCCTC 240
TACTTGCGAA TGGATTTTGG TCCAAATATT GGAAAAATTC TTACTATTGA CTTGCAACTG 300 CTTGAGTTCT TTCTTGGTGG CTCC 324
(2) INFORMATION FOR SEQ ID NO:148:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 300 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Yl/t3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148:
GAATTCGCTC AAAAGCATAA AACAGACAGC GTGATTCAAG GCAAAGTGGT GAGCATTAAG 60
GATTTTGGCG TTTTCATTAA TGCTGATGGC ATTGATGTGC TGATCAAAAA TGAAGATTTG 120 AACCCCTTGA AAAAAGATGA AATCAAAATA GGCCAAGAAA TCACATGCGT GGTGGTTGCA 180
ATTGAAAAAT CTAACAACAA GGTGCGTGCT TCTGTGCATA GGTTAGAGCG CAAAAAAGAA 240
AAAGAAGAAT TGCAAGCTTT TAACACGAGC GATGATAAAA TGACTTTAGG GGATATTCTT 300
(2) INFORMATION FOR SEQ ID NO: 149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 238 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Yl/t7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149:
TTCTTCTAGG TTGATGTCAT GGTTTTTTGC GTCTGTTTGA GAGAGGCCTT CATTGGTTTG 60
AGAGATCGCA TGTGTTTGAG AGATCGCATG AGAAGTAGCG GGCATTTCTT TTTTAACGCT 120
TAAAAACAAA TAGGCTAAAA CGCCAAGAGC GACAGCTAGG GAAATAAAAG CAACAACACT 180 CTTAAATCTC ATGCCACTCT CATGCTTTTA AAATCACTCT TTAGAGTTTT TCTTTAAG 238
(2) INFORMATION FOR SEQ ID NO: 150:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 500 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y200-1.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: TACGCCTGAA GCGAAAAAAC TATTAGAAGA AGCTAAAGAG AGCGTCAAAG CTTATAAAGA 60
CTGCCTCTCC CAAGCTAGAA ATGAAACTGA AAGGAGAGCT TGCGAGAAAT TACTCACCCC 120
TGAAGCAAGG AAACTACTAG AGCAAGAAGT TAAAAAGAGC GTTAAAGCTT ATAAAGACTG 180
CGTATCAAGA GCTAGGAATG AAAAAGAGAA AAAAGAATGC GAGAAATTAC TCACCCCTGA 240
AGCGAGGAAA CTTTTAGAAC AACAAGCGCT AGATTGTTTG AAAAACGCTA AAACCGAATC 300 TGATAAAAAA AGGTGTGTCA AAGATCTCCC TAAAGACTTG CAGAAAAAGG TTTTAGCCAA 360
AGAGAGCGTT AAGGCTTATT TGGACTGCGT ATCAAGAGCT AGGAATGGCG TTAAAAATCG 420
CTATTTATGA GCGCTTGAGC AATTTAGTCG CTCCCATGAA AGCTTTAAGG GACGCTTTCG 480
CTCAAAAAGC TAAGGAATTC 500 (2) INFORMATION FOR SEQ ID NO: 151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 232 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y203/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: TCAAAGAATA TGAAAAAACC TACGTTCGTG TTTATTCTGA ATCAGCGTGT TCTCCAGAGC 60
TTGGTTTTAG CGTGACCGGC GTGATCATGC GTGGTGTTGT GGCTACGCAA AAACCTGTGA 120
TTCCGGTTGA AAAAGAGCAT GGCGCTACGC CCCCAAAAGA ACCAAAATAG GCGTTAGAAA 180
ATTCTATCGG CATAAAAAAT GGGTGGATGC AGATGTGTGG CCAAATGGAA AA 232
(2) INFORMATION FOR SEQ ID NO: 152:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 231 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y203/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152:
TCTCATAACC AATCGGATCT CTTTCCTTAA ACTCTAGTTT TTTGAGACCA TTATAATGCC 60 CTGTTTTTTC TGTCCTGGCT AGGATTTCAT CTCTAGCTTG TTTTAAAGTT TTGCCGTTTT 120
TCAATAAATT TGCCATTTTG AACTCCTTTA TTTAATTTCT TTCAAGTGGA ACAATCGGTG 180
TTTGTCTAGT CTTGTCGCAA AGCCTTTGGG TATCACGAAA GTGGTCGCAT C 231
(2) INFORMATION FOR SEQ ID NO:153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 263 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y208/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:153:
AGCCTAAGAT TTTCACGCTT TTAGAGTTGT TGCGCCTTTT CTTAAACCAC AGAAAAACCA 60
TTATTATAAG ACGCACGATT TTTGAATTGG AAAAGGCTAA GGCTAGAGCG CATATTTTAG 120
AGGGCTATTT GATCGCACTA GACAATATTG ATGGGAATCG TGCGACTCAT TAAAACAAGC 180 CCAAGCCCAG AAAGCGGCTA AAAACGCCTT AATGGAGCGT TTCACTTTGA GCGAAATCCA 240
AAGCAAAGCC ATTTAGAAAA TGC 263
(2) INFORMATION FOR SEQ ID NO:154: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y208/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154:
TAGGGTTGCC ATGATCTTTT CATCCGGAGC GAGCGAGATT AAATTCACAA TGGCTTTACC 60
CATAGCGATC CGGCTCGCTT CTGGGATTTT ATAGACTTTC AAATGATACA ATTGCCCCTT 120
ATTGGTGATA AAGAGCAAAA TATCATGCGT GTTAGCCACA AAAAAGTTTT CAATGAAATC 180
GTCTTCATAA GTGCTGCCTG AAAGCTTGCC CTTACCGCCA CGATTTTGCT TTTCATAAGC 240 TTTAAACCCA CTCTTTTCAC ATAGCCTTT 269
(2) INFORMATION FOR SEQ ID NO: 155:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 324 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y20/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155:
CTAAAAAATC CATGGTTAAA TTGCCCTTTC GTTTTAAAGA TAAACATTGT AGCATTTTTA 60
GATTTAAGAA TGCTTTTTAT ATATTATATA AAAATATCCC CTTTTAACCC CCTATTGATA 120 CCAACCCCTT TTTGACCTAA TTCTCATTAA ATAGATTTTT ATGATAAAAT CTAAACTTTA 180
TCAAGCCATT AGCCGGTGTT CTTTCTCATT TTTGTAAATT TTTAAAAATT TTCATACTCC 240
TGTTTACTTT TTCATTATCA TTTATGCTAT AATTATGGGA CAACTTAAAC CAACACAAAG 300
GAGATACTAT GTTTATCAAA AGAC 324 (2) INFORMATION FOR SEQ ID NO:156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 317 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y20/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156:
CATAGCTTTC AAAGTGATCT TTTTCACTTC ATCAAACACC CCCCACGCCC TCGCTAAAGA 60
CTCAATCGTC TCAGCATTGA CTAGCGTGGA GTCAAAATCA AAAACGGCTA GTTTTTGCAC 120 TCAATGACCC TAAAATTAAG ATTTCCTGCT TTTAGCGATC CCTTTGATAT ACTGATCAGC 180
CAAATACAAG CCATGGTTTT CATTACCAAT CAACTCAATT TTATCCAAAA TATCCTTGAA 240
AAGCACTTCT TCTTCATGCT GTTCAGACAC ATACCATTGC AAGAAATTGA AAGTCGCATG 300
ATCTTTGCCT TTTATGG 317 (2) INFORMATION FOR SEQ ID NO:157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 227 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y212/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:157:
GCAAAAAAGT AGGCGGTAAA GAAGAAATCA CCCAAGTGGC GACTATTTCT GCAAACTCCG 60
ATCACAATAT CGGGAAACTC ATCGCTGACG CTATGGAAAA AGTGGGCAAA GACGGCGTGA 120 TCACCGTTGA AGAGGCTCAG GGCATTGAAG ATGAATTAGA TGTCGTAGAG GGCATGCAAT 180
TTGATAGAGG CTACCTCTCC CCTTATTTTG TAACGAACGC TGAGAAA 227
(2) INFORMATION FOR SEQ ID NO:158: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 288 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y212/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:158:
CAATGCCTTC TTCAACCGCC GCTTTAGTCG CGCTCAACGC GTCATCCACC CGGTCTTTTT 60 TCTCTTTCAT TTCCACTTCA CTCGCAGCGC CCACTTTAAT CACAGCCACA CCGCCAGAGA 120 GTTTAGCCAA TCTTTCTTGC AATTTTTCTT TGTCATAATC GCTTGTCGTG CTTGCAATTT 180 GGGTTTTGAT TTGCGCAACT CTGTCTTTAA CATCATCGCT ATGGCCTTTG CCATCTACGA 240 TCGTGGTGTT GTCTTTGTCA ATCACAATCC TTCCGGCTTT GCTAAAAA 288 (2) INFORMATION FOR SEQ ID NO: 159:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 220 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y220/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159:
TGAAAGCGGG CTTGTTTGAT AAAGAAAGCG TTAGCGTAGC TCATGTGGAA GCGGTTTCAA 60
AACTCCCAAG CAAAGAAGAG CTTATGGGAA TGTTGCTCTC TGTTTGGACG GCTCCGGTGC 120 GTTATTTTGT GACCGGTTTA GACAATTTGC GTAAAGCGAA AGAAGAAAAC TAAGAGCCTA 180
TGATGGCTTT AATTAAAAAA ATTTGAAGGA TTGGATTATG 220
(2) INFORMATION FOR SEQ ID NO: 160: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 261 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y220/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160:
CGGTTTTTTA GGGGAATTTT TTTTGACATA TTTCAAGCCT TTGATCAAAA ATTTTGAGTT 60
ATTTAATGCA TTAACTCAAG TAAGCATTGC AAAAGGATCT TTAAACTTTA GAACTATCCA 120
TGATAAAAAT ACCAAAAACC CACCTGCAAT ACTAACCACT TTAGGCCACA CGCTCACAAA 180
CAAAATCATG GAGCAAAACC ATCTGCCGCA CTAGATCAGA GAAAGGCGCA AAAGGCCTTT 240 CTGTTTTTAA GTCTTACTTG A 261
(2) INFORMATION FOR SEQ ID NO: 161:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 257 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(A) LIBRARY: Y221/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: CTTCTGCATT TAGTTTCTTG GTAGCTTTTT TATTGGTAGC TGGCTGTAGT CATAAAATGG 60
ATAATAAGAC TGTGGCTGGC GATGTGAGCG CTAAAACGGT TCAGACTGCA CCTGTTACTA 120
CAGAACCAGC TCCAGAGAAA GAAGAGCCTA AACAAGAGCC AGCTCCAGTG GTTGAAGAAA 180
AGCCGGCTAT TGAAAGCGGG ACTATCGTCG CTTCTATTTA TTTTGATTTT GACAAGTATG 240
AGATCAAAGA ATCCGAT 257
(2) INFORMATION FOR SEQ ID NO:162:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 242 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: Y221/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162:
CCATTGAGAA AAAAGGGAGC GATAAAAGGG ATAAAAAAAA ACCTTTTCAT CCTANAATTA 60
CTTCACTAAT TTGACATCCA CTCTTCTGTT TTCTTTGTAA CATTCTCTAG TTTTTTGGGC 120
GCATTTGGGT TTGGTTTCAC CAAAACTGAT GGTTTTGATC ATATCTTTTT CTACCCCTTT 180
AATGACTAAA GCGTTTTTCA CGCTCAAAGT CCTTTTAACG CCAAGCGCTT GGTTGTATTC 240 GC 242
(2) INFORMATION FOR SEQ ID NO: 163:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 839 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y224.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: AACCTACTAC GCCCATTTGA ATAAAATCGT CGTAAAAACG GGCGAATTTG TCAAAAAAGG 60
GCAGTTGATT GGGTATAGTG GTAATACAGG AATGAGTACA GGACCGCATT TGCATTATGA 120
AGTGCGCTTC TTAGATCAAC CCATAAACCC CATGAGTTTC ACCAAATGGA ACATGAAAAA 180
TTTTGAAGAA GTTTTCAATA AAGAAAGGAG CATCAGATGG CAATCTTTGA TAACAATAAT 240
AAATCGGCTA ATGCAAAAAC AGGACCAGCG ACTATCATCG CTCAAGGCAC AAAAATAAAG 300 GGTGAGCTTC ACTTGGATTA CCATTTGCCA CATAGATGGC GAATTAGAAG GGTGGTGCAT 360
TCTAAAAGCA CAGTGGTGAT CGGGCAAACC GGCTCGGTAG TGGGTGAGAT TTTTGCTAAT 420
AAATTAGTGG TCAATGGCAA GTTCACTGGC ACGGTGGAAG CGGAAGTGGT AGAAATCATG 480
CCTTTAGGGC GCCTTGATGG TAAGATCTCT AGCCAAGAGC TTGTGGTGGA AAGGAAGGGG 540
ATTTTGATTG GGGAAACTCG CCCTAAGAAC ATTCAAGGGG GGGCGTTGTT AATCAATGAG 600 CAAGAAAAGA AAATTGAAAA TAAATAGGGA ATGATCCAAT CTAGTCTTTA TAGAGCCTTA 660
AACAAAGGCT TTGATTATCA AATACTCGCT TGTAAGGATT TTAAAGAGTC TGAGCTCGCT 720
AAAGAAGTCA TAAGCTATTT TAAGCCCAAT ACCAAAGCTA TTCTTTTCCC GGAGTTTAGG 780
GCTAAAAAAA ATGATGATTT GCGTTCGTTT TTTGAAGAAT TTTTACAGCT TTTAGGGGG 839 (2) INFORMATION FOR SEQ ID NO: 164:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 389 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y237.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164:
GCGACGATAA GGGGTTAGGG TAAAGCATAT TTTTGATCTA CAAAAAATCA GCCAAGAAGA 60
TAAAAGAAGT GGTTAAATCA TAGTTGTTTA ATTAAGCACG CTACAATACG AAAATTTTAA 120 AAGAAAGGAA AAGCATGAGC GAACAACGAA AAGAATCTTT ACAAAATAAC CCTAATTTGA 180
GTAAAAAAGA CGTCAAAATC GTGGAAAAGA TTTTGAGCAA GAACGACATT AAAGCCGCTG 240
AAATGAAAGA GCGCTATCTC AAAGAAGGGC TGTATGTGTT AAATTTCATG AGCTCTCCTG 300
GCAGCGGTAA AACCACGATG CTAGAAAATC TAGCGGATTT TAAAGACTTT AAGTTTTGCG 360
TGGTAGAGGG CGATTTGCAA ACCAACAGA 389
(2) INFORMATION FOR SEQ ID NO: 165:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 199 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: Y238/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165:
GACTGCCTTT AGTGGCTCCA AGCAAAGAAA CGATCAAACT TTTAGAAAAA ACTTTACAAC 60
AATATGAGGT AATTGCATGA ATGGTTCCAA TCACATGAAA AATAAAACCC TAGTGATCAG 120
CGGCGCGACT AGAGGGATTG GCAAGGCGAT ATTGTATCGC TTCGCTCAAA GCGGCGTGAA 180
TATCGCTTCA CTTACAATA 199
(2) INFORMATION FOR SEQ ID NO: 166:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 223 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y238/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:166:
CAAGAAAAAG CACCAACAAA ACAGCCCCAC ACAAAGGGGC GAAGAAATAG CCCAACAAGC 60 ATTTTTTAAT GGATTGCAAC TCTTTGTTTT GGTGGATGTG GATAATGTTT TGCAAGAAAT 120
ATCTTTATTT AAAAGTGGTC CCGCCATCTA CAACGATCGT TTGCCCTGTA AGCCAACCGC 180
TTTGGGTTTC AGCGCATAAA AAATAAGCCG CTCCGGCTAG ATC 223
(2) INFORMATION FOR SEQ ID NO: 167:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 198 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y23/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167:
GGCTAAAATA AAATCGCCTA ACTTATCGTT ACGGATCACA AAAACTTTTT GAGAGTTTTC 60
TTTATCTTTG CATTTTAAAT CTTCAAACCC TACAAAATCC ATAGCTCCGC CTAAGAGATC 120
AAATTTTTTG TCATGTTAGC AAAAATCGTT CTTATGCGAT GGATAAATTT TATTTTTTAT 180 TCTGCCATGC TTTTATAG 198
(2) INFORMATION FOR SEQ ID NO:168:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 320 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y23/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:168: CATCAATTTT GCCATCAAAA CTATCAAACA CGCCTCTTGT TTCATTGAAT TTGAAGTGTT 60
TGACCTCAAA CCACACGCTA GAGTTTGCCT TATCAATCGT ATAAGGTTTT GCAAACGCCA 120
GACTAACGCC CAAAAGGGTG AACATTAACG CTTTTTTCAT TGTTTCCTCG CTTCATTTTG 180
AATTAAAAAC GCCAACTATA CAACAAATTG GTTAATGGTA AAAATACTCC TAACCATCTG 240
TTTTAAGGTA TAATACAAAA AATCCCCATT AGGTAGGTGT TGTTATGATA AAAAAGACTT 300 TGCATCCAGT TTTATTAGGA 320
(2) INFORMATION FOR SEQ ID NO: 169: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 678 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: antigenic region of structure meld for Y241.asm and Y261/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169:
GGATTTGCAA ATTCTTGAGG ACAATAAAGA AGGGCGTAGT CATTAGGTAG AAAAAGACTC 60
CCCGGCTAAA AAAGCAGGGA TTTTGGTGTG GGATTTGATC ACCGAAGTCA ATGGCAAAAA 120
GGTTAAAAAC ACGAACGAAT TGAGAAATCT AATCGGCTCT ATGCTACCCA ATCAAAGGGT 180
AACCTTAAAG GTCATTAGAG ACAAAAAAGA ACGCGCCTTC ACCCTCACAC TTGCTGAAAG 240
GAAAAACCCT AACAAAAAAG AAACCATTTC TGCCCAAAAC GGCGCGCAAG GCCAATTGAA 300
CGGGCTTCAA GTAGAAGATT TAACCCAAAA AACCAAAAGG TCTATGCGTT TGAGCGATGA 360
TGTTCAAGGG GTTTTAGTCT CTCAAGTGAA TGAAAATTCC CCAGCAGAGC AAGCCGGATT 420
TAGGCAAGGT AACATTATCA CAAAAATTGA AGAGGTTGAA GTTAAAAGCG TTGCGGATTT 480
TAACCATGCT TTAGAAAAGT ATAAAGGCAA ACCCAAACGA TTCTTAGTTT TAGATTTGAA 540
TCAAGGTTAT AGGATCATTT TGGTGAAATG ATAGAGGTGG GTTGTTAGTC GCATGTCTTT 600
GATTAGAGTG AATGGGGAAG CTTTTAAACT CTCTTTAGAA AGTTTAGAAG AAGACCCTTT 660
TGAAACTAAA GAAACGCT 678
(2) INFORMATION FOR SEQ ID NO: 170:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 489 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y247.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170:
GAATTCCTAT AATTATACAA GCGATAAGGC TGGCACTTAT TATTTGACGA GCAACATCAA 60 AGGTTTTAAT CAAAACAATG AGATACCAGG AACTTATAAC GCGCAAAACC AACCCTTACA 120
AGCCTTGCAC ATTTACAATC AGGCTATCAC TAAGCAGGAT TTAAGCGTTA TTGCCAGCTT 180
GGGTAAGGAA TTTTTGCCTA AAATAGCCAA TCTTTTATCT TCAGGGGCTT TGGACAATCT 240
CAATCTCAAT AGTCCGGATA GTTTTGAAAC TCTTTTTGGT ATCTTTGAAA AATATGGTAT 300
CACTTTAACT CAAGCAAATT GGAAGAGTTT ATTGGGTATT ATCAATAATT TTTCCAACAC 360 GGCTAATTAT CATTTCTCTC AAGGCAATCT CGTGGTGGGG GCGATCAAAG AAGGGCAAAC 420
GAACACTAAT AGCGTGGTGT GGTTTGGAGA TAACACTTGC CAGATGTTTA GCGAGCGGGC 480
CGCGAATTC 489
(2) INFORMATION FOR SEQ ID NO: 171:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 216 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y250/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171:
AATACGCGGC CGCTCGTATT TTAACAACAA GCGTTTGTTT TATAGTTTAT TTATCATTAA 60
TTAGGCTATG TTAATCAAAG TTTGATAGTT AAAAATCTCT AACNCCCCTT AATTTTCTAA 120
AAAGCGCTTT ACTTAGCGCA CTTCTTGATA TCAATATTCC AGTCAGCACC GGATTTAATT 180 TGGTATTTTT GCGCGAAATC ATCCCTATTC AGCGCC 216
(2) INFORMATION FOR SEQ ID NO: 172: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y261/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:172:
GCGTTACTCA AGGCATTGTT TCAGCGCTCA ATAAAAGCGG GATTGGGATC AACAGCTATG 60
AGAATTTCAT TCAAACAGAC GCCTCTATTA ATCCTGGAAA TTCCGGCGGC GCTTTAATTG 120
ATAGCCGTGG AGGGTTAGTG GGGATTAATA CCGCTATCAT CTCTAAAACT GGGGGCAACC 180 ACGGCATTGG CTTTGCCATC CCTTCTAACA TGGTTAAAGA TATTGTAACC CAACTCATCA 240
AAACCGGTTA GATTGAAAGA AGTTACTTGG GCG 273
(2) INFORMATION FOR SEQ ID NO:173: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 217 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y264/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:173:
TGGCCCTTTA ACGCTCACTT CTTCAGCCAC GCTAAATTCA TTCTTAGAAA CATCTTTCCA 60
ATCAATGAAA TAAGTCGTTT GAGTGTTTTC ATTCTTGTTT CTCTTTTTCA TTCCTAGCTC 120
TTGATACGCA GTCCAAATAA GCCTTAACGC TCTTCTTAAC TTCTTGCTCT AAGAGTTTCC 180
TCGCTGCAGG GGTGAGTAAT TTTTCGCAAG CTCTCCT 217
(2) INFORMATION FOR SEQ ID NO: 174:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 247 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y268/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:174:
TGCCTTCTAG TATCAAAGTC GTTGCCCCGC AAGCTAAAGG CCCATAAACC ACATAAGTGT 60 GCCCTGTGAT CCAACCAATA TCAGCGGTGC ACCAAAAATT ATCGTTATCT CTAATATCAA 120
AAACCCACTC CATCGTCATT TGCGCCCATA ACAAATACCC TGCACTGCTG TGTTGAACGC 180
CTTTAGCTTT CCGGTTGATC CGCTTGTATA GAGCAAGAAT AAAGGATCTT CAGAGTCCAT 240
CATTTCA 247 (2) INFORMATION FOR SEQ ID NO: 175:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 222 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y269/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:175: GGTTGTTCGC TGAAACGCAT GTTGATCCTA AAAACGCCCT AAGCGATGGG GCGAACATGC 60
TAAAACCTAG CGAGCTAGAA CACTTAGTAA CCGACATGTT AAAAATCCAA AATTTATTTT 120
AAAGGAATTT CATGCGAATC ATAGAAGGGA AATTGCAATT GCAAGGGAAT GAAAGAGTCG 180
CTATTTTAAC ATCGCGCTTC AATCATATCA TCACAGACAG AT 222
(2) INFORMATION FOR SEQ ID NO: 176:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 241 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y269/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:176:
ACAATTCAAC CACAGCCCCC CTGGCTCGAG TTCGTGTCGC CATTTTAACC CTTGAGAGTT 60 TGGCACAAGC TCAACAATTC AATGAGGGTG CTCATCGCTT CAAAGCCCTT ATTGCCGGCT 120
TTACTGCCCG CTCTTTCAAT CGCTTGTTCA ATATTGTCTG TGGTCAGCAC GCCAAAGCTT 180
ACCGGCATGC TGTATTTGAG CATCGCATGG GCAATGCCCT TAGTCGCTTC CGCGCTCACA 240
T 241 (2) INFORMATION FOR SEQ ID NO: 177:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 143 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y26/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177:
TCATTTTGAG CCATGGCTTA GCATGGAGCT TATGCCTTAT TGTGAAACCC GTTTGTATGG 60
CTTTAGAGTC ATGCTCAATT ACTTGATTTA TCAAGAAATG TTTGGGAATT TCATCCCTAT 120 TGATGGATTT TTAGAACAAA CTC 143
(2) INFORMATION FOR SEQ ID NO: 178:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 233 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y26/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:178: TGCACCAAAT CGTAAGTGGC TTCGGTGTGG CTCAGGGATT TAGCGCTGAT TTGGAGCGTG 60
TTAGGATTAA GCCATTCTAA GCTTGCGCCT AAATTTTGCA ACAATAACGC CATCGCTTTT 120
ATATCCACCA CTTGGGGCGA GGATTTGATT TTGACTTCCT GTTGGCTTAA AAGCGTACCG 180
GCTAAAATGG GGAGTGCGGA GTTTTTCGCC CCTGAAATTT CTACCCCCCC TTT 233 (2) INFORMATION FOR SEQ ID NO: 179:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 284 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: Y270/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:179:
GAATTCAACC ACCCCCTTTT TGTCGCCTAC GCTTATAACG CTGGGCCTGG GTTTTTAAGG 60
AGGTGGTTAG AAAGCTCCAA ACGATTTAAA GAAAAAAATC ATTTTGAGCC ATGGCTTAGC 120
ATGGAGCTTA TGCCTTATAG TGAAACCCGT TTGTATGGCT TTAGAGTCAT GCTCAATTAC 180
TTGATTTATC AAGAAATTTT TGGGAATTTC ATCCCTATTG ATGGATTTTT AGAACAAACT 240 CTTAACTCAA AGGACAAACC ATGATTAAAA AATGCCTTTT TCCT 284
(2) INFORMATION FOR SEQ ID NO: 180:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 259 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y270/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: CTTTTATTAG GAATATTTTG CTTTAAGGGG GGGCGAGTAA GCCTCCAAAC AAGCCCCATT 60
AATAACACAG GCAACAAGCA TACAAAGAGG ATCAAATACA TGTAAGCGTA ATAAGTCAAA 120
TTGCCCACGT TGATAAAATT TATAAGCGTT TTTTATAATA AGCGTTACTC GCTTCAATAT 180
AGCCTTCCAC GCTCCCGCAA TCGTATCGTT TGCCCTTGAA CTGATAAGCG ATGATGCGCT 240
TTCTTTTGGC TTGAGTGCC 259
(2) INFORMATION FOR SEQ ID NO: 181:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 332 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y28/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181:
CCCTTTTATT TTTGATTGTT AATGGGGTTA TTAGGGTTGT TTTGTTGGGC TTCTTGGAAC 60 GCTTTCATCT CGGCTAAAAT ATTGACCATC GCCATTAAAG CCATGTTGTA ACCAAGCGGG 120
CCAAATCCCA TGATCACGCC TCCACAAGCC GCTCCGGTGT AAGAATTTTT CCTGAATTCT 180
TCTCTAGCTT GAATGTTAGT GAGATGCACT TCAATAACGG GTTTGCCCGC TAGCATGATC 240
GCATCCGCAA TCGCAATAGA ATTGTTGCGA AAACGCTCCA GGGTTAATGA TAATCCCTTC 300
ATAATCGCTG CCCACGCTCT CTTGGATCTT GT 332
(2) INFORMATION FOR SEQ ID NO: 182:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 331 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y28/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182:
GGCTTTTTGA TCTGGCTCTT TTTTTTATTA GTTATTGTAA AGATTTTTTG GAGTGGGATC 60 AAGCAAAATT CTTTAATATC GCTCTTTATA TTAACGACAC TCGCTTTTTA CCTCATTTTT 120
GGCATCGGGT TTGACCCCTT TGATTTCTTC ATTACGGGAA GTTTTTTTGT AGGAATAATC 180
ATGATGGCTG TTTTTTTAAA AAAGGATAAA AGCGCTTTTT AGCATCAAAG GGTTTGATGT 240 TAGTCAAGCG GTANTTTCTT GACAATGCGT TCTTAGGGGA TTTAAGGGGT AAGATAAAAA 300 CTTATGCTAT AATGCGGTTT CATTCCATAT T 331
(2) INFORMATION FOR SEQ ID NO:183:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 501 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y29.ASM (xi) SEQUENCE DESCRIPTION: SEQ ID NO:183:
TAGAGCAAGA AGTTAAGAAG AGCGTTAAGG CTTATTTGGA CTGCGTATCA AGAGCTAGGA 60
ATGAAAAAGA GAAACAAGAA TGCGAGAAAT TACTCACCCC TGAAGCGAGG AAATTTTTAG 120
CGAAGCAAGC ATTAAGTTGT TTGGAAAAAG CTAGAAATGA AGAAGAAAGA AAAGCATGTC 180 TTAAAAATCT CCCTAAAGAC TTACAGAAAA ATGTTTTAGC TAAAGAGAGC CTTAAAGCTT 240
ATAAGGACTG CCTCTGATCC CATTTTAACG CTTGCAAGTA TTCTAGTCGT GGGTAAGAAG 300
GTTTTTGCTC CACTAATCTT GTGTATTTAG CGATATTTTC TTGAGTGGTT TCAATGGAGT 360
TTTCTAATTC CTTGATTTTT TCAGGGAGTT TAGTGATAGC GTTATCCAAT CTCTTTAAAA 420
ACCCATCAAA TTTGATCTCA GCGCAGAAAT TATAAGAGCT AAACAGTTGG TTAGTATCGT 480 TTTTATAAAC CATATTGCTA G 501
(2) INFORMATION FOR SEQ ID NO: 184:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 548 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y30.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: TCTTCAATGA CGATCCCAAT AGAACCTTAT ACAACTATTT GAATATTGCA GAAATTGAGG 60
ACAAAAACCC ATTGAGAGCC TTTTATGAGT GTATTAGTAA TGGTGGCAAC TATGAAGAAT 120
GTTTGAAGCT TATCAAAGAC AAAAAACTTC AAGATCAAAT GAAAAAGACT CTAGAGGCTT 180
ATAATGACTG CATCAAAGAT GCCAAAACTG AAGAAGAAAG GATCAAGTGT TTAGATTTAA 240
TCAAAGATGA AAACTTGAAA AAAAGCTTAC TGAACCAACA AAAAGTTCAG GTGGCGCTAG 300 ATTGTTTGAA AAACGCTAAA ACCGATGAAG AACGAAACGA GTGCCTAAAA CTCATAAATG 360
ACCCTGATAT TAGAGAGAAA TTCCGTAAGG AATTAGAGCT TCAAAAAGAG CTTCAAGAGT 420
ATAAGGATTG TATCAAAAAC GCCAAAACAG AGGCTGAGAA AAATGAATGC TTGAAAGGCT 480
TGTCTAAAGA AGCTATAGAA AGATTGAAAC AGCAAGCGCT AGATTGTTTG AAAAACGCTA 540
AAACCGAC 548
(2) INFORMATION FOR SEQ ID NO: 185:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 335 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y33/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:185:
CAAAGATTTT AGATTCTTTC AAACGCCTAA TCCTGTCTTG TATCTCTTCG TAATATTTTT 60 CAGCATCAGC GTTAGACTTC TCGTTTTTTT CCCAACGACT GCCCTTGTTG TAAGATTTAA 120
TCATGTCTTT TAGATTGTCA TGGTAGCGTG TTTTCCAATA GAGCAACTCT TTTAAAGCCA 180
CTTCAGAAGC AAACGCATCG TCTTTAATGA GCAATTCCCC CATCACATTA CGCAAAAAGG 240 GGCTATCATT ATGCCCATAG CTTTTTAGAA CGCTAGGGAT ATAAGAATGG TACACACCCA 300 GCGCTCGGAT CGGAAAAATT GATTTTATAA ACCCC 335
(2) INFORMATION FOR SEQ ID NO:186:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 349 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y33/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186:
AGCGTGATAA AGCACTTCTT TATTGTTTTT GAACGCTGTT TGGAAATGCT GGTGATCTTT 60
TAACCCTAAT TTTTGATAAA ACCCTAAGCC TAATGCGTTC AACCCTTGCG TTTTATCTTC 120
ATTGTTGCAT GCGATTTTAA CCGGAGGCCC AAAAAATTCC TCTTTGGTTT CTTGGGTTAA 180 AAGGGGAAAA TCTTTAATGA GCAAACACAA ACGCCTAGGG GTGTAAAAAA TCTCTACATT 240
TCCCACTTCT AAAGTGCGTT TTTGAAAAAG AGCATGGAGT TTTTTAGGCA TTTCTTTATA 300
TTCATTCAAT AACGCTTGTG CGGGCAATTC TTCAACTAAA ATCTCTACT 349
(2) INFORMATION FOR SEQ ID NO:187:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 492 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y37.ASM (xi) SEQUENCE DESCRIPTION: SEQ ID NO:187:
CTTTGGAGCT TGCGGCTTTG GCGGGGTTTT AATATCCAAA TTCAAAGCTT TCACGCCTAT 60
TTTCTGCACC GCTTCAGAGA TATTTTCCCT TAAAATCGCG CTCGTTTCTT CAGAGCTTGA 120
GGGGATTTCT ATTAAAAGCC CTAATTGATT GTCATGCAAA GCGATGCTTT TAACAAAACC 180 AAAGCTGACA ATATCCTTTT CAAAATTAGG GTAAATGATC GTTTTTAACG CGTTTAAGAC 240
ATCTTCTTGG GTGAGCATTT TAATCCTTAA AATGATAGTA TTTGAATTTT CTTTATAGCA 300
TGTTTTGTTT AATTTTGCAT TATAAATTAA ATTTGAGTTT TCAAAAAGGA TATGGCATGA 360
AATTTTATAA GCGTGTTTTG AAATTGCACC ATTTTAGGAA TTTGGGTAGA AACTCGCCTA 420
TGGAGTTGCT TTTAAATTCA AGTTTTGAAA AACATGGAGG ATTGGTGGTT TTAGTGGGGG 480 AAAATAATGT CG 492
(2) INFORMATION FOR SEQ ID NO: 188:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 255 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y3/t3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:188: GGTTTGAAGC GCACAACACC GACTTGGAAA ACAGGATTAA AAATAACGCC AGCTTGCCTT 60
TGGGGGGCGA TTTTAACCCA ACGGATAGAG ATTATGAAAA GCATATTTCT CATGCGTCTC 120
AAGTCAAAAG GGATAAGCAA TGCATCACCA CTGAGAACTG CTTTGACAAT TATGATTTGT 180
ATTTGAATTA CATCAAAGGT GGTCCTGGAT TTGGCGATCC GATTGAAAGG GATTTTGAAT 240
GCGATTTTAG AAGAT 255
(2) INFORMATION FOR SEQ ID NO: 189: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 357 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y40/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189:
GATGATCTCT TTTAATTCTA GCCAATCCAC TTGGTCTAGT TTTTCCAAAC ATTCATCATC 60
GGCTATCAAA ATCCATTCTG GATCTTCTTT AAGGCACGCA GATAGCTCTT GATAGTGGTT 120
AAAATTTTCC ATAGGCAATT CTAATTTCTT AGAGACGCTC TCAAGCAGCT TTGTGATCAT 180 GGGGTTTTGG TTGAATAGAA TCATTTTCAT TTTAGGGACT CCTTATTTGA TAAGAAACGC 240
TTCTTTAAGC CTATCCATTT TATCGCTTTT TTTAAATCTT TTATTATAAT CAAAATTCCA 300
CTTGGTTGGT TGTTTTTTAT CATAGAGTGT AATTTAAAAT AAGGATCATT TGATGTT 357
(2) INFORMATION FOR SEQ ID NO: 190:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 332 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y40/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:190:
ACACCACTGA TGCACTCAAG GTTCAAATGC GTAGGGTCTT TGCCTTTTAT GTGGGGTATA 60
ATTACCACTT CTAAAAGGGC TTTTAAAACC CAACGCAACT CCCTAACATC TTTTGGTAAT 120
AGCTCTTGGC TTTGAGGATC GTTTCTGTGT CATCGGTTTT TAAAAGCACT TCGTAATTGT 180 TTTCAAAAGC GTTTTTGCTC CAATTCGCTG AGCCTAAAAA CACGATCTTA TCATCAATGA 240
TCGCCACTTT TTGGTGCATG ATGCCGTAAT AATTCCCGTT TTTAGCCTTA AGCCCTTTCA 300
ATAAGCACAC TTTCGTGTTA GGGTATTTGT CT 332
(2) INFORMATION FOR SEQ ID NO: 191:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 299 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y42/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191:
GCTCAAGCGA GATGGCGACT TATCAAAATG TGAATGATGC CACTAAAAAT ACGACTGCAA 60
GCATTAATAG CACGGATTTA TTGCTAACCG CTAACGCGAT GTTAGATTCC ATGTTTAGCG 120
ACCCTAATTT TGAGCAACTC AAGGGCAAGC ATTTGATTGA AGTTTCAGAT GTGATTAACG 180 ACACCACGCA ACCCAATTTG GACATGAATC TTTTGACGAC TGAAATTGCG CGGCAGTTGC 240
GGTTGCGATC TAATGGGAGG TTCAATATCA CAAGGGCGAA CGGAAGGAAT TGGCATTGC 299
(2) INFORMATION FOR SEQ ID NO: 192: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 337 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y42/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:192:
AGGCGTTTCA TTAAAACCCT ACTTTTTAAC CATGCCCAAC TCTTCGCGCA CTTTATCCAC 60
AATTTGCTTA TCCAAGCCCA CTAAAACAAA AACCCTATCT TTACCGACAT AGCGGGCAAG 120 CATTTTAGAA GCGATCAATT CCTTATCCAC TAATTGAGAA ATTTTTTCAG TATCAGTGCC 180
GCTTATGGAC CTTTTACCAG AAGCGTCTAC CGTTCTGGTT TTTTCATTTT CCAAATCTTT 240
TTGTAAAGTG GATTTTAAAT TCGCCGCTAA ATTAGCCCTA GCCTTCGCTG TAGCCTGGTT 300
AGTAGAATAA TCCACATCAT TATTGGTGAT CAAATCT 337 (2) INFORMATION FOR SEQ ID NO:193:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y43/t3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:193
CCGGGCTTGA TTTAAGCACC GCTATTAACA CCCCTGTGTA TGCGAGCGCG AGCGGGGTAG 60
TGGGGTTAGC GAGCAAAGGG TGGAATGGGG GGTATGGGAA TTTGATTAAA GTTTTCCACC 120 CTTTTGGTTT TAAAACCTAC TACGCCCATT TGAATAAAAT CGTCGTAAAA ACGGGCGAAT 180
TTGTCAAAAA AGGGCAGTTG ATTGGGTATA GTGGTAATAC AGGAATGAGT ACAGGACCGC 240
ATTTGCATTA TGAAGTGCGC TTCTTAGAT 269
(2) INFORMATION FOR SEQ ID NO: 194:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 311 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y43/t7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:194:
TCTAATTCGC GGCCCGCTCG TATAAAGACT AGATTGGATC ATTCCCTATT TATTTTCAAT 60
TTTCTTTTCT TGCTCATTGA TTAACAACGC CCCCCCTTGA ATGTTCTTAG GGCGAGTTTC 120
CCCAATCAAA ATCCCCTTCC TTTCCACCAC AAGCTCTTGG CTAGAGATCT TACCATCAAG 180 GCGCCCTAAA GGCATGATTT CTACCACTTC CGCTTCCACC GTGCCAGTGA ACTTGCCATT 240
GACCACTAAT TTATTAGCAA AAATCTCACC CACTACCGAG CCGGTTTGCC CGATCACCAC 300
TGTGCTTTTA G 311
(2) INFORMATION FOR SEQ ID NO: 195:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 282 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y46/t3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195:
TACACTCAGT TTAGTTTATT TCTTAATACA AAAGGTAGGC GTTTTGAAAC ATTTAACCCC 60
ACTCACTCAC ACCATCTTTA AAGCCTTATG GCTAGGCACA GCCTTAAGCG CATCTTTAAG 120
TTTAGCCGCA GCAGAAAGCC CCACTAAAAC AGAGCCTAAG CCCGCTAAAG GGGTTAAAAA 180 CAAACCCAAA TCGCCCGTTA CTAAAGTCAT GATGACCAAT TGCGACAATA TTAAAGATTT 240
TAACGCTAAG CAAAAAGAAG TCTTAAAAGC CGCTTATCAA TT 282 (2) INFORMATION FOR SEQ ID NO:196:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 294 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y46/t7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196:
GGGGCATGGC GTTGCCGATA GGGTCTAAAT CCAGGTTGCT ATTAGCACTT TTTTGCAATT 60 CTTGGTCATT ACTGGACTGC GAATCAAAGA TTTTAGATTC TTTCAAACGC CTAATCCTGT 120
CTTGTATCTC TTCGTAATAT TTTTCAGCAT CAGCGTTAGA CTTCTCGTTT TTTTCCCAAC 180
GACTGCCCTT GTTGTAAGAT TTAATCATGT CTTTTAGATT GTCATGGTAG CGTGTTTTCC 240
AATAGAGCAA CTCTTTTAAA GCCACTTCAG AAGCAAACGC ATCGTCTTTA ATGA 294 (2) INFORMATION FOR SEQ ID NO: 197:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 251 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y47/t3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197:
TGTGGGGCGA TGATCAAATC GCTCTCTCTA AACTCGTGTT TGATTTCCAA AAAGAGCATA 60
AAGATCACTT TGTGTTGAAA GCGGGCTTGT TTGATAAAGA AAGCGTTAGC GTAGCTCATG 120 TGGAAGCGGT TTCAAAACTC CCAAGCAAAG AAGAGCTTAT GGGAATGTTG CTTTCTGTTT 180
GGACGGCTCC GGTGCGTTAT TTTGTGACCG GTTTAGACAA TTTGCGTTAA GCGAAAGAAG 240
AAAACTAAGA G 51
(2) INFORMATION FOR SEQ ID NO: 198:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 290 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y47/t7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:198:
CAGAGAAAGG CGCAAAAGGC CTTTCTGTTT TTAAGTCTTA CTTGACTTCA ACCTTAGCAC 60
CTACTTCTTC AAGTTTCTTC TTGATGGTTT CAGCTTCTTC TTTATTCACG CCCTCTTTAA 120
GCACATGAGG GGTTTTTTCG GTAGCGTCTT TAGCTTCTTT CAGGCCAAGT CCAGTGATTT 180 CACGAACCAC TTTAATCACC TTAATTTTTT CAGCACCGCT ATCGGCTAAA ATCACATTAA 240
ATTCGGTTTT TTCTTCGCTC TCAGCCGCTG CACCGCCAGC TACAGCCGCA 290
(2) INFORMATION FOR SEQ ID NO: 199: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 491 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y48.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199:
TGAAAAACGC TAAAACCGAA TCTGAGAAAA AAAGGTGTGT CAAAGATCTC CCTAAAGACT 60
TGCAGAAAAA GGTTTTAGCT AAAGAGAGCG TTAAGGCTTA TTTGGATTGC GTATCTCAAG 120 CCAAAACTGA AGCTGAAAGA AAAGAGTGTG AGAAATTACT TACGCCTGAA GCGAAAAAAC 180
TATTAGAAGA AGCTAAAGAG AGCGTCAAAG CTTATAAAGA CTGCCTCTCC CAAGCTAGAA 240
ATGAAACTGA AAGGAGAGCT TGCGAAAAAT TACTCACCCC TGAAGCGAGG AAACTCTTAG 300
AGCAAGAAGT TAAGAAGAGC GTTAAGGCTT ATTTGGACTG CGTATCAAGA GCTAGGAATG 360
AAAAAGAGAA ACAAGAATGC GAGAAATTAC TCACCCCTGA AGCGAGGAAA TTTTTAGCGA 420 AGCAAGCATT AAGTTGTTTG GAAAAAGCTA GAAATGAAGA AGAAAGAAAA GCATGTCTTA 480
AAAATCTCCC T 491
(2) INFORMATION FOR SEQ ID NO: 200: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 332 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: antigenic region of structure meld for Y49T7 and
Y69/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:200:
GAATTCCTCA AAGACCAAAT AGGTATATTG ATCAAAAAAA TCTGCTTATC TTGATTCTTT 60
AGTTAAAATT ATCCGAGCCG ATGGAGTAAG CTACCCCATC GTTTTAGAGC TTGGGCGATC 120
CACTAAAAAC ACCCAAGAAA GAAATAAGAA TTTTTAGCTC AAAACCCCTA CCGCGCTTCT 180 ATGACAACCG GTTTAGTCAT TTTTTTGCTG ACTTCATCAG CATGCATTAA GGCTTCTTGC 240
TTGCTCTTAT AAGGGCCTAT GAGATAGCGT TTAGTAGCCC CCCTATCTTC AATCTTATGG 300
GGGAATTGGT TAAACGCTTG CAAAAAGGCT TT 332
(2) INFORMATION FOR SEQ ID NO: 201:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 232 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y53/t3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:201:
AATACTTTTT CTAAAAAATC CATGGTTAAA TTGCCCTTTC GTTTTAAAGA TAAACATTGT 60 AGCATTTTTA GATTTAAGAA TGCTTTTTAT ATATTATATA AAAATATCCC CTTTTAACCC 120 CCTATTGATA CCAACCCCTT TTGGACCTAA TTCTCATTAA ATAGATTTTT ATGATAAAAT 180 CTAAACTTTA TCAAGCCATT AGCCGGTGTT CTTTCTCATT TTTGTAAATT TT 232
(2) INFORMATION FOR SEQ ID NO -.202:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 321 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y53/t7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:202: GTGATCTTTT TCACTTCATC AAACACCCCC CACGCCCTCG CTAAAGACTC AATCGTCTCA 60 GCATTGACTA GCGTGGAGTC GAAATCAAAA ACGGCTAGTT TTTGCACTCA ATGACCCTAA 120 AATTAAGATT TCCTGCTTTT AGCGATCCCT TTGATATACT GATCAGCCAA ATACAAGCCA 180 TGGTTTTCAT TACCAATCAA CTCAATTTTA TCCAAAATAT CCTTGAAAAG CACTTCTTCT 240 TCATGCTGTT CAGACACATA CCATTGCAAG AAATTGAAAG TCGCATGATC TTTGCCTTTT 300 ATGGCGTGAT CAACGATATT G 321
(2) INFORMATION FOR SEQ ID NO: 203:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 608 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y55.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203
TGAGCGGGCG GCTTCTTTTT ATAAGAGCGC GATTAAAAAT GGCGAGCCGC TTGCTTATGT 60
TCTTTTAGGA ATCATGTATG AAAATGGTAG GGGTGTGCCT AAAGATTACA AGAAAGCGGC 120
TGAATATTTT CAAAAAGCGG TTGATAACGA TATACCTAGA GGGTATAACA ATTTAGGCGT 180
GATGTATAAA GAGGGTAGGG GCGTTCCTAA AGATGAAAAG AAAGCCGTGG AGTATTTTAG 240
AATAGCTACA GAGAAAGGCT ATACTAACGC TTATATCAAC TTAGGCATCA TGTATATGGA 300
GGGTAGGGGC GTTCCAAGCA ACTATGTGAA AGCGACAGAG TGTTTTAGAA AAGCGATGCA 360
TAAGGGTAAT GTAGAAGCTT ATATCCTTTT AGGGGATATT TATTATAGTG GGAATGATCA 420
ATTGGGTATT GAGCCAGACA AAGATAAGGC GATTGTCTAT TATAAAATGG CGGCTGATAT 480
GAGCTCTTCT AGAGCTTATG AAGGGTTAGC AGAGTCTTAT CAGTATGGGT TAGGCGTGGA 540
AAAAGATAAG AAAAAGGCTG AAGAATACAT GCAAAAAGCA TGCGATTTTG ACATTGATGA 600
AAATTGTA 608
(2) INFORMATION FOR SEQ ID NO: 204:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 219 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y56/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:204:
GTCCTGGATT TGGCGATCCG ATTGAAAGGG ATTTGAATGC GATTTTAGAA GATCTCAACA 60
GCAAACAGCT ATTGCCAGAA TACGCTTACA AGGTTTATGG CGCAGTGGTG AGTCAAAATA 120
AAGACGGCGT GTGGGTCGGC GATGAAGCCA AAACGAAAGC CAGAAGAAAA GAAATTCTTG 180
AAAACAGAAA GCTAGATCCA TACCGGTAAA ACAATGGAT 219
(2) INFORMATION FOR SEQ ID NO: 205:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 291 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y56/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205:
GGTAGATTTT GTCATCAAAA TCTACCTTAT CTTGCAACAC CTTCAAATAC ATTTGGAATC 60
TTTCATGATC TTTAGGCATG CTAAGCATTT TTAAGACAGT GTTCCAATCC AAGTTCCCCT 120
CTACCAAATT TTTAATTTGT TCTTGTGTGT ATTTTGACAT AACCATTCTC CTTTCTTTAT 180
TTCTCATCAA CCAACAGAAC TGTGCGCACA TCAGGCAATT TGCTCAAATC CATCCTGTAT 240
TTAGAACCAT AGGTGAATAC GCCAAGCTCA TCTTCTTTCA TATCCACTCT T 291
(2) INFORMATION FOR SEQ ID NO: 206: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y60/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:206:
GGCATTTATT GGCGGACACT TACGCTCTAG CTGCAACGAC TGGGAATGTG TTTATCCTAA 60
AAGCCGATTA TAATATCCAC AAATGGGGAC TTACTTTAAC CTGGCTCTCG CGTTTTGTAA 120
CTAACATGTT TTATGAAGGT TATTCTATCT ACTATCCGCA ATACGGCTTG ATGAAAATCC 180 ATAAACCCGG GTTATGGCGT GCATAATGTC TTTATCAACT GGACCCCCAC TTCTAAAAAA 240
TGGCAGGGTT TAAGGATTTC AGCCG 265
(2) INFORMATION FOR SEQ ID NO: 207: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 295 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y60/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207:
GAATTCAGCA TGGTGCGCTT TAAGGGCGAT AAAATAAAAG CCCAATCTGT CTATGAAAAC 60
ACCATTTATC TCAAACCACA AGATATTGCT AACATCGTGC TATGGATTTA TGAACAACCC 120
TTGCATGTCA ATATCAACCG CATAGAAATC ATGCCCATAA GCCAAACTTT CGCTCCCCTA 180
CCCACCCATA AAAACCCTTA AGGGGTTTTA AGAATAAATT TAAACTTTAA TAACCCTTTA 240 AGTTTTAAAA AAAGGGGATT TTGGTTGTTT AAACTCATCA TCTTAAGAGA ATAGG 295
(2) INFORMATION FOR SEQ ID NO:208:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 266 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y61/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: CCCTAACGCG CGGCAAGCCT GCCCATCATC CAATCGGCAT GCTTTCAAAT AATAGATGAC 60
GGCTTTATGG TTGCTTTGTT TGACTTCTAA AAAGCCTTTT TCATAAGCCA CGCCCAAAAC 120
ATAACACCCC TTAGCCATAT CAAAATTACA GCTTTTTTGA TAGTAGGTAA CCGCTTGATC 180
GGCATTGCCC TTGATTTTTC GGTCATAAAT AATGCCTAAA TTGTTAGCAA CTCTCAGGGT 240
GTTTTTAAAA CGCACGCCTT AGCATA 266
(2) INFORMATION FOR SEQ ID NO: 209:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 244 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y61/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: AAAGAAGGGT TCTAAAAATC AGAATTTAAA AGAAGGTAAA AATGAGTGTC AAAATTTTAA 60
AAATATTAGT TTGTGGGTTA TTTTTTTTGA ATGCCCATTT ATGGGGGAAA CAAGACAATA 120
GTTTTTTGGG GGTTGCTGAA AGAGCCTATA AAAGCGGGAA TTATTCTAAA GCCACATCTT 180
ATTTTAAAAA AGCATGCAAC GATGGGGTGA GTGAAGGTTG CACGCAATTA GGAATCATTT 240 ATGA 244
(2) INFORMATION FOR SEQ ID NO: 210:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 286 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Y73/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: CCTCCAACGC CCGTAAATCG TAATTGCGCT TCCATTATTG TTTTCCTTGT GCTTTTTCAA 60
TGATTTCTTG ATACGCTTCG CAATATTCTT TCCTGTCCGT GTCATGCTTT AAAACGCCTG 120
TAGGGAATTT ATCCACCCTT TCTTCAGGGC TCATGGCTTC AAATTGGCGT TTGCTCACCA 180
ATCGGCTTTC CATCCATTTT AGCATTTGAG ACGCTTCGCC CATTTTATTC TTGCGCCCTA 240
AATTGATATG GCAATTACTA TGGACATCAA AGAAGCTAAA GCCCTT 286
(2) INFORMATION FOR SEQ ID NO: 211:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 286 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y73/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211:
TGAAAGGGCT ATGCAAAGAA AGGTGCATTT CTTGGGGCAA GCCAATGGGC GCACGATTTC 60 GCCTAAACAA ATCATCGCAA AATTGAAGGA GCTTTAAAAT GGCGTTTAAT TATGATGAAT 120
ATTTGCGTGT GGATAAAATA CCCACTTTGT GGTGTTGGGG CTGTGGCGAT GGCGTGATTT 180
TGAAATCCAT TATCCGCACG ATTGACGCTT TAGGCTGGAA AATGGATGAT GTGTGCTTGG 240
TGAGTGGGAT TGGTTGCAGC GGGCGCATGA GCTCCTTATG TGAATT 286 (2) INFORMATION FOR SEQ ID NO: 212:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 551 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Yδl.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212:
GAATTCCTTA CCACTTCGGA TAAAATCGGT CAGGTTCGTA TCGCTACAGG CGCGTTAATC 60
ACGGCTTCTG GGGATATTAG CTTGACTTTT AAACAAGTGG ATGGCGTGAA TGATGTAACT 120 TTAGAGAGCG TAAAAGTTTC TAGTTCAGCA GGCACGGGGA TCGGTGTGTT AGCGGAAGTG 180
ATTAACAAAA ATTCTAACCG AACAGGGGTT AAAGCTTATG CGAGCGTTAT CACCACGAGC 240
GATGTGGCGG TCCAATCAGG AAGTTTGAGT AATTTAACTT TAAATGGGAT CCATTTGGGT 300
AATATCGCAG ATATTAAGAA AAATGACTCA GACGGAAGGT TAGTCGCAGC GATCAATGCG 360
GTTACTTCAG AAACCGGCGT GGAAGCTTAT ACGGATCAAA AAGGGCGCTT GAATTTGCGC 420 AGTATAGATG GTCGTGGGAT TGAAATCAAA ACCGATAGCG TCAGTAATGG GCCTAGTGCT 480
TTAACGATGG TCAATGGCGG TCAGGATTTA ACAAAAGGTT CTACTAACTA TGGGAGGCTT 540
TCTCTCACAC G 551 (2) INFORMATION FOR SEQ ID NO: 213:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 229 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y82/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213:
TGGTGTATGC AGAATGGAGT TTGAGAATGT TTGTTGGTTT TAGAACGCAT ACAGTTTCTT 60 TTTTAAGATT GGTAGCCATT GGCATTATGT TTGATCTTAT TAAGGCAGAG GAGTAACAAT 120
GGGATACGCA AGCAAATTAG CCTTGAAGAT TTGTTTGGCA AGTTTATGTT TATTTAGCGC 180
TCTTGGTGCA GAACACCTTG AACAAAAAAG GAATTTTATT TATAAAGGG 229
(2) INFORMATION FOR SEQ ID NO: 214:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 326 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y82/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:214:
CCCACTATAA TAAATATCCC CTAAAAGGAT ATAAGCTTCT ACATTACCCT TATGCATCGC 60
TTTTCTAAAA CACTCTGTCG CTTTCACATA GTTGCTTGGA ACGCCCCTAC CCTCCATATA 120
CATGATGCCT AAGTTGATAT AAGCGTTAGT ATAGCCTTTC TCTGTAGCTA TTCTAAAATA 180 CTCCACGGCT TTCTTTTCAT CTTTAGGAAC GCCCCTACCC TCTTTATACA TCACGCCTAA 240
ATTGTTATAC CCTCTAGGTA TATCGTTATC AACCGCTTTT TGAAAATATT CAGCCGCTTT 300
CTTGTAATCT TTAGGCACAC CCCTAC 326
(2) INFORMATION FOR SEQ ID NO:215:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 270 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y84/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:215:
CAACCCCTTT CATAATTATC AAGGGCTTTC CTCAAATCTT TTGGAACAGC GTCCCCTTGG 60
GAATACATAA AGCCCATCCT AGAGCAACTC ACCGCGCTTC CCATATCGCA TGCTTGTTTG 120
AAATAATTAA AAGCCTTTTG GGGATCTTTT TTGACATACC TGCCTAGCAT ATACATAGAG 180 CCTAAACTCG CGCAACCACC AGAATTGTTT AACCCGCAAG CTTTTTGCAA GTTAATCAAA 240
CGCCACTTCA AAATCTTCAT CAAGCCCTGC 270
(2) INFORMATION FOR SEQ ID NO: 216: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 278 base pairs
(B) TYPE : nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y84/T7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:216:
AAGGGTTCTA AAAATCAGAA TTTAAAAGAA GGTAAAAATG AGTGTCAAAA TTTTAAAAAT 60
ATTAGTTTGT GGGTTATTTT TTTTGAATGC CCATTTATGG GGGAAACAAG ACAATAGTTT 120
TTTGGGGGTT GCTGAAAGAG CCTATAAAAG CGGGAATTAT TCTAAAGCCA CATCTTATTT 180
TAAAAAAGCA TGCAACGATG GGGTGAGTGA AGGTTGCACG CAATTAGGAA TCATTTATGA 240
AAACGGGCAA GGCACTAGAA TAGATTATAA AAAAGCCC 278
(2) INFORMATION FOR SEQ ID NO: 217:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 659 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y87.asm (xi) SEQUENCE DESCRIPTION: SEQ ID NO:217:
CCTTGAGTTT AGGCAAAGGC ATGATGCCTA AATACAATCT CAATTTAGAA GAAATCCAAG 60
CGATTTATCT TTACATCACC TCTTTAGAGC ATAAAGAAGA GCGTAAGGAT TCTCCTAAGC 120
CTTAATCAAA GCGCTTGATT TATGTTAAAA TGGAGCGTTG CATTTTTGTT TTGATTAAAG 180 AAGGGTTCTA AAAATCAGAA TTTAAAAGAA GGTAAAAATG AGTGTCAAAA TTTTAAAAAT 240
ATTAGTTTGT GGGTTATTTT TTTTGAATGC CCATTTATGG GGGAAACAAG ACAATAGTTT 300
TTTGGGGGTT GCTGAAAGAG CCTATAAAAG CGGGAATTAT TCTAAAGCCA CATCTTATTT 360
TAAAAAAGCA TGCAACGATG GGGTGAGTGA AGGTTGCACG CAATTAGGAA TCATTTATGA 420
AAACGGGCAA GGCACTAGAA TAGATTATAA AAAAGCCCTA GAATATTATA AAACCGCATG 480 CCATCTGGCG ATAAAGTCGT GTTGCTTGTG TTGGTGCCAG CGTATTGCAT CGCTTGGTAA 540
AACAAGTCGT TAAAATCCGC ACGAGATTTT TTAAACCCGG TGGTATTGAC ATTGGCGATA 600
TTGTTTGAAG TGGTGTCAAT ATGCGTTTGT TGGGCGAGCA TCCCTGAAGT GGCGCTATA 659
(2) INFORMATION FOR SEQ ID NO: 218:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 259 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y89/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:218:
GATCAAAACC ATCAGTTTTG GTGAAACCAA ACCCAAATGC GCCCAAAAAA CTAGAGAATG 60
TTACAAAGAA AACAGAAGAG TGGATGTCAA ATTAGTGAAG TAATTCTAGG ATGAAAAGGT 120
TTTTTTTTAT CCCTTTTATC GCTCCCTTTT TTCTCAATGG GGAGCCTTCA GCGTTTGATT 180 TACAAAGCGG AGCCACCAAA AAAGAACTCA AGCAGTTGCA AGTCAATAGT AAGAATTTTT 240
CCAATATTTT GACCAAAAT 259
(2) INFORMATION FOR SEQ ID NO: 219: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 282 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y89/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:219:
TCATTCAACA CTTTAGAAGT CAAACTTTTA ATCAAGTCAT TCTTTTGGGT GTTATACAGA 60 GAGATTCCAG CGCTTCTTTC AGGGTTTTTC TTGTAAGTTT CAATGACTTG ATTGAGTTGG 120 TTGATTTGAT TGAAAATCAA AAACAAATCC ACCGCATCCG TTTCATGCTC TTCAACCGCC 180 CTCAAAGGAT TTAAAAACAA TAACAACACC AATACCCACC ACAATAATAA ACGCATGCAA 240 AACCCCACTA AAACAAGATA GAATCTTTCT CCTAATTTCT AG 282 (2) INFORMATION FOR SEQ ID NO: 220:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 253 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y91/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220:
TGAAGCCGTA ATCGTTTCTA TGCGTTTGTT CATGAGTTTC CTAAAAAATG GGTTAAAATA 60
GCCTTATTAT AACTTAAATC AAGGAAAGAT TAATGACCCC TGAACTAAAC CTCAAATCCT 120 TAGGCGCTAA AACGCCCTAT ATTTTTGAAT ACAACAGCCA GCTATTAGAA GCTTTCCCTA 180
ACCCAAACCC CAATTTAGAC CCCTTAATCA CGCTAGAGTG CAAGGAATTT ACAAGCCTTT 240
GCCCCATCAC TTC 253
(2) INFORMATION FOR SEQ ID NO: 221:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 329 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y91/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221:
GATGATGAGG GAAATGTTGT TGCTAGAGGT GAGGAGGCGG TAAATTATGA GGGGTAGCCC 60
CCCCTGTGGG GTCTCCTTTT TTAATCTCTT TTCTTGCTCT TTTCATTCTC TCTCTAATTT 120
GTCCGTTTAA TTTGGAACGG CTTTAATCTT AAAACCCTAT CTTTGCACGT TATTTCGCAT 180
TCAAAAGGCG TTTTTCTTTA AATTCCTGGT ATTCTTTGAT CGCATAATTC ACAAAGGGTT 240
TGATCGCAAT CCCGCCCCTA GAGACAAAAT CCCCATACAC TTCCAAATAC TTTGGTTCTA 300
GCAATTGGAT TAAATCCAGT AATATCGTA 329
(2) INFORMATION FOR SEQ ID NO:222:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y92/T3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:222:
CTCCTTAAGA GAGAGCGCTA AATTAGAATT AGAAGAGCAT GCCAATAACC CTTTTGTGAA 60
AGAGATTTGC TCTTTTATTT TAGAGAGTTC TAGGGGCGTG GCGTATAAGT CAAGCGAATA 120
TTCTAGCGAA GAAAAACAAG AGGAATAACA TGAACGAAAC GCTTTATTGC AGTTTTTGCA 180 AAAAACCAGA ATCCAGAGAT CCCAAAAAAC GCCGCATTAT TTTTGCGAGC AACCT 235
(2) INFORMATION FOR SEQ ID NO:223:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 313 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Y92/T7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:223:
CAAAAATCAA ATCCACTTCA TCCATTTTGA AAAGTTGCTG GTATTGCTTG ATGAGAGCGT 60
TTTTAGGTTT TTGCAAAATA TCCACCATCG CTTCTAGACT GATACTATCT AGCGTGCTTA 120 AAACCGGCAA ACGGCCGATA AGCTCAGGGA TAAGCCCATA AGTTACCAAA TCATGGGTTT 180
GGACTAAATG CAAAATCGCT TCTTGCTCTT TTTTGCTCAT CTTTTCTTGA GTGAAACCCA 240
ACACATTTTG CGTGGTGCGT TTTTTAATGA TTTCCGCTAA CCCATCAAAC GCTCCAGCGC 300
AAATGAATAA AAT 313 (2) INFORMATION FOR SEQ ID NO:224:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 968 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z14.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:224:
GAATTCCCTG GTGATGACAC TCCTATCGTA GCGGGTTCAG CTTTAAGAGC TTTAGAGGAA 60
GCAAAGGCTG GTAATGTGGG TGAATGGGGT GAAAAAGTGC TTAAGCTCAT GGCTGAAGTG 120 GATGCCTATA TCCCTACTCC AGAAAGAGAC ACTGAAAAAA CTTTCTTGAT GCCGGTTGAA 180
GATGTGTTCT CTATTGCGGG TAGAGGGACT GTGGTTACAG GTAGGATTGA AAGAGGTGTG 240
GTGAAAGTAG GCGATGAAGT GGAAATCGTT GGTATCAGAG CTACACAAAA AACGACTGTA 300
ACCGGTGTGG AAATGTTTAG AAAAGAGCTA GAAAAAGGTG AGGCCGGCGA TAATGTGGGC 360
GTGCTTTTGA GAGGAACTAA AAAAGAAGAA GTAGAACGCG GTATGGTTCT ATGCAAACCA 420 GGTTCTATCA CTCCGCACAA GAAATTTGAG GGAGAAATTT ATGTCCTTTC TAAAGAAGAA 480
GGCGGGAGAC ACACTCCATT CTTCACCAAT TACCGCCCGC AATTCTATGT TGCGCACGAC 540
TGATGTGACT GGCTCTATCA CCTTCCTGAA GCGTAGAAAT GGTTATGCCT GGCGATAATG 600
TGAAAATCAC TGTAGAGTTG ATTAGCCCTG TTGCGTTAGA GTGGGAACTA AATTTGCGAT 660
TCGTGAAGGC GGTAGGACCG TTGGTGCTGG TGTTGTGAGC AACATTATTG AATAATATTA 720 GCAAAAAGAG TTACCATAAA GGGTCATTAT GAAAGTTAAA ATAGGGTTGA AGTGTTCTGA 780
TTGTGAAGAT ATCAATTACA GCACAACCAA GAACGCTAAA ACTAACACTG AAAAACTGGA 840
GCTTAAGAAG TTCTGCCCAA GGGAAAACAA ACACACTCTT CATAAAGAAA TCAAATTGAA 900
GAGCTAGTTC TTTCTTTTGT GTTGTGATTG AAAAGGAGGG GAGGTTAGGT CAGTAGCTCC 960
AATGGTAG 968
(2) INFORMATION FOR SEQ ID NO: 225:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 280 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z21.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225:
GAACAATCAA CCCCAGTAGG CGGTAATAAT GGCAAGCCTT TCAACCCTTT CACAGACGCT 60 AGCTTCGCTC AAGGCATGCT CGCTAACGCT AGCGCGCAAG CCAAAATGCT CAATTTAGCC 120
CATCAGGTGG GGCAAGCCAT TAACCCTGAC AATCTTACCG GGAGTTTTAA AAATTTTGTT 180
ACAGGCTTTT TAGCCACATG CAACAACCCT TCAACAGCTG GCACTAGTGG CACACAAGGT 240
TCAGCTCTTG GCACGGTTAC CACTCAAACT TTCGCTTCCG 280 (2) INFORMATION FOR SEQ ID NO: 226:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 194 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z24.11F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226:
TGCGAGCGTT ATCACCACGA GCGATGTGGC GGTCCAATCA GGAAGTTATG GGATCCATTT 60
GGGTAATATC GCAGATATTA AGAAAAATGA CTCAGCGATC AATGCGGTTA CTTCAGAAAC 120
CGGCGTGGAA GCTTATACGA ATTTGCGCAG TATAGATGGT CGTGGGATTG AAATCAAAAC 180
CGATAAGTGC TTTA 194
(2) INFORMATION FOR SEQ ID NO:227:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1186 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z25.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227:
GAATTCCCTT TGACTTTCCC TAGCGGATAC CACGCAGAGC ATTTAGCCGG CAAAGAAGCC 60 CTTTTTAAAG TGAAATTGCG CCAGATTCAA GCGCGTGAAG TGTTAGAAAT CAATGACGAA 120
CTCGCTAAAA TCGTGCTAGC TAATGAAGAG GATGCGACCT TAAAGCTTTT AAAAGAAAGG 180
GTTGAGGGGC AGTTGTTTTT AGAAAATAAA GCCAGGCTCT ATAATGAAGA GTTGAAAGAA 240
AAATTGATTG AAAATTTAGA TGAAAAGATT GTTTTTGATT TGCCTAAAAC GATCATAGAG 300
CAAGAAATGG ATTTGTTGTT CAGGAACGCT CTTTATTCYA TGCAAGCTGA GGAAGTCAAA 360 TCCTTACAAG AAAGTCAAGA AAAAGCCAAA GAAAAGCGTG AGAGCTTTAG GAACGATGCC 420
ACAAAAAGCG TGAAAATCAC TTTTATCATT GACGCTTTAG CGAAAAGAAG AAAAATTGGC 480
GTGCATGACA ATGAAGTCTT TCAAACCTTG TATTATGAAG CGATGATGAC AGGGCRAAAC 540
CCAGAAAGTC TCATTGAACA ATACCGCAAA AATAACATGT TAGCGGCGGT GAAAATGGCG 600
ATGATTGAAG ATAGGGTGTT AGCTTATTTG TTGGATAAAA ACCTGCCTAA AGAGCAACAA 660 GAAATTTTGG AAAAAATGAG GCCCAACGCT CAAAAAATTC AAGCGGGTTA AACGGCTAAA 720
AAGGAGAGAT GATGGGATAC ATTCCTTATG TAATAGAGAA TACCGATCGT GGGGAGCGCA 780
GCTATGATAT TTACTCGCGC CTTTTAAAGG ATCGCATTGT TTTATTGAGC GGTGAGATTA 840
ACGATAGCGT GGCGTCTTCT ATCGTGGCCC AACTCTTGTT TTTGGAAGCT GAAGACCCTG 900
AAAAAGACAT TGGCTTGTAT ATCAATTCTC CCGGTGGGGT GATAACAAGC GGTCTTAGCA 960 TCTATGATAC CATGAATTTT ATCCGCCCTG ATGTTTCCAC GATTTGCATC GGTCAAGCGG 1020
CTTCTATGGG GGCGTTTTTA CTGAGCTGTG GGGCTAAGGG CAAGCGCTTT TCACTACCCC 1080
ATTCAAGGAT TATGATCCAC CAGCCTTTAG GGGGGGCTCA AGGGCAAGCG AGCGATATTG 1140
AAATCATTTC TAACGAGATC CTTAGGCTTA AGGGTTTGAT GAATTC 1186 (2) INFORMATION FOR SEQ ID NO: 228:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1564 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z3.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:228
GAATTCTTTT TTATTCATGG GGTTTTGTAA GGGGGCTTGG ATTTTGGCTT CCAATACGAC 60
AGGCTTGGTT ACGCCATGAA GAGTCAAATC CCCATGGATT TTACCATCTT CGTATTTGGT 120 CATTTTAAAG CTCCCTTTGG GGTATTTCAT GGCATCAAAA AACTCTGCTG TTTTTAGGTG 180
GTCGTCTCTT TTTTTGTTCC TAGTGTTAAT GCTTTTAATA TCAGTTTTGC CTTCAAAAAC 240
ATTGAGAGCT TTGGTATTAG GATCGGCATC AATTTTGCCA TCAAAACTAT CAAACACGCC 300 TCTTGTTTCA TTGAATTTGA AGTGTTTGAC CTCAAACCAC ACGCTAGAGT TTGCCTTATC 360
AATCGTATAA GGTTTTGCAA ACGCCAGACT AACGCCCAAA AGGGTGAACA TTAACGCTTT 420
TTTCATTGTT TCCTCGCTTC ATTTTGAATT AAAAACGCCA ACTATACAAC AAGTTGGTTA 480
ATGGTAAAAA TACTCCTAAC CATCTGTTTT AAGGTATAAT ACAAAAAATC CCCATTAGGT 540 AGGTGTTGTT ATGATAAAAA AGACCTTTGC ATCAGTTTTA TTAGGATTGA GTTTGGTGAG 600
TTGTTTTAAA TGCCAAAGAA TGCGTCTCGC CCCATAACAA GAAGCGTTAA GTATCATCAG 660
CAAAGCGCTG AGATCAGAGC CTTGCAATTG CAAAGTTACA AAATGGCGAA AATGGCGCTA 720
GACAATAATC TCAAGCTCGT TAAAGACAAA AAGCCAGCCG TCATCTTGGA TTTAGATGAA 780
ACCGTTTTGA ACACTTTTGA TTATGCGGGC TATTTGATCA AAAATTGCAT CAAATACACC 840 CCAGAAACTT GGGATAAATT TGAAAAAGAA GGCTCTCTCA CGCTCGTTCC TGGAGTGCTA 900
GACTTTTTAG AATACGCTAA TTCTAAGGGC GTTAAGATTT TTTACATTTC TAACCGCACG 960
CAAAAAAATA AGGCATTCAC CTTAAAAACG CTCAAAAGTT TTAAACTCCC CCAAGTGAGT 1020
GAAGAATCCG TTTTATTAAA AGAAAAAGGC AAGCCTAAAG CCGTTAGGCG AGAATTAGTC 1080
GCTAAGGATT ATGCGATTGT TTTACAAGTG GGCGACACTT TGCATGATTT TGATGCACTT 1140 TTTGCTAAAG ACGCTAAAAA CAGCCAAGAA CAACGAGCTA AAGTCTTGCA AAACGCTCAA 1200
AAATTCGGCA CAGAGTGGAT CATTTTACCT AATTCTCTCT ATGGCACATG GGAAGATGAG 1260
CCTATAAAAG CATGGCAGAA TAAAAAATAA AATTTATCCA TCGCATAAGA ACGATTTTTG 1320
CTAACATGGC AAAAAATTTG ATCTCTTAGG CGGAGCTATG GATTTTGTAG GGTTTGAAGA 1380
TTTAAAATGC AAAGATAAAG AAAACTCTCA AAAAGTTTTT GTGATCCGTA ACGATAAGTT 1440 AGGCGATTTT ATTTTAGCCA TACCCGCTTT AATCGCTCTA AAGCAAGCTT TTTTAGAAAA 1500
AGGCAAGGAA GTGTATTTGG GCGTGGTTGT GCCTAGCTAT ACCACCCCAA TCGCTTTAGA 1560
ATTC 1564
(2) INFORMATION FOR SEQ ID NO: 229:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 518 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: Z41.ASM (xi) SEQUENCE DESCRIPTION: SEQ ID NO:229:
CAATGTCTAA GCGTTTGCCG AAAGTTTTTT CTCTGTCAAA GTCTAAGCAT CTATTCACTT 60
CAAAGAAATG GAAGTGTGAG CCGATTTGAA CCGGTCTGTC GCCAACATTT TTAACTTTCA 120
CGCTAACGGC TTTTTTGCCT TCGTTGATAG TGATGTCTTC ATTTTTTAAG AACAACTCAC 180 CAGGAACTAA TTTACCATTG GCCTCAATAG GGGTATGCAC GGTTACGAGT TTAGTCCCAT 240
CAGGAAACAT CGCTTCAATA CCCACTTCAT GGATCATGCT TGCCACGCCA TCCATCACAT 300
CATCCGGTTT TAAAAGAGTG CGCCCTTCTT GCATCAATTC AGCCGCAGTC TTTTTACCAG 360
CTCTCGCTTC TTCCATAATA TGGGCACTAA TCAAAGCTAC CGCTTCTACA TAGTTAAGCT 420
TAATGCCTTT TTCTTTGCGT TTTTTAGCCA ATTCTCCAGC GTAGTGGAGC ATCAACTTGT 480 CTAACTCTTT TGGGGTGAGT TTCATCTCAT TCTCCTAT 518
(2) INFORMATION FOR SEQ ID NO: 230:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 523 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: Z9.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: TGAAGCGAGA AAACTACTAG AACAAGAAGT TAAAAAGAGC GTCAAGGCTT ATTTGGATTG 60
CGTATCAAAA GCTAGGAATG AAAAAGAGAA ACAAGAATGC GAGAAATTAC TCACGCCTGA 120
AGCGAGGAAA TTTTTAGAGA AAGAACTCCA ACAAAAAGAT AAAGCGATAA AAGATTGCTT 180
GAAAAACGCC GATCCTAACG ACAGAGCGGC TATCATGAAG TGTTTGGATG GTTTGAGCGA 240
TGAAGAGAAG CTCAAATACC TGCAAGAAGC TAGAGAAAAG GCTGTTGCGG ATTGTTTGGC 300 TATGGCTAAA ACCGATGAAG AAAAAAGGAA ATGCCAAAAC CTTTATAGCG ATTTGATCCA 360
AGAAATCCAA AATAAAAGGA CACAAAGCAA ACAAAATCAA TTGAGTAAAA CAGAAAGATT 420
GCATCAAGCA AGCGAGTGTT TGGATAACTT AGATGACCCT ACCGATCAAC AAGTCATAGA 480 GCAATGTTTA GAGGGCTTGA GCGATAGTGA AAGGGCGCTA ATT 523
(2) INFORMATION FOR SEQ ID NO: 231:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer GD60
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231:
GCAATTAACC CTCACTAAAG GGAAC 25
(2) INFORMATION FOR SEQ ID NO: 232: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer GD61
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232:
GTAATACGAC TCACTATAGG GCGAATTG 28
(2) INFORMATION FOR SEQ ID NO: 233: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer CM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233:
CGCCATGGGA CTTAAAGAGA AAGAAAAAGA AGCC 34
(2) INFORMATION FOR SEQ ID NO: 234: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer CN
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234:
GCGGATCCCT TTTCTTGTAA CATCCC 26
(2) INFORMATION FOR SEQ ID NO: 235: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DK
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235:
CGCCATGGGC AAGGCCTATA ACAAAACA 28
(2) INFORMATION FOR SEQ ID NO: 236:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DP
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236:
CGCCATGGGC TATCTTTATG GGAGT 25
(2) INFORMATION FOR SEQ ID NO: 237:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DL
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:237:
GCGGATCCTA ACCCCGTATT GCCTTGCC 28
(2) INFORMATION FOR SEQ ID NO: 238:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 31 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DG
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238:
CGCCATGGCC CTTTTAAAAG AAAGGGTTGA G 31
(2) INFORMATION FOR SEQ ID NO: 239:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DH
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: GCGGATCCAC CCGCTTGAAT TTTTTGAGC 29
(2) INFORMATION FOR SEQ ID NO: 240:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DI
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240:
CGCCATGGTA GCTGAAAAAC TCAAAAACG 29
(2) INFORMATION FOR SEQ ID NO: 241: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DJ
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241:
GCGGATCCAG CTTTCATGGC CACATTC 27
(2) INFORMATION FOR SEQ ID NO:242: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DQ
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242:
CGCCATGGGA CCTAAACATT CAGTTTCAGC 30
(2) INFORMATION FOR SEQ ID NO: 243: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer DR
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243:
GCGGATCCGT GGTATTTTCC ACTTAAG 27
(2) INFORMATION FOR SEQ ID NO:244: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer GB
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244:
CGCCATGGGA CATATGTATG GTAAGGTG 28
(2) INFORMATION FOR SEQ ID NO: 245:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer GC
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 245:
GCGGATCCAA GCTCCATCAA TTTTGCGA 28
(2) INFORMATION FOR SEQ ID NO: 246:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer FZ
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:246:
GGGCCCGTGG TGCAAGCCTG AAACATG 27
(2) INFORMATION FOR SEQ ID NO: 247:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer GA
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:247:
CATGTTTCAG GCTTGCACCA CGGGCCC 27
(2) INFORMATION FOR SEQ ID NO: 248:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1975 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:248 GGCATATGTA TGGTAAGGTG CAAATCCCAG ATTTAGAAGA AGTGCAAAAG ACGACTATTA 60 ATCGTAAGGA ATTTGTGGGC GATAAAAAAG ACTACAAACC TTATGGGGTC GCACAAAATG 120 ACCCGGCTGT TTTAAACCCT TTCTTTAAAG GTTATCGCTA CCATGTTTCA GGCTTGCACC 180 ATGGGCCCAT TGGCTTTCCT ACTGAAGACG CTAAAATCGG TGGGGATTTG ATTGACAGAT 240 TATTTAATAA GATTGAATCC AAGCAAGACA TTATCAACGA AAATGAGGAA ATGGATTTAG 300 AGGGCGCTGA AATCGTTATC ATCGCTTATG GTTCGGTTTC TCTAGCGGTT AAAGAAGCCT 360 TGAAAGATTA CCATAAAGAA AGCAAGCAAA AAGTCGGCTT TTTCAGGCCC AAAACCTTAT 420 GGCCAAGCCC GGCTAAACGC TTGAAAGAAA TAGGGGATAA ATACGAAAAA ATCCTTGTGA 480 TTGAATTGAA TAAAGGGCAG TATTTAGAAG AAATTGAAAG GGCTATGCAA AGAAAGGTGC 540 ATTTCTTGGG GCAAGCCAAT GGGCGCACGA TTTCGCCTAA ACAAATCATC GCAAAATTGA 600 AGGAGCTTTA AAATGGCGTT TAATTATGAT GAATATTTGC GTGTGGATAA AATACCCACT 660 TTGTGGTGTT GGGGCTGTGG CGATGGCGTG ATTTTGAAAT CCATTATCCG CACAATTGAC 720 GCTTTAGGCT GGAAAATGGA TGATGTGTGC TTGGTGAGTG GGATTGGTTG CAGCGGGCGC 780 ATGAGCTCGT ATGTGAATTG CAACACCGTT CACACCACGC ATGGTAGGGC TGTAGCGTAT 840 GCGACAGGGA TTAAAATGGC TAACCCTAGT AAGCATGTGA TCGTGGTTTC TGGTGATGGC 900 GATGGCTTTG CTATTGGAGG CAATCACACC ATGCATGCAT GCAGAAGAAA CATTGATTTG 960 AATTTTATTT TAGTGAATAA TTTCATTTAT GGTTTGACCA ACTCCCAAAC TTCGCCCACC 1020 ACGCCTAATG GCATGTGGAC GGTTACGGCT CAATGGGGGA ATATTGATAA CCAATTTGAC 1080 CCATGCGCTT TAACCACCGC TGCAGGGGCG AGCTTTGTGG CTAGAGAGAG CGTTTTAGAC 1140 CCTCAAAAAT TAGAAAAAGT GCTTAAAGAA GGTTTCTCGC ACAAGGGCTT TAGCTTCTTT 1200 GATGTCCATA GTAATTGCCA TATCAATTTA GGGCGCAAGA ATAAAATGGC GAAGCGTCTC 1260 AAATGCTAAA ATGGATGGAA AGCCGATTGG TGAGCAAACG CCAATTTGAA GCCATGAGCC 1320 CTGAAGAAAG GGTGGATAAA TTCCCTACAG GCGTTTTAAA GCATGACACG GACAGGAAAG 1380 AATATTGCGA AGCGTATCAA GAAATCATTG AAAAAGCACA AGGAAAACAA TAATGGAAGC 1440 GCAATTACGA TTTACGGGCG TTGGAGGGCA AGGCGTGCTG TTAGCGGGGG AGATTTTAGC 1500 TGAGGCTAAG ATCGTGAGTG GGGGCTATGG CACTAAGACT TCCACTTACA CTTCGCAAGT 1560 GCGTGGAGGG CCCACTAAGG TGGATATTTT GCTAGACAGA AATGAAATTA TTTTCCCTTA 1620 TGCTAAAGAG GGCGAGATTG ATTTCATGCT TTCAGTCGCT CAAATCAGCT ACAACCAGTT 1680 TAAAAGCGAT ATTAAACAAG GCGGTATCGT TGTCATTGAT CCCAATCTAG TAACCCCCAC 1740 TAAAGAAGAT GAAGAAAAGT ATCAGCTTTA TAAAATCCCT ATCATTAGTA TCGCTAAAGA 1800 TGAAGTGGGT AACATCATCA CGCAATCTGT GGTAGCGTTA GCCATTACCG TGGAGCTTAC 1860 TAAATGTGTA GAAGAAAATA TCGTGCTAGA CACCATGCTT AAAAAAGTCC CTGCAAAAGT 1920 CGCTGACACC AACAAAAAAG CCTTTGAAAT TGGCAAAAAA CATGCTTTAG AAGCT 1975
(2) INFORMATION FOR SEQ ID NO: 249:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 202 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a30RFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249:
His Met Tyr Gly Lys Val Gin He Pro Asp Leu Glu Glu Val Gin Lys
1 5 10 15
Thr Thr He Asn Arg Lys Glu Phe Val Gly Asp Lys Lys Asp Tyr Lys
20 25 30
Pro Tyr Gly Val Ala Gin Asn Asp Pro Ala Val Leu Asn Pro Phe Phe
35 40 45
Lys Gly Tyr Arg Tyr His Val Ser Gly Leu His His Gly Pro He Gly
50 55 60
Phe Pro Thr Glu Asp Ala Lys He Gly Gly Asp Leu He Asp Arg Leu 65 70 75 80
Phe Asn Lys He Glu Ser Lys Gin Asp He He Asn Glu Asn Glu Glu
85 90 95
Met Asp Leu Glu Gly Ala Glu He Val He He Ala Tyr Gly Ser Val
100 105 110
Ser Leu Ala Val Lys Glu Ala Leu Lys Asp Tyr His Lys Glu Ser Lys
115 120 125
Gin Lys Val Gly Phe Phe Arg Pro Lys Thr Leu Trp Pro Ser Pro Ala 130 135 140
Lys Arg Leu Lys Glu He Gly Asp Lys Tyr Glu Lys He Leu Val He 145 150 155 160
Glu Leu Asn Lys Gly Gin Tyr Leu Glu Glu He Glu Arg Ala Met Gin 165 170 175
Arg Lys Val His Phe Leu Gly Gin Ala Asn Gly Arg Thr He Ser Pro
180 185 190
Lys Gin He He Ala Lys Leu Lys Glu Leu 195 200
(2) INFORMATION FOR SEQ ID NO: 250:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 218 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: a30RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250:
Met Ala Phe Asn Tyr Asp Glu Tyr Leu Arg Val Asp Lys He Pro Thr 1 5 10 15
Leu Trp Cys Trp Gly Cys Gly Asp Gly Val He Leu Lys Ser He He
20 25 30
Arg Thr He Asp Ala Leu Gly Trp Lys Met Asp Asp Val Cys Leu Val 35 40 45 Ser Gly He Gly Cys Ser Gly Arg Met Ser Ser Tyr Val Asn Cys Asn 50 55 60
Thr Val His Thr Thr His Gly Arg Ala Val Ala Tyr Ala Thr Gly He 65 70 75 80
Lys Met Ala Asn Pro Ser Lys His Val He Val Val Ser Gly Asp Gly 85 90 95
Asp Gly Phe Ala He Gly Gly Asn His Thr Met His Ala Cys Arg Arg
100 105 110
Asn He Asp Leu Asn Phe He Leu Val Asn Asn Phe He Tyr Gly Leu 115 120 125 Thr Asn Ser Gin Thr Ser Pro Thr Thr Pro Asn Gly Met Trp Thr Val 130 135 140
Thr Ala Gin Trp Gly Asn He Asp Asn Gin Phe Asp Pro Cys Ala Leu 145 150 155 160
Thr Thr Ala Ala Gly Ala Ser Phe Val Ala Arg Glu Ser Val Leu Asp 165 170 175
Pro Gin Lys Leu Glu Lys Val Leu Lys Glu Gly Phe Ser His Lys Gly
180 185 190
Phe Ser Phe Phe Asp Val His Ser Asn Cys His He Asn Leu Gly Arg 195 200 205 Lys Asn Lys Met Ala Lys Arg Leu Lys Cys 210 215
(2) INFORMATION FOR SEQ ID NO: 251: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2077 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: c2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 251:
TTTTTATCAT TTTTGAATTA ATTTTTTTTA ATAAAGGGTT TTTTATGAAT ATATTCAAGC 60
GTGTTATTAG TGTAGGGGTA ATTGTTTTAG GTTTGTTTAA CCTTTTAGAC GCCAAACACC 120
ACAAAGAAAA AAAAGAAGAC CACAAAATCA CTCGTGAGCT TAAAGTGGGC GCTAACCCCG 180
TTCCGCATGC ACAAATCTTG CAATCAGTCG TGGACGATTT GAAAGAGAAA GGGATCAAAT 240 TAGTGATCGT ATCTTTTACC GATTATGTGT TGCCTAATTT AGCGCTCAAT GACGGCTCTT 300
TAGACGCGAA TTACTTCCAG CACCGCCCTT ATTTGGATCG GTTTAATTTG GACAGAAAAA 360
TGCACCTTGT TGGTTTGGCC AATATCCATG TGGAGCCTTT AAGATTTTAT TCTCAAAAAA 420 TCACGGACAT TAAAAACCTT AAAAAAGGCT CAGTGATTGC TGTGCCAAAT GATCCGGCCA 480
ATCAAGGCAG GGCGTTGATT TTACTCCATA AACAAGGCCT TATCGCTCTC AAAGATCCAA 540
GCAATCTATA CGCTACGGAG TTTGATATTG TCAAAAACCC TTACAACATC AAAATCAAGC 600
CTTTAGAAGC CGCGCTATTG CCTAAAGTTT TAGGGGATGT GGATGGAGCT ATTGTAACAG 660 GGAATTATGC CTTGCAAGCA AAACTCACCG GAGCTTTATT TTCAGAAGAC AAGGACTCGC 720
CTTATGCCAA TCTAATAGCC GCTCGTGAGG ATAATGCGCA AGATGAAGCC ATAAAAGCGC 780
TGATTGAAGT TTTGCAGAGT GAAAAGACCA GGAAATTTAT TTTGGATACC TATAAGGGGG 840
CGATTATCCC GGCTTTTTAA GCCATTTGGA TAACATTCAT CTTTAAGATG AATGGGAGTG 900
CTTTAGGCGC TAATAAAAAG CTCTTTTCTC ATTGAGCATT AAAAACGCTA AAAGTATTTT 960 TAAAATAAAA CTATAAAAGA GCTTGCGCTA ATCAAGCGAC AATTTCTTAA AAAGCGTTAT 1020
TTCTTAAGGG GGGAAATGGC GCAAAATAAC CCCCTATCCC TTTAAGAAAA TAATGATAAA 1080
ATCCAAAACG ATGAAGCCAA ATAACTAAAA CCCCATTTTT TAAAAATATT AAGCAAGACT 1140
TTAACCATCA TGCGATTAGA ATAAAGCCAA AAACACAGCA AGTCTAAATC TCATTAAAGT 1200
TACGCTTTTA GAATTTGCTC CAATAAAGAC CAGTTTTAAA CCACCCGTTA GACTGGGTTA 1260 TAGAGCGGTG GAATGAGTTG GAAACAAGTT GGCTAATGGT CTTTAGCGTT TAAAAGGTTG 1320
CGTTTTTGAA AAGGTTTAAA ATTTGATTAA AGATAGCCAA GCTCATAGAG TTTATTGTTC 1380
ATTTTCATTA ACAAGCCCCC CAGTTTTGAC CCCCTTTCCC CATGTTTTTA CTAAAATAGT 1440
GATAGCGTAT TTGGGTTTTT CATAAGGCAA GAATGCGGTA ATCCACGCAT GGGATCGATG 1500
GAAATATTCC ATATCCTTTT CTTTCATGCG ATTGACGATG TTTTGAGCGA TTTCCACGAC 1560 TTGCGCGGTG CCGGTTTTAC ACGCTAAAGT AACCTTAGAA CCCCTTGTGG AATGATAAGC 1620
GGTGCCGTCT TTATGGTTAC ACACTTCATA CATGCCAACG CGCAAGGCTT GGAGCTTCTT 1680
TTTTTGAAAG CTATTTAAGG GATCTTTGAG CGGTTGTTGG TTGTTGATAG CAAAATGAGG 1740
CGTTGCCAGT TTGCCCGTAG CAATGAGTCC CCGTGTAGGC TAAAACCTGC AAGGGCGTGG 1800
CTAAAAAAGA GCCTTGCCCA ATAGCGGTAA TGAGCGTGTC CCCAACGCGC CAATTTTGAT 1860 TGAAGCGTTT GAGTTTCCAC AAATTATCCG GCACAATCCC CACAAATTCA TTCGGCAAAT 1920
CAACGCCCGT TTTTTCCCCA AAGCCCACTT CCCTTAAGGT TTTAGAGAGT TTTTCTATAG 1980
AAATTTCAAG CCCAAATTTA TAAAAATACA CATCCACAGA CTCCCTAATG GCTTTATACA 2040
AATTAGAATT GCCATGCCCT GTTTTTTTCC AGTCTCT 2077 (2) INFORMATION FOR SEQ ID NO:252:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 271 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: C20RF (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252:
Met Asn He Phe Lys Arg Val He Ser Val Gly Val He Val Leu Gly
1 5 10 15
Leu Phe Asn Leu Leu Asp Ala Lys His His Lys Glu Lys Lys Glu Asp 20 25 30
His Lys He Thr Arg Glu Leu Lys Val Gly Ala Asn Pro Val Pro His
35 40 45
Ala Gin He Leu Gin Ser Val Val Asp Asp Leu Lys Glu Lys Gly He 50 55 60 Lys Leu Val He Val Ser Phe Thr Asp Tyr Val Leu Pro Asn Leu Ala 65 70 75 80
Leu Asn Asp Gly Ser Leu Asp Ala Asn Tyr Phe Gin His Arg Pro Tyr
85 90 95
Leu Asp Arg Phe Asn Leu Asp Arg Lys Met His Leu Val Gly Leu Ala 100 105 110
Asn He His Val Glu Pro Leu Arg Phe Tyr Ser Gin Lys He Thr Asp
115 120 125
He Lys Asn Leu Lys Lys Gly Ser Val He Ala Val Pro Asn Asp Pro
130 135 140 Ala Asn Gin Gly Arg Ala Leu He Leu Leu His Lys Gin Gly Leu He
145 150 155 160
Ala Leu Lys Asp Pro Ser Asn Leu Tyr Ala Thr Glu Phe Asp He Val
165 170 175
Lys Asn Pro Tyr Asn He Lys He Lys Pro Leu Glu Ala Ala Leu Leu 180 185 190
Pro Lys Val Leu Gly Asp Val Asp Gly Ala He Val Thr Gly Asn Tyr 195 200 205 Ala Leu Gin Ala Lys Leu Thr Gly Ala Leu Phe Ser Glu Asp Lys Asp
210 215 220
Ser Pro Tyr Ala Asn Leu He Ala Ala Arg Glu Asp Asn Ala Gin Asp 225 230 235 240 Glu Ala He Lys Ala Leu He Glu Val Leu Gin Ser Glu Lys Thr Arg
245 250 255
Lys Phe He Leu 'Asp Thr Tyr Lys Gly Ala He He Pro Ala Phe 260 265 270 (2) INFORMATION FOR SEQ ID NO: 253:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 650 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: c5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253:
AGCTTATAAA GACTGCGTAT CAAGAGCTAG GAATGAAAAA GAGAAAAAAG AATGCGAGAA 60
ATTACTCACC CCTGAAGCGA GGAAACTTTT AGAACAACAA GCGCTAGATT GTTTGAAAAA 120 CGCTAAAACC GAATCTGATA AAAAAAGGTG TGTCAAAGAT CTCCCTAAAG ACTTGCAGAA 180
AAAGGTTTTA GCCAAAGAGA GCGTTAAGGC TTATTTGGAC TGCGTATCAA GAGCTAGGAA 240
TGAAAAAGAG AAAAAAGAAT GCGAGAAATT ACTCACCCCT GAAGCGAGGA AACTTTTAGA 300
AGAAGCTAAA GAGAGCCTTA AGGCTTATAA AGACTGCCTC TCTCAAGCTA GAAATGAAAC 360
TGAAAGGAGA GCTTGCGAAA AATTACTCAC CCCTGAAGCG AGGAAACTCT TAGAGCAAGA 420 AGTTAAGAAG AGCGTTAAGG CTTATTTGGA CTGCGTATCA AGAGCTAGGA ATGAAAAAGA 480
GAAACAAGAA TGCGAGAAAT TACTCACCCC TGAAGCGAGG AAATTTTTAG CGAAGCAAGC 540
ATTAAGTTGT TTGGAAAAAG CTAGAAATGA AGAAGAAAGA AAAGCATGTC TTAGAAAATC 600
TCCCTAAAGA CTTACAGAAA AATGTTTTAG CTAAAGAGAG CCTTAAAGCT 650 (2) INFORMATION FOR SEQ ID NO: 254:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 201 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: C50RF (xi) SEQUENCE DESCRIPTION: SEQ ID N0:254:
Ala Tyr Lys Asp Cys Val Ser Arg Ala Arg Asn Glu Lys Glu Lys Lys
1 5 10 15
Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Leu Leu Glu Gin 20 25 30
Gin Ala Leu Asp Cys Leu Lys Asn Ala Lys Thr Glu Ser Asp Lys Lys
35 40 45
Arg Cys Val Lys Asp Leu Pro Lys Asp Leu Gin Lys Lys Val Leu Ala 50 55 60 Lys Glu Ser Val Lys Ala Tyr Leu Asp Cys Val Ser Arg Ala Arg Asn 65 70 75 80
Glu Lys Glu Lys Lys Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg
85 90 95
Lys Leu Leu Glu Glu Ala Lys Glu Ser Leu Lys Ala Tyr Lys Asp Cys 100 105 110
Leu Ser Gin Ala Arg Asn Glu Thr Glu Arg Arg Ala Cys Glu Lys Leu
115 120 125
Leu Thr Pro Glu Ala Arg Lys Leu Leu Glu Gin Glu Val Lys Lys Ser
130 135 140 Val Lys Ala Tyr Leu Asp Cys Val Ser Arg Ala Arg Asn Glu Lys Glu
145 150 155 160
Lys Gin Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Phe Leu 165 170 175
Ala Lys Gin Ala Leu Ser Cys Leu Glu Lys Ala Arg Asn Glu Glu Glu
180 185 190
Arg Lys Ala Cys Leu Arg Lys Ser Pro 195 200
(2) INFORMATION FOR SEQ ID NO: 255:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1742 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cl3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: AGCTTTGGAG CGTTTTAAGG GCGATGAATT AAAGCATGCG GTGATGAGTA AAATCAGTGG 60
CGATAAATTT GGCGTGTACC AACAAGGCGA GTTTGAAGAT TTGTGTAAGG GGCCTCACCT 120
CCCAAACACC CGTTTTTTAA ACCATTTCAA GCTCACTAAA CTGGCTGGGG CTTATTTGGG 180
TGGCGATGAA AACAATGAAA TGCTCATTAG AATCTATGGC ATCGCTTTTG CCACTAAAGA 240
AGGCTTAAAA GACTATCTTT TCCAAATAGA AGAAGCGAAA AAACGAGATC ACAGAAAGCT 300 AGGCGTGGAG CTAGGGCTTT TTAGTTTTGA TGATGAGATA GGGGCAGGCT TACCTTTATG 360
GCTGCCTAAA GGGGCAAGGC TTAGGAAACG CATTGAAGAT TTATTGAGCA AAGCGTTACT 420
TTTAAGAGGC TATGAGCCGG TTAAAGGCCC TGAGATTTTA AAGAGCGATG TGTGGAAAAT 480
CAGCGGGCAT TATGACAACT ATAAAGAAAA CATGTATTTC ACCACGATTG ATGAGCAACA 540
ATACGGCATA AAGCCTATGA ACTGCGTGGG GCATATTAAA GTCTATCAAA GCGCTTTACA 600 CAGCTACAGA GATTTGCCCT TAAGATTTTA TGAATACGGC GTGGTGCATC GGCATGAAAA 660
AAGCGGCGTG TTGCATGGGC TTTTAAGGGT TAGGGAATTT ACCCAAGATG ATGCGCATAT 720
TTTTTGCTCT TTTGAACAGA TCCAAAGCGA AGTGAGCGCG ATTTTAGATT TTACGCATAA 780
GATCATGCAA GCGTTTGATT TTAGCTATGA AATGGAATTA TCCACAAGGC CGGCTAAATC 840
CATAGGCGAT GATAAAGTGT GGGAAAAGGC CACTAACGCT TTAAAAGAAG CTCTAAAAGA 900 GCACCGCATT GATTATAAGA TTGATGAAGG GGGAGGGGCT TTCTATGGGC CTAAGATTGA 960
CATTAAAATC ACTGACGCTT TAAAGCGTAA ATGGCAGTGC GGCACGATTC AAGTGGATAT 1020
GAATTTGCCT GAACGCTTCA AGCTCGCTTT CATTAATGAG CATAATCACG CTGAACAGCC 1080
AGTGATGATC CACAGAGCGA TTTTAGGCTC GTTTGAAAGG TTTATTGCGA TTTTGAGCGA 1140
ACATTTTGGG GGGAATTTCC CTTTCTTTGT CGCGCCCACT CAAATCGCTC TCATCCCTAT 1200 TAATGAAGAG CATCATGTTT TTGCTTTGAA ATTAAAAGAG GAACTAAAAA AGCGCGATAT 1260
TTTTGTAGAA GTGCTGGATA AAAACGACAG CTTGAATAAA AAAGTGCGAT TAGCCGAAAA 1320
GCAAAAAATC CCTATGATTT TAGTGTTAGG GAATGAAGAA GTGGAGACCG AAATTTATCC 1380
ATTAGAGACA GAGAAAAACA AGCTCAATAT AAAATGCCCT TAAAGGAGTT TTTAAAACAT 1440
GGTTGAATCT AAGATGCAAG AGGTTAGTTT TTGAGTAGAA ACGAAGTGTT GTTAAACGGA 1500 GACATTAATT TTAAAGAAGT GCGTTGCGTG GGTAATAATG GCGAAGTGTA TGGGATCATT 1560
TCTTCCAAAG AAGCACTCAA TATCGCTCAA AATTTAGGTT TGGATTTGGT TTTGATTTCA 1620
GCGAGCGCGA AACCTCCCGT GTGTAAGGTG ATGGATTATA ATAAATTCCG CTACCAAAAT 1680
GAAAAGAAAA TCAAGGAAGC CAAGAAAAAG CAAAAGCAAA TTGAAATCAA AGAGATCAAG 1740
CT 1742
(2) INFORMATION FOR SEQ ID NO:256:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 473 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: C130RF
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256:
Ala Leu Glu Arg Phe Lys Gly Asp Glu Leu Lys His Ala Val Met Ser 1 5 10 15 Lys He Ser Gly Asp Lys Phe Gly Val Tyr Gin Gin Gly Glu Phe Glu 20 25 30
Asp Leu Cys Lys Gly Pro His Leu Pro Asn Thr Arg Phe Leu Asn His 35 40 45
Phe Lys Leu Thr Lys Leu Ala Gly Ala Tyr Leu Gly Gly Asp Glu Asn
50 55 60
Asn Glu Met Leu He Arg He Tyr Gly He Ala Phe Ala Thr Lys Glu 65 70 75 80
Gly Leu Lys Asp Tyr Leu Phe Gin He Glu Glu Ala Lys Lys Arg Asp
85 90 95
His Arg Lys Leu Gly Val Glu Leu Gly Leu Phe Ser Phe Asp Asp Glu 100 105 110 He Gly Ala Gly Leu Pro Leu Trp Leu Pro Lys Gly Ala Arg Leu Arg 115 120 125
Lys Arg He Glu Asp Leu Leu Ser Lys Ala Leu Leu Leu Arg Gly Tyr
130 135 140
Glu Pro Val Lys Gly Pro Glu He Leu Lys Ser Asp Val Trp Lys He 145 150 155 160
Ser Gly His Tyr Asp Asn Tyr Lys Glu Asn Met Tyr Phe Thr Thr He
165 170 175
Asp Glu Gin Gin Tyr Gly He Lys Pro Met Asn Cys Val Gly His He 180 185 190 Lys Val Tyr Gin Ser Ala Leu His Ser Tyr Arg Asp Leu Pro Leu Arg 195 200 205
Phe Tyr Glu Tyr Gly Val Val His Arg His Glu Lys Ser Gly Val Leu
210 215 220
His Gly Leu Leu Arg Val Arg Glu Phe Thr Gin Asp Asp Ala His He 225 230 235 240
Phe Cys Ser Phe Glu Gin He Gin Ser Glu Val Ser Ala He Leu Asp
245 250 255
Phe Thr His Lys He Met Gin Ala Phe Asp Phe Ser Tyr Glu Met Glu 260 265 270 Leu Ser Thr Arg Pro Ala Lys Ser He Gly Asp Asp Lys Val Trp Glu 275 280 285
Lys Ala Thr Asn Ala Leu Lys Glu Ala Leu Lys Glu His Arg He Asp
290 295 300
Tyr Lys He Asp Glu Gly Gly Gly Ala Phe Tyr Gly Pro Lys He Asp 305 310 315 320
He Lys He Thr Asp Ala Leu Lys Arg Lys Trp Gin Cys Gly Thr He
325 330 335
Gin Val Asp Met Asn Leu Pro Glu Arg Phe Lys Leu Ala Phe He Asn 340 345 350 Glu His Asn His Ala Glu Gin Pro Val Met He His Arg Ala He Leu 355 360 365
Gly Ser Phe Glu Arg Phe He Ala He Leu Ser Glu His Phe Gly Gly
370 375 380
Asn Phe Pro Phe Phe Val Ala Pro Thr Gin He Ala Leu He Pro He 385 390 395 400
Asn Glu Glu His His Val Phe Ala Leu Lys Leu Lys Glu Glu Leu Lys
405 410 415
Lys Arg Asp He Phe Val Glu Val Leu Asp Lys Asn Asp Ser Leu Asn 420 425 430 Lys Lys Val Arg Leu Ala Glu Lys Gin Lys He Pro Met He Leu Val 435 440 445
Leu Gly Asn Glu Glu Val Glu Thr Glu He Tyr Pro Leu Glu Thr Glu
450 455 460
Lys Asn Lys Leu Asn He Lys Cys Pro 465 470
(2) INFORMATION FOR SEQ ID NO:257:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2274 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: f3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:257:
GAATTCCGTT TTTATGCGCG TTTTGAAGAA ATCCCCCCAC GATTTATTGA AAGCCTTTTA 60
GCGGTAGAAG ACACCCTCTT TTTTGAACAT GGGGGGATCA ATTTAGACGC TATCATGCGC 120 GCTATGATTA AAAACGCTAA AAGCGGTCGT TACACTGAAG GGGGCAGCAC TCTAACCCAA 180
CAACTCGTTA AAAACATGGT GCTCACGCGA GAAAAAACAT TAACCAGAAA ACTCAAAGAA 240
GCTATCATTT CTTTACGCAT TGAAAAAGTC TTAAGCAAAG AAGAAATTTT AGAGCGTTAT 300
TTGAACCAAA CTTTTTTTGG GCATGGGTAT TATGGCGTGA AAACCGCAAG TTTAGGGTAT 360
TTTAAAAAAC CCCTTGACAA ACTCACGCTT AAAGAAATCA CCATGTTAGT CGCCTTGCCT 420 AGGGCTCCGA GTTTTTATGA TCCTACCAAA AATTTAGAAT TTTCACTCTC TAGAGCTAAT 480
GATATTTTAA GGCGGTTGTA TTCTTTAGGC TGGATTTCTT CTAACGAGCT CAAAGGCGCT 540
CTCAATGAAG TGCCAATCGT CTATAACCAA ACTTCCACGC AAAATATCGC TCCCTATGTC 600
GTGGATGAAG TGTTGAAGCA ATTGGATCAA TTAGACGGGT TAAAAACTCA AGGCTATACC 660
ATAAAGCTCA CGATAGATTT GGATTACCAA CGCTTAGCGT TAGAGTCCTT GCGTTTTGGG 720 CATCAAAAAA TCTTAGAAAA AATCGCTAAA GAAAAGCCAA AAACTAACGC CTCTGATGAA 780
GATGAAGACA ACTTGAACGC CAGCATGATA GTTACAGACA CGAGCACCGG TAAGATTTTA 840
GCTTTAGTGG GGGGGATTGA TTATAAAAAA AGCGCTTTCA ATCGCGCCAC GCAAGCCAAA 900
CGGCAGTTTG GGAGCGCGAT AAAGCCTTTT GTGTATCAAA TCGCTTTTGA TAATGGCTAT 960
TCCACCACTT CCAAAATCCC TGATACCGCG CGAAACTTTG AAAATGGCAA TTATAGTAAA 1020 AACAGCGAAC AAAACCACGC ATGGCACCCC AGCAATTATT CTCGCAAGTT TTTAGGGCTT 1080
GTAACCTTGC AAGAAGCCTT AAGCCATTCG TTAAATCTAG CCACGATCAA TTTAAGCGAT 1140
CAGCTTGGCT TTGAAAAAAT TTATCAATCT TTAAGCGATA TGGGGTTTAA AAACCTCCCT 1200
AAGGACTTGT CTATCGTGTT AGGGAGCTTT GCTATCTCAC CCATTGATGC GGCTGAAAAG 1260
TATTCTTTAT TTTCTAATTA CGGCACCATG CTCAAACCCA TGCTCATTGA AAGCATCACT 1320 AACCAACAAA ACGATGTCAA AACTTTCACG CCTATGGAAA CCAAAAAGAT CACCTCCAAA 1380
GAACAGGCTT TTTTAACCCT TTCAGTGCTG ATAAATGCGG TAGAAAACGG CACAGGGCGT 1440
TTGGCTCGCA CTAAAGGTTT AGAAATCGCC GGTAAAACCG GGACTTCTAA CAACAATATT 1500
GACGCTTGGT TCATTGGCTT TACCCCCACC TTGCAAAGCG TGATCTGGTT TGGGAGAGAC 1560
GATAACACGC CTATTAGCAA AGGAGCGACA GGAGGCGTTG TGAGTGCACC TGTGTATTCG 1620 TATTTCATGC GTAATATTTT AGCGATTGAA CCTTCTTTAA AAAGAAAGTT TGATGTCCCC 1680
AAAGGCTTGC GTAAAGAAAT CGTGGATAAA ATCCCCTACT ACTCAACTCC TAATTCCATC 1740
ACCCCCACCC CCAAAAGAAC AGACGATAGT GAGGAACGCT TGTTGTTCTA ATCCTACATG 1800
GCTATAGGGA CTTTAACATC TCCGTAATTG ACAGAATCTT TAAAGATTTG TTCCACTTTA 1860
GCCACAAGGA TACTCACCGT TTGATTGTCT AATGCCATTT TATGCAACCC TTCGCTCAAA 1920 GCGTCCTTAT CAAAACCACG CGCGCGCACT AACGCAAACA CTTCCTCATT GTTAGGATTG 1980
ATATAAATCC CTAAACGATC CACATTCATC AAGCTGTCAT AAATGGGCTG TGTCATTTGG 2040
ATAAACATTT CATTAGTGAT CTTTTGCTTG ATGCGGTTGG ATTTAGAAAG GTAAGCCTCT 2100
TTGAGTTTAT GCACAATATT TTGCGCCACT CTTATTGTCG CTCTTTTGGT AGCGATTTTT 2160
TCTATCTCTT CAGTTTTCTT ACCATCCACA ATGTGCGAGC TATCCACCCC TTGCGTGATA 2220 TACTCACTGC GATTTTGCAA CATCCAATAA GGCACATCTT GCAAGAAAGA ATTC 2274
(2) INFORMATION FOR SEQ ID NO: 258:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 596 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: f30RF
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258:
Glu Phe Arg Phe Tyr Ala Arg Phe Glu Glu He Pro Pro Arg Phe He 1 5 10 15
Glu Ser Leu Leu Ala Val Glu Asp Thr Leu Phe Phe Glu His Gly Gly
20 25 30
He Asn Leu Asp Ala He Met Arg Ala Met He Lys Asn Ala Lys Ser 35 40 45 Gly Arg Tyr Thr Glu Gly Gly Ser Thr Leu Thr Gin Gin Leu Val Lys 50 55 60
Asn Met Val Leu Thr Arg Glu Lys Thr Leu Thr Arg Lys Leu Lys Glu 65 70 75 80
Ala He He Ser Leu Arg He Glu Lys Val Leu Ser Lys Glu Glu He 85 90 95
Leu Glu Arg Tyr Leu Asn Gin Thr Phe Phe Gly His Gly Tyr Tyr Gly 100 105 110 Val Lys Thr Ala Ser Leu Gly Tyr Phe Lys Lys Pro Leu Asp Lys Leu
115 120 125
Thr Leu Lys Glu He Thr Met Leu Val Ala Leu Pro Arg Ala Pro Ser
130 135 140 Phe Tyr Asp Pro Thr Lys Asn Leu Glu Phe Ser Leu Ser Arg Ala Asn
145 150 155 160
Asp He Leu Arg Arg Leu Tyr Ser Leu Gly Trp He Ser Ser Asn Glu
165 170 175
Leu Lys Gly Ala Leu Asn Glu Val Pro He Val Tyr Asn Gin Thr Ser 180 185 190
Thr Gin Asn He Ala Pro Tyr Val Val Asp Glu Val Leu Lys Gin Leu
195 200 205
Asp Gin Leu Asp Gly Leu Lys Thr Gin Gly Tyr Thr He Lys Leu Thr
210 215 220 He Asp Leu Asp Tyr Gin Arg Leu Ala Leu Glu Ser Leu Arg Phe Gly
225 230 235 240
His Gin Lys He Leu Glu Lys He Ala Lys Glu Lys Pro Lys Thr Asn
245 250 255
Ala Ser Asp Glu Asp Glu Asp Asn Leu Asn Ala Ser Met He Val Thr 260 265 270
Asp Thr Ser Thr Gly Lys He Leu Ala Leu Val Gly Gly He Asp Tyr
275 280 285
Lys Lys Ser Ala Phe Asn Arg Ala Thr Gin Ala Lys Arg Gin Phe Gly
290 295 300 Ser Ala He Lys Pro Phe Val Tyr Gin He Ala Phe Asp Asn Gly Tyr
305 310 315 320
Ser Thr Thr Ser Lys He Pro Asp Thr Ala Arg Asn Phe Glu Asn Gly
325 330 335
Asn Tyr Ser Lys Asn Ser Glu Gin Asn His Ala Trp His Pro Ser Asn 340 345 350
Tyr Ser Arg Lys Phe Leu Gly Leu Val Thr Leu Gin Glu Ala Leu Ser
355 360 365
His Ser Leu Asn Leu Ala Thr He Asn Leu Ser Asp Gin Leu Gly Phe
370 375 380 Glu Lys He Tyr Gin Ser Leu Ser Asp Met Gly Phe Lys Asn Leu Pro
385 390 395 400
Lys Asp Leu Ser He Val Leu Gly Ser Phe Ala He Ser Pro He Asp
405 410 415
Ala Ala Glu Lys Tyr Ser Leu Phe Ser Asn Tyr Gly Thr Met Leu Lys 420 425 430
Pro Met Leu He Glu Ser He Thr Asn Gin Gin Asn Asp Val Lys Thr
435 440 445
Phe Thr Pro Met Glu Thr Lys Lys He Thr Ser Lys Glu Gin Ala Phe
450 455 460 Leu Thr Leu Ser Val Leu He Asn Ala Val Glu Asn Gly Thr Gly Arg
465 470 475 480
Leu Ala Arg Thr Lys Gly Leu Glu He Ala Gly Lys Thr Gly Thr Ser
485 490 495
Asn Asn Asn He Asp Ala Trp Phe He Gly Phe Thr Pro Thr Leu Gin 500 505 510
Ser Val He Trp Phe Gly Arg Asp Asp Asn Thr Pro He Ser Lys Gly
515 520 525
Ala Thr Gly Gly Val Val Ser Ala Pro Val Tyr Ser Tyr Phe Met Arg
530 535 540 Asn He Leu Ala He Glu Pro Ser Leu Lys Arg Lys Phe Asp Val Pro
545 550 555 560
Lys Gly Leu Arg Lys Glu He Val Asp Lys He Pro Tyr Tyr Ser Thr
565 570 575
Pro Asn Ser He Thr Pro Thr Pro Lys Arg Thr Asp Asp Ser Glu Glu 580 585 590
Arg Leu Leu Phe 595
(2) INFORMATION FOR SEQ ID NO: 259:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2474 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: g2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:259: GAATTCATCA GGGATCAATG ATGGCGCGAG CATTATCATT TTATGCAGCG CTAAAAAAGC 60
GCAAAAATTA GGGTTAAAAG CCATGGCTAC TATCAGGGGG TTTGGTTTGG GTGGTTGCAG 120
TCCGGATATA ATGGGTATAT GCCCTATTAT TGCGATTAAA AACAATCTTA AAAATGTCAA 180
AATGAATCTC AATGACATCA ATCTTTTTGA ACTCAATGAA GCCTTTGCCG CGCAAAGTCT 240
AGCCGTGTTA AAAGAGCTTG AATTAAACCC CAATATAGTG AATGTGAATG GAGGCGCGAT 300 AGCGATTGGC CACCCTATTG GTGCGAGCGG CGCTAGGATC TTAGTAACTT TATTGCATGA 360
AATGAAAAAA AGCGGTCATG GCGTGGGTTG CGCGTCATTG TGCGTGGGTG GCGGACAAGG 420
CCTTTCTGTG GTATTGGAAC AAAAATAAGG AGAAAATGAG ATGAACAAGG TTATAACTAA 480
TTTAAACAAA GCATTGAGCG GGTTAAAAAA CGGGGACACT ATTTTAGTGG GCGGTTTTGG 540
GCTGTGCGGG ATACCCGAAT ACGCCATTGA TTACATTTAT AAGAAAGGCA TTAAGGATTT 600 GATTGTCGTG AGTAATAATT GCGGCGTTGA TGATTTTGGG CTTGGCATTC TTTTAGAAAA 660
AAAACAGATT AAAAAGATTA TCGCTTCCTA TGTGGGAGAA AATAAGATTT TTGAATCGCA 720
AATGCTGAAC GGAGAAATTG AATTCGTTTT GACACCGCAA GGCACGCTGG CTGAAAACTT 780
GCGCGCTGGA GGGGCTGGGA TACCCGCTTA CTACACCCCA ACCGGTGTTG GGACTTTGAT 840
CGCTCAAGGC AAGGAATCAC GGGAATTTAA CGGCAAAGAG TATATTTTAG AAAGAGCGAT 900 CACAGGCGAT TACGGGCTTA TCAAAGCCTA TAAAAGCGAC ACTCTTGGGA ATTTGGTGTT 960
TAGAAAAACG GCTAGAAATT TCAACCCCTT GTGTGCGATG GCATCAAAAA TATGCGTCGC 1020
TGAAGTGGAA GAAATTGTCC CGGCTGGGGA ATTAGACCCA GATGAAATAC ACTTGCCAGG 1080
AATCTATGTG CAACACATCT ATAAAGGCGA GAAATTTGAA AAACGGATAG AAAAAATCAC 1140
AACAAGGAGC GCGAAATGAG AGAGGCTATC ATTAAAAGAG CGGCAAAGGA ATTGAAAGAG 1200 GGCATGTATG TGAATTTAGG GATAGGCTTG CCCACGCTTG TGGCTAATGA AGTGAGCGGG 1260
ATGAATATCG TTTTCCAGAG CGAGAACGGG CTGTTAGGGA TTGGCGCTTA CCCTTTAGAG 1320
GGGGGCGTTG ATGCGGATCT CATCAATGCA GGAAAGGAAA CCGTAACCGT GGTGCCGGGC 1380
GCTTCGTTTT TCAACAGCGC GGATTCGTTT GCAATGATTC GTGGGGGGCA TATTGATCTA 1440
GCGATTTTAG GAGGGATGGA AGTTTCACAA AATGGGGATT TGGCTAATTG GATGATCCCT 1500 AAAAAGCTCA TAAAAGGCAT GGGAGGGGCT ATGGATTTGG TGCATGGCGC TAAAAAAGTG 1560
ATTGTCATCA TGGAGCATTG CAACAAATAC GGGGAGTCTA AAGTGAAAAA AGAATGCTCA 1620
TTGCCCTTAA CAGGAAAGGG CGTGGTGCAT CAATTGATAA CGGATTTAGC GGTGTTTGAA 1680
TTTTCCAATA ACGCCATGAA ATTAGTGGAA TTGCAAGAGG GTGTCAGCCT TGATCAAGTG 1740
AGAGAAAAAA CAGAAGCCGA ATTTGAAGTG CACCTATAGC TTATAAAAGG GGTGTTTATG 1800 TTTTTATTAA GGCATTTGAC TTCAGCGTGC GTGTTTTTAG CGTCTAAATG TTTGCCGGAC 1860
TCCTTTGTCT TGGTCGCTCT TTTATCGTTT ATCGTGTTTG TTCTTGTTTA TGGTTTGACA 1920
GGGCAAGACG CTTTTTCTGT CATTTCTAGT TGGGGGAATG GCGCTTGGAC GCTTTTAGGT 1980
TTTTCTATGC AAATGGCTCT TATTTTGGTG CTAGGTCAGG CTTTGGCTAG CGCTAAATTA 2040
GTCCAAAAAC TTTTAAAATA TCTAGCGTCT TTACCTAAAG GGTATTACAC GGCTTTATGG 2100 TTGGTTACTT TTTTATCGTT AATCGCTAAT TGGATCAACT GGGGTTTTGG CTTGGTGATC 2160
AGCGCAATTT TTGCAAAAGA GATCGCCAAA AATGTTAAAG GGGTGGATTA CAGGCTGCTT 2220
ATTGCTAGCG CTTATTCGGG TTTTGTCATC TGGCATGGGG GTTTATCAGG CTCTATCCCT 2280
TTAAGCGTTG CCACCCAAAA TGAAAATTTA TCCAAAATAA GTGCTGGGGT GATTGAAAAA 2340
GCTATTCCTA TCAGTCAGAC GATTTTTTCT GCCTATAATT TAATCATTAT AGGGATCATT 2400 CTTGTAGGGT TACCCTTTTT AATGGCAATA ATCCACCCTA AAAAAGAAGA AATCGTTGAG 2460
ATTGACGCCA AGCT 2474
(2) INFORMATION FOR SEQ ID NO: 260: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 148 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: g20RFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:260: Asn Ser Ser Gly He Asn Asp Gly Ala Ser He He He Leu Cys Ser 1 5 10 15
Ala Lys Lys Ala Gin Lys Leu Gly Leu Lys Ala Met Ala Thr He Arg 20 25 30
Gly Phe Gly Leu Gly Gly Cys Ser Pro Asp He Met Gly He Cys Pro
35 40 45
He He Ala He Lys Asn Asn Leu Lys Asn Val Lys Met Asn Leu Asn 50 55 60
Asp He Asn Leu Phe Glu Leu Asn Glu Ala Phe Ala Ala Gin Ser Leu 65 70 75 80
Ala Val Leu Lys Glu Leu Glu Leu Asn Pro Asn He Val Asn Val Asn 85 90 95 Gly Gly Ala He Ala He Gly His Pro He Gly Ala Ser Gly Ala Arg 100 105 110
He Leu Val Thr Leu Leu His Glu Met Lys Lys Ser Gly His Gly Val
115 120 125
Gly Cys Ala Ser Leu Cys Val Gly Gly Gly Gin Gly Leu Ser Val Val 130 135 140
Leu Glu Gin Lys 145
(2) INFORMATION FOR SEQ ID NO: 261:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 232 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g20RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:261:
Met Asn Lys Val He Thr Asn Leu Asn Lys Ala Leu Ser Gly Leu Lys
1 5 10 15
Asn Gly Asp Thr He Leu Val Gly Gly Phe Gly Leu Cys Gly He Pro 20 25 30 Glu Tyr Ala He Asp Tyr He Tyr Lys Lys Gly He Lys Asp Leu He 35 40 45
Val Val Ser Asn Asn Cys Gly Val Asp Asp Phe Gly Leu Gly He Leu
50 55 60
Leu Glu Lys Lys Gin He Lys Lys He He Ala Ser Tyr Val Gly Glu 65 70 75 80
Asn Lys He Phe Glu Ser Gin Met Leu Asn Gly Glu He Glu Phe Val
85 90 95
Leu Thr Pro Gin Gly Thr Leu Ala Glu Asn Leu Arg Ala Gly Gly Ala 100 105 110 Gly He Pro Ala Tyr Tyr Thr Pro Thr Gly Val Gly Thr Leu He Ala 115 120 125
Gin Gly Lys Glu Ser Arg Glu Phe Asn Gly Lys Glu Tyr He Leu Glu
130 135 140
Arg Ala He Thr Gly Asp Tyr Gly Leu He Lys Ala Tyr Lys Ser Asp 145 150 155 160
Thr Leu Gly Asn Leu Val Phe Arg Lys Thr Ala Arg Asn Phe Asn Pro
165 170 175
Leu Cys Ala Met Ala Ser Lys He Cys Val Ala Glu Val Glu Glu He 180 185 190 Val Pro Ala Gly Glu Leu Asp Pro Asp Glu He His Leu Pro Gly He 195 200 205
Tyr Val Gin His He Tyr Lys Gly Glu Lys Phe Glu Lys Arg He Glu
210 215 220
Lys He Thr Thr Arg Ser Ala Lys 225 230
(2) INFORMATION FOR SEQ ID NO:262:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 207 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE: (B) CLONE: g20RF3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:262:
Met Arg Glu Ala He He Lys Arg Ala Ala Lys Glu Leu Lys Glu Gly
1 5 10 15
Met Tyr Val Asn Leu Gly He Gly Leu Pro Thr Leu Val Ala Asn Glu 20 25 30 Val Ser Gly Met Asn He Val Phe Gin Ser Glu Asn Gly Leu Leu Gly 35 40 45
He Gly Ala Tyr Pro Leu Glu Gly Gly Val Asp Ala Asp Leu He Asn
50 55 60
Ala Gly Lys Glu Thr Val Thr Val Val Pro Gly Ala Ser Phe Phe Asn 65 70 75 80
Ser Ala Asp Ser Phe Ala Met He Arg Gly Gly His He Asp Leu Ala
85 90 95
He Leu Gly Gly Met Glu Val Ser Gin Asn Gly Asp Leu Ala Asn Trp 100 105 110 Met He Pro Lys Lys Leu He Lys Gly Met Gly Gly Ala Met Asp Leu 115 120 125
Val His Gly Ala Lys Lys Val He Val He Met Glu His Cys Asn Lys
130 135 140
Tyr Gly Glu Ser Lys Val Lys Lys Glu Cys Ser Leu Pro Leu Thr Gly 145 150 155 160
Lys Gly Val Val His Gin Leu He Thr Asp Leu Ala Val Phe Glu Phe
165 170 175
Ser Asn Asn Ala Met Lys Leu Val Glu Leu Gin Glu Gly Val Ser Leu 180 185 190 Asp Gin Val Arg Glu Lys Thr Glu Ala Glu Phe Glu Val His Leu 195 200 205
(2) INFORMATION FOR SEQ ID NO: 263: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 225 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: g20RF4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:263: Met Phe Leu Leu Arg His Leu Thr Ser Ala Cys Val Phe Leu Ala Ser 1 5 10 15
Lys Cys Leu Pro Asp Ser Phe Val Leu Val Ala Leu Leu Ser Phe He
20 25 30
Val Phe Val Leu Val Tyr Gly Leu Thr Gly Gin Asp Ala Phe Ser Val 35 40 45
He Ser Ser Trp Gly Asn Gly Ala Trp Thr Leu Leu Gly Phe Ser Met
50 55 60
Gin Met Ala Leu He Leu Val Leu Gly Gin Ala Leu Ala Ser Ala Lys 65 70 75 80 Leu Val Gin Lys Leu Leu Lys Tyr Leu Ala Ser Leu Pro Lys Gly Tyr
85 90 95
Tyr Thr Ala Leu Trp Leu Val Thr Phe Leu Ser Leu He Ala Asn Trp
100 105 110
He Asn Trp Gly Phe Gly Leu Val He Ser Ala He Phe Ala Lys Glu 115 120 125
He Ala Lys Asn Val Lys Gly Val Asp Tyr Arg Leu Leu He Ala Ser
130 135 140
Ala Tyr Ser Gly Phe Val He Trp His Gly Gly Leu Ser Gly Ser He 145 150 155 160 Pro Leu Ser Val Ala Thr Gin Asn Glu Asn Leu Ser Lys He Ser Ala
165 170 175
Gly Val He Glu Lys Ala He Pro He Ser Gin Thr He Phe Ser Ala 180 185 190
Tyr Asn Leu He He He Gly He He Leu Val Gly Leu Pro Phe Leu
195 200 205
Met Ala He He His Pro Lys Lys Glu Glu He Val Glu He Asp Ala
210 215 220
Lys
225
(2) INFORMATION FOR SEQ ID NO:264:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4292 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g9 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:264:
AGCTTATGCG GCTGAGCCAA GCAGGCAAAC CCTCTCTAAA GTCAGTAACC GCTTCAAAGA 60
GCATGGCGCT AAATTCGATC TTCGTGTGAT GGCAACACAT GGAGGCACCA TTAGCTGGAA 120
AGCTAAAGAG CTCGCTAGGA CTATTGTGAG CGGCCCTATT GGAGGCGTGA TCGGATCTAA 180 ATTGCTAGGC GAAACGCTTG GTTATGACAA TATTGCATGC AGCGATATTG GCGGCACGAG 240
CTTTGATATG GCGCTTATCG TTAAGAGCAA TTTTAATATC GCTTCTGACC CTGATATGGC 300
AGTGCGGGGA GTTTCGTGCG CATTGATCCG CACAGCCGAT CGGTCAAACT AGGGCCTGAC 360
AGCGCGGGGT ATAAAGTTGG CACTTGTTGG AAAGACAGCG GATAGACACG GTTTCAGTAA 420
CCGATTGCCA TATTGTTTTA GGCTATTTGA ACCCAGATTA TTTCTTAGGC GGTTTGATCA 480 AATTAGATGT GGATATGGCT AAAAAACACA TTAAAGAGCA AATCGCTGAT CCGCTAGGCA 540
TTATCGTAGA AGATGCGGCT GCGGGTGTGA TTGAGCTGCT TGATCCGGAG CTTAAAGAAT 600
ACTTGCGATC CAATATTAGC GCTAAAGGGT ATAGCCCATC TGATTTTGTG TGCTTTTCAT 660
ATGGTGGCGC GGGGCCTGTG CATACATATG GCTATACAGA AAGCTTAGGT TTTAAGGATG 720
TGGTAGTGCC TGCGTGGGCG GCTGGATTTA GCGCTTTTGG TTGCGCTTGC GCTGATTTTG 780 AATACAGATA CGACAAGAGC GTGGATATTG CCATTCCGCA GTATTCTTCA GACGAGTCCA 840
AAATAGAAGC ATGCAAGATC ATTCAAGACG CATGGGATGA ATTGACTCTC AAAGTGATTG 900
AAGAGTTCAA GATCAATGGA TTTTCTCAAA AAGATGTGAT CTTAAGACCT GGATACAGGA 960
TGCAGTATAT GGGGCAATTG AATGATTTAG AGATCACTTC TCCTGTGTCA AAAGCTGCAA 1020
GCGTGGCTGA TTGGGAAGAG ATTGTCAAAG AATATGAAAA AACCTACGCT CGTGTTTATT 1080 CTGAATCAGC GTGTTCTCCA GAGCTTGGTT TTAGCGTGAC CGGCGTGATC ATGCGTGGTG 1140
TTGTGGCTAC GCAAAAACCT GTGATTCCGG TTGAAAAAGA GCATGGCGCT ACGCCCCCAA 1200
AAGAAGCCAA AATAGGCGTT AGAAAATTCT ATCGGCATAA AAAATGGGTG GATGCAGATG 1260
TGTGGCAAAT GGAAAAATTA TTGCCTGGAA ATGAAGTCAT AGGGCCTGCG ATCGTGGAAT 1320
CAGATGCGAC CACTTTCGTG ATACCCAAAG GCTTTGCGAC AAGACTAGAC AAACACCGAT 1380 TGTTCCACTT GAAAGAAATT AAATAAAGGA GTTCAAAATG GCAAATTTAT TGAAAAACGG 1440
CAAAACTTTA AAACAAGCTA GAGATGAAAT CCTAGCCAGG ACAGAAAAAA CAGGGCATTA 1500
TAATGGTCTC AAAAAACTAG AGTTTAAGGA AAGAGATCCG ATTGGTTATG AGAAGATGTT 1560
CTCTAAATTA AGAGGCGGTA TCGTGCATGC CAGAGAAACG GCTAAAAGGA TTGCGGCAAG 1620
CCCTATTGTT GAGCAAGAGG GAGAATTGTG CTTCACGCTT TATAACGCTG TGGGCGATAG 1680 CGTGCTGACT TCTACGGGTA TCATTATCCA TGTAGGGACT ATGGGATCAG CTATCAAATA 1740
CATGGTAGAG AATAATTGGG AAGACAACCC AGGCATCAAT GACAAGGATA TTTTCACCAA 1800
TAACGACTGT GCGATTGGGA ATGTGCACCC ATGCGATATT ATGACTCTTG TGCCTATTTT 1860
CCACGATGAA AAATTGATTG GGTGGGTAGG TGGTGTTACG CATGTGATTG ATACCGGTTC 1920
GGTTACTCCA GGATCGATGA GCACCGGGCA GGTTCAAAGA TTTGGGGATG GATACATGAT 1980 CACTTGCCGT AAGACAGGAG CGAATGATGA GAGCTTTAAG GATTGGTTGC ATGAATCTCA 2040
AAGATCGGTT CGCACGCCTA AATATTGGAT TCTAGATGAA AGGACTAGGA TTGCAGGATG 2100
CCATATGATT AGGGATTTAG TGATGGAAGT CATTAAAGAA GACGGCATTG ATTCTTACAT 2160
GCGATTTATT GATGAAGTGA TTGAAGAAGG AAAGAAGAAG CCTTATCTCT AGGATCAAAT 2220
CCATGAACCA TACCAGGCCA ATTATAGAAA GGTTACCTTT GTGGATGTGC CTTATGCGCA 2280 TAAGGATATT GGCGTGTGCT CTGAATTTGC TTAAGCTAGA ACACGATCAT GCACTCTCCT 2340
GTGGGAAATC ACTATCAATA AAGACGCTAA CATGGAAATT AGATTTTGAT GGCGCGTCCA 2400
GGTGGGGATG GCACTCTTTC AATTGCAACC AAGTGTCTTT CACTAGCGGT ATTTGGGTGA 2460
TGATGACTCA AACGCTGATA CCCACTTCTC GCATCAATGA TGGCGCTTAT TTTGCGACTC 2520
AATTCAGACT CAAAAAAGGG ACTTGGATGA ATCCAGATGA CAGACGCACC GGGCATGCTT 2580 ATGCATGGCA TTTCTTGGTA TCAGGCTGGA GCGCTTTGTG GAGAGGCTTG TCTCAAGCGT 2640
ATTACAGCCG AGGGTATTTA GAAGAGGTGA ATTCCGGGAA CGCTAACACT TCCAATTGGT 2700
TGCAAGGCGG TGGTATCAAC CAGGATGGAG AAATCCATGC GGTGAATAGC TTTGAGACAA 2760 GTTCTTGTGG GACTGGAGCT TGCGCGATAA AAGACGGCTT AAATCACGCA GCGGCTATTT 2820
GGAATCCAAA AGGCGATATG GGCGATGTTG AAATTTGGGA AATGGCAGAG CCTCTTCTTT 2880
ATTTGGGCAG GAATGTCAAA GCCAATACCG GTGGGTATGG GAAATATCGA GGCGGTAACG 2940
GGTTTGAAAC CTTAAGAATG GTGTGGGGTG TGCATGATTG GACCATGTTC TTTATGGGTA 3000 ATGGCTACAT GAATAGCGAT TGGGGTATGA TGGGGGGCTA TCCACCGGCC AGTGGCTATA 3060
GGTTTGAAGC GCACAACACC GACTTGGAAA ACAGGATTAA AAATAACGCC AGCTTGCCTT 3120
TGGGGGGCGA TTTTAACCCA ACGGATAGAG ATTATGAAAA GCATATTTCT CATGCGTCTC 3180
AAGTCAAAAG GGATAAGCAA TGCATCACCA CTGAGAACTG CTTTGACAAT TATGATTTGT 3240
ATTTGAATTA CATCAAAGGT GGTCCTGGAT TTGGCGATCC GATTGAAAGG GATTTGAATG 3300 CGATTTTAGA AGATCTCAAC AGCAAACAGC TATTGCCAGA ATACGCTTAC AAGGTTTATG 3360
GCGCAGTGGT GAGTCAAAAT AAAGACGGCG TGTGGGTCGG CGATGAAGCC AAAACGAAAG 3420
CCAGAAGAAA AGAAATTCTT GAAAACAGAA AGGCTAGATC CATACCGGTA AAACAATGGA 3480
TGGAGCAAGA AAGAAACGCT ATCCTTGAAA AAGAGGCTTC CAAACAGGTT AAGCACATGT 3540
ATGCGACTAG CTTTGATCTT TCGCCTAAGT TTTTAAGCGA TTTTAAAACA TTTTGGAACT 3600 TGCCAAAGAG CTGGACTATG AAAGAAGATG AGCTTGGCGT ATTCACCTAT GGTTCTAAAT 3660
ACAGGATGGA TTTGAGCAAA TTGCCTGATG TGCGCACAGT TCTGTTGGTT GATGAGAAAT 3720
AAAGAAAGGA GAATGGTTAT GTCAAAATAC ACACAAGAAC AAATTAAAAA TTTGGTAGAG 3780
GGGAACTTGG ATTGGAACAC TGTCTTAAAA ATGCTTAGCA TGCCTAAAGA TCATGAAAGA 3840
TTCCAAATGT ATTTGAAGGT GTTGCAAGAT AAGGTAGATT TTGATGACAA AATCGTCTTA 3900 CCCTTGGGGC CACATTTGTT TGTGGTGCAA GATTCTCAAA AGAAATGGGT CATTAAGTGT 3960
TCATGCGGTC ATGCATTCTG TGCTCCAGAA GAAAACTGGA AATTGCATGC AAACATCTAT 4020
GTGCACGATA CAGCAGAAAA AATGGAAGAA GTGTATCCTA AACTCTTAGC CAGTGATACT 4080
AACTGGCAAG TGTATCGGGA GTATATTTGC CCGGATTGCG GCATCCTTTT AGATGTTGAA 4140
GCCCCAACTC CTTGGTATCC TGTGATCCAT GATTTTGAGC CTGATATAGA GGTGTTTTAT 4200 AAAGAATGGC TAGGCATACA GCCCCCAGAA AGACGCTAAA ATCGCTCAAT CCTTTTTATG 4260
AAGAGGCTTT ATGCCTCTTG GCGCTAAAAG CT 4292
(2) INFORMATION FOR SEQ ID NO: 265: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 116 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: g90RFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:265: Ala Tyr Ala Ala Glu Pro Ser Arg Gin Thr Leu Ser Lys Val Ser Asn 1 5 10 15
Arg Phe Lys Glu His Gly Ala Lys Phe Asp Leu Arg Val Met Ala Thr
20 25 30
His Gly Gly Thr He Ser Trp Lys Ala Lys Glu Leu Ala Arg Thr He 35 40 45
Val Ser Gly Pro He Gly Gly Val He Gly Ser Lys Leu Leu Gly Glu
50 55 60
Thr Leu Gly Tyr Asp Asn He Ala Cys Ser Asp He Gly Gly Thr Ser 65 70 75 80 Phe Asp Met Ala Leu He Val Lys Ser Asn Phe Asn He Ala Ser Asp
85 90 95
Pro Asp Met Ala Val Arg Gly Val Ser Cys Ala Leu He Arg Thr Ala
100 105 110
Asp Arg Ser Asn 115
(2) INFORMATION FOR SEQ ID NO:266:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 303 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g90RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:266: Met Ala Lys Lys His He Lys Glu Gin He Ala Asp Pro Leu Gly He
1 5 10 15
He Val Glu Asp Ala Ala Ala Gly Val He Glu Leu Leu Asp Pro Glu 20 25 30 Leu Lys Glu Tyr Leu Arg Ser Asn He Ser Ala Lys Gly Tyr Ser Pro 35 40 45
Ser Asp Phe Val Cys Phe Ser Tyr Gly Gly Ala Gly Pro Val His Thr
50 55 60
Tyr Gly Tyr Thr Glu Ser Leu Gly Phe Lys Asp Val Val Val Pro Ala 65 70 75 80
Trp Ala Ala Gly Phe Ser Ala Phe Gly Cys Ala Cys Ala Asp Phe Glu
85 90 95
Tyr Arg Tyr Asp Lys Ser Val Asp He Ala He Pro Gin Tyr Ser Ser 100 105 110 Asp Glu Ser Lys He Glu Ala Cys Lys He He Gin Asp Ala Trp Asp 115 120 125
Glu Leu Thr Leu Lys Val He Glu Glu Phe Lys He Asn Gly Phe Ser
130 135 140
Gin Lys Asp Val He Leu Arg Pro Gly Tyr Arg Met Gin Tyr Met Gly 145 150 155 160
Gin Leu Asn Asp Leu Glu He Thr Ser Pro Val Ser Lys Ala Ala Ser
165 170 175
Val Ala Asp Trp Glu Glu He Val Lys Glu Tyr Glu Lys Thr Tyr Ala 180 185 190 Arg Val Tyr Ser Glu Ser Ala Cys Ser Pro Glu Leu Gly Phe Ser Val 195 200 205
Thr Gly Val He Met Arg Gly Val Val Ala Thr Gin Lys Pro Val He
210 215 220
Pro Val Glu Lys Glu His Gly Ala Thr Pro Pro Lys Glu Ala Lys He 225 230 235 240
Gly Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val
245 250 255
Trp Gin Met Glu Lys Leu Leu Pro Gly Asn Glu Val He Gly Pro Ala 260 265 270 He Val Glu Ser Asp Ala Thr Thr Phe Val He Pro Lys Gly Phe Ala 275 280 285
Thr Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu He Lys 290 295 300 (2) INFORMATION FOR SEQ ID NO: 267:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 264 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g90RF3 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:267:
Met Ala Asn Leu Leu Lys Asn Gly Lys Thr Leu Lys Gin Ala Arg Asp
1 5 10 15
Glu He Leu Ala Arg Thr Glu Lys Thr Gly His Tyr Asn Gly Leu Lys 20 25 30
Lys Leu Glu Phe Lys Glu Arg Asp Pro He Gly Tyr Glu Lys Met Phe
35 40 45
Ser Lys Leu Arg Gly Gly He Val His Ala Arg Glu Thr Ala Lys Arg 50 55 60 He Ala Ala Ser Pro He Val Glu Gin Glu Gly Glu Leu Cys Phe Thr 65 70 75 80
Leu Tyr Asn Ala Val Gly Asp Ser Val Leu Thr Ser Thr Gly He He
85 90 95
He His Val Gly Thr Met Gly Ser Ala He Lys Tyr Met Val Glu Asn 100 105 110
Asn Trp Glu Asp Asn Pro Gly He Asn Asp Lys Asp He Phe Thr Asn 115 120 125 Asn Asp Cys Ala He Gly Asn Val His Pro Cys Asp He Met Thr Leu
130 135 140
Val Pro He Phe His Asp Glu Lys Leu He Gly Trp Val Gly Gly Val 145 150 155 160 Thr His Val He Asp Thr Gly Ser Val Thr Pro Gly Ser Met Ser Thr
165 170 175
Gly Gin Val Gin Arg Phe Gly Asp Gly Tyr Met He Thr Cys Arg Lys
180 185 190
Thr Gly Ala Asn Asp Glu Ser Phe Lys Asp Trp Leu His Glu Ser Gin 195 200 205
Arg Ser Val Arg Thr Pro Lys Tyr Trp He Leu Asp Glu Arg Thr Arg
210 215 220
He Ala Gly Cys His Met He Arg Asp Leu Val Met Glu Val He Lys 225 230 235 240 Glu Asp Gly He Asp Ser Tyr Met Arg Phe He Asp Glu Val He Glu
245 250 255
Glu Gly Lys Lys Lys Pro Tyr Leu 260 (2) INFORMATION FOR SEQ ID NO: 268:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 499 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g90RF4 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:268:
Met Asn His Thr Arg Pro He He Glu Arg Leu Pro Leu Trp Met Cys
1 5 10 15
Leu Met Arg He Arg He Leu Ala Cys Ala Leu Asn Leu Leu Lys Leu 20 25 30
Glu His Asp His Ala Leu Ser Cys Gly Lys Ser Leu Ser He Lys Thr
35 40 45
Leu Thr Trp Lys Leu Asp Phe Asp Gly Ala Ser Arg Trp Gly Trp His 50 55 60 Ser Phe Asn Cys Asn Gin Val Ser Phe Thr Ser Gly He Trp Val Met 65 70 75 80
Met Thr Gin Thr Leu He Pro Thr Ser Arg He Asn Asp Gly Ala Tyr
85 90 95
Phe Ala Thr Gin Phe Arg Leu Lys Lys Gly Thr Trp Met Asn Pro Asp 100 105 110
Asp Arg Arg Thr Gly His Ala Tyr Ala Trp His Phe Leu Val Ser Gly
115 120 125
Trp Ser Ala Leu Trp Arg Gly Leu Ser Gin Ala Tyr Tyr Ser Arg Gly
130 135 140 Tyr Leu Glu Glu Val Asn Ser Gly Asn Ala Asn Thr Ser Asn Trp Leu
145 150 155 160
Gin Gly Gly Gly He Asn Gin Asp Gly Glu He His Ala Val Asn Ser
165 170 175
Phe Glu Thr Ser Ser Cys Gly Thr Gly Ala Cys Ala He Lys Asp Gly 180 185 190
Leu Asn His Ala Ala Ala He Trp Asn Pro Lys Gly Asp Met Gly Asp
195 200 205
Val Glu He Trp Glu Met Ala Glu Pro Leu Leu Tyr Leu Gly Arg Asn
210 215 220 Val Lys Ala Asn Thr Gly Gly Tyr Gly Lys Tyr Arg Gly Gly Asn Gly
225 230 235 240
Phe Glu Thr Leu Arg Met Val Trp Gly Val His Asp Trp Thr Met Phe
245 250 255
Phe Met Gly Asn Gly Tyr Met Asn Ser Asp Trp Gly Met Met Gly Gly 260 265 270
Tyr Pro Pro Ala Ser Gly Tyr Arg Phe Glu Ala His Asn Thr Asp Leu 275 280 285 Glu Asn Arg He Lys Asn Asn Ala Ser Leu Pro Leu Gly Gly Asp Phe
290 295 300
Asn Pro Thr Asp Arg Asp Tyr Glu Lys His He Ser His Ala Ser Gin 305 310 315 320 Val Lys Arg Asp Lys Gin Cys He Thr Thr Glu Asn Cys Phe Asp Asn
325 330 335
Tyr Asp Leu Tyr Leu Asn Tyr He Lys Gly Gly Pro Gly Phe Gly Asp
340 345 350
Pro He Glu Arg Asp Leu Asn Ala He Leu Glu Asp Leu Asn Ser Lys 355 360 365
Gin Leu Leu Pro Glu Tyr Ala Tyr Lys Val Tyr Gly Ala Val Val Ser
370 375 380
Gin Asn Lys Asp Gly Val Trp Val Gly Asp Glu Ala Lys Thr Lys Ala 385 390 395 400 Arg Arg Lys Glu He Leu Glu Asn Arg Lys Ala Arg Ser He Pro Val
405 410 415
Lys Gin Trp Met Glu Gin Glu Arg Asn Ala He Leu Glu Lys Glu Ala
420 425 430
Ser Lys Gin Val Lys His Met Tyr Ala Thr Ser Phe Asp Leu Ser Pro 435 440 445
Lys Phe Leu Ser Asp Phe Lys Thr Phe Trp Asn Leu Pro Lys Ser Trp
450 455 460
Thr Met Lys Glu Asp Glu Leu Gly Val Phe Thr Tyr Gly Ser Lys Tyr 465 470 475 480 Arg Met Asp Leu Ser Lys Leu Pro Asp Val Arg Thr Val Leu Leu Val
485 490 495
Asp Glu Lys
(2) INFORMATION FOR SEQ ID NO: 269:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 168 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: g90RF5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:269:
Met Val Met Ser Lys Tyr Thr Gin Glu Gin He Lys Asn Leu Val Glu
1 5 10 15
Gly Asn Leu Asp Trp Asn Thr Val Leu Lys Met Leu Ser Met Pro Lys 20 25 30
Asp His Glu Arg Phe Gin Met Tyr Leu Lys Val Leu Gin Asp Lys Val
35 40 45
Asp Phe Asp Asp Lys He Val Leu Pro Leu Gly Pro His Leu Phe Val 50 55 60 Val Gin Asp Ser Gin Lys Lys Trp Val He Lys Cys Ser Cys Gly His 65 70 75 80
Ala Phe Cys Ala Pro Glu Glu Asn Trp Lys Leu His Ala Asn He Tyr
85 90 95
Val His Asp Thr Ala Glu Lys Met Glu Glu Val Tyr Pro Lys Leu Leu 100 105 110
Ala Ser Asp Thr Asn Trp Gin Val Tyr Arg Glu Tyr He Cys Pro Asp
115 120 125
Cys Gly He Leu Leu Asp Val Glu Ala Pro Thr Pro Trp Tyr Pro Val 130 135 140 He His Asp Phe Glu Pro Asp He Glu Val Phe Tyr Lys Glu Trp Leu 145 150 155 160
Gly He Gin Pro Pro Glu Arg Arg 165 (2) INFORMATION FOR SEQ ID NO: 270:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1624 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: k4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 270:
GAATTCCGGG AACGCTAACA CTTCCAATTG GTTGCAAGGC GGTGGTATCA ACCAGGATGG 60
AGAAATCCAT GCGGTGAATA GCTTTGAGAC AAGTTCTTGT GGGACTGGAG CTTGCGCGAT 120
AAAAGACGGC TTAAATCACG CAGCGGCTAT TTGGAATCCA AAAGGCGATA TGGGCGATGT 180
TGAAATTTGG GAAATGGCAG AGCCTCTTCT TTATTTGGGC AGGAATGTCA AAGCCAATAC 240 CGGTGGGTAT GGGAAATATC GAGGCGGTAA CGGGTTTGAA ACCTTAAGAA TGGTGTGGGG 300
TGTGCATGAT TGGACCATGT TCTTTATGGG TAATGGCTAC ATGAATAGCG ATTGGGGTAT 360
GATGGGGGGC TATCCACCGG CCAGTGGCTA TAGGTTTGAA GCGCACAACA CCGACTTGGA 420
AAACAGGATT AAAAATAACG CCAGCTTGCC TTTGGGGGGC GATTTTAACC CAACGGATAG 480
AGATTATGAA AAGCATATTT CTCATGCGTC TCAAGTCAAA AGGGATAAGC AATGCATCAC 540 CACTGAGAAC TGCTTTGACA ATTATGATTT GTATTTGAAT TACATCAAAG GTGGTCCTGG 600
ATTTGGCGAT CCGATTGAAA GGGATTTGAA TGCGATTTTA GAAGATCTCA ACAGCAAACA 660
GCTATTGCCA GAATACGCTT ACAAGGTTTA TGGCGCAGTG GTGAGTCAAA ATAAAGACGG 720
CGTGTGGGTC GGCGATGAAG CCAAAACGAA AGCCAGAAGA AAAGAAATTC TTGAAAACAG 780
AAAGGCTAGA TCCATACCGG TAAAACAATG GATGGAGCAA GAAAGAAACG CTATCCTTGA 840 AAAAGAGGCT TCCAAACAGG TTAAGCACAT GTATGCGACT AGCTTTGATC TTTCGCCTAA 900
GTTTTTAAGC GATTTTAAAA CATTTTGGAA CTTGCCAAAG AGCTGGACTA TGAAAGAAGA 960
TGAGCTTGGC GTATTCACCT ATGGTTCTAA ATACAGGATG GATTTGAGCA AATTGCCTGA 1020
TGTGCGCACA GTTCTGTTGG TTGATGAGAA ATAAAGAAAG GAGAATGGTT ATGTCAAAAT 1080
ACACACAAGA ACAAATTAAA AATTTGGTAG AGGGGAACTT GGATTGGAAC ACTGTCTTAA 1140 AAATGCTTAG CATGCCTAAA GATCATGAAA GATTCCAAAT GTATTTGAAG GTGTTGCAAG 1200
ATAAGGTAGA TTTTGATGAC AAAATCGTCT TACCCTTGGG GCCACATTTG TTTGTGGTGC 1260
AAGATTCTCA AAAGAAATGG GTCATTAAGT GTTCATGCGG TCATGCATTC TGTGCTCCAG 1320
AAGAAAACTG GAAATTGCAT GCAAACATCT ATGTGCACGA TACAGCAGAA AAAATGGAAG 1380
AAGTGTATCC TAAACTCTTA GCCAGTGATA CTAACTGGCA AGTGTATCGG GAGTATATTT 1440 GCCCGGATTG CGGCATCCTT TTAGATGTTG AAGCCCCAAC TCCTTGGTAT CCTGTGATCC 1500
ATGATTTTGA GCCTGATATA GAGGTGTTTT ATAAAGAATG GCTAGGCATA CAGCCCCCAG 1560
AAAGACGCTA AAATCGCTCA ATCCTTTTTA TGAAGAGGCT TTATGCCTCT TGGCGCTAAA 1620
AGCT 1624 (2) INFORMATION FOR SEQ ID NO: 271:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1899 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: gll
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271:
AGCTTATGCG GCTGAGCCAA GCAGGCAAAC CCTCTCTAAA GTCAGTAACC GCTTCAAAGA 60
GCATGGCGCT AAATTCGATC TTCGTGTGAT GGCAACACAT GGAGGCACCA TTAGCTGGAA 120 AGCTAAAGAG CTCGCTAGGA CTATTGTGAG CGGCCCTATT GGAGGCGTGA TCGGATCTAA 180
ATTGCTAGGC GAAACGCTTG GTTATGACAA TATTGCATGC AGCGATATTG GCGGCACGAG 240
CTTTGATATG GCGCTTATCG TTAAGAGCAA TTTTAATATC GCTTCTGACC CTGATATGGC 300
AGTGCGGGGA GTTTCGTGCG CATTGATCCG CACAGCCGAT CGGTCAAACT AGGGCCTGAC 360
AGCGCGGGGT ATAAAGTTGG CACTTGTTGG AAAGACAGCG GATAGACACG GTTTCAGTAA 420 CCGATTGCCA TATTGTTTTA GGCTATTTGA ACCCAGATTA TTTCTTAGGC GGTTTGATCA 480
AATTAGATGT GGATATGGCT AAAAAACACA TTAAAGAGCA AATCGCTGAT CCGCTAGGCA 540
TTATCGTAGA AGATGCGGCT GCGGGTGTGA TTGAGCTGCT TGATCCGGAG CTTAAAGAAT 600
ACTTGCGATC CAATATTAGC GCTAAAGGGT ATAGCCCATC TGATTTTGTG TGCTTTTCAT 660
ATGGTGGCGC GGGGCCTGTG CATACATATG GCTATACAGA AAGCTTAGGT TTTAAGGATG 720 TGGTAGTGCC TGCGTGGGCG GCTGGATTTA GCGCTTTTGG TTGCGCTTGC GCTGATTTTG 780
AATACAGATA CGACAAGAGC GTGGATATTG CCATTCCGCA GTATTCTTCA GACGAGTCCA 840
AAATAGAAGC ATGCAAGATC ATTCAAGACG CATGGGATGA ATTGACTCTC AAAGTGATTG 900 AAGAGTTCAA GATCAATGGA TTTTCTCAAA AAGATGTGAT CTTAAGACCT GGATACAGGA 960
TGCAGTATAT GGGGCAATTG AATGATTTAG AGATCACTTC TCCTGTGTCA AAAGCTGCAA 1020
GCGTGGCTGA TTGGGAAGAG ATTGTCAAAG AATATGAAAA AACCTACGCT CGTGTTTATT 1080
CTGAATCAGC GTGTTCTCCA GAGCTTGGTT TTAGCGTGAC CGGCGTGATC ATGCGTGGTG 1140 TTGTGGCTAC GCAAAAACCT GTGATTCCGG TTGAAAAAGA GCATGGCGCT ACGCCCCCAA 1200
AAGAAGCCAA AATAGGCGTT AGAAAATTCT ATCGGCATAA AAAATGGGTG GATGCAGATG 1260
TGTGGCAAAT GGAAAAATTA TTGCCTGGAA ATGAAGTCAT AGGGCCTGCG ATCGTGGAAT 1320
CAGATGCGAC CACTTTCGTG ATACCCAAAG GCTTTGCGAC AAGACTAGAC AAACACCGAT 1380
TGTTCCACTT GAAAGAAATT AAATAAAGGA GTTCAAAATG GCAAATTTAT TGAAAAACGG 1440 CAAAACTTTA AAACAAGCTA GAGATGAAAT CCTAGCCAGG ACAGAAAAAA CAGGGCATTA 1500
TAATGGTCTC AAAAAACTAG AGTTTAAGGA AAGAGATCCG ATTGGTTATG AGAAGATGTT 1560
CTCTAAATTA AGAGGCGGTA TCGTGCATGC CAGAGAAACG GCTAAAAGGA TTGCGGCAAG 1620
CCCTATTGTT GAGCAAGAGG GAGAATTGTG CTTCACGCTT TATAACGCTG TGGGCGATAG 1680
CGTGCTGACT TCTACGGGTA TCATTATCCA TGTAGGGACT ATGGGATCAG CTATCAAATA 1740 CATGGTAGAG AATAATTGGG AAGACAACCC AGGCATCAAT GACAAGGATA TTTTCACCAA 1800
TAACGACTGT GCGATTGGGA ATGTGCACCC ATGCGATATT ATGACTCTTG TGCCTATTTT 1860
CCACGATGAA AAATTGATTG GGTGGGTAGG TGGTGTTAC 1899
(2) INFORMATION FOR SEQ ID NO:272
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y104F (xi) SEQUENCE DESCRIPTION: SEQ ID NO.-272:
CGCGCCATGG CCATGAAAAA AAATATCTTA AATTTAGCG 39
(2) INFORMATION FOR SEQ ID NO:273:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y104R (xi) SEQUENCE DESCRIPTION: SEQ ID NO:273:
GTGCGGATCC CTTGTTGAGA ACAATTTTAG CGTGC 35
(2) INFORMATION FOR SEQ ID NO:274:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y108F2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:274:
CGCGCCATGG GAATGCTCAC CCAAGAAGTT G 31
(2) INFORMATION FOR SEQ ID NO: 275:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 33 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y108R2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:275: CGCGCCATGG CAATCGCTCG CATAAGCATT GGG 33
(2) INFORMATION FOR SEQ ID NO: 276:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y124F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:276: CCGCGCCATG GAGGGCATGC AATTTGATAG A 31
(2) INFORMATION FOR SEQ ID NO: 277:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y124R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:277: GGCGCGGATC CACCGCCAGA GAGTTTAGCC AA 32
(2) INFORMATION FOR SEQ ID NO:278:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y128F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: CGCGCCATGG GAATGGTTAA TGGCAAGCCA GAC 33
(2) INFORMATION FOR SEQ ID NO: 279:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y128R1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 279: CGCGGATCCA GTCGCAAGCT TTAGAGTAGT 30 (2) INFORMATION FOR SEQ ID NO: 280:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y128F2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: CGCGCCATGG GAATGGTATT AGGAAGCTTA CACC 34 (2) INFORMATION FOR SEQ ID NO: 281:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y128R4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 281: GGCGCGGATC CAAGTTCTAT TTTCAATTCC TTG 33 (2) INFORMATION FOR SEQ ID NO:282:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y128F2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 282: CGCGCCATGG GAATGGTATT AGGAAGCTTA CACC 34 (2) INFORMATION FOR SEQ ID N0:283:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y152R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:283: GGCGCGGATC CATTCCTTTC TACCGCTTGT TT 32 (2) INFORMATION FOR SEQ ID NO:284:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y143F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:284:
CCGCGCCATG GGACAAAAAG CGGTTGATAA CGAT 34
(2) INFORMATION FOR SEQ ID NO: 285: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y139R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285:
GGCGCGGATC CCATTCCACT ATAATAAATA TC 32
(2) INFORMATION FOR SEQ ID NO: 286: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y146F2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:286:
CGCGCCATGG GAATGGTATG CACCCCTATT GTGG 34
(2) INFORMATION FOR SEQ ID NO:287: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y146R2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287:
GCGCGGATCC AAGCTTTTCA ACTTTGATGA G 31
(2) INFORMATION FOR SEQ ID NO: 288: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y146F1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:288: CGCGCCATGG GAATGAACCT TTCTGAAATT G 31 (2) INFORMATION FOR SEQ ID NO: 289:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y146R3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 289: GGCGCGGATC CAAGCTTTTC AACTTTGATG AG 32 (2) INFORMATION FOR SEQ ID NO: 290:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y108F2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290: CGCGCCATGG GAATGCTCAC CCAAGAAGTT G 31 (2) INFORMATION FOR SEQ ID NO: 291:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y150R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: GGCGCGGATC CGTTTTTAGC CAAATTTTTA GT 32 (2) INFORMATION FOR SEQ ID NO: 292:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y153F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:292: CGCGCCATGG GAATGGGCGA TGTTGAAATT TGG 33 (2) INFORMATION FOR SEQ ID NO:293:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y153R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:293:
GCGCGGATCC TTTCTCATCA ACCAACAGAA C 31
(2) INFORMATION FOR SEQ ID NO: 294: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y173F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:294:
CGCGCCATGG GAATGGCAAT TTCAAAAGAA GAAG 34
(2) INFORMATION FOR SEQ ID NO: 295: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y173R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 295:
GCGCGGATCC CTTGACTTCA ACCTTAGCAC C 31
(2) INFORMATION FOR SEQ ID NO:296: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y175F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:296:
CGCGCCATGG GAATGAAGAG ATCTTCTGCA TTTAG 35
(2) INFORMATION FOR SEQ ID NO: 297: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y175R (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: GCGCGGATCC CTTCACTAAT TTGACATCCA C 31 (2) INFORMATION FOR SEQ ID NO: 298:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y184F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 298: CCGCGCCATG GGAATGAATT TATTTGAAAA AATG 34 (2) INFORMATION FOR SEQ ID NO: 299:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y184R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: GGCGCGGATC CGCCTTTTCT TAAAGATTCT AA 32 (2) INFORMATION FOR SEQ ID NO: 300:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y212F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 300: CCGCGCCATG GGAAAAAAAG TAGGCGGTAA AGAA 34 (2) INFORMATION FOR SEQ ID NO: 301:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y212R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301: GGCGCGGATC CAATGCCTTC TTCAACCGCC GC 32 (2) INFORMATION FOR SEQ ID NO: 302:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y261F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:302:
CCGCGCCATG GGAGTTACTC AAGGCATTGT TTCA 34
(2) INFORMATION FOR SEQ ID NO: 303: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y241R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:303:
GGCGCGGATC CTTTCACCAA AATGATCCTA TA 32
(2) INFORMATION FOR SEQ ID NO: 304: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y28F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:304:
CCGCGCCATG GGAATGAAAA TTTTAGTGAT TCAA 34
(2) INFORMATION FOR SEQ ID NO: 305: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y28R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 305:
GGCGCGGATC CGGCTTCTTG GAACGCTTTC AT 32
(2) INFORMATION FOR SEQ ID NO: 306: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y291F (xi) SEQUENCE DESCRIPTION: SEQ ID NO:306: CCGCGCCATG GGAATGAAGT TTCAACCATT AGGA 34 (2) INFORMATION FOR SEQ ID NO: 307:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y291R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:307: GGCGCGGATC CGCAACCCTC ACTGATTTTA TG 32 (2) INFORMATION FOR SEQ ID NO: 308:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y56F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 308: CCGCGCCATG GGACCTGGAT TTGGCGATCC GATT 34 (2) INFORMATION FOR SEQ ID NO: 309:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Y117R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 309: GGCGCGGATC CCGCATACAT GTGCTTAACC TG 32 (2) INFORMATION FOR SEQ ID NO: 310:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y89F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: CCGCGCCATG GGAATCAAAA CCATCAGTTT TGGT 34 (2) INFORMATION FOR SEQ ID NO: 311:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y89R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311:
GGCGCGGATC CCTTCACTAA TTTGACATCC AC 32
(2) INFORMATION FOR SEQ ID NO: 312: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y98F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:312:
CCGCGCCATG GGAATGAAAA AACCCTACAG AAAG 34
(2) INFORMATION FOR SEQ ID NO:313: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Y98R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 313:
GGCGCGGATC CGCCGTTATC AATTTTGGCC TC 32
(2) INFORMATION FOR SEQ ID NO: 314: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Z14F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 314:
CGCGCCATGG GAGAATTCCC TGGTGATGAC ACTC 34
(2) INFORMATION FOR SEQ ID NO: 315: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer Z14R (xi) SEQUENCE DESCRIPTION: SEQ ID NO:315: GCGCGGATCC CATTATCGCC AGGCATAACC A 31 (2) INFORMATION FOR SEQ ID NO:316:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: CGCGCCATGG GAGAATTCCC TTTGACTTTC CCTAG 35 (2) INFORMATION FOR SEQ ID NO: 317:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25R3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:317: GGCGCGGATC CACCCGCTTG AATTTTTTGA GC 32 (2) INFORMATION FOR SEQ ID NO:318:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25F3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 318: CCGCGCCATG GGATACATTC CTTATGTA 28 (2) INFORMATION FOR SEQ ID NO: 319:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25R4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: GGCGCGGATC CATTCATCAA ACCCTTAAGC CT 32 (2) INFORMATION FOR SEQ ID NO: 320:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25F3
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 320:
CCGCGCCATG GGATACATTC CTTATGTA 28
(2) INFORMATION FOR SEQ ID NO: 321: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: primer Z25R5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:321:
GGCGCGGATC CCACGATAGA AGACGCCACG CT 32
(2) INFORMATION FOR SEQ ID NO:322: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1857 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: fll
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:322:
TTAAATACTA CATAGAAAAT AGTGAAAATA TTGGCGATAT TGCGTTTTAA GTGAAAAACC 60
CCCTATCCCC TTAAGAGAGC TTAATTAATG GTCAATCATT TTAAATTATT TTAACATAAA 120
GACTTATTTT AAAATTTTGT TTCAATATGA AATTTTAATC CCTTGTAAAA GCTTTAAATT 180
GTGAAAAACA AGCAAACAAC CCCCTTAAAT TGTATAATAA AAACACTTTA AGCCAACAAA 240 GGATAATCAT GAGAAAACGC TTCACTCCAC TTTTGTTATT CAGGAAATAA TACAGATGAG 300
AAAAACGATT TCAGCGTTGT TTTTATCAGC GTGCATAGGG TTATCGTCTG TTTATGCAGA 360
TAACGCTTTG ATTTTGCAAA CCGATTTTAG TCTAAAAGAT GGGGCCGTCT CGGCGATGAA 420
AGGCGTCGCT TTCAGCGTTG ATTCCCATCT TAAAATCTTT GATTTAACGC ACGAAATCCC 480
CCCGTATAAC ATCTGGGAAG GCGCTTACCG CTTGTATCAG ACCGCCAGTT ATTGGCCAAA 540 AGGTTCGGTA TTTGTGAGCG TAGTTGATCC GGGCGTAGGC ACTAAGCGTA AATCGGTGGT 600
ACTAAAAACT AAAAACGGCC AGTATTTCGT CTCGCCGGAT AACGGCACGC TGACTTTGGT 660
GGCACAAACT TTGGGGATTG ATAGCGTGCG TGAAATTGAT GAAAAAGCTA ACCGCTTGAA 720
AGGTTCTGAA AAATCCTATA CTTTCCATGG TCGTGATGTG TATGCTTACA CCGGTGCACG 780
CTTGGCTTCT GGGGCGATCA CATTCGAGCA GGTCGGGCCA GAGCTTCCCC CAAAAGTCGT 840 TGAAATTCCT TACCAAAAAG CGAAAGCCAC AAAAGGGGAA GTGAAAGGTA ATATCCCGAT 900
TTTGGATATT CAATATGGCA ATGTTTGGAG CAACATCAGC GATAAATTAC TCAATCAAGC 960
AAAAATCAAA CTCAATGACA CGCTGTGTGT AACGATTTTT AAAGGTTTTA AGAACCAATA 1020
CGAAGGGAAA ATGCCGTATG TTGCGAGCTT TGGCGATGTG CCAGAAGCCC AGCCGTTGGT 1080
TTATTTAAAC AGCTTGTTGA ATGTTTCCGT GGCGCTGAAT AGGGATAATT TCGCGCAAAA 1140 ATACCAAATT AAATCCGGTG CTGACTGGAA TATTGATATC AAGAAGTGCG CTAAGTAAAG 1200
CGCTTTTTAG AAAATTAAGG GGAGTTAGAG ATTTTTAACT ATCAAACTTT GATTAACATA 1260
GCCTAATTAA TGATAAATAA ACTATAAAAC AAACGCTTGT TGTTAAAATT TTGTTTTAAT 1320
ATGAAACACT AATCCCTTGT AAAAACTTTA ATTTTCAAAA AATAAACAAA CCATCCCCTT 1380
AAATTGTATA ATAAAGACAC TATAAACTAA AAAGAGATAA CCATGAGAAA ACTATTCATC 1440 CCACTTTTAT TATTCAGCGC TTTAGAAGCG AATGAGAAAA ATGGCTTTTT CATAGAAGCC 1500
GGGTTTGAAA CTGGGCTATT AGAAGGCCGT CAAACGCAAG AAAAAAGACA CACCACCACA 1560
AAAAACACTT ACGCAACTTA TAATTATTTA CCCACAGACA CGATTTTAAA AAGAGCGGCT 1620 AATTTGTTCA CGGACGCTAA ATCCATATCA CAATTAAATT TTTCATCTTT ATCTCCTGTT 1680
AAAGTGTTGT ATATCGGTGG TAAATTAACT ATAGAAAACT TCCTACCTTA TAATTTAAGC 1740
AACGTGAAGC TTAGTTTTAC AGACGCTCAA GGCAATGTAA TCAATCTAGG CGTGATAGAG 1800
ACTATCCCCA AACACTCTAA AATTGTTTTA CCCTGGGAGG CATTTGATAG CCTAAAA 1857
(2) INFORMATION FOR SEQ ID NO: 323:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 145 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: antigen 1 encoded by fllORFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:323:
Met Arg Lys Leu Phe He Pro Leu Leu Leu Phe Ser Ala Leu Glu Ala 1 5 10 15 Asn Glu Lys Asn Gly Phe Phe He Glu Ala Gly Phe Glu Thr Gly Leu 20 25 30
Leu Glu Gly Arg Gin Thr Gin Glu Lys Arg His Thr Thr Thr Lys Asn
35 40 45
Thr Tyr Ala Thr Tyr Asn Tyr Leu Pro Thr Asp Thr He Leu Lys Arg 50 55 60
Ala Ala Asn Leu Phe Thr Asp Ala Lys Ser He Ser Gin Leu Asn Phe 65 70 75 80
Ser Ser Leu Ser Pro Val Lys Val Leu Tyr He Gly Gly Lys Leu Thr 85 90 95 He Glu Asn Phe Leu Pro Tyr Asn Leu Ser Asn Val Lys Leu Ser Phe 100 105 110
Thr Asp Ala Gin Gly Asn Val He Asn Leu Gly Val He Glu Thr He
115 120 125
Pro Lys His Ser Lys He Val Leu Pro Trp Glu Ala Phe Asp Ser Leu 130 135 140
Lys 145
(2) INFORMATION FOR SEQ ID NO:324:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 amino acids
(B) TYPE : amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: antigen 2 encoded by fllORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 324:
Met Arg Lys Thr He Ser Ala Leu Phe Leu Ser Ala Cys He Gly Leu
1 5 10 15
Ser Ser Val Tyr Ala Asp Asn Ala Leu He Leu Gin Thr Asp Phe Ser 20 25 30 Leu Lys Asp Gly Ala Val Ser Ala Met Lys Gly Val Ala Phe Ser Val 35 40 45
Asp Ser His Leu Lys He Phe Asp Leu Thr His Glu He Pro Pro Tyr
50 55 60
Asn He Trp Glu Gly Ala Tyr Arg Leu Tyr Gin Thr Ala Ser Tyr Trp 65 70 75 80
Pro Lys Gly Ser Val Phe Val Ser Val Val Asp Pro Gly Val Gly Thr
85 90 95
Lys Arg Lys Ser Val Val Leu Lys Thr Lys Asn Gly Gin Tyr Phe Val 100 105 110 Ser Pro Asp Asn Gly Thr Leu Thr Leu Val Ala Gin Thr Leu Gly He 115 120 125
Asp Ser Val Arg Glu He Asp Glu Lys Ala Asn Arg Leu Lys Gly Ser 130 135 140
Glu Lys Ser Tyr Thr Phe His Gly Arg Asp Val Tyr Ala Tyr Thr Gly 145 150 155 160
Ala Arg Leu Ala Ser Gly Ala He Thr Phe Glu Gin Val Gly Pro Glu 165 170 175
Leu Pro Pro Lys Val Val Glu He Pro Tyr Gin Lys Ala Lys Ala Thr
180 185 190
Lys Gly Glu Val Lys Gly Asn He Pro He Leu Asp He Gin Tyr Gly 195 200 205 Asn Val Trp Ser Asn He Ser Asp Lys Leu Leu Asn Gin Ala Lys He 210 215 220
Lys Leu Asn Asp Thr Leu Cys Val Thr He Phe Lys Gly Phe Lys Asn 225 230 235 240
Gin Tyr Glu Gly Lys Met Pro Tyr Val Ala Ser Phe Gly Asp Val Pro 245 250 255
Glu Ala Gin Pro Leu Val Tyr Leu Asn Ser Leu Leu Asn Val Ser Val
260 265 270
Ala Leu Asn Arg Asp Asn Phe Ala Gin Lys Tyr Gin He Lys Ser Gly 275 280 285 Ala Asp Trp Asn He Asp He Lys Lys Cys Ala Lys 290 295 300
(2) INFORMATION FOR SEQ ID NO: 325: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: forward primer GP for c5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325:
CGCCATGGCT TATAAAGACT GCGTA 25
(2) INFORMATION FOR SEQ ID NO: 326: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: reverse primer GQ for c5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:326:
GCGGATCCGG GAGATTTTCT AAGACA 26
(2) INFORMATION FOR SEQ ID NO: 327: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: forward primer GR for g20RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 327:
CGCCATGGGC AACAAGGTTA TAACTAAT 28 (2) INFORMATION FOR SEQ ID NO: 328:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: reverse primer GS for g20RF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:328:
GCGGATCCTT TCGCGCTCCT TGTTGTG 27
(2) INFORMATION FOR SEQ ID NO: 329:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 32 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer GJ for c2 (minus transmembrane region)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329:
CGCCATGGGA AAACACCACA AAGAAAAAAA AG 32
(2) INFORMATION FOR SEQ ID NO: 330:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: reverse primer Gl for c2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330:
GCGGATCCAA AAGCCGGGAT AATCGCCCC 29
(2) INFORMATION FOR SEQ ID NO: 331:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer GV for cl3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331:
CGCCATGGCT TTGGAGCGTT TTAAGG 26
(2) INFORMATION FOR SEQ ID NO: 332:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: reverse primer GW for cl3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:332:
GCGGATCCAG GGCATTTTAT ATTGAG 26
(2) INFORMATION FOR SEQ ID NO:333: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: forward primer GL for f3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:333:
CGCCATGGAA TTCCGTTTTT ATGCGCGT 28
(2) INFORMATION FOR SEQ ID NO:334: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
( ii) IMMEDIATE SOURCE:
(B) CLONE: reverse primer GM for f3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:334:
GCGGATCCGA ACAACAAGCG TTCCTC 26
(2) INFORMATION FOR SEQ ID NO: 335: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: forward primer GN for g90RF4
( i) SEQUENCE DESCRIPTION: SEQ ID NO: 335:
CGCCATGGGC CTAGAACACG ATCATGCA 28
(2) INFORMATION FOR SEQ ID NO: 336: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: reverse primer GO for g90RF4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:336:
GCGGATCCTT TCTCATCAAC CAACAG 26 (2) INFORMATION FOR SEQ ID NO: 337:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 25 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer HO for dllORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 337:
CGCCATGGCG TTTAATTATG ATGAA 25
(2) INFORMATION FOR SEQ ID NO:338:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: reverse primer HP for dllORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:338:
GCGGATCCTT GTTTTCCTTG TGCTTTTTC 29
(2) INFORMATION FOR SEQ ID NO: 339:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: forward primer HY for b8c7
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 339:
CCATGGGACC CAAAGTTAAA GAGCGAGTG 29
(2) INFORMATION FOR SEQ ID NO: 340:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 72 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 02c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 340:
Val Phe Asn Leu Asn Leu He Thr Leu Asn Ser Lys Tyr He Arg Arg 1 5 10 15 Leu Thr Val Asn Gly Ser Asn Gin Met Pro Arg Phe Ser Thr Asp Gly 20 25 30
Arg Asn He Met Tyr He Lys Lys Thr Pro Gin Glu Tyr Ala Met Gly
35 40 45
Leu He Leu Leu Asp Tyr Asn Gin Ser Phe Leu Phe Pro Leu Lys Asn 50 55 60
Val Lys He Gin Ala Phe Asp Trp 65 70 (2) INFORMATION FOR SEQ ID NO: 341:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 299 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 03, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 341:
Met Lys Lys Asn He Leu Asn Leu Ala Leu Val Gly Ala Leu Ser Ala 1 5 10 15 Ser Phe Leu Met Ala Lys Pro Ala His Asn Ala Asp Asn Ala Thr His 20 25 30
Asn Thr Lys Lys Thr Thr Asp Ser Ser Pro Gly Val Leu Ala Thr Val
35 40 45
Asp Gly Arg Pro He Thr Lys Ser Asp Phe Asp Met He Lys Gin Arg 50 55 60
Asn Pro Asn Phe Asp Phe Asp Lys Leu Lys Glu Lys Glu Lys Glu Ala 65 70 75 80
Leu He Glu Gin Ala He Arg Thr Ala Leu Val Glu Asn Glu Ala Lys 85 90 95 Ala Glu Lys Leu Asp Gin Thr Pro Glu Phe Lys Ala Met Met Glu Ala 100 105 110
Val Lys Lys Gin Ala Leu Val Glu Phe Trp Ala Lys Lys Gin Ala Glu
115 120 125
Glu Val Lys Lys Val Gin He Pro Glu Lys Glu Met Gin Asp Phe Tyr 130 135 140
Asn Ala Asn Lys Asp Gin Leu Phe Val Lys Gin Glu Ala His Ala Arg
145 150 155 160
His He Leu Val Lys Thr Glu Asp Glu Ala Lys Arg He He Ser Glu
165 170 175 He Asp Lys Gin Pro Lys Ala Lys Lys Glu Ala Lys Phe He Glu Leu
180 185 190
Ala Asn Arg Asp Thr He Asp Pro Asn Ser Lys Asn Ala Gin Asn Gly
195 200 205
Gly Asp Leu Gly Lys Phe Gin Lys Asn Gin Met Ala Pro Asp Phe Xaa 210 215 220
Lys Ala Ala Phe Ala Leu Thr Xaa Gly Asp Tyr Thr Lys Thr Pro Val
225 230 235 240
Lys Thr Glu Phe Gly Tyr His He He Tyr Leu He Ser Lys Asp Ser
245 250 255 Pro Val Thr Tyr Thr Tyr Glu Gin Ala Lys Pro Thr He Lys Gly Met
260 265 270
Leu Gin Glu Lys Leu Phe Gin Glu Arg Met Asn Gin Arg He Glu Glu
275 280 285
Leu Arg Lys His Ala Lys He Val He Asn Lys 290 295
(2) INFORMATION FOR SEQ ID NO: 342:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 443 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 04a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:342:
Val Ser Pro Leu Lys Thr He Arg He Tyr Ser Tyr His Asp Ser He 1 5 10 15
Lys Asp Ser He Lys Ala Val Val Asn He Ser Thr Glu Lys Lys He 20 25 30 Lys Asn Asn Phe He Gly Gly Gly Val Phe Asn Asp Pro Phe Phe Gin
35 40 45
Gin Phe Phe Gly Asp Leu Gly Gly Met He Pro Lys Glu Arg Met Glu 50 55 60 Arg Ala Leu Gly Ser Gly Val He He Ser Lys Asp Gly Tyr He Val 65 70 75 80
Thr Asn Asn His Val He Asp Gly Ala Asp Lys He Lys Val Thr He
85 90 95
Pro Gly Ser Asn Lys Glu Tyr Ser Ala Thr Leu Val Gly Thr Asp Ser 100 105 110
Glu Ser Asp Leu Ala Val He Arg He Thr Lys Asp Asn Leu Pro Thr
115 120 125
He Lys Phe Ser Asp Ser Asn Asp He Ser Val Gly Asp Leu Val Phe
130 135 140 Ala He Gly Asn Pro Phe Gly Val Gly Glu Ser Val Thr Gin Gly He
145 150 155 160
Val Ser Ala Leu Asn Lys Ser Gly He Gly He Asn Ser Tyr Glu Asn
165 170 175
Phe He Gin Thr Asp Ala Ser He Asn Pro Gly Asn Ser Gly Gly Ala 180 185 190
Leu He Asp Ser Arg Gly Gly Leu Val Gly He Asn Thr Ala He He
195 200 205
Ser Lys Thr Gly Gly Asn His Gly He Gly Phe Ala He Pro Ser Asn
210 215 220 Met Val Lys Asp Thr Val Thr Gin Leu He Lys Thr Gly Lys He Glu
225 230 235 240
Arg Gly Tyr Leu Gly" Val Gly Leu Gin Asp Leu Ser Gly Asp Leu Gin
245 250 255
Asn Ser Tyr Asp Asn Lys Glu Gly Ala Val Val He Ser Val Glu Lys 260 265 270
Asp Ser Pro Ala Lys Lys Ala Gly He Leu Val Trp Asp Leu He Thr
275 280 285
Glu Val Asn Gly Lys Lys Val Lys Asn Thr Asn Glu Leu Arg Asn Leu
290 295 300 He Gly Ser Met Leu Pro Asn Gin Arg Val Thr Leu Lys Val He Arg
305 310 315 320
Asp Lys Lys Glu Arg Ala Phe Thr Leu Thr Leu Ala Glu Arg Lys Asn
325 330 335
Pro Asn Lys Lys Glu Thr He Ser Ala Gin Asn Gly Ala Gin Gly Gin 340 345 350
Leu Asn Gly Leu Gin Val Glu Asp Leu Thr Gin Lys Thr Lys Arg Ser
355 360 365
Met Arg Leu Ser Asp Asp Val Gin Gly Val Leu Val Ser Gin Val Asn
370 375 380 Glu Asn Ser Pro Ala Glu Gin Ala Gly Phe Arg Gin Gly Asn He He
385 390 395 400
Thr Lys He Glu Glu Val Glu Val Lys Ser Val Ala Asp Phe Asn His
405 410 415
Ala Leu Glu Lys Tyr Lys Gly Lys Pro Lys Arg Phe Leu Val Leu Asp 420 425 430
Leu Asn Gin Gly Tyr Arg He He Leu Val Lys 435 440
(2) INFORMATION FOR SEQ ID NO:343:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 40 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 04b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:343:
Leu Ala Asn Lys Ala Ser Leu Asn Ala Gly Asn He Gin He Gin Asn 1 5 10 15 Met Pro Lys Val Lys Glu Arg Val Ser Val Pro Ser Lys Asp Asp Thr
20 25 30
Asp Leu Phe Leu Pro Arg Phe Tyr 35 40
(2) INFORMATION FOR SEQ ID NO:344:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 28 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 04c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:344:
Met Ser Leu He Arg Val Asn Gly Glu Ala Phe Lys Leu Ser Leu Glu 1 5 10 15 Ser Leu Glu Glu Asp Pro Phe Glu Thr Lys Glu Thr 20 25
(2) INFORMATION FOR SEQ ID NO: 345: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 155 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 05a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 345: Met Arg He He Glu Gly Lys Leu Gin Leu Gin Gly Asn Glu Arg Val 1 5 10 15
Ala He Leu Thr Ser Arg Phe Asn His He He Thr Asp Arg Leu Gin
20 25 30
Glu Gly Ala Met Asp Cys Phe Lys Arg His Gly Gly Asp Glu Asp Leu 35 40 45
Leu Asp He Val Leu Val Pro Gly Ala Tyr Glu Leu Pro Phe He Leu
50 55 60
Asp Lys Leu Leu Glu Ser Glu Lys Tyr Asp Gly Val Cys Val Leu Gly 65 70 75 80 Ala He He Arg Gly Gly Thr Pro His Phe Asp Tyr Val Ser Ala Glu
85 90 95
Ala Thr Lys Gly He Ala His Ala Met Leu Lys Tyr Ser Met Pro Val
100 105 110
Ser Phe Gly Val Leu Thr Asp Asn He Glu Gin Ala He Glu Arg Ala 115 120 125
Gly Ser Lys Ala Gly Asn Lys Gly Phe Glu Ala Met Ser Thr Leu He
130 135 140
Glu Leu Leu Ser Leu Cys Gin Thr Leu Lys Gly 145 150 155
(2) INFORMATION FOR SEQ ID NO:346:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 39 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 05b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:346: Leu Phe Ala Glu Thr His Val Asp Pro Lys Asn Ala Leu Ser Asp Gly
1 5 10 15
Ala Asn Met Leu Lys Pro Ser Glu Leu Glu His Leu Val Thr Asp Met 20 25 30 Leu Lys He Gin Asn Leu Phe
35
(2) INFORMATION FOR SEQ ID NO: 347: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 229 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 06a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 347: Ser Gly Tyr His Ala Glu His Leu Ala Gly Lys Glu Ala Leu Phe Lys 1 5 10 15
Val Lys Leu Arg Gin He Gin Ala Arg Glu Val Leu Glu He Asn Asp
20 25 30
Glu Leu Ala Lys He Val Leu Ala Asn Glu Glu Asp Ala Thr Leu Lys 35 40 45
Leu Leu Lys Glu Arg Val Glu Gly Gin Leu Phe Leu Glu Asn Lys Ala
50 55 60
Arg Leu Tyr Asn Glu Glu Leu Lys Glu Lys Leu He Glu Asn Leu Asp 65 70 75 80 Glu Lys He Val Phe Asp Leu Pro Lys Thr He He Glu Gin Glu Met
85 90 95
Asp Leu Leu Phe Arg Asn Ala Leu Tyr Ser Met Gin Ala Glu Glu Val
100 105 110
Lys Ser Leu Gin Glu Ser Gin Glu Lys Ala Lys Glu Lys Arg Glu Ser 115 120 125
Phe Arg Asn Asp Ala Thr Lys Ser Val Lys He Thr Phe He He Asp
130 135 140
Ala Leu Ala Lys Arg Arg Lys He Gly Val His Asp Asn Glu Val Phe 145 150 155 160 Gin Thr Leu Tyr Tyr Glu Ala Met Met Thr Gly Arg Asn Pro Glu Ser
165 170 175
Leu He Glu Gin Tyr Arg Lys Asn Asn Met Leu Ala Ala Val Lys Met
180 185 190
Ala Met He Glu Asp Arg Val Leu Ala Tyr Leu Leu Asp Lys Asn Leu 195 200 205
Pro Lys Glu Gin Gin Glu He Leu Glu Lys Met Arg Pro Asn Ala Gin
210 215 220
Lys He Gin Ala Gly 225
(2) INFORMATION FOR SEQ ID NO:348:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 152 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 06b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:348:
Met Met Gly Tyr He Pro Tyr Val He Glu Asn Thr Asp Arg Gly Glu 1 5 10 15 Arg Ser Tyr Asp He Tyr Ser Arg Leu Leu Lys Asp Arg He Val Leu 20 25 30
Leu Ser Gly Glu He Asn Asp Ser Val Ala Ser Ser He Val Ala Gin 35 40 45
Leu Leu Phe Leu Glu Ala Glu Asp Pro Glu Lys Asp He Gly Leu Tyr
50 55 60
He Asn Ser Pro Gly Gly Val He Thr Ser Gly Leu Ser He Tyr Asp 65 70 75 80
Thr Met Asn Phe He Arg Pro Asp Val Ser Thr He Cys He Gly Gin
85 90 95
Ala Ala Ser Met Gly Ala Phe Leu Leu Ser Cys Gly Ala Lys Gly Lys 100 105 110 Arg Phe Ser Leu Pro His Ser Arg He Met He His Gin Pro Leu Gly 115 120 125
Gly Ala Gin Gly Gin Ala Ser Asp He Glu He He Ser Asn Glu He
130 135 140
Leu Arg Leu Lys Gly Leu Met Asn 145 150
(2) INFORMATION FOR SEQ ID NO: 349:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 139 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 07, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 349:
His Leu Leu Ala Asp Thr Tyr Ala Leu Ala Ala Thr Thr Gly Asn Val 1 5 10 15
Phe He Leu Lys Ala Asp Tyr Asn He His Lys Trp Gly Leu Thr Leu
20 25 30
Thr Trp Leu Ser Arg Phe Val Thr Asn Met Phe Tyr Glu Gly Tyr Ser 35 40 45 He Tyr Tyr Pro Gin Tyr Gly Leu Met Lys He His Lys Pro Gly Tyr 50 55 60
Gly Val His Asn Val Phe He Asn Trp Thr Pro Thr Ser Lys Lys Trp 65 70 75 80
Gin Gly Leu Arg He Ser Ala Val Phe Asn Asn He Leu Asn Lys Gin 85 90 95
Tyr Val Asp Gin Thr Ser Val Phe Gin Ala Ser Ala Asp Ala Pro Ala
100 105 110
Ser Asp Met He Pro Lys Gly Lys Arg Met Ala Leu Pro Ala Pro Gly 115 120 125 Phe Asn Ala Arg Phe Glu Val Ser Tyr Gin Phe 130 135
(2) INFORMATION FOR SEQ ID NO: 350: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 72 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 08a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 350: Met Lys Phe Gin Pro Leu Gly Glu Arg Val Leu Val Glu Arg Leu Glu 1 5 10 15
Glu Glu Asn Lys Thr Ser Ser Gly He He He Pro Asp Asn Ala Lys
20 25 30
Glu Lys Pro Leu Met Gly Val Val Lys Ala Val Ser His Lys He Ser 35 40 45
Glu Gly Cys Lys Cys Val Lys Glu Gly Asp Val He Ala Phe Gly Lys 50 55 60 Tyr Lys Gly Ala Glu He Val Leu 65 70
(2) INFORMATION FOR SEQ ID NO: 351:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 52 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 08b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 351:
Met He Leu Lys Ser Ser He Asp Arg Leu Leu Gin Thr He Asp He
1 5 10 15
Val Glu Val He Ser Ser Tyr Val Asp Leu Arg Lys Ser Gly Ser Asn 20 25 30 Tyr Met Ala Cys Cys Pro Phe His Glu Glu Arg Ser Ala Ser Phe Ser 35 40 45
Val Asn Gin He 50 (2) INFORMATION FOR SEQ ID NO: 352:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 269 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 09a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:352:
Met He Leu Arg Ala Ser Val Leu Ser Ala Leu Leu Leu Val Ser Leu
1 5 10 15
Gly Ala Ala Pro Lys His Ser Val Ser Ala Asn Asp Lys Arg Met Gin 20 25 30
Asp Asn Leu Val Ser Val He Glu Lys Gin Thr Asn Lys Lys Val Arg
35 40 45
He Leu Glu He Lys Pro Leu Lys Ser Ser Gin Asp Leu Lys Met Val 50 55 60 Val He Glu Asp Pro Asp Thr Lys Tyr Asn He Pro Leu Val Val Ser 65 70 75 80
Lys Asp Gly Asn Leu He He Gly Leu Ser Ser He Phe Phe Ser Tyr
85 90 95
Lys Ser Asp Asp Val Arg Leu Val Ala Glu Thr Asn Gin Asn Val Gin 100 105 110
Ala Ala Leu Thr Leu Pro Ser Lys Ser Ser Ala Lys Val Glu Ala Leu
115 120 125
Val Cys Val Asn Glu Asn He Pro Ala Asp Tyr Ala He Glu Leu Pro
130 135 140 Ser Thr Asn Ala Glu Asn Lys Asp Lys He Leu Tyr He Val Ser Asp
145 150 155 160
Pro Met Cys Pro His Cys Gin Lys Glu Leu Thr Lys Leu Arg Asp His
165 170 175
Leu Lys Glu Asn Thr Val Arg Met Val Val Val Gly Trp Leu Gly Val 180 185 190
Asn Ser Ala Lys Lys Ala Ala Leu He Gin Glu Glu Met Ala Lys Ala
195 200 205
Arg Ala Arg Gly Ala Ser Val Glu Asp Lys He Ser He Leu Glu Lys
210 215 220 He Tyr Ser Thr Gin Tyr Asp He Asn Ala Gin Lys Glu Pro Glu Asp
225 230 235 240
Leu Arg Thr Lys Val Glu Asn Thr Thr Lys Lys He Phe Glu Ser Gly 245 250 255
Val He Lys Gly Val Pro Phe Leu Tyr His Tyr Lys Ala 260 265 (2) INFORMATION FOR SEQ ID NO:353:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 109 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 09b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:353:
Met Lys Lys Pro Tyr Arg Lys He Ser Asp Tyr Ala He Val Gly Gly
1 5 10 15
Leu Ser Ala Leu Val Met Val Ser He Val Gly Cys Lys Ser Asn Ala 20 25 30
Asp Asp Lys Pro Lys Glu Gin Ser Ser Leu Ser Gin Ser Val Gin Lys
35 40 45
Gly Ala Phe Val He Leu Glu Glu Gin Lys Asp Lys Ser Tyr Lys Val 50 55 60 Val Glu Glu Tyr Pro Ser Ser Lys Thr His He He Val Arg Asp Leu 65 70 75 80
Gin Gly Asn Glu Arg Val Leu Ser Asn Glu Glu He Gin Lys Leu He
85 90 95
Lys Glu Glu Glu Ala Lys He Asp Asn Gly Thr Ser Lys 100 105
(2) INFORMATION FOR SEQ ID NO: 354:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 125 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 10a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:354:
Met Ala He Ser Lys Glu Glu Val Leu Glu Tyr He Gly Ser Leu Ser 1 5 10 15
Val Leu Glu Leu Ala Glu Leu Val Lys Met Phe Glu Glu Lys Phe Gly
20 25 30
Val Ser Ala Thr Pro Thr Val Val Ala Gly Ala Ala Val Ala Gly Gly 35 40 45 Ala Ala Ala Glu Ser Glu Glu Lys Thr Glu Phe Asn Val He Leu Ala 50 55 60
Asp Ser Gly Ala Glu Lys He Lys Val He Lys Val Val Arg Glu He 65 70 75 80
Thr Gly Leu Gly Leu Lys Glu Ala Lys Asp Ala Thr Glu Lys Thr Pro 85 90 95
His Val Leu Lys Glu Gly Val Asn Lys Glu Glu Ala Glu Thr He Lys
100 105 110
Lys Lys Leu Glu Glu Val Gly Ala Lys Val Glu Val Lys 115 120 125
(2) INFORMATION FOR SEQ ID NO: 355:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 84 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 10b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:355:
Val Phe Leu Trp Gly Asp Asp Gin He Ala Leu Ser Lys Leu Val Phe
1 5 10 15
Asp Phe Gin Lys Glu His Lys Asp His Phe Val Leu Lys Ala Gly Leu 20 25 30 Phe Asp Lys Glu Ser Val Ser Val Ala His Val Glu Ala Val Ser Lys 35 40 45
Leu Pro Ser Lys Glu Glu Leu Met Gly Met Leu Leu Ser Val Trp Thr
50 55 60
Ala Pro Val Arg Tyr Phe Val Thr Gly Leu Asp Asn Leu Arg Lys Ala 65 70 75 80
Lys Glu Glu Asn
(2) INFORMATION FOR SEQ ID NO: 356:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 167 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 11a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:356:
Met Leu Ser Lys Asp He He Lys Leu Leu Asn Glu Gin Val Asn Lys
1 5 10 15
Glu Met Asn Ser Ser Asn Leu Tyr Met Ser Met Ser Ser Trp Cys Tyr 20 25 30 Thr His Ser Leu Asp Gly Ala Gly Leu Phe Leu Phe Asp His Ala Ala 35 40 45
Glu Glu Tyr Glu His Ala Lys Lys Leu He Val Phe Leu Asn Glu Asn
50 55 60
Asn Val Pro Val Gin Leu Thr Ser He Ser Ala Pro Glu His Lys Phe 65 70 75 80
Glu Gly Leu Thr Gin He Phe Gin Lys Ala Tyr Glu His Glu Gin His
85 90 95
He Ser Glu Ser He Asn Asn He Val Asp His Ala He Lys Gly Lys 100 105 110 Asp His Ala Thr Phe Asn Phe Leu Gin Trp Tyr Val Ser Glu Gin His 115 120 125
Glu Glu Glu Val Leu Phe Lys Asp He Leu Asp Lys He Glu Leu He
130 135 140
Gly Asn Glu Asn His Gly Leu Tyr Leu Ala Asp Gin Tyr He Lys Gly 145 150 155 160
He Ala Lys Ser Arg Lys Ser 165
(2) INFORMATION FOR SEQ ID NO: 357:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster lib, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 357:
Val Gin Lys Leu Ala Val Phe Asp Phe Asp Ser Thr Leu Val Asn Ala 1 5 10 15 Glu Thr He Glu Ser Leu Ala Arg Ala Trp Gly Val Phe Asp Glu Val
20 25 30
Lys
(2) INFORMATION FOR SEQ ID NO: 358:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 303 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 12a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 358:
Met Lys Lys He He Leu Ala Cys Leu Met Ala Phe Val Gly Ala Asn 1 5 10 15 Leu Ser Ala Glu Pro Lys Trp Tyr Gly Lys Ala Tyr Asn Lys Thr Asn 20 25 30
Thr Gin Lys Gly Tyr Leu Tyr Gly Asn Gly Ser Ala Thr Ser Lys Glu
35 40 45
Ala Ser Lys Gin Lys Ala Leu Ala Asp Leu Val Ala Ser He Ser Val 50 55 60
Val Val Asn Ser Gin He His He Gin Lys Ser Arg Val Asp Asn Lys 65 70 75 80
Leu Lys Ser Ser Asp Ser Gin Thr He Asn Leu Lys Thr Asp Asp Leu 85 90 95 Glu Leu Asn Asn Val Glu He Val Asn Gin Glu Ala Gin Lys Gly He 100 105 110
Tyr Tyr Thr Arg Val Arg He Asn Gin Asn Leu Phe Leu Gin Gly Leu
115 120 125
Arg Asp Lys Tyr Asn Ala Leu Tyr Gly Gin Phe Ser Thr Leu Met Pro 130 135 140
Lys Val Cys Lys Gly Val Phe Leu Gin Gin Ser Lys Ser Met Gly Asp
145 150 155 160
Leu Leu Ala Lys Ala Met Pro He Glu Arg He Leu Lys Ala Tyr Ser
165 170 175 Val Pro Val Gly Ser Leu Glu Asn Tyr Glu Lys He Tyr Tyr Gin Asn
180 185 190
Ala Phe Lys Pro Lys Val Gin He Thr Phe Asp Asn Asn Ser Asp Thr
195 200 205
Glu He Lys Asn Ala Leu He Ser Ala Tyr Ala Arg Val Leu Thr Pro 210 215 220
Ser Asp Glu Glu Lys Leu Tyr Gin He Lys Asn Glu Val Phe Thr Asp
225 230 235 240
Ser Ala Asn Gly Thr Thr Arg He Arg Val Val Val Ser Ala Ser Asp
245 250 255 Cys Gin Gly Thr Pro Val Leu Asn Arg Ser Leu Glu Val Asp Glu Lys
260 265 270
Asn Lys Asn Phe Ala He Thr Arg Leu Gin Ser Leu Leu Tyr Lys Glu
275 280 285
Leu Lys Asp Tyr Ala Asn Lys Glu Gly Gin Gly Asn Thr Gly Leu 290 295 300
(2) INFORMATION FOR SEQ ID NO: 359:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 210 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 12b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:359: Met Val Leu Lys Thr Lys Leu Lys He He Ser Ser Val He Leu Asn
1 5 10 15
Thr Leu Leu Trp Val Gly Cys Ser Ser Glu Met Ala Thr Tyr Gin Asn 20 25 30 Val Asn Asp Ala Thr Lys Asn Thr Thr Ala Ser He Asn Ser Thr Asp 35 40 45
Leu Leu Leu Thr Ala Asn Ala Met Leu Asp Ser Met Phe Ser Asp Pro
50 55 60
Asn Phe Glu Gin Leu Lys Gly Lys His Leu He Glu Val Ser Asp Val 65 70 75 80
He Asn Asp Thr Thr Gin Pro Asn Leu Asp Met Asn Leu Leu Thr Thr
85 90 95
Glu He Ala Arg Gin Leu Arg Leu Arg Ser Asn Gly Arg Phe Asn He 100 105 110 Thr Arg Ala Asn Gly Arg Asn Gly He Ala Ala Asp Ser Arg Met Val 115 120 125
Lys Gin Arg Glu Lys Glu Arg Glu Ser Glu Glu Tyr Asn Gin Asp Thr
130 135 140
Thr Val Glu Lys Gly Thr Leu Lys Ala Ala Asp Leu Ser Leu Ser Gly 145 150 155 160
Lys Val Ser Ser He Ala Ala Ser He Ser Ser Ser Arg Gin Arg Leu
165 170 175
Asp Tyr Asp Phe Thr Leu Ser Leu Thr Asn Arg Lys Thr Gly Glu Glu 180 185 190 Val Trp Ser Asp Val Lys Pro He Val Lys Asn Ala Ser Asn Lys Arg 195 200 205
Met Phe 210 (2) INFORMATION FOR SEQ ID NO: 360:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 175 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 12c, antigen 3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:360:
Met Lys Asn Gin Val Lys Lys He Leu Gly Met Ser Val He Ala Ala
1 5 10 15
Met Val He Val Gly Cys Ser His Ala Pro Lys Ser Gly He Ser Lys 20 25 30
Ser Asn Lys Ala Tyr Lys Glu Ala Thr Lys Gly Ala Pro Asp Trp Val
35 40 45
Val Gly Asp Leu Glu Lys Val Ala Lys Tyr Glu Lys Tyr Ser Gly Val 50 55 60 Phe Leu Gly Arg Ala Glu Asp Leu He Thr Asn Asn Asp Val Asp Tyr 65 70 75 80
Ser Thr Asn Gin Ala Thr Ala Lys Ala Arg Ala Asn Leu Ala Ala Asn
85 90 95
Leu Lys Ser Thr Leu Gin Lys Asp Leu Glu Asn Glu Lys Thr Arg Thr 100 105 110
Val Asp Ala Ser Gly Lys Arg Ser He Ser Gly Thr Asp Thr Glu Lys
115 120 125
He Ser Gin Leu Val Asp Lys Glu Leu He Ala Ser Lys Met Leu Ala 130 135 140 Arg Tyr Val Gly Lys Asp Arg Val Phe Val Leu Val Gly Leu Asp Lys 145 150 155 160
Gin He Val Asp Lys Val Arg Glu Glu Leu Gly Met Val Lys Lys 165 170 175 (2) INFORMATION FOR SEQ ID NO: 361:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 130 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 12d, antigen 4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 361: Leu Trp He Lys Cys Ala Lys Ser Trp Ala Trp Leu Lys Ser Arg Val 1 5 10 15
Leu Met Lys Arg Leu Ala He Ala Leu Val Leu Val Leu Gly Val Ala
20 25 30
Trp Gly Lys Ser Leu Pro Lys Trp Ala Lys Asp Cys Ser Lys Glu Val 35 40 45
Gin He Glu Lys Thr Leu Thr Lys Asp Glu Lys He Leu Val Cys Gly
50 55 60
Val Ser Asp He Leu Leu Ser Asp Met Asp Tyr Ser Leu Ser Ser Ala 65 70 75 80 Arg Gin Asn Ala Leu Glu Lys Val Met Glu Ala Phe Lys Gly Asp Lys
85 90 95
He Glu He Lys Ala Ser Glu Leu Lys Ala Thr Phe He Asp Thr Asp
100 105 110
Lys Val Tyr Val Leu Leu Arg He Thr Lys Lys His Val Ala Leu Met 115 120 125
Asn Glu 130
(2) INFORMATION FOR SEQ ID NO:362:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 256 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 13, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:362:
Met Gly Tyr Ala Ser Lys Leu Ala Leu Lys He Cys Leu Ala Ser Leu
1 5 10 15
Cys Leu Phe Ser Ala Leu Gly Ala Glu His Leu Glu Gin Lys Arg Asn 20 25 30 Phe He Tyr Lys Gly Glu Glu Ala Tyr Asn Asn Lys Glu Tyr Glu Arg 35 40 45
Ala Ala Ser Phe Tyr Lys Ser Ala He Lys Asn Gly Glu Pro Leu Ala
50 55 60
Tyr Val Leu Leu Gly He Met Tyr Glu Asn Gly Arg Gly Val Pro Lys 65 70 75 80
Asp Tyr Lys Lys Ala Ala Glu Tyr Phe Gin Lys Ala Val Asp Asn Asp
85 90 95
He Pro Arg Gly Tyr Asn Asn Leu Gly Val Met Tyr Lys Glu Gly Arg 100 105 110 Gly Val Pro Lys Asp Glu Lys Lys Ala Val Glu Tyr Phe Arg He Ala 115 120 125
Thr Glu Lys Gly Tyr Thr Asn Ala Tyr He Asn Leu Gly He Met Tyr
130 135 140
Met Glu Gly Arg Gly Val Pro Ser Asn Tyr Val Lys Ala Thr Glu Cys 145 150 155 160
Phe Arg Lys Ala Met His Lys Gly Asn Val Glu Ala Tyr He Leu Leu
165 170 175
Gly Asp He Tyr Tyr Ser Gly Asn Asp Gin Leu Gly He Glu Pro Asp 180 185 190 Lys Asp Lys Ala He Val Tyr Tyr Lys Met Ala Ala Asp Met Ser Ser 195 200 205
Ser Arg Ala Tyr Glu Gly Leu Ala Glu Ser Tyr Gin Tyr Gly Leu Gly 210 215 220
Val Glu Lys Asp Lys Lys Lys Ala Glu Glu Tyr Met Gin Lys Ala Cys 225 230 235 240
Asp Phe Asp He Asp Glu Asn Cys Lys Lys Lys Asn Thr Ser Ser Arg 245 250 255
(2) INFORMATION FOR SEQ ID NO: 363:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 276 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 14, antigen, groEL epitope, partial sequence
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:363:
Lys Lys Val Gly Gly Lys Glu Glu He Thr Gin Val Ala Thr He Ser 1 5 10 15
Ala Asn Ser Asp His Asn He Gly Lys Leu He Ala Asp Ala Met Glu
20 25 30
Lys Val Gly Lys Asp Gly Val He Thr Val Glu Glu Ala Gin Gly He 35 40 45 Glu Asp Glu Leu Asp Val Val Glu Gly Met Gin Phe Asp Arg Gly Tyr 50 55 60
Leu Ser Pro Tyr Phe Val Thr Asn Ala Glu Lys Met Thr Ala Gin Leu 65 70 75 80
Asp Asn Ala Tyr He Leu Leu Thr Asp Lys Lys He Ser Ser Met Lys 85 90 95
Asp He Leu Pro Leu Leu Glu Lys Thr Met Lys Glu Gly Lys Pro Leu
100 105 110
Leu He He Ala Glu Asp He Glu Gly Glu Ala Leu Thr Thr Leu Val 115 120 125 Val Asn Lys Leu Arg Gly Val Leu Asn He Ala Ala Val Lys Ala Pro 130 135 140
Gly Phe Gly Asp Arg Arg Lys Glu Met Leu Lys Asp He Ala He Leu 145 150 155 160
Thr Gly Gly Gin Val He Ser Glu Glu Leu Gly Leu Thr Leu Glu Asn 165 170 175
Ala Glu Val Glu Phe Leu Gly Lys Ala Gly Arg He Val He Asp Lys
180 185 190
Asp Asn Thr Thr He Val Asp Gly Lys Gly His Ser Asp Asp Val Lys 195 200 205 Asp Arg Val Ala Gin He Lys Thr Gin He Ala Ser Thr Thr Ser Asp 210 215 220
Tyr Asp Lys Glu Lys Leu Gin Glu Arg Leu Ala Lys Leu Ser Gly Gly 225 230 235 240
Val Ala Val He Lys Val Gly Ala Ala Ser Glu Val Glu Met Lys Glu 245 250 255
Lys Lys Asp Arg Val Asp Asp Ala Leu Ser Ala Thr Lys Ala Ala Val
260 265 270
Glu Glu Gly He 275
(2) INFORMATION FOR SEQ ID NO: 364:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 260 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 15, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 364 Pro Lys He Phe Thr Leu Leu Glu Leu Leu Arg Leu Phe Leu Asn His
1 5 10 15
Arg Lys Thr He He He Arg Arg Thr He Phe Glu Leu Glu Lys Ala 20 25 30 Lys Ala Arg Ala His He Leu Glu Gly Tyr Leu He Ala Leu Asp Asn 35 40 45
He Asp Gly He Val Arg Leu He Lys Thr Ser Pro Ser Pro Glu Ala
50 55 60
Ala Lys Asn Ala Leu Met Glu Arg Phe Thr Leu Ser Glu He Gin Ser 65 70 75 80
Lys Ala He Leu Glu Met Arg Leu Gin Arg Leu Thr Gly Leu Glu Arg
85 90 95
Asp Lys He Lys Glu Glu Tyr Gin Asn Leu Leu Glu Leu He Asp Asp 100 105 110 Leu Asn Gly He Leu Lys Ser Glu Asp Arg Leu Asn Gly Val Val Lys 115 120 125
Thr Glu Leu Leu Glu Val Lys Glu Gin Phe Ser Ser Pro Arg Arg Thr
130 135 140
Glu He Gin Glu Ser Tyr Glu Asn He Asp He Glu Asp Leu He Ala 145 150 155 160
Asn Glu Pro Met Val Val Ser Met Ser Tyr Lys Gly Tyr Val Lys Arg
165 170 175
Val Gly Leu Lys Ala Tyr Glu Lys Gin Asn Arg Gly Gly Lys Gly Lys 180 185 190 Leu Ser Gly Ser Thr Tyr Glu Asp Asp Phe He Glu Asn Phe Phe Val 195 200 205
Ala Asn Thr His Asp He Leu Leu Phe He Thr Asn Lys Gly Gin Leu
210 215 220
Tyr His Leu Lys Val Tyr Lys He Pro Glu Ala Ser Arg He Ala Met 225 230 235 240
Gly Lys Ala He Val Asn Leu He Ser Leu Ala Pro Asp Glu Lys He
245 250 255
Met Ala Thr Leu 260
(2) INFORMATION FOR SEQ ID NO: 365:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 144 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 16a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:365:
Met Asn Leu Phe Glu Lys Met Thr Asp Gin Leu His Glu Ala Leu Asp 1 5 10 15 Ser Ala Leu Ala Leu Ala Leu His His Lys Asn Ala Glu Val Thr Pro 20 25 30
Met His Met Leu Phe Ala Met Leu Asn Asn Ser Gin Gly He Leu He
35 40 45
Gin Ala Leu Gin Lys Met Pro Val Asp He Glu Ala Leu Lys Leu Ser 50 55 60
Val Gin Ser Glu Leu Asn Lys Leu Ala Lys Val Ser Gin He Ser Lys 65 70 75 80
Gin Asn He Gin Leu Asn Gin Ala Leu He Gin Ser Leu Glu Asn Ala 85 90 95 Gin Gly Leu Met Ala Lys Thr Gly Asp Ser Phe He Ala Thr Asp Val 100 105 110
Tyr Leu Leu Ala Asn Met Ser Leu Phe Glu Ser Val Leu Lys Pro Tyr
115 120 125
Leu Asp Thr Lys Glu Leu Gin Lys Thr Leu Glu Ser Leu Arg Lys Gly 130 135 140
(2) INFORMATION FOR SEQ ID NO: 366: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 16b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 366:
Gly Arg Phe Ser Val Gly Leu Glu He Glu Lys Glu Tyr Cys Glu Leu
1 5 10 15
Ser Lys Lys Arg He Leu Glu Ser Leu Ser Leu Val 20 25
(2) INFORMATION FOR SEQ ID NO: 367:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 167 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 17a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:367:
Met Lys He Leu Val He Gin Gly Pro Asn Leu Asn Met Leu Gly His 1 5 10 15 Arg Asp Pro Arg Leu Tyr Gly Met Val Thr Leu Asp Gin He His Glu 20 25 30
He Met Gin Thr Phe Val Lys Gin Gly Asn Leu Asp Val Glu Leu Glu
35 40 45
Phe Phe Gin Thr Asn Phe Glu Gly Glu He He Asp Lys He Gin Glu 50 55 60
Ser Val Gly Ser Asp Tyr Glu Gly He He He Asn Pro Gly Ala Phe 65 70 75 80
Ser His Thr Ser He Ala He Ala Asp Ala He Met Leu Ala Gly Lys 85 90 95 Pro Val He Glu Val His Leu Thr Asn He Gin Ala Arg Glu Glu Phe 100 105 110
Arg Lys Asn Ser Tyr Thr Gly Ala Ala Cys Gly Gly Val He Met Gly
115 120 125
Phe Gly Pro Leu Gly Tyr Asn Met Ala Leu Met Ala Met Val Asn He 130 135 140
Leu Ala Glu Met Lys Ala Phe Gin Glu Ala Gin Gin Asn Asn Pro Asn 145 150 155 160
Asn Pro He Asn Asn Gin Lys 165
(2) INFORMATION FOR SEQ ID NO: 368:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 151 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 17b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 368:
Met Lys Gly Leu Glu Arg Glu Ser His Phe Thr Leu Asn Glu Asn Ala 1 5 10 15 Met Phe Phe Glu Cys Ala Tyr Ser Cys Asp Asn Ala Leu Phe Leu Gin 20 25 30
Leu Asp Asp Arg Ser Phe Phe He Thr Asp Ser Arg Tyr Thr Gin Glu 35 40 45
Ala Lys Glu Ser Val Gin Pro Lys Asn Gly Val Leu Ala Glu Val Val
50 55 60
Glu Ser Ser Asp Leu Ala Arg Ser Ala Val Asp Leu He Ala Lys Ser 65 70 75 80
Ser Val Lys Lys Leu Phe Phe Asp Pro Asn Gin Val Asn Leu Gin Thr
85 90 95
Tyr Lys Arg Leu Asn Ser Ala Leu Gly Asp Lys Val Ala Leu Glu Gly 100 105 110 Val Pro Ser Tyr His Arg Gin Lys Arg He He Lys Asn Glu His Glu 115 120 125
He Gin Leu Leu Lys Lys Ser Gin Ala Leu Asn Val Glu Ala Phe Glu
130 135 140
Asn Phe Ala Glu Tyr Val Lys 145 150
(2) INFORMATION FOR SEQ ID NO: 369:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 73 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 17c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 369:
Gly Phe Leu He Trp Leu Phe Phe Leu Leu Val He Val Lys He Phe 1 5 10 15
Trp Ser Gly He Lys Gin Asn Ser Leu He Ser Leu Phe He Leu Thr
20 25 30
Thr Leu Ala Phe Tyr Leu He Phe Gly He Gly Phe Asp Pro Phe Asp 35 40 45 Phe Phe He Thr Gly Ser Phe Phe Val Gly He He Met Met Ala Val 50 55 60
Phe Leu Lys Lys Asp Lys Ser Ala Phe 65 70 (2) INFORMATION FOR SEQ ID NO: 370:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 187 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 18a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 370:
Met Thr Asn Glu Thr He Asn Gin Gin Pro Gin Thr Glu Ala Ala Phe
1 5 10 15
Asn Pro Gin Gin Phe He Asn Asn Leu Gin Val Ala Phe Leu Lys Val 20 25 30
Asp Asn Ala Val Ala Ser Tyr Asp Pro Asp Gin Lys Pro He Val Asp
35 40 45
Lys Asn Asp Arg Asp Asn Arg Gin Ala Phe Asp Gly Ser Ser Gin Leu 50 55 60 Arg Glu Glu Tyr Ser Asn Lys Ala He Gin Asn Pro Thr Lys Lys Asn 65 70 75 80
Gin Tyr Phe Ser Asp Phe He Asn Glu Ser Asn Asp Leu He Asn Lys
85 90 95
Asp Asn Leu He Asp He Gly Ser Ser He Lys Ser Phe Gin Lys Phe 100 105 110
Gly Thr Gin Arg Tyr Arg He Phe Thr Ser Trp Val Ser His Gin Asn 115 120 125 Asp Pro Ser Lys He Asn Thr Arg Ser He Arg Asn Phe Met Glu Asn
130 135 140
He He Gin Pro Pro He Pro Asp Asp Lys Glu Lys Ala Glu Phe Leu 145 150 155 160 Lys Ser Ala Lys Gin Ser Phe Ala Gly He He He Gly Asn Gin He
165 170 175
Arg Thr Asp Gin Lys Phe Met Gly Val Phe Asp 180 185 (2) INFORMATION FOR SEQ ID NO: 371:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 186 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 18b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 371:
Met Pro Ala Ser He Gly Ser Leu Val Ser Gin Leu Phe Tyr Lys Glu
1 5 10 15
Lys Leu Lys Asn Gly Val He Lys Asn Thr Ser Gin Phe Tyr Asp Pro 20 25 30
Lys Asn He He Arg Trp He Asn Val Glu Gly Glu His Lys Leu Glu
35 40 45
Lys Thr Ser Ser Tyr Asn Lys Asn Gin Val Gin Lys He He Glu Leu 50 55 60 Leu Glu Gin He Asn Arg He Leu Asn Gin Arg Lys He Lys Lys Thr 65 70 75 80
He Gly He He Thr Pro Tyr Asn Ala Gin Lys Arg Cys Leu Arg Ser
85 90 95
Glu Val Glu Lys Tyr Gly Phe Lys Asn Phe Asp Glu Leu Lys He Asp 100 105 110
Thr Val Asp Ala Phe Gin Gly Glu Lys Ala Asp He He He Tyr Ser
115 120 125
Thr Val Lys Thr Tyr Gly Asn Leu Ser Phe Leu He Asp Ser Lys Arg
130 135 140 Leu Asn Val Leu Phe Leu Gly Gin Lys Lys He Ser Phe Leu Trp Ala
145 150 155 160
Lys Ser Leu Ser Leu Arg He Cys Glu Ala Met Arg Arg He Ser Leu
165 170 175
Ala Leu Phe Cys Lys Ser Val Asp Arg Glx 180 185
(2) INFORMATION FOR SEQ ID NO:372:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 291 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 19a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 372:
Met Leu Glu Asn Val Gin Lys Ser Leu Phe Arg Val Leu Cys Leu Gly 1 5 10 15
Ala Leu Cys Leu Gly Gly Leu Met Ala Glu Pro Asp Pro Lys Glu Leu
20 25 30
Val Gly Leu Gly Ala Lys Ser Tyr Lys Glu Gin Asp Phe Thr Gin Ala 35 40 45 Lys Lys Tyr Phe Glu Lys Ala Cys Asp Leu Lys Glu Asn Ser Gly Cys 50 55 60
Phe Asn Leu Gly Val Leu Tyr Tyr Gin Gly Gin Gly Val Glu Lys Asn 65 70 75 80
Leu Lys Lys Ala Ala Ser Phe Tyr Ala Lys Ala Cys Asp Leu Asn Tyr
85 90 95
Ser Asn Gly Cys His Leu Leu Gly Asn Leu Tyr Tyr Ser Gly Gin Gly 100 105 110
Val Ser Pro His Thr Asn Lys Ala Leu Gin Tyr Tyr Ser Lys Ala Cys
115 120 125
Asp Leu Lys Tyr Ser Glu Gly Cys Ala Ser Leu Gly Gly He Tyr His
130 135 140 Asp Gly Gly Lys Trp Tyr Thr Arg Asp Phe Lys Lys Ala Val Glu Tyr
145 150 155 160
Phe Thr Lys Ala Cys Asp Leu Asn Asp Gly Asp Gly Cys Thr He Leu
165 170 175
Gly Ser Leu Tyr Asp Ala Gly Arg Gly Thr Pro Lys Asp Leu Lys Lys 180 185 190
Ala Leu Ala Ser Phe Asp Lys Ala Cys Asp Leu Lys Asp Ser Pro Gly
195 200 205
Cys Phe Asn Ala Gly Asn Met Tyr His His Gly Asp Gly Val Ala Lys
210 215 220 Asn Phe Lys Glu Ala Leu Asp Arg Tyr Ser Lys Ala Cys Glu Met Gin
225 230 235 240
Asn Gly Gly Gly Cys Phe Asn Leu Gly Ala Met Gin Tyr Asn Gly Glu
245 250 255
Gly Ala Thr Arg Asn Glu Lys Gin Ala He Glu Asn Phe Lys Lys Gly 260 265 270
Cys Lys Leu Gly Ala Lys Gly Ala Cys Asp He Leu Lys Gin Val Lys
275 280 285
He Lys Val 290
(2) INFORMATION FOR SEQ ID NO:373
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 63 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 19b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 373:
Phe Asn Ala Pro Phe Lys Arg Val Lys Phe Cys Pro Pro Gly Gly He 1 5 10 15 Ser Ala Asp Asn Met Arg Ser Tyr Leu Asn Leu Glu Asn Val Leu Cys 20 25 30
Val Gly Gly Ser Trp Leu Thr Pro Lys Asp Leu He Pro Asn Lys Glu
35 40 45
Trp Asp Lys He Thr Glu He Cys Lys Arg Ala Leu Thr Leu Arg 50 55 60
(2) INFORMATION FOR SEQ ID NO:374:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 250 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 20a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 374:
Met Leu Gly Ser Val Lys Lys Thr Leu Phe Gly Val Leu Cys Leu Gly 1 5 10 15
Thr Leu Cys Leu Arg Trp Leu Met Ala Glu Pro Asp Ala Lys Glu Leu 20 25 30 Val Asn Leu Gly He Glu Ser Ala Lys Lys Gin Asp Phe Ala Gin Ala
35 40 45
Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys 50 55 60 Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp 65 70 75 80
Leu Lys Lys Ala He Gin Phe Tyr Thr Lys Gly Cys Glu Leu Asn Asp
85 90 95
Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu Tyr Tyr Asn Gly Gin Gly 100 105 110
Val Ser Lys Asp Ala Lys Lys Ala Ser Gin Tyr Tyr Ser Lys Ala Cys
115 120 125
Asp Leu Asn His Ala Glu Gly Cys Met Val Leu Gly Ser Leu His His
130 135 140 Tyr Gly Val Gly Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr
145 150 155 160
Glu Lys Ala Cys Asp Leu Lys Asp Ser Pro Gly Cys He Asn Ala Gly
165 170 175
Tyr He Tyr Ser Val Thr Lys Asn Phe Lys Glu Ala He Val Arg Tyr 180 185 190
Ser Lys Ala Cys Glu Leu Asn Asp Gly Arg Gly Cys Tyr Asn Leu Gly
195 200 205
Val Met Gin Tyr Asn Ala Gin Gly Thr Thr Lys Asp Glu Lys Gin Ala 210 215 220 Val Glu Asn Phe Lys Lys Gly Cys Lys Ser Gly Val Lys Glu Ala Cys 225 230 235 240
Asp Ala Leu Lys Glu Leu Lys He Glu Leu 245 250 (2) INFORMATION FOR SEQ ID NO: 375:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 20b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 375:
Ser Leu Lys Glu Leu Glu Leu Leu Glu Lys Val Phe Leu Gly Val Leu
1 5 10 15
Glu Asp Leu Ser Glu 20
(2) INFORMATION FOR SEQ ID NO:376:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 160 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 21a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:376:
Arg Asn Ser Phe Leu Gin Asp Val Pro Tyr Trp Met Leu Gin Asn Arg 1 5 10 15
Ser Glu Tyr He Thr Gin Gly Val Asp Ser Ser His He Val Asp Gly
20 25 30
Lys Lys Thr Glu Glu He Glu Lys He Ala Thr Lys Arg Ala Thr He 35 40 45 Arg Val Ala Gin Asn He Val His Lys Leu Lys Glu Ala Tyr Leu Ser 50 55 60
Lys Ser Asn Arg He Lys Gin Lys He Thr Asn Glu Met Phe He Gin 65 70 75 80
Met Thr Gin Pro He Tyr Asp Ser Leu Met Asn Val Asp Arg Leu Gly
85 90 95
He Tyr He Asn Pro Asn Asn Glu Glu Val Phe Ala Leu Val Arg Ala 100 105 110
Arg Gly Phe Asp Lys Asp Ala Leu Ser Glu Gly Leu His Lys Met Ala
115 120 125
Leu Asp Asn Gin Ala Val Ser He Leu Val Ala Lys Val Glu Glu He 130 135 140 Phe Lys Asp Ser Val Asn Tyr Gly Asp Val Lys Val Pro He Ala Met 145 150 155 160
(2) INFORMATION FOR SEQ ID NO:377: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 536 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 21b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 377: Glu Phe Arg Phe Tyr Ala Arg Phe Glu Glu He Pro Pro Arg Phe He 1 5 10 15
Glu Ser Leu Leu Ala Val Glu Asp Thr Leu Phe Phe Glu His Gly Gly
20 25 30
He Asn Leu Asp Ala He Met Arg Ala Met He Lys Asn Ala Lys Ser 35 40 45
Gly Arg Tyr Thr Glu Gly Gly Ser Thr Leu Thr Gin Gin Leu Val Lys
50 55 60
Asn Met Val Leu Thr Arg Glu Lys Thr Leu Thr Arg Lys Leu Lys Glu 65 70 75 80 Ala He He Ser Leu Arg He Glu Lys Val Leu Ser Lys Glu Glu He
85 90 95
Leu Glu Arg Tyr Leu Asn Gin Thr Phe Phe Gly His Gly Tyr Tyr Gly
100 105 110
Val Lys Thr Ala Ser Leu Gly Tyr Phe Lys Lys Pro Leu Asp Lys Leu 115 120 125
Thr Leu Lys Glu He Thr Met Leu Val Ala Leu Pro Arg Ala Pro Ser
130 135 140
Phe Tyr Asp Pro Thr Lys Asn Leu Glu Phe Ser Leu Ser Arg Ala Asn 145 150 155 160 Asp He Leu Arg Arg Leu Tyr Ser Leu Gly Trp He Ser Ser Asn Glu
165 170 175
Leu Lys Gly Ala Leu Asn Glu Val Pro He Val Tyr Asn Gin Thr Ser
180 185 190
Thr Gin Asn He Ala Pro Tyr Val Val Asp Glu Val Leu Lys Gin Leu 195 200 205
Asp Gin Leu Asp Gly Leu Lys Thr Gin Gly Tyr Thr He Lys Leu Thr
210 215 220
He Asp Leu Asp Tyr Gin Arg Leu Ala Leu Glu Ser Leu Arg Phe Gly 225 230 235 240 His Gin Lys He Leu Glu Lys He Ala Lys Glu Lys Pro Lys Thr Asn
245 250 255
Ala Ser Asp Glu Asp Glu Asp Asn Leu Asn Ala Ser Met He Val Thr
260 265 270
Asp Thr Ser Thr Gly Lys He Leu Ala Leu Val Gly Gly He Asp Tyr 275 280 285
Lys Lys Ser Ala Phe Asn Arg Ala Thr Gin Ala Lys Val Thr Leu Gin
290 295 300
Glu Ala Leu Ser His Ser Leu Asn Leu Ala Thr He Asn Leu Ser Asp 305 310 315 320 Gin Leu Gly Phe Glu Lys He Tyr Gin Ser Leu Ser Asp Met Gly Phe
325 330 335
Lys Asn Leu Pro Lys Asp Leu Ser He Val Leu Gly Ser Phe Ala He 340 345 350
Ser Pro He Asp Ala Ala Glu Lys Tyr Ser Leu Phe Ser Asn Tyr Gly
355 360 365
Thr Met Leu Lys Pro Met Leu He Glu Ser He Thr Asn Gin Gin Asn 370 375 380
Asp Val Lys Thr Phe Thr Pro Met Glu Thr Lys Lys He Thr Ser Lys
385 390 395 400
Glu Gin Ala Phe Leu Thr Leu Ser Val Leu He Asn Ala Val Glu Asn
405 410 415 Gly Thr Gly Arg Leu Ala Arg Thr Lys Gly Leu Glu He Ala Gly Lys
420 425 430
Thr Gly Thr Ser Asn Asn Asn He Asp Ala Trp Phe He Gly Phe Thr
435 440 445
Pro Thr Leu Gin Ser Val He Trp Phe Gly Arg Asp Asp Asn Thr Pro 450 455 460
He Ser Lys Arg Ala Thr Gly Gly Val Val Ser Ala Pro Val Tyr Ser
465 470 475 480
Tyr Phe Met Arg Asn He Leu Ala He Glu Pro Ser Leu Lys Arg Lys
485 490 495 Phe Asp Val Pro Lys Gly Leu Arg Lys Glu He Val Asp Lys He Pro
500 505 510
Tyr Tyr Ser Thr Pro Asn Ser He Thr Pro Thr Pro Lys Arg Thr Asp
515 520 525
Asp Ser Glu Glu Arg Leu Leu Phe 530 535
(2) INFORMATION FOR SEQ ID NO: 378:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 138 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 22, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:378:
Ser Phe Val Ala Glu Lys Leu Lys Asn Asp Tyr Glu Asn Lys Met Lys 1 5 10 15
Val Leu Asp Ser Glu Gin Arg Ser Arg He Glu Arg He Val Tyr Leu
20 25 30
Gin He Leu Asp Asn Ala Trp Arg Glu His Leu Tyr Thr Met Asp Asn 35 40 45 Leu Lys Thr Gly He Asn Leu Arg Gly Tyr Asn Gin Lys Asp Pro Leu 50 55 60
Val Glu Tyr Lys Lys Glu Ser Tyr Asn Leu Phe Leu Glu Phe He Glu 65 70 75 80
Asp He Lys Met Glu Ala He Lys Thr Phe Ser Lys He Gin Phe Glu 85 90 95
Asn Glu Gin Asp Ser Ser Asp Ala Glu Arg Tyr Leu Asp Asn Phe Ser
100 105 110
Glu Glu Arg Glu Tyr Glu Ser Val Thr Tyr Arg His Glu Glu Ala Leu 115 120 125 Asp Glu Asp Leu Asn Val Ala Met Lys Ala 130 135
(2) INFORMATION FOR SEQ ID NO: 379: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 57 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 23, antigen, short flagellin A immunoreactive epitope (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 379:
Thr Ala Ser Gly Asp He Ser Leu Thr Phe Lys Gin Val Asp Gly Val 1 5 10 15 Asn Asp Val Thr Leu Glu Ser Val Lys Val Ser Ser Ser Ala Gly Thr 20 25 30
Gly He Gly Val Leu Ala Glu Val He Asn Lys Asn Ser Asn Arg Thr
35 40 45
Gly Val Lys Ala Tyr Ala Ser Val He 50 55
(2) INFORMATION FOR SEQ ID NO: 380:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 634 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 24, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 380:
Leu He Thr Val Val Lys Arg Asn Gly Arg He Glu Pro Leu Asp He 1 5 10 15
Thr Lys He Gin Lys Tyr Thr Lys Asp Ala Thr Asp Asn Leu Glu Gly
20 25 30
Val Ser Gin Ser Glu Leu Glu Val Asp Ala Arg Leu Gin Phe Arg Asp 35 40 45 Lys He Thr Thr Glu Glu He Gin Gin Thr Leu He Lys Thr Ala Val 50 55 60
Asp Lys He Asp He Asp Thr Pro Asn Trp Ser Phe Val Ala Ser Arg 65 70 75 80
Leu Phe Leu Tyr Asp Leu Tyr His Lys Leu Ser Gly Phe Thr Gly Tyr 85 90 95
Arg His Leu Lys Glu Tyr Phe Glu Asn Ala Glu Glu Lys Gly Arg He
100 105 110
Leu Lys Gly Phe Lys Glu Lys Phe Asp Leu Glu Phe Leu Asn Ser Gin 115 120 125 He Lys Pro Glu Arg Asp Phe Gin Phe Asn Tyr Leu Gly He Lys Thr 130 135 140
Leu Tyr Asp Arg Tyr Leu Leu Lys Asp Ala Asn Asn Asn Pro He Glu 145 150 155 160
Leu Pro Gin His Met Phe Met Ser He Ala Met Phe Leu Ala Gin Asn 165 170 175
Glu Gin Glu Pro Asn Lys He Ala Leu Glu Phe Tyr Glu Val Leu Ser
180 185 190
Lys Phe Glu Ala Met Cys Ala Thr Pro Thr Leu Ala Asn Ala Arg Thr 195 200 205 Thr Lys His Gin Leu Ser Ser Cys Tyr He Gly Ser Thr Pro Asp Asn 210 215 220
He Glu Gly He Phe Asp Ser Tyr Lys Glu Met Ala Leu Leu Ser Lys 225 230 235 240
Tyr Gly Gly Gly He Gly Trp Asp Phe Ser Leu Val Arg Ser He Gly 245 250 255
Ser Tyr He Asp Gly His Lys Asn Ala Ser Ala Gly Thr He Pro Phe
260 265 270
Leu Lys He Ala Asn Asp Val Ala He Ala Val Asp Gin Leu Gly Thr 275 280 285 Arg Lys Gly Ala He Ala Val Tyr Leu Glu He Trp His He Asp Trp 290 295 300
Arg Asp Pro Leu He Gin Gly Arg Glu Val Gly Val Glu Thr Arg Gly 305 310 315 320
Ala Leu Asp Leu Cys Pro Ala Ser Val Gly Gly Arg Phe Val Phe Glu 325 330 335
Lys Gly Phe Arg Thr Met Arg Cys Gly Leu Leu Tyr Asp Pro Tyr Glu 340 345 350 Cys Lys Asp Leu Thr Asp Leu Tyr Gly Gin Asp Phe Glu Lys Arg Tyr
355 360 365
Leu Glu Tyr Glu Lys Asp Pro Lys He He Lys Glu Tyr He Asn Ala
370 375 380 Lys Asp Leu Trp Thr Lys He Leu Met Asn Tyr Phe Glu Ala Gly Leu
385 390 395 400
Pro Phe Leu Ala Phe Lys Asp Asn Ala Asn Arg Cys Asn Pro Asn Ala
405 410 415
His Ala Gly He He Arg Ser Ser Asn Leu Cys Thr Glu He Phe Gin 420 425 430
Asn Thr Ala Pro Asn His Tyr Tyr Met Gin He Glu Tyr Thr Asp Gly
435 440 445
Thr He Glu Phe Phe Glu Glu Lys Gin Leu Val Thr Thr Asp Ser Asn
450 455 460 He Thr Lys Cys Ala Asn Lys Leu Thr Ser Thr Asp He Leu Lys Gly
465 470 475 480
Lys Gin He Tyr He Ala Thr Lys Val Ala Lys Asp Gly Gin Thr Ala
485 490 495
Val Cys Asn Leu Ala Ser He Asn Leu Ser Lys He Asn Thr Glu Glu 500 505 510
Asp He Lys Arg Val Val Pro He Met Val Arg Leu Leu Asp Asn Val
515 520 525
He Asp Leu Asn Phe Tyr Pro Asn Arg Lys Val Lys Ala Thr Asn Leu
530 535 540 Gin Asn Arg Ala He Gly Leu Gly Val Met Gly Glu Ala Gin Met Leu
545 550 555 560
Ala Glu His Lys He Ala Trp Gly Ser Lys Glu His Leu Gin Lys He
565 570 575
Asp Ala Leu Met Glu Gin He Ser Tyr His Ala He Asp Thr Ser Ala 580 585 590
Asn Leu Ala Lys Glu Lys Gly Val Tyr Lys Asp Phe Glu Asn Ser Gin
595 600 605
Trp Ser Lys Gly He Phe Pro He Asp Lys Ala Asn Asn Glu Ala Leu 610 615 620 Lys Leu Thr Glu Lys Gly Leu Phe Tyr His 625 630
(2) INFORMATION FOR SEQ ID NO: 381: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 225 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 25a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 381: Met Phe Leu Leu Arg His Leu Thr Ser Ala Cys Val Phe Leu Ala Ser 1 5 10 15
Lys Cys Leu Pro Asp Ser Phe Val Leu Val Ala Leu Leu Ser Phe He
20 25 30
Val Phe Val Leu Val Tyr Gly Leu Thr Gly Gin Asp Ala Phe Ser Val 35 40 45
He Ser Ser Trp Gly Asn Gly Ala Trp Thr Leu Leu Gly Phe Ser Met
50 55 60
Gin Met Ala Leu He Leu Val Leu Gly Gin Ala Leu Ala Ser Ala Lys 65 70 75 80 Leu Val Gin Lys Leu Leu Lys Tyr Leu Ala Ser Leu Pro Lys Gly Tyr
85 90 95
Tyr Thr Ala Leu Trp Leu Val Thr Phe Leu Ser Leu He Ala Asn Trp
100 105 110
He Asn Trp Gly Phe Gly Leu Val He Ser Ala He Phe Ala Lys Glu 115 120 125
He Ala Lys Asn Val Lys Gly Val Asp Tyr Arg Leu Leu He Ala Ser 130 135 140 Ala Tyr Ser Gly Phe Val He Trp His Gly Gly Leu Ser Gly Ser He
145 150 155 160
Pro Leu Ser Val Ala Thr Gin Asn Glu Asn Leu Ser Lys He Ser Ala
165 170 175 Gly Val He Glu Lys Ala He Pro He Ser Gin Thr He Phe Ser Ala
180 185 190
Tyr Asn Leu He He He Gly He He Leu Val Gly Leu Pro Phe Leu
195 200 205
Met Ala He He His Pro Lys Lys Glu Glu He Val Glu He Asp Ala 210 215 220
Lys 225
(2) INFORMATION FOR SEQ ID NO:382:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 232 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 25b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:382:
Met Asn Lys Val He Thr Asn Leu Asn Lys Ala Leu Ser Gly Leu Lys
1 5 10 15
Asn Gly Asp Thr He Leu Val Gly Gly Phe Gly Leu Cys Gly He Pro 20 25 30 Glu Tyr Ala He Asp Tyr He Tyr Lys Lys Gly He Lys Asp Leu He 35 40 45
Val Val Ser Asn Asn Cys Gly Val Asp Asp Phe Gly Leu Gly He Leu
50 55 60
Leu Glu Lys Lys Gin He Lys Lys He He Ala Ser Tyr Val Gly Glu 65 70 75 80
Asn Lys He Phe Glu Ser Gin Met Leu Asn Gly Glu He Glu Phe Val
85 90 95
Leu Thr Pro Gin Gly Thr Leu Ala Glu Asn Leu Arg Ala Gly Gly Ala 100 105 110 Gly He Pro Ala Tyr Tyr Thr Pro Thr Gly Val Gly Thr Leu He Ala 115 120 125
Gin Gly Lys Glu Ser Arg Glu Phe Asn Gly Lys Glu Tyr He Leu Glu
130 135 140
Arg Ala He Thr Gly Asp Tyr Gly Leu He Lys Ala Tyr Lys Ser Asp 145 150 155 160
Thr Leu Gly Asn Leu Val Phe Arg Lys Thr Ala Arg Asn Phe Asn Pro
165 170 175
Leu Cys Ala Met Ala Ser Lys He Cys Val Ala Glu Val Glu Glu He 180 185 190 Val Pro Ala Gly Glu Leu Asp Pro Asp Glu He His Leu Pro Gly He 195 200 205
Tyr Val Gin His He Tyr Lys Gly Glu Lys Phe Glu Lys Arg He Glu
210 215 220
Lys He Thr Thr Arg Ser Ala Lys 225 230
(2) INFORMATION FOR SEQ ID NO: 383:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 207 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 25c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:383: Met Arg Glu Ala He He Lys Arg Ala Ala Lys Glu Leu Lys Glu Gly
1 5 10 15
Met Tyr Val Asn Leu Gly He Gly Leu Pro Thr Leu Val Ala Asn Glu 20 25 30 Val Ser Gly Met Asn He Val Phe Gin Ser Glu Asn Gly Leu Leu Gly 35 40 45
He Gly Ala Tyr Pro Leu Glu Gly Gly Val Asp Ala Asp Leu He Asn
50 55 60
Ala Gly Lys Glu Thr Val Thr Val Val Pro Gly Ala Ser Phe Phe Asn 65 70 75 80
Ser Ala Asp Ser Phe Ala Met He Arg Gly Gly His He Asp Leu Ala
85 90 95
He Leu Gly Gly Met Glu Val Ser Gin Asn Gly Asp Leu Ala Asn Trp 100 105 110 Met He Pro Lys Lys Leu He Lys Gly Met Gly Gly Ala Met Asp Leu 115 120 125
Val His Gly Ala Lys Lys Val He Val He Met Glu His Cys Asn Lys
130 135 140
Tyr Gly Glu Ser Lys Val Lys Lys Glu Cys Ser Leu Pro Leu Thr Gly 145 150 155 160
Lys Gly Val Val His Gin Leu He Thr Asp Leu Ala Val Phe Glu Phe
165 170 175
Ser Asn Asn Ala Met Lys Leu Val Glu Leu Gin Glu Gly Val Ser Leu 180 185 190 Asp Gin Val Arg Glu Lys Thr Glu Ala Glu Phe Glu Val His Leu 195 200 205
(2) INFORMATION FOR SEQ ID NO:384: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 148 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 25d, antigen 4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 384: Asn Ser Ser Gly He Asn Asp Gly Ala Ser He He He Leu Cys Ser 1 5 10 15
Ala Lys Lys Ala Gin Lys Leu Gly Leu Lys Ala Met Ala Thr He Arg
20 25 30
Gly Phe Gly Leu Gly Gly Cys Ser Pro Asp He Met Gly He Cys Pro 35 40 45
He He Ala He Lys Asn Asn Leu Lys Asn Val Lys Met Asn Leu Asn
50 55 60
Asp He Asn Leu Phe Glu Leu Asn Glu Ala Phe Ala Ala Gin Ser Leu 65 70 75 80 Ala Val Leu Lys Glu Leu Glu Leu Asn Pro Asn He Val Asn Val Asn
85 90 95
Gly Gly Ala He Ala He Gly His Pro He Gly Ala Ser Gly Ala Arg
100 105 110
He Leu Val Thr Leu Leu His Glu Met Lys Lys Ser Gly His Gly Val 115 120 125
Gly Cys Ala Ser Leu Cys Val Gly Gly Gly Gin Gly Leu Ser Val Val
130 135 140
Leu Glu Gin Lys 145
(2) INFORMATION FOR SEQ ID NO: 385:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 85 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 26, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO-.385:
Met Ser Glu Gin Arg Lys Glu Ser Leu Gin Asn Asn Pro Asn Leu Ser
1 5 10 15
Lys Lys Asp Val Lys He Val Glu Lys He Leu Ser Lys Asn Asp He 20 25 30 Lys Ala Ala Glu Met Lys Glu Arg Tyr Leu Lys Glu Gly Leu Tyr Val 35 40 45
Leu Asn Phe Met Ser Ser Pro Gly Ser Gly Lys Thr Thr Met Leu Glu
50 55 60
Asn Leu Ala Asp Phe Lys Asp Phe Lys Phe Cys Val Val Glu Gly Asp 65 70 75 80
Leu Gin Thr Asn Arg 85
(2) INFORMATION FOR SEQ ID NO:386:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 196 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 27, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 386:
Tyr Met Pro Met He Val Glu Ser Val Tyr Met Met Leu Ala Cys Ala
1 5 10 15
Arg He Gly Ala He His Ser He Val Phe Ala Gly Phe Ser Pro Glu 20 25 30 Ala Leu Arg Asp Arg He Asn Asp Ala Gin Ala Lys Leu Val He Thr 35 40 45
Ala Asp Gly Thr Phe Arg Lys Gly Lys Pro Tyr Met Leu Lys Leu Ala
50 55 60
Leu Asp Lys Ala Leu Glu Asn Asn Ala Cys Pro Ser Val Glu Lys Ala 65 70 75 80
Leu He Val He Arg Asn Ala Arg Glu He Asp Tyr Val Arg Gly Arg
85 90 95
Asp Phe Val Tyr Asn Glu Met Val Asn Tyr Gin Ser Asp Lys Cys Glu 100 105 110 Pro Glu Met Met Asp Ser Glu Asp Pro Leu Phe Leu Leu Tyr Thr Ser 115 120 125
Gly Ser Thr Gly Lys Pro Lys Gly Val Gin His Ser Ser Ala Gly Tyr
130 135 140
Leu Leu Trp Ala Gin Met Thr Met Glu Trp Val Phe Asp He Arg Asp 145 150 155 160
Asn Asp Asn Phe Trp Cys Thr Ala Asp He Gly Trp He Thr Gly His
165 170 175
Thr Tyr Val Val Tyr Gly Pro Leu Ala Cys Gly Ala Thr Thr Leu He 180 185 190 Leu Glu Gly Thr 195
(2) INFORMATION FOR SEQ ID NO: 387: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 160 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 28a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 387:
Met Asn Leu Ser Glu He Glu Glu Leu He Lys Glu Phe Lys Ala Ser 1 5 10 15 Asp Leu Gly His Leu Lys Leu Lys His Glu His Phe Glu Leu Val Leu 20 25 30
Asp Lys Glu Ser Ala Tyr Ala Lys Lys Asn Ala Leu Asn Pro Ala His
35 40 45
Ser Gin Ala Ser He Gin Ala Pro He Met Val Glu Ala Ser Met Pro 50 55 60
Ser Thr Gin Ala Pro Val Pro Met Val Cys Thr Pro He Val Asp Lys 65 70 75 80
Lys Glu Asp Phe Val Leu Ser Pro Met Val Gly Thr Phe Tyr His Ala 85 90 95 Pro Ser Pro Gly Ala Glu Pro Tyr Val Lys Ala Gly Asp Thr Leu Lys 100 105 110
Lys Gly Gin He Val Gly He Val Glu Ala Met Lys He Met Asn Glu
115 120 125
He Glu Val Glu Tyr Pro Cys Lys Val Val Ser Val Glu Val Gly Asp 130 135 140
Ala Gin Pro Val Glu Tyr Gly Thr Lys Leu He Lys Val Glu Lys Leu 145 150 155 160
(2) INFORMATION FOR SEQ ID NO:388:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 28b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 388:
Met Asn Lys Val Asn Lys Glu Asn Lys Lys Val Glu Lys Lys Glu Leu
1 5 10 15
Ser Arg He Leu He Ala Asn Arg Gly Glu He Ala Leu Arg Ala He 20 25 30 Gin Thr He Gin Asp 35
(2) INFORMATION FOR SEQ ID NO: 389: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 190 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 28c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 389: Val Arg Met Gly Leu Lys Ala Asp Ser Trp He Lys Lys Met Ser Leu 1 5 10 15
Glu His Gly Met He Ser Pro Phe Cys Glu Lys Gin Val Gly Lys Asn
20 25 30
Val He Ser Tyr Gly Leu Ser Ser Tyr Gly Tyr Asp He Arg Val Gly 35 40 45
Ser Glu Phe Met Leu Phe Asp Asn Lys Asn Ala Leu He Asp Pro Lys
50 55 60
Asn Phe Asp Pro Asn Asn Ala Thr Lys He Asp Ala Ser Lys Glu Gly
65 70 75 80 Tyr Phe He Leu Pro Ala Asn Ala Phe Ala Leu Ala His Thr He Glu
85 90 95
Tyr Phe Lys Met Pro Lys Asp Thr Leu Ala He Cys Leu Gly Lys Ser 100 105 110
Thr Tyr Ala Arg Cys Gly He He Val Asn Val Thr Pro Phe Glu Pro
115 120 125
Glu Phe Glu Gly Tyr He Thr He Glu He Ser Asn Thr Thr Asn Leu 130 135 140
Pro Ala Lys Val Tyr Ala Asn Glu Gly He Ala Gin Val Val Phe Leu 145 150 155 160
Gin Gly Asp Glu Met Cys Glu Gin Ser Tyr Lys Asp Arg Gly Gly Lys 165 170 175 Tyr Gin Gly Gin Val Gly He Thr Leu Pro Lys He Leu Lys 180 185 190
(2) INFORMATION FOR SEQ ID NO:390: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 378 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluset 28d, antigen 4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 390: Met Leu Arg Phe Val Ser Lys Thr He Cys Leu Ser Leu He Gly Leu 1 5 10 15
Phe Asn Pro Leu Glu Ala Phe Gin Lys His Gin Lys Asp Gly Phe Phe
20 25 30
He Glu Ala Gly Phe Glu Thr Gly Leu Leu Glu Gly Val Gin Asn Lys 35 40 45
Glu Gin Thr He Thr Thr Gin Lys He Gin Lys Asn Pro Leu Thr His
50 55 60
Pro Gin He Lys Glu Gin Pro Lys Glu Gin Asn Lys Ser Asp Thr Ala 65 70 75 80 Thr Pro Gin Ser Ala Tyr Gly Lys Tyr Tyr He Pro Gin Ser Thr He
85 90 95
Leu Lys Asn Ala Thr Ala Leu Phe Thr Thr Asp Asn He Glu Lys Asn
100 105 110
Gly Leu Thr Phe Tyr Ser Gin Asn Pro Val Tyr Ala Asn Met Val Asn 115 120 125
Gly Ser Val Thr He Gin Asn Phe Leu Pro Tyr Asn Leu Asn Asn He
130 135 140
Glu Leu Ser Tyr Thr Asp Ala Gin Gly Lys Val Val Asn Leu Gly Val 145 150 155 160 He Glu Thr He Pro Lys Asp Ser Gin He He Leu Pro Ala Ser Leu
165 170 175
Phe Asn Asp Glu Phe Glu Gin Ala Asp Ser Phe Asn Tyr Gin Gin Leu
180 185 190
Gin Ala Thr Ala Thr Gin Phe Ser Asp Ala Asn Thr Gin Ser Leu Phe 195 200 205
Glu Lys Leu Ser Gin He Thr Thr Asn Val Thr Met Ser Tyr Glu Asn
210 215 220
Ala Asp Thr Asn Asn Phe Lys Gly Asn Cys Asn Asp Cys Val Ser Asp 225 230 235 240 Phe Thr Pro Gin Thr Ala Glu Glu Leu Thr Asn Leu Met Leu Asp Met
245 250 255
He Ala Val Phe Asp Ser Lys Ser Trp Glu Glu Ala He Leu Asn Ala
260 265 270
Pro Phe Gin Phe Ser Asn Ser Pro Ser Glu Cys Gly Ser Asp Phe Pro 275 280 285
Lys Cys Val Asn Pro Phe Asn Asn Gly Arg Val Ala Pro He Tyr Glu
290 295 300
Lys Tyr Val Leu Thr Pro Gin Ser Val He Asp Ala Phe Arg Arg Ala 305 310 315 320 He Asn Leu Glu Val Asn He Met Lys Ser Gly Phe Leu Gly Leu Gly
325 330 335
Tyr Glu Leu Asp Asp Asn Asp Gly Asn Leu Gly He Ala Ala Ser Ala 340 345 350
Leu Asn Pro Glu Lys Leu Phe Gly Lys Thr Leu Asn Lys Val Asp He
355 360 365
Val Glu Leu Arg Asp He He His Glu Phe 370 375
(2) INFORMATION FOR SEQ ID NO: 391:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 136 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 29a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 391:
Met Ala He Phe Asp Asn Asn Asn Lys Ser Ala Asn Ala Lys Thr Gly 1 5 10 15
Pro Ala Thr He He Ala Gin Gly Thr Lys He Lys Gly Glu Leu His
20 25 30
Leu Asp Tyr His Leu His He Asp Gly Glu Leu Glu Gly Val Val His 35 40 45 Ser Lys Ser Thr Val Val He Gly Gin Thr Gly Ser Val Val Gly Glu 50 55 60
He Phe Ala Asn Lys Leu Val Val Asn Gly Lys Phe Thr Gly Thr Val 65 70 75 80
Glu Ala Glu Val Val Glu He Met Pro Leu Gly Arg Leu Asp Gly Lys 85 90 95
He Ser Ser Gin Glu Leu Val Val Glu Arg Lys Gly He Leu He Gly
100 105 110
Glu Thr Arg Pro Lys Asn He Gin Gly Gly Ala Leu Leu He Asn Glu 115 120 125 Gin Glu Lys Lys He Glu Asn Lys 130 135
(2) INFORMATION FOR SEQ ID NO: 392: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 142 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 29b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 392: Gly Leu Asp Leu Ser Thr Ala He Asn Thr Pro Val Tyr Ala Ser Ala 1 5 10 15
Ser Gly Val Val Gly Leu Ala Ser Lys Gly Trp Asn Gly Gly Tyr Gly
20 25 30
Asn Leu He Lys Val Phe His Pro Phe Gly Phe Lys Thr Tyr Tyr Ala 35 40 45
His Leu Asn Lys He Val Val Lys Thr Gly Glu Phe Val Lys Lys Gly
50 55 60
Gin Leu He Gly Tyr Ser Gly Asn Thr Gly Met Ser Thr Gly Pro His 65 70 75 80 Leu His Tyr Glu Val Arg Phe Leu Asp Gin Pro He Asn Pro Met Ser
85 90 95
Phe Thr Lys Trp Asn Met Lys Asn Phe Glu Glu Val Leu Asn Lys Glu
100 105 110
Arg Ser He Arg Trp Gin Ser Leu He Thr He He Asn Arg Leu Met 115 120 125
Gin Lys Gin Asp Gin Arg Leu Ser Ser Leu Lys Ala Gin Lys 130 135 140 (2) INFORMATION FOR SEQ ID NO:393:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 69 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 29c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 393:
Met He Gin Ser Ser Leu Tyr Arg Ala Leu He Arg Gly Phe Asp Tyr 1 5 10 15 Gin He Leu Ala Cys Lys Asp Phe Lys Glu Ser Glu Leu Ala Lys Glu 20 25 30
Val He Ser Tyr Phe Lys Pro Asn Thr Lys Ala He Leu Phe Pro Glu
35 40 45
Phe Arg Ala Lys Lys Asn Asp Asp Leu Arg Ser Phe Phe Glu Glu Phe 50 55 60
Leu Gin Leu Leu Gly 65
(2) INFORMATION FOR SEQ ID NO: 394:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 238 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 30a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 394:
Glu Phe Pro Gly Asp Asp Thr Pro He Val Ala Gly Ser Ala Leu Arg
1 5 10 15
Ala Leu Glu Glu Ala Lys Ala Gly Asn Val Gly Glu Trp Gly Glu Lys 20 25 30 Val Leu Lys Leu Met Ala Glu Val Asp Ala Tyr He Pro Thr Pro Glu 35 40 45
Arg Asp Thr Glu Lys Thr Phe Leu Met Pro Val Glu Asp Val Phe Ser
50 55 60
He Ala Gly Arg Gly Thr Val Val Thr Gly Arg He Glu Arg Gly Val 65 70 75 80
Val Lys Val Gly Asp Glu Val Glu He Val Gly He Arg Ala Thr Gin
85 90 95
Lys Thr Thr Val Thr Gly Val Glu Met Phe Arg Lys Glu Leu Glu Lys 100 105 110 Gly Glu Ala Gly Asp Asn Val Gly Val Leu Leu Arg Gly Thr Lys Lys 115 120 125
Glu Glu Val Glu Arg Gly Met Val Leu Cys Lys Pro Gly Ser He Thr
130 135 140
Pro His Lys Lys Phe Glu Gly Glu He Tyr Val Leu Ser Lys Glu Glu 145 150 155 160
Gly Gly Arg His Thr Pro Phe Phe Thr Asn Tyr Arg Pro Gin Phe Tyr
165 170 175
Val Arg Thr Thr Asp Val Thr Gly Ser He Thr Leu Pro Glu Gly Val 180 185 190 Glu Met Val Met Pro Gly Asp Asn Val Lys He Thr Val Glu Leu He 195 200 205
Ser Pro Val Ala Leu Glu Leu Gly Thr Lys Phe Ala He Arg Glu Gly
210 215 220
Gly Arg Thr Val Gly Ala Gly Val Val Ser Asn He He Glu 225 230 235
(2) INFORMATION FOR SEQ ID NO: 395: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 52 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 30b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 395:
Met Lys Val Lys He Gly Leu Lys Cys Ser Asp Cys Glu Asp He Asn
1 5 10 15
Tyr Ser Thr Thr Lys Asn Ala Lys Thr Asn Thr Glu Lys Leu Glu Leu 20 25 30 Lys Lys Phe Cys Pro Arg Glu Asn Lys His Thr Leu His Lys Glu He 35 40 45
Lys Leu Lys Ser 50 (2) INFORMATION FOR SEQ ID NO: 396:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 174 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 31, antigen (xi) SEQUENCE DESCRIPTION: SEQ ID NO:396:
Glu Ala Arg Lys Leu Leu Glu Gin Glu Val Lys Lys Ser Val Lys Ala
1 5 10 15
Tyr Leu Asp Cys Val Ser Lys Ala Arg Asn Glu Lys Glu Lys Gin Glu 20 25 30
Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Phe Leu Glu Lys Glu
35 40 45
Leu Gin Gin Lys Asp Lys Ala He Lys Asp Cys Leu Lys Asn Ala Asp 50 55 60 Pro Asn Asp Arg Ala Ala He Met Lys Cys Leu Asp Gly Leu Ser Asp 65 70 75 80
Glu Glu Lys Leu Lys Tyr Leu Gin Glu Ala Arg Glu Lys Ala Val Ala
85 90 95
Asp Cys Leu Ala Met Ala Lys Thr Asp Glu Glu Lys Arg Lys Cys Gin 100 105 110
Asn Leu Tyr Ser Asp Leu He Gin Glu He Gin Asn Lys Arg Thr Gin
115 120 125
Ser Lys Gin Asn Gin Leu Ser Lys Thr Glu Arg Leu His Gin Ala Ser 130 135 140 Glu Cys Leu Asp Asn Leu Asp Asp Pro Thr Asp Gin Gin Val He Glu 145 150 155 160
Gin Cys Leu Glu Gly Leu Ser Asp Ser Glu Arg Ala Leu He 165 170 (2) INFORMATION FOR SEQ ID NO:397:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 243 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 32a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:397:
Met Val Gin Phe Gin Asn Thr Leu He Lys Phe His Ala He Ser Phe 1 5 10 15
Phe Glu Asn Ser Asn Leu He Tyr Asn Ala Lys Leu Asn Lys Thr Cys
20 25 30
Tyr Lys Glu Asn Ser Asn Thr He He Leu Arg He Lys Met Leu Thr 35 40 45
Gin Glu Asp Val Leu Asn Ala Leu Lys Thr He He Tyr Pro Asn Phe
50 55 60
Glu Lys Asp He Val Ser Phe Gly Phe Val Lys Ser He Ala Leu His 65 70 75 80 Asp Asn Gin Leu Gly Leu Leu He Glu He Pro Ser Ser Ser Glu Glu
85 90 95
Thr Ser Ala He Leu Arg Glu Asn He Ser Glu Ala Val Gin Lys He
100 105 110
Gly Val Lys Ala Leu Asn Leu Asp He Lys Thr Pro Pro Lys Pro Gin 115 120 125
Ala Pro Lys Pro Thr Thr Lys Asn Leu Ala Lys Asn He Lys His Val
130 135 140
Val Met He Ser Ser Gly Lys Gly Gly Val Gly Lys Ser Thr Thr Ser 145 150 155 160 Val Asn Leu Ser He Ala Leu Ala Asn Leu Asn Gin Lys Val Gly Leu
165 170 175
Leu Asp Ala Asp Val Tyr Gly Pro Asn He Pro Arg Met Met Gly Leu
180 185 190
Gin Ser Ala Asp Val He Met Asp Pro Ser Gly Lys Lys Leu He Pro 195 200 205
Leu Lys Ala Phe Gly Val Ser Val Met Ser Met Gly Leu Leu Tyr Asp
210 215 220
Glu Gly Gin Ser Leu He Trp Arg Gly Pro Met Leu Met Arg Ala He 225 230 235 240 Glu Gin Met
(2) INFORMATION FOR SEQ ID NO: 398: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 106 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 32b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 398: Met Lys Phe Tyr Lys Arg Val Leu Lys Leu His His Phe Arg Asn Leu 1 5 10 15
Gly Arg Asn Ser Pro Met Glu Leu Leu Leu Asn Ser Ser Phe Glu Lys
20 25 30
His Gly Gly Leu Val Val Leu Val Gly Glu Asn Asn Val Gly Lys Ser 35 40 45
Asn Val Leu Glu Ala Leu Lys He Phe Asn Asp Ala Asp Val Lys Leu
50 55 60
Cys Ser Glu Lys Asp Tyr Phe Lys Ala His Glu Ser Glu Asp Ala Val 65 70 75 80 Leu Asn Leu Glu Glu Glu Thr He Leu Asp His Lys Thr He Gly Phe
85 90 95
Ser Cys Val Asp Leu Lys He Gin Thr Lys 100 105 (2) INFORMATION FOR SEQ ID NO: 399:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 267 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 33, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 399: Phe Asn Asp Asp Pro Asn Arg Thr Leu Tyr Asn Tyr Leu Asn He Ala 1 5 10 15
Glu He Glu Asp Lys Asn Pro Leu Arg Ala Phe Tyr Glu Cys He Ser
20 25 30
Asn Gly Gly Asn Tyr Glu Glu Cys Leu Lys Leu He Lys Asp Lys Lys 35 40 45
Leu Gin Asp Gin Met Lys Lys Thr Leu Glu Ala Tyr Asn Asp Cys He
50 55 60
Lys Asn Ala Lys Thr Glu Glu Glu Arg He Lys Cys Leu Asp Leu He 65 70 75 80 Lys Asp Glu Asn Leu Lys Lys Ser Leu Leu Asn Gin Gin Lys Val Gin
85 90 95
Val Ala Leu Asp Cys Leu Lys Asn Ala Lys Thr Asp Glu Glu Arg Asn
100 105 110
Glu Cys Leu Lys Leu He Asn Asp Pro Asp He Arg Glu Lys Phe Arg 115 120 125
Lys Glu Leu Glu Leu Gin Lys Glu Leu Gin Glu Tyr Lys Asp Cys He
130 135 140
Lys Asn Ala Lys Thr Glu Ala Glu Lys Asn Glu Cys Leu Lys Gly Leu 145 150 155 160 Ser Lys Glu Ala He Glu Arg Leu Lys Gin Gin Ala Leu Asp Cys Leu
165 170 175
Lys Asn Ala Lys Thr Asp Glu Glu Arg Asn Glu Cys Leu Lys Asn He
180 185 190
Pro Gin Asp Leu Gin Lys Glu Leu Leu Ala Asp Met Ser Val Lys Ala 195 200 205
Tyr Lys Asp Cys Val Ser Lys Ala Arg Asn Glu Lys Glu Lys Lys Glu
210 215 220
Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Lys Leu Glu Gin Gin 225 230 235 240 Val Leu Asp Cys Leu Lys Asn Ala Lys Thr Asp Glu Glu Arg Lys Lys
245 250 255
Cys Leu Lys Asp Leu Pro Lys Asp Leu Gin Ser 260 265 (2) INFORMATION FOR SEQ ID NO: 400:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 248 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 34a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 400:
Met Gin Phe Thr Gly Lys Asn Val Leu He Thr Gly Ala Ser Lys Gly
1 5 10 15
He Gly Ala Glu He Ala Lys Thr Leu Ala Ser Met Gly Leu Lys Val 20 25 30
Trp He Asn Tyr Arg Ser Asn Ala Glu Val Ala Asp Ala Leu Lys Asn
35 40 45
Glu Leu Glu Glu Lys Gly Cys Lys Ala Ala Val He Lys Phe Asp Ala 50 55 60 Ala Ser Glu Ser Asp Phe He Glu Ala He Gin Thr He Val Gin Ser 65 70 75 80
Asp Gly Gly Leu Ser Tyr Asn Leu Val Asn Asn Ala Gly Val Val Arg
85 90 95
Asp Lys Leu Ala He Lys Met Lys Thr Glu Asp Phe His His Val He 100 105 110
Asp Asn Asn Leu Thr Ser Ala Phe He Gly Cys Arg Glu Ala Leu Lys 115 120 125 Val Met Ser Lys Ser Arg Phe Gly Ser Val Val Asn Val Ala Ser He
130 135 140
He Gly Glu Arg Gly Asn Met Gly Gin Thr Asn Tyr Ser Ala Ser Lys 145 150 155 160 Gly Gly Met He Ala Met Ser Lys Ser Phe Ala Tyr Glu Gly Ala Leu
165 170 175
Arg Asn He Arg Phe Asn Ser Val Thr Pro Gly Phe He Glu Thr Asp
180 185 190
Met Asn Ala Asn Leu Lys Asp Glu Leu Lys Ala Asp Tyr Val Lys Asn 195 200 205
He Pro Leu Asn Arg Leu Gly Ser Ala Lys Glu Val Ala Glu Ala Val
210 215 220
Ala Phe Leu Leu Ser Asp His Ser Ser Tyr He Thr Gly Glu Thr Leu 225 230 235 240 Lys Val Asn Gly Gly Leu Tyr Met
245
(2) INFORMATION FOR SEQ ID NO: 401: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 70 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 34b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 401: Met Pro Gly He Lys Val Arg Glu Gly Asp Ala Phe Asp Glu Ala Tyr 1 5 10 15
Arg Arg Phe Lys Lys Gin Thr Asp Arg Asn Leu Val Val Thr Glu Cys
20 25 30
Arg Ala Arg Arg Phe Phe Glu Ser Lys Thr Glu Lys Arg Lys Lys Gin 35 40 45
Lys He Ser Ala Lys Lys Lys Val Leu Lys Arg Leu Tyr Met Leu Arg
50 55 60
Arg Tyr Glu Ser Arg Leu 65 70
(2) INFORMATION FOR SEQ ID NO: 402:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 343 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 35a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 402:
Met Ala Thr Lys Leu Thr Pro Lys Gin Lys Ala Gin Leu Asn Glu Leu 1 5 10 15 Ser Met Ser Glu Lys He Ala He Leu Leu He Gin Val Gly Glu Asp
20 25 30
Thr Thr Gly Glu He Leu Arg His Leu Asp He Asp Ser He Thr Glu
35 40 45
He Ser Lys Gin He Val Gin Leu Asn Gly Thr Asp Lys Gin He Gly 50 55 60
Ala Ala Val Leu Glu Glu Phe Phe Ala He Phe Gin Ser Asn Gin Tyr
65 70 75 80
He Asn Thr Gly Gly Leu Glu Tyr Ala Arg Glu Leu Leu Thr Arg Thr 85 90 95 Leu Gly Ser Glu Glu Ala Lys Lys Val Met Asp Lys Leu Thr Lys Ser
100 105 110
Leu Gin Thr Gin Lys Asn Phe Ala Tyr Leu Gly Lys He Lys Pro Pro 115 120 125
Gin Leu Ala Asp Phe He He Asn Glu His Pro Gin Thr He Ala Leu
130 135 140
He Leu Ala His Met Glu Ala Pro Asn Ala Ala Glu Thr Leu Ser Tyr 145 150 155 160
Phe Pro Asp Glu Met Lys Ala Glu He Ser He Arg Met Ala Asn Leu
165 170 175
Gly Glu He Ser Pro Gin Val Leu Lys Arg Val Ser Thr Val Leu Glu 180 185 190 Asn Lys Leu Glu Ser Leu Thr Ser Tyr Lys He Glu Val Gly Gly Leu 195 200 205
Arg Ala Val Ala Glu He Phe Asn Arg Leu Gly Gin Lys Ser Ala Lys
210 215 220
Thr Thr Leu Ala Arg He Glu Ser Val Asp Asn Lys Leu Ala Gly Ala 225 230 235 240
He Lys Glu Met Met Phe Thr Phe Glu Asp He Val Lys Leu Asp Asn
245 250 255
Phe Ala He Arg Glu He Leu Lys Val Ala Asp Lys Lys Asp Leu Ser 260 265 270 Leu Ala Leu Lys Thr Ser Thr Lys Asp Leu Thr Asp Lys Phe Leu Asn 275 280 285
Asn Met Ser Ser Arg Ala Ala Glu Gin Phe Val Glu Glu Met Gin Tyr
290 295 300
Leu Gly Ala Val Lys He Lys Asp Val Asp Val Ala Gin Arg Lys He 305 310 315 320
He Glu He Val Gin Asn Leu Gin Glu Lys Gly Val He Gin Thr Gly
325 330 335
Glu Glu Glu Asp Val He Glu 340
(2) INFORMATION FOR SEQ ID NO:403:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 80 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 35b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 403:
Met Ser Leu Asn Ser Arg Lys Asn Leu He Gin Lys Asp His Leu Asn 1 5 10 15 Lys His Asp He Gin Lys Tyr Glu Phe Lys Ser Met Ala Asn Leu Pro 20 25 30
Pro Lys Thr Asn Pro Asn Gly Ala Ser Leu Glu Thr Pro Asn Leu Glu
35 40 45
Glu Pro Leu Glu Lys Lys Ala He Glu Asn Asp Leu He Asp Cys Leu 50 55 60
Leu Lys Lys Thr Asp Glu Leu Ser Ser His Leu Val Lys Leu Gin Met 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:404:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 74 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 35c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:404:
Glu Glu Met Asp Glu Glu Glu Asp Glu Leu Asn Lys Leu Gly Asp Leu 1 5 10 15 Arg Lys Lys Val Glu Asp Gin Leu Gly Leu Asn Ala Thr Phe Ser Glu
20 25 30
Glu Glu Val Arg Tyr Glu He He Leu Glu Lys He Arg Gly Thr Leu 35 40 45 Lys Glu Arg Pro Asp Glu He Ala Thr Leu Phe Lys Leu Leu He Lys 50 55 60
Asp Glu He Ser Ser Asp Ser Ala Lys Gly 65 70 (2) INFORMATION FOR SEQ ID NO:405:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 170 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 36, antigen (xi) SEQUENCE DESCRIPTION: SEQ ID NO:405:
Asn Ser Tyr Asn Tyr Thr Ser Asp Lys Ala Gly Thr Tyr Tyr Leu Thr
1 5 10 15
Ser Asn He Lys Gly Phe Asn Gin Asn Asn Glu He Pro Gly Thr Tyr 20 25 30
Asn Ala Gin Asn Gin Pro Leu Gin Ala Leu His He Tyr Asn Gin Ala
35 40 45
He Thr Lys Gin Asp Leu Ser Val He Ala Ser Leu Gly Lys Glu Phe 50 55 60 Leu Pro Lys He Ala Asn Leu Leu Ser Ser Gly Ala Leu Asp Asn Leu 65 70 75 80
Asn Leu Asn Ser Pro Asp Ser Phe Glu Thr Leu Phe Gly He Phe Glu
85 90 95
Lys Tyr Gly He Thr Leu Thr Gin Ala Asn Trp Lys Ser Leu Leu Gly 100 105 110
He He Asn Asn Phe Ser Asn Thr Ala Asn Tyr His Phe Ser Gin Gly
115 120 125
Asn Leu Val Val Gly Ala He Lys Glu Gly Gin Thr Asn Thr Asn Ser 130 135 140 Val Val Trp Phe Gly Gly Asp Gly Tyr Lys Glu Pro Cys Ala Val Gly 145 150 155 160
Asp Asn Thr Cys Gin Met Phe Arg Gin Thr 165 170 (2) INFORMATION FOR SEQ ID NO:406:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 765 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 37a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 406:
Met Ala Asn Leu Leu Lys Asn Gly Lys Thr Leu Lys Gin Ala Arg Asp
1 5 10 15
Glu He Leu Ala Arg Thr Glu Lys Thr Gly His Tyr Asn Gly Leu Lys 20 25 30
Lys Leu Glu Phe Lys Glu Arg Asp Pro He Gly Tyr Glu Lys Met Phe
35 40 45
Ser Lys Leu Arg Gly Gly He Val His Ala Arg Glu Thr Ala Lys Arg 50 55 60 He Ala Ala Ser Pro He Val Glu Gin Glu Gly Glu Leu Cys Phe Thr 65 70 75 80
Leu Tyr Asn Ala Val Gly Asp Ser Val Leu Thr Ser Thr Gly He He 85 90 95
He His Val Gly Thr Met Gly Ser Ala He Lys Tyr Met Val Glu Asn
100 105 110
Asn Trp Glu Asp Asn Pro Gly He Asn Asp Lys Asp He Phe Thr Asn 115 120 125
Asn Asp Cys Ala He Gly Asn Val His Pro Cys Asp He Met Thr Leu
130 135 140
Val Pro He Phe His Asp Glu Lys Leu He Gly Trp Val Gly Gly Val 145 150 155 160 Thr His Val He Asp Thr Gly Ser Val Thr Pro Gly Ser Met Ser Thr
165 170 175
Gly Gin Val Gin Arg Phe Gly Asp Gly Tyr Met He Thr Cys Arg Lys
180 185 190
Thr Gly Ala Asn Asp Glu Ser Phe Lys Asp Trp Leu His Glu Ser Gin 195 200 205
Arg Ser Val Arg Thr Pro Lys Tyr Trp He Leu Asp Glu Arg Thr Arg
210 215 220
He Ala Gly Cys His Met He Arg Asp Leu Val Met Glu Val He Lys 225 230 235 240 Glu Asp Gly He Asp Ser Tyr Met Arg Phe He Asp Glu Val He Glu
245 250 255
Lys Glu Arg Arg Ser Leu He Ser Arg He Lys Ser Met Thr He Pro
260 265 270
Gly Asn Tyr Arg Lys Val Thr Phe Val Asp Val Pro Tyr Ala His Lys 275 280 285
Asp He Gly Val Cys Ser Glu Phe Ala Lys Leu Asp Thr He Met His
290 295 300
Ser Pro Val Glu He Thr He Asn Lys Asp Ala Thr Trp Lys Leu Asp 305 310 315 320 Phe Asp Gly Ala Ser Arg Trp Gly Trp His Ser Phe Asn Cys Asn Gin
325 330 335
Val Ser Phe Thr Ser Gly He Trp Val Met Met Thr Gin Thr Leu He
340 345 350
Pro Thr Ser Arg He Asn Asp Gly Ala Tyr Phe Ala Thr Gin Phe Arg 355 360 365
Leu Lys Lys Gly Thr Trp Met Asn Pro Asp Asp Arg Arg Thr Gly His
370 375 380
Ala Tyr Ala Trp His Phe Leu Val Ser Gly Trp Ser Ala Leu Trp Arg 385 390 395 400 Gly Leu Ser Gin Ala Tyr Tyr Ser Arg Gly Tyr Leu Glu Glu Val Asn
405 410 415
Ser Gly Asn Ala Asn Thr Ser Asn Trp Leu Gin Gly Gly Gly He Asn
420 425 430
Gin Asp Gly Glu He His Ala Val Asn Ser Phe Glu Thr Ser Ser Cys 435 440 445
Gly Thr Gly Ala Cys Ala He Lys Asp Gly Leu Asn His Ala Ala Ala
450 455 460
He Trp Asn Pro Glu Gly Asp Met Gly Asp Val Glu He Trp Glu Met 465 470 475 480 Ala Glu Pro Leu Leu Tyr Leu Gly Arg Asn Val Lys Ala Asn Thr Gly
485 490 495
Gly Tyr Gly Lys Tyr Arg Gly Gly Asn Gly Phe Glu Thr Leu Arg Met
500 505 510
Val Trp Gly Val His Asp Trp Thr Met Phe Phe Met Gly Asn Gly Tyr 515 520 525
Met Asn Ser Asp Trp Gly Met Met Gly Gly Tyr Pro Pro Ala Ser Gly
530 535 540
Tyr Arg Phe Glu Ala His Asn Thr Asp Leu Glu Asn Arg He Lys Asn 545 550 555 560 Asn Ala Ser Leu Pro Leu Gly Gly Asp Phe Asn Pro Thr Asp Arg Asp
565 570 575
Tyr Glu Lys His He Ser His Ala Ser Gin Val Lys Arg Asp Lys Gin
580 585 590
Cys He Thr Thr Glu Asn Cys Phe Asp Asn Tyr Asp Leu Tyr Leu Asn 595 600 605
Tyr He Lys Gly Gly Pro Gly Phe Gly Asp Pro He Glu Arg Asp Leu 610 615 620 Asn Ala He Leu Glu Asp Leu Asn Ser Lys Gin Leu Leu Pro Glu Tyr
625 630 635 640
Ala Tyr Lys Val Tyr Gly Ala Val Val Ser Gin Asn Lys Asp Gly Val
645 650 655 Trp Val Gly Asp Glu Ala Lys Thr Lys Ala Arg Arg Lys Glu He Leu
660 665 670
Glu Asn Arg Lys Ala Arg Ser He Pro Val Lys Gin Trp Met Glu Gin
675 680 685
Glu Arg Asn Ala He Leu Glu Lys Glu Ala Ser Lys Gin Val Lys His 690 695 700
Met Tyr Ala Thr Ser Phe Asp Leu Ser Pro Lys Phe Leu Ser Asp Phe 705 710 715 720
Lys Thr Phe Trp Asn Leu Pro Lys Ser Trp Thr Met Lys Glu Asp Glu 725 730 735 Leu Gly Val Phe Thr Tyr Gly Ser Lys Tyr Arg Met Asp Leu Ser Lys 740 745 750
Leu Pro Asp Val Arg Thr Val Leu Leu Val Asp Glu Lys 755 760 765 (2) INFORMATION FOR SEQ ID NO: 407:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 334 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 37b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:407:
He Asp Thr Val Ser Val Thr Asp Cys His He Val Leu Gly Tyr Leu
1 5 10 15
Asn Pro Asp Tyr Phe Leu Gly Gly Leu He Lys Leu Asp Val Asp Met 20 25 30
Ala Lys Lys His He Lys Glu Gin He Ala Asp Pro Leu Gly He He
35 40 45
Val Glu Asp Ala Ala Ala Gly Val He Glu Leu Leu Asp Pro Glu Leu 50 55 60 Lys Glu Tyr Leu Arg Ser Asn He Ser Ala Lys Gly Tyr Ser Pro Ser 65 70 75 80
Asp Phe Val Cys Phe Ser Tyr Gly Gly Ala Gly Pro Val His Thr Tyr
85 90 95
Gly Tyr Thr Glu Ser Leu Gly Phe Lys Asp Val Val Val Pro Ala Trp 100 105 110
Ala Ala Gly Phe Ser Ala Phe Gly Cys Ala Cys Ala Asp Phe Glu Tyr
115 120 125
Arg Tyr Asp Lys Ser Val Asp He Ala He Pro Gin Tyr Ser Ser Asp
130 135 140 Glu Ser Lys He Glu Ala Cys Lys He He Gin Asp Ala Trp Asp Glu
145 150 155 160
Leu Thr Leu Lys Val He Glu Glu Phe Lys He Asn Gly Phe Ser Gin
165 170 175
Lys Asp Val He Leu Arg Pro Gly Tyr Arg Met Gin Tyr Met Gly Gin 180 185 190
Leu Asn Asp Leu Glu He Thr Ser Pro Val Ser Lys Ala Ala Ser Val
195 200 205
Ala Asp Trp Glu Glu He Val Lys Glu Tyr Glu Lys Thr Tyr Ala Arg
210 215 220 Val Tyr Ser Glu Ser Ala Cys Ser Pro Glu Leu Gly Phe Ser Val Thr
225 230 235 240
Gly Val He Met Arg Gly Val Val Ala Thr Gin Lys Pro Val He Pro
245 250 255
Val Glu Lys Glu His Gly Ala Thr Pro Pro Lys Glu Ala Lys He Gly 260 265 270
Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val Trp 275 280 285 Gin Met Glu Lys Leu Leu Pro Gly Asn Glu Val He Gly Pro Ala He
290 295 300
Val Glu Ser Asp Ala Thr Thr Phe Val He Pro Lys Gly Phe Ala Thr 305 310 315 320 Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu He Lys
325 330
(2) INFORMATION FOR SEQ ID NO:408: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 119 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:408: Lys Glu Tyr Glu Lys Thr Tyr Ala Arg Val Tyr Ser Glu Ser Ala Cys 1 5 10 15
Ser Pro Glu Leu Gly Phe Ser Val Thr Gly Val He Met Arg Gly Val
20 25 30
Val Ala Thr Gin Lys Pro Val He Pro Val Glu Lys Glu His Gly Ala 35 40 45
Thr Pro Pro Lys Glu Ala Lys He Gly Val Arg Lys Phe Tyr Arg His
50 55 60
Lys Lys Trp Val Asp Ala Asp Val Trp Gin Met Glu Lys Leu Leu Pro 65 70 75 80 Gly Asn Glu Val He Gly Pro Ala He Val Glu Ser Asp Ala Thr Thr
85 90 95
Phe Val He Pro Lys Gly Phe Ala Thr Arg Leu Asp Lys His Arg Leu
100 105 110
Phe His Leu Lys Glu He Lys 115
(2) INFORMATION FOR SEQ ID NO: 409:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 45 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 38b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:409:
Met Ala Asn Leu Leu Lys Asn Gly Lys Thr Leu Lys Gin Ala Arg Asp 1 5 10 15
Glu He Leu Ala Arg Thr Glu Lys Thr Gly His Tyr Asn Gly Leu Lys
20 25 30
Lys Leu Glu Phe Lys Glu Arg Asp Pro He Gly Tyr Glu 35 40 45
(2) INFORMATION FOR SEQ ID NO: 410:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 117 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 410; Ala Tyr Ala Ala Glu Pro Ser Arg Gin Thr Leu Ser Lys Val Ser Asn
1 5 10 15
Arg Phe Lys Glu His Gly Ala Lys Phe Asp Leu Arg Val Met Ala Thr 20 25 30 His Gly Gly Thr He Ser Trp Lys Ala Lys Glu Leu Ala Arg Thr He 35 40 45
Val Ser Gly Pro He Gly Gly Val He Gly Ser Lys Leu Leu Gly Glu
50 55 60
Thr Leu Gly Tyr Asp Asn He Ala Cys Ser Asp He Gly Gly Thr Ser 65 70 75 80
Phe Asp Met Ala Leu He Val Lys Ser Asn Phe Asn He Ala Ser Asp
85 90 95
Pro Asp Met Ala Val Arg Gly Val Ser Cys Ala Leu He Arg Thr Ala 100 105 110 Asp Arg Ser Asn Glx 115
(2) INFORMATION FOR SEQ ID NO: 411: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 304 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38d, antigen 4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 411: Met Ala Lys Lys His He Lys Glu Gin He Ala Asp Pro Leu Gly He
1 5 10 15
He Val Glu Asp Ala Ala Ala Gly Val He Glu Leu Leu Asp Pro Glu
20 25 30
Leu Lys Glu Tyr Leu Arg Ser Asn He Ser Ala Lys Gly Tyr Ser Pro 35 40 45
Ser Asp Phe Val Cys Phe Ser Tyr Gly Gly Ala Gly Pro Val His Thr
50 55 60
Tyr Gly Tyr Thr Glu Ser Leu Gly Phe Lys Asp Val Val Val Pro Ala 65 70 75 80 Trp Ala Ala Gly Phe Ser Ala Phe Gly Cys Ala Cys Ala Asp Phe Glu
85 90 95
Tyr Arg Tyr Asp Lys Ser Val Asp He Ala He Pro Gin Tyr Ser Ser
100 105 110
Asp Glu Ser Lys He Glu Ala Cys Lys He He Gin Asp Ala Trp Asp 115 120 125
Glu Leu Thr Leu Lys Val He Glu Glu Phe Lys He Asn Gly Phe Ser
130 135 140
Gin Lys Asp Val He Leu Arg Pro Gly Tyr Arg Met Gin Tyr Met Gly 145 150 155 160 Gin Leu Asn Asp Leu Glu He Thr Ser Pro Val Ser Lys Ala Ala Ser
165 170 175
Val Ala Asp Trp Glu Glu He Val Lys Glu Tyr Glu Lys Thr Tyr Ala
180 185 190
Arg Val Tyr Ser Glu Ser Ala Cys Ser Pro Glu Leu Gly Phe Ser Val 195 200 205
Thr Gly Val He Met Arg Gly Val Val Ala Thr Gin Lys Pro Val He
210 215 220
Pro Val Glu Lys Glu His Gly Ala Thr Pro Pro Lys Glu Ala Lys He 225 230 235 240 Gly Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val
245 250 255
Trp Gin Met Glu Lys Leu Leu Pro Gly Asn Glu Val He Gly Pro Ala
260 265 270
He Val Glu Ser Asp Ala Thr Thr Phe Val He Pro Lys Gly Phe Ala 275 280 285
Thr Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu He Lys Glx 290 295 300 (2) INFORMATION FOR SEQ ID NO:412:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 265 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38e, antigen 5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:412:
Met Ala Asn Leu Leu Lys Asn Gly Lys Thr Leu Lys Gin Ala Arg Asp 1 5 10 15 Glu He Leu Ala Arg Thr Glu Lys Thr Gly His Tyr Asn Gly Leu Lys 20 25 30
Lys Leu Glu Phe Lys Glu Arg Asp Pro He Gly Tyr Glu Lys Met Phe
35 40 45
Ser Lys Leu Arg Gly Gly He Val His Ala Arg Glu Thr Ala Lys Arg 50 55 60
He Ala Ala Ser Pro He Val Glu Gin Glu Gly Glu Leu Cys Phe Thr 65 70 75 80
Leu Tyr Asn Ala Val Gly Asp Ser Val Leu Thr Ser Thr Gly He He 85 90 95 He His Val Gly Thr Met Gly Ser Ala He Lys Tyr Met Val Glu Asn 100 105 110
Asn Trp Glu Asp Asn Pro Gly He Asn Asp Lys Asp He Phe Thr Asn
115 120 125
Asn Asp Cys Ala He Gly Asn Val His Pro Cys Asp He Met Thr Leu 130 135 140
Val Pro He Phe His Asp Glu Lys Leu He Gly Trp Val Gly Gly Val
145 150 155 160
Thr His Val He Asp Thr Gly Ser Val Thr Pro Gly Ser Met Ser Thr
165 170 175 Gly Gin Val Gin Arg Phe Gly Asp Gly Tyr Met He Thr Cys Arg Lys
180 185 190
Thr Gly Ala Asn Asp Glu Ser Phe Lys Asp Trp Leu His Glu Ser Gin
195 200 205
Arg Ser Val Arg Thr Pro Lys Tyr Trp He Leu Asp Glu Arg Thr Arg 210 215 220
He Ala Gly Cys His Met He Arg Asp Leu Val Met Glu Val He Lys 225 230 235 240
Glu Asp Gly He Asp Ser Tyr Met Arg Phe He Asp Glu Val He Glu 245 250 255 Glu Gly Lys Lys Lys Pro Tyr Leu Glx 260 265
(2) INFORMATION FOR SEQ ID NO:413: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 500 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38f, antigen 6
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:413: Met Asn His Thr Arg Pro He He Glu Arg Leu Pro Leu Trp Met Cys 1 5 10 15
Leu Met Arg He Arg He Leu Ala Cys Ala Leu Asn Leu Leu Lys Leu
20 25 30
Glu His Asp His Ala Leu Ser Cys Gly Lys Ser Leu Ser He Lys Thr 35 40 45
Leu Thr Trp Lys Leu Asp Phe Asp Gly Ala Ser Arg Trp Gly Trp His 50 55 60 Ser Phe Asn Cys Asn Gin Val Ser Phe Thr Ser Gly He Trp Val Met 65 70 75 80
Met Thr Gin Thr Leu He Pro Thr Ser Arg He Asn Asp Gly Ala Tyr 85 90 95 Phe Ala Thr Gin Phe Arg Leu Lys Lys Gly Thr Trp Met Asn Pro Asp 100 105 110
Asp Arg Arg Thr Gly His Ala Tyr Ala Trp His Phe Leu Val Ser Gly
115 120 125
Trp Ser Ala Leu Trp Arg Gly Leu Ser Gin Ala Tyr Tyr Ser Arg Gly 130 135 140
Tyr Leu Glu Glu Val Asn Ser Gly Asn Ala Asn Thr Ser Asn Trp Leu
145 150 155 160
Gin Gly Gly Gly He Asn Gin Asp Gly Glu He His Ala Val Asn Ser
165 170 175 Phe Glu Thr Ser Ser Cys Gly Thr Gly Ala Cys Ala He Lys Asp Gly
180 185 190
Leu Asn His Ala Ala Ala He Trp Asn Pro Lys Gly Asp Met Gly Asp
195 200 205
Val Glu He Trp Glu Met Ala Glu Pro Leu Leu Tyr Leu Gly Arg Asn 210 215 220
Val Lys Ala Asn Thr Gly Gly Tyr Gly Lys Tyr Arg Gly Gly Asn Gly
225 230 235 240
Phe Glu Thr Leu Arg Met Val Trp Gly Val His Asp Trp Thr Met Phe
245 250 255 Phe Met Gly Asn Gly Tyr Met Asn Ser Asp Trp Gly Met Met Gly Gly
260 265 270
Tyr Pro Pro Ala Ser Gly Tyr Arg Phe Glu Ala His Asn Thr Asp Leu
275 280 285
Glu Asn Arg He Lys Asn Asn Ala Ser Leu Pro Leu Gly Gly Asp Phe 290 295 300
Asn Pro Thr Asp Arg Asp Tyr Glu Lys His He Ser His Ala Ser Gin
305 310 315 320
Val Lys Arg Asp Lys Gin Cys He Thr Thr Glu Asn Cys Phe Asp Asn
325 330 335 Tyr Asp Leu Tyr Leu Asn Tyr He Lys Gly Gly Pro Gly Phe Gly Asp
340 345 350
Pro He Glu Arg Asp Leu Asn Ala He Leu Glu Asp Leu Asn Ser Lys
355 360 365
Gin Leu Leu Pro Glu Tyr Ala Tyr Lys Val Tyr Gly Ala Val Val Ser 370 375 380
Gin Asn Lys Asp Gly Val Trp Val Gly Asp Glu Ala Lys Thr Lys Ala
385 390 395 400
Arg Arg Lys Glu He Leu Glu Asn Arg Lys Ala Arg Ser He Pro Val
405 410 415 Lys Gin Trp Met Glu Gin Glu Arg Asn Ala He Leu Glu Lys Glu Ala
420 425 430
Ser Lys Gin Val Lys His Met Tyr Ala Thr Ser Phe Asp Leu Ser Pro
435 440 445
Lys Phe Leu Ser Asp Phe Lys Thr Phe Trp Asn Leu Pro Lys Ser Trp 450 455 460
Thr Met Lys Glu Asp Glu Leu Gly Val Phe Thr Tyr Gly Ser Lys Tyr 465 470 475 480
Arg Met Asp Leu Ser Lys Leu Pro Asp Val Arg Thr Val Leu Leu Val 485 490 495 Asp Glu Lys Glx 500
(2) INFORMATION FOR SEQ ID NO:414: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 184 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38g, antigen 7 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 414:
Met Cys Ala Gin Phe Cys Trp Leu Met Arg Asn Lys Glu Arg Arg Met 1 5 10 15 Val Met Ser Lys Tyr Thr Gin Glu Gin He Lys Asn Leu Val Glu Gly 20 25 30
Asn Leu Asp Trp Asn Thr Val Leu Lys Met Leu Ser Met Pro Lys Asp
35 40 45
His Glu Arg Phe Gin Met Tyr Leu Lys Val Leu Gin Asp Lys Val Asp 50 55 60
Phe Asp Asp Lys He Val Leu Pro Leu Gly Pro His Leu Phe Val Val 65 70 75 80
Gin Asp Ser Gin Lys Lys Trp Val He Lys Cys Ser Cys Gly His Ala 85 90 95 Phe Cys Ala Pro Glu Glu Asn Trp Lys Leu His Ala Asn He Tyr Val 100 105 110
His Asp Thr Ala Glu Lys Met Glu Glu Val Tyr Pro Lys Leu Leu Ala
115 120 125
Ser Asp Thr Asn Trp Gin Val Tyr Arg Glu Tyr He Cys Pro Asp Cys 130 135 140
Gly He Leu Leu Asp Val Glu Ala Pro Thr Pro Trp Tyr Pro Val He 145 150 155 160
His Asp Phe Glu Pro Asp He Glu Val Phe Tyr Lys Glu Trp Leu Gly 165 170 175 He Gin Pro Pro Glu Arg Arg Glx 180
(2) INFORMATION FOR SEQ ID NO:415: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 291 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 39, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:415: Asn Gly Tyr Thr Pro Gly Tyr Tyr Gly Pro Met Ser He Glu Asn Phe 1 5 10 15
Lys Lys He Asn Glu Ala Tyr Gin He Leu Gin Thr Ala Leu Lys Lys
20 25 30
Gly Leu Pro Ala Leu Lys Glu Asn Asn Gly Thr Val Asn Val Ser Tyr 35 40 45
Thr Tyr Thr Cys Ser Gly Glu Gly Asn Asp Asn Cys Ser Glu Lys Ala
50 55 60
Thr Gly Val Lys Asn Gin His Gly Gly Thr Thr Thr Lys He Gin Thr 65 70 75 80 He Asp Gly Lys Ser Val Thr Thr Thr He Ser Ser Lys Val Val Asp
85 90 95
Ser Thr Ala Leu Gly Asn Thr Gin Arg Val Ser Tyr Thr Glu He Thr
100 105 110
Asn Gin Leu Asp Gly Val Pro Asn Ser Ala Gin Ala Leu Leu Ala Gin 115 120 125
Ala Ser Thr Leu He Asn Thr He Asn Thr Ala Cys Pro Tyr Phe His
130 135 140
Ala Asn Asn Ser Ser Glu Ala Asn Ala Pro Lys Phe Ser Thr Thr Thr 145 150 155 160 Gly Lys He Cys Gly Ala Phe Ser Glu Glu He Ser Ala He Gin Lys
165 170 175
Met He Thr Asp Ala Gin Glu Leu Val Asn Gin Thr Ser Val He Asn
180 185 190
Glu His Glu Gin Ser Thr Pro Val Gly Gly Asn Asn Gly Lys Pro Phe 195 200 205
Asn Pro Phe Thr Asp Ala Ser Phe Ala Gin Gly Met Leu Ala Asn Ala 210 215 220 Ser Ala Gin Ala Lys Met Leu Asn Leu Ala His Gin Val Gly Gin Ala
225 230 235 240
He Asn Pro Asp Asn Leu Thr Gly Ser Phe Lys Asn Phe Val Thr Gly
245 250 255 Phe Leu Ala Thr Cys Asn Asn Pro Ser Thr Ala Gly Thr Ser Gly Thr
260 265 270
Gin Gly Ser Ala Leu Gly Thr Val Thr Thr Gin Thr Phe Ala Ser Ala
275 280 285
Lys Ser Lys 290
(2) INFORMATION FOR SEQ ID NO:416:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 25 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 40a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:416:
Leu Pro Leu Val Ala Pro Ser Lys Glu Thr He Lys Leu Leu Glu Lys 1 5 10 15
Thr Leu Gin Gin Tyr Glu Val He Ala 20 25
(2) INFORMATION FOR SEQ ID NO:417:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 262 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 40b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 417:
Met Asn Gly Ser Asn His Met Lys Asn Lys Thr Leu Val He Ser Gly
1 5 10 15
Ala Thr Arg Gly He Gly Lys Ala He Leu Tyr Arg Phe Ala Gin Ser 20 25 30 Gly Val Asn He Ala Phe Thr Tyr Asn Lys Asn Val Glu Glu Ala Asn 35 40 45
Lys He He Glu Asp Val Glu Gin Lys Tyr Ser He Lys Ala Lys Ala
50 55 60
Tyr Ser Leu Asn Val Leu Glu Pro Glu Gin Tyr Thr Glu Leu Phe Lys 65 70 75 80
Gin He Asp Ala Asp Phe Asp Arg Val Asp Phe Phe He Ser Asn Ala
85 90 95
He He Tyr Gly Arg Ser Val Val Gly Gly Phe Ala Pro Phe Met Arg 100 105 110 Leu Lys Pro Lys Gly Leu Asn Asn He Tyr Thr Ala Thr Val Leu Ala 115 120 125
Phe Val Val Gly Ala Gin Glu Ala Ala Lys Arg Met Gin Lys He Gly
130 135 140
Gly Gly Ala He Val Ser Leu Ser Ser Thr Gly Asn Leu Val Tyr Met 145 150 155 160
Pro Asn Tyr Ala Gly His Gly Asn Ser Lys Asn Ala Val Glu Thr Met
165 170 175
Val Lys Tyr Ala Ala Val Asp Leu Gly Glu Phe Asn He Arg Val Asn 180 185 190 Ala Val Ser Gly Gly Pro He Asp Thr Asp Ala Leu Lys Ala Phe Pro 195 200 205
Asp Tyr Val Glu He Lys Glu Lys Val Glu Glu Gin Ser Pro Leu Lys 210 215 220
Arg Met Gly Asn Pro Asn Asp Leu Ala Gly Ala Ala Tyr Phe Leu Cys 225 230 235 240
Ala Glu Thr Gin Ser Gly Trp Leu Thr Gly Gin Thr He Val Val Asp 245 250 255
Gly Gly Thr Thr Phe Lys 260
(2) INFORMATION FOR SEQ ID NO: 418:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 41a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:418:
Met Arg Lys Thr He Ser Ala Leu Phe Leu Ser Ala Cys He Gly Leu
1 5 10 15
Ser Ser Val Tyr Ala Asp Asn Ala Leu He Leu Gin Thr Asp Phe Ser 20 25 30 Leu Lys Asp Gly Ala Val Ser Ala Met Lys Gly Val Ala Phe Ser Val
35 40 45
Asp Ser His Leu Lys He Phe Asp Leu Thr His Glu He Pro Pro Tyr
50 55 60
Asn He Trp Glu Gly Ala Tyr Arg Leu Tyr Gin Thr Ala Ser Tyr Trp 65 70 75 80
Pro Lys Gly Ser Val Phe Val Ser Val Val Asp Pro Gly Val Gly Thr
85 90 95
Lys Arg Lys Ser Val Val Leu Lys Thr Lys Asn Gly Gin Tyr Phe Val 100 105 110 Ser Pro Asp Asn Gly Thr Leu Thr Leu Val Ala Gin Thr Leu Gly He
115 120 125
Asp Ser Val Arg Glu He Asp Glu Lys Ala Asn Arg Leu Lys Gly Ser
130 135 140
Glu Lys Ser Tyr Thr Phe His Gly Arg Asp Val Tyr Ala Tyr Thr Gly 145 150 155 160
Ala Arg Leu Ala Ser Gly Ala He Thr Phe Glu Gin Val Gly Pro Glu
165 170 175
Leu Pro Pro Lys Val Val Glu He Pro Tyr Gin Lys Ala Lys Ala Thr 180 185 190 Lys Gly Glu Val Lys Gly Asn He Pro He Leu Asp He Gin Tyr Gly 195 200 205
Asn Val Trp Ser Asn He Ser Asp Lys Leu Leu Asn Gin Ala Lys He
210 215 220
Lys Leu Asn Asp Thr Leu Cys Val Thr He Phe Lys Gly Ser Lys Lys 225 230 235 240
Gin Tyr Glu Gly Lys Met Pro Tyr Val Ala Ser Phe Gly Asp Val Pro
245 250 255
Glu Gly Gin Pro Leu Val Tyr Leu Asn Ser Leu Leu Asn Val Ser Val 260 265 270 Ala Leu Asn Arg Asp Asp Phe Ala Gin Lys Tyr Gin He Lys Ser Gly 275 280 285
Ala Asp Trp Asn He Asp He Lys Lys Cys Ala Lys 290 295 300 (2) INFORMATION FOR SEQ ID NO: 419:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 114 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 41b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:419: Met Asn He Lys Thr His Ser Ser Asn Glu Lys Glu Arg Phe Val Arg 1 5 10 15
He Glu Glu Asp Glu Lys Lys Gly Leu Phe Ala Gly Thr Ala Asn Glu
20 25 30
Asn Ser His Gly Leu Ser Leu Met Ala Leu He Gly Val Leu Val Phe 35 40 45
Gly Gly Ala Phe Leu Ala Leu Leu Ala Pro Lys He Tyr Leu Ser Asn
50 55 60
Asn He Tyr Tyr He Ser Arg Lys He Asn Thr Leu Glu Asp Gin Lys 65 70 75 80 Arg Leu Leu Leu Glu Glu Gin Gin He Leu Lys Asn Glu Leu Glu Lys
85 90 95
Glu Arg Phe Lys Tyr Tyr He Glu Asn Ser Glu Asn He Gly Asp He
100 105 110
Ala Phe
(2) INFORMATION FOR SEQ ID NO: 420:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 338 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 42a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:420:
Leu Asn Ala His Leu Trp Gly Lys Gin Asp Asn Ser Phe Leu Gly Val 1 5 10 15
Ala Glu Arg Ala Tyr Lys Ser Gly Asn Tyr Ser Lys Ala Thr Ser Tyr
20 25 30
Phe Lys Lys Ala Cys Asn Asp Gly Val Ser Glu Gly Cys Thr Gin Leu 35 40 45 Gly He He Tyr Glu Asn Gly Gin Gly Thr Arg He Asp Tyr Lys Lys 50 55 60
Ala Leu Glu Tyr Tyr Lys Thr Ala Cys Gin Ala Asp Asp Arg Glu Gly 65 70 75 80
Cys Phe Gly Leu Gly Gly Leu Tyr Asp Glu Gly Leu Gly Thr Thr Gin 85 90 95
Asn Tyr Gin Glu Ala He Asp Ala Tyr Ala Lys Ala Cys Val Leu Lys
100 105 110
His Pro Glu Ser Cys Tyr Asn Leu Gly He He Tyr Asp Arg Lys He 115 120 125 Lys Gly Asn Ala Asp Gin Ala Val Thr Tyr Tyr Gin Lys Ser Cys Asn 130 135 140
Phe Asp Met Ala Lys Gly Cys Tyr Val Leu Gly Val Ala Tyr Glu Lys 145 150 155 160
Gly Phe Leu Glu Val Lys Gin Ser Asn His Lys Ala Val He Tyr Tyr 165 170 175
Leu Lys Ala Cys Arg Leu Asp Asp Gly Gin Ala Cys Arg Ala Leu Gly
180 185 190
Ser Leu Phe Glu Asn Gly Asp Ala Gly Leu Asp Glu Asp Phe Glu Val 195 200 205 Ala Phe Asp Tyr Leu Gin Lys Ala Cys Gly Leu Asn Asn Ser Gly Gly 210 215 220
Cys Ala Ser Leu Gly Ser Met Tyr Met Leu Gly Arg Tyr Val Lys Lys 225 230 235 240
Asp Pro Gin Lys Ala Phe Asn Tyr Phe Lys Gin Ala Cys Asp Met Gly 245 250 255
Ser Ala Val Ser Cys Ser Arg Met Gly Phe Met Tyr Ser Gin Gly Asp 260 265 270 Ala Val Pro Lys Asp Leu Arg Lys Ala Leu Asp Asn Tyr Glu Arg Gly
275 280 285
Cys Asp Met Gly Asp Glu Val Gly Cys Phe Ala Leu Ala Gly Met Tyr
290 295 300 Tyr Asn Met Lys Asp Lys Glu Asn Ala He Met He Tyr Asp Lys Gly
305 310 315 320
Cys Lys Leu Gly Met Lys Gin Ala Cys Glu Asn Leu Thr Lys Leu Arg
325 330 335
Gly Tyr
(2) INFORMATION FOR SEQ ID NO: 421:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 131 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 42b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 421:
He Gin Gly Ala Lys Tyr Arg Lys Ala Phe Ser Val Glu Glu Met He 1 5 10 15
Pro Ser Met Gly Gin Gly Ala Leu Gly Val Glu Met Leu Lys Asn His
20 25 30
Lys His Phe He Thr Leu Gin Lys Leu Asn Asp Glu Lys Ser Ala Phe 35 40 45 Cys Cys His Leu Glu Arg Glu Phe Val Lys Gly Leu Asn Gly Gly Cys 50 55 60
Gin He Pro He Gly Val His Ala Ser Leu Met Gly Asp Arg Val Lys 65 70 75 80
He Gin Ala Val Leu Gly Leu Pro Asn Gly Lys Glu Val He Thr Lys 85 90 95
Glu Lys Gin Gly Asp Lys Thr Lys Ala Phe Asp Leu Val Gin Glu Leu
100 105 110
Leu Glu Glu Phe Leu Gin Ser Gly Ala Lys Glu He Leu Glu Lys Ala 115 120 125 Gin Leu Phe 130
(2) INFORMATION FOR SEQ ID NO:422: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 42c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:422: Met Arg Leu Phe He Ala Leu Val Leu Phe Trp Trp Trp Leu Ser Leu 1 5 10 15
Ser Ala Lys Glu Ala Asp Phe He Ser Asp Leu Glu Tyr Gly Met Ala
20 25 30
Leu Tyr Lys Asn Pro Arg Gly Val Ala Cys Ala Lys Cys His Gly He 35 40 45
Lys Gly Glu Gin Gin Glu He Thr Phe Tyr Tyr Glu Lys Gly Glu Lys
50 55 60
Lys He Leu Tyr Ala Pro Lys He Asn His Leu Asp Phe Lys Thr Phe
65 70 75 80 Lys Asp Ala Leu Ser Leu Gly Lys Gly Met Met Pro Lys Tyr Asn Leu
85 90 95
Asn Leu Glu Glu He Gin Ala He Tyr Leu Tyr He Thr Ser Leu Glu 100 105 110
His Lys Glu Glu Arg Lys Asp Ser Pro Lys Pro 115 120 (2) INFORMATION FOR SEQ ID NO:423:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 43a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 423:
Met He Lys Lys Cys Leu Phe Pro Ala Ala Gly Tyr Gly Thr Arg Phe
1 5 10 15
Leu Pro He Thr Lys Thr He Pro Lys Glu Met Leu Pro He Val Asp 20 25 30
Lys Pro Leu He Gin Tyr Ala Val Glu Glu Ala Met Glu Ala Gly Cys
35 40 45
Glu Val Met Ala He Val Thr Gly Arg Asn Lys Arg Ser Leu Glu Asp 50 55 60 Tyr Phe Asp Thr Ser Tyr Glu He Glu His Gin He Gin Gly Thr Asn 65 70 75 80
Lys Glu Asn Ala Leu Lys Ser He Arg Asn He He Glu Lys Cys Cys
85 90 95
Phe Ser Tyr Val Arg Gin Lys Gin Met Lys Gly Leu Gly His Ala He 100 105 110
Leu Thr Gly Glu Ala Leu He Gly Asn Glu Pro Phe Ala Val He Leu
115 120 125
Ala Asp Asp Leu Cys He Ser His Asp His Pro Ser Val Leu Lys Gin
130 135 140 Met Thr Ser Leu Tyr Gin Lys Tyr Gin Cys Ser He Val Ala He Glu
145 150 155 160
Glu Val Ala Leu Glu Glu Val Ser Lys Tyr Gly Val He Arg Gly Glu
165 170 175
Trp Leu Glu Glu Gly Val Tyr Glu He Lys Asp Met Val Glu Lys Pro 180 185 190
Asn Gin Glu Asp Ala Pro Ser Asn Leu Ala Val He Gly Arg Tyr He
195 200 205
Leu Thr Pro Asp He Phe Glu He Leu Ser Glu Thr Lys Pro Gly Lys
210 215 220 Asn Asn Glu He Gin He Thr Asp Ala Leu Gly Thr Gin Ala Lys Arg
225 230 235 240
Lys Arg He He Ala Tyr Gin Phe Lys Gly Lys Arg Tyr Asp Cys Gly
245 250 255
Ser Val Glu Gly Tyr He Glu Ala Ser Asn Ala Tyr Tyr Lys Lys Arg 260 265 270
Leu
(2) INFORMATION FOR SEQ ID NO:424:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 43b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 424:
Leu Lys Lys Glu Phe Asn His Pro Leu Phe Val Ala Tyr Ala Tyr Asn 1 5 10 15 Ala Gly Pro Gly Phe Leu Arg Arg Trp Leu Glu Ser Ser Lys Arg Phe
20 25 30
Lys Glu Lys Asn His Phe Glu Pro Trp Leu Ser Met Glu Leu Met Pro 35 40 45 Tyr Ser Glu Thr Arg Leu Tyr Gly Phe Arg Val Met Leu Asn Tyr Leu 50 55 60
He Tyr Gin Glu He Phe Gly Asn Phe He Pro He Asp Gly Phe Leu 65 70 75 80
Glu Gin Thr Leu Asn Ser Lys Asp Lys Pro 85 90
(2) INFORMATION FOR SEQ ID NO: 425:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 135 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 43c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 5:
Val Gly Asn Leu Thr Tyr Tyr Ala Tyr Met Tyr Leu He Leu Phe Val 1 5 10 15
Cys Leu Leu Pro Val Leu Leu Met Gly Leu Val Trp Arg Leu Thr Arg
20 25 30
Pro Pro Leu Lys Gin Asn He Pro Asn Lys Ser Leu Ser Leu Glu Asn 35 40 45 Leu Asn Glu Gin He Lys Asn Leu Lys Ser Val Pro Ala Leu Glu Lys 50 55 60
Leu Lys Asn Asp Phe Asn Glu Arg Phe Lys He Cys Pro Lys Asp Lys 65 70 75 80
Glu Thr Leu Trp Leu Glu Thr He Gin Lys Leu Val Ala Ser Glu Phe 85 90 95
Phe Glu Leu Glu Asp Ala He Asn Phe Gly Gin Glu Leu Glu Asn Ala
100 105 110
Asn Pro Asn Tyr Gin Gin Lys He Ala Asn Ala Thr Gly Leu Ala Leu 115 120 125 Lys Asn Lys Lys Glu Lys Gly 130 135
(2) INFORMATION FOR SEQ ID NO: 426: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 115 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 44, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:426: Asp Trp Ser Asn Arg Lys Met Gin Leu Glu Leu Phe Pro He Asp Leu 1 5 10 15
Pro Tyr Ala Ser Glu Lys Glu He Ala He Ala Lys Met Gin His Leu
20 25 30
Pro Lys Leu Val Arg Xaa Ala Leu Lys Cys Met Gly Phe Asp Arg Val 35 40 45
Ser Lys Glu He Val Phe Glu Tyr Glu Pro Lys Leu Leu Lys Pro Ser
50 55 60
Arg Leu Thr Tyr Phe Phe Gly Tyr Phe Gin Asp Pro Arg Tyr Phe Asp 65 70 75 80 Ala He Ser Pro Leu He Lys Gin Thr Phe Thr Leu Pro Pro Pro Pro
85 90 95
Pro Lys Met Glu Lys Leu Leu Lys Lys Lys Arg Lys Asn He Ser Ala 100 105 110
Ser Leu Leu 115 (2) INFORMATION FOR SEQ ID NO: 427:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 239 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 45a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:427:
Leu Lys His Leu Thr Pro Leu Thr His Thr He Phe Lys Ala Leu Trp
1 5 10 15
Leu Gly Thr Ala Leu Ser Ala Ser Leu Ser Leu Ala Ala Ala Glu Ser 20 25 30
Pro Thr Lys Thr Glu Pro Lys Pro Ala Lys Gly Val Lys Asn Lys Pro
35 40 45
Lys Ser Pro Val Thr Lys Val Met Met Thr Asn Cys Asp Asn He Lys 50 55 60 Asp Phe Asn Ala Lys Gin Lys Glu Val Leu Lys Ala Ala Tyr Gin Phe 65 70 75 80
Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly He Ala Trp Lys
85 90 95
Glu Ser Cys Ala Gly Val Tyr Lys He Asn Phe Ser Asp Pro Ser Ala 100 105 110
Gly Val Tyr His Ser Tyr He Pro Ser Val Leu Lys Ser Tyr Gly His
115 120 125
Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu He Lys
130 135 140 Asp Asp Ala Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp
145 150 155 160
Lys Thr Arg Tyr His Asp Asn Leu Lys Asp Met He Lys Ser Tyr Asn
165 170 175
Lys Gly Ser Arg Trp Glu Lys Asn Glu Lys Ser Asn Ala Asp Ala Glu 180 185 190
Lys Tyr Tyr Glu Glu He Gin Asp Arg He Arg Arg Leu Lys Glu Ser
195 200 205
Lys He Phe Asp Ser Gin Ser Ser Asn Asp Gin Glu Leu Gin Lys Ser 210 215 220 Ala Asn Ser Asn Leu Asp Leu Asp Pro He Gly Asn Ala Met Pro 225 230 235
(2) INFORMATION FOR SEQ ID NO: 428: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 45b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:428: Leu His Ser Asp Glu Leu Leu Val Glu He Leu Val Glu Glu Leu Pro 1 5 10 15
Ala Gin Ala Leu Leu Asn Glu Tyr Lys Glu Met Pro Lys Lys Leu His
20 25 30
Ala Leu Phe Gin Lys Arg Thr Leu Glu Val Gly Asn He Glu He Phe 35 40 45
Tyr Thr Pro Arg Arg Leu Cys Leu Leu He Lys Asp Phe Pro Leu Leu 50 55 60 Thr Gin Glu Thr Lys Glu Glu Phe Phe Gly Pro Pro Val Lys He Ala 65 70 75 80
Cys Asn Asn Glu Asp Lys Thr Gin Gly Leu Asn Ala Leu Gly Leu Gly 85 90 95 Phe Tyr Gin Lys Leu Gly Leu Lys Asp His Gin His Phe Gin Thr Ala 100 105 110
Phe Lys Asn Asn Lys Glu Val Leu Tyr His Ala 115 120 (2) INFORMATION FOR SEQ ID NO: 429:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 271 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 46a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 429:
Met Asn He Phe Lys Arg Val He Ser Val Gly Val He Val Leu Gly
1 5 10 15
Leu Phe Asn Leu Leu Asp Ala Lys His His Lys Glu Lys Lys Glu Asn 20 25 30
His Lys He Thr Arg Glu Leu Lys Val Gly Ala Asn Pro Val Pro His
35 40 45
Ala Gin He Leu Gin Ser Val Val Asp Asp Leu Lys Glu Lys Gly He 50 55 60 Lys Leu Val He Val Ser Phe Thr Asp Tyr Val Leu Pro Asn Leu Ala 65 70 75 80
Leu Asn Asp Gly Ser Leu Asp Ala Asn Tyr Phe Gin His Arg Pro Tyr
85 90 95
Leu Asp Arg Phe Asn Leu Asp Arg Lys Met His Leu Val Gly Leu Ala 100 105 110
Asn He His Val Glu Pro Leu Arg Phe Tyr Ser Gin Lys He Thr Asp
115 120 125
He Lys Asn Leu Lys Lys Gly Ser Val He Ala Val Pro Asn Asp Pro
130 135 140 Ala Asn Gin Gly Arg Ala Leu He Leu Leu His Lys Gin Gly Leu He
145 150 155 160
Ala Leu Lys Asp Pro Ser Asn Leu Tyr Ala Thr Glu Phe Asp He Val
165 170 175
Lys Asn Pro Tyr Asn He Lys He Lys Pro Leu Glu Ala Ala Leu Leu 180 185 190
Pro Lys Val Leu Gly Asp Val Asp Gly Ala He Val Thr Gly Asn Tyr
195 200 205
Ala Leu Gin Ala Lys Leu Thr Gly Ala Leu Phe Ser Glu Asp Lys Asp
210 215 220 Ser Pro Tyr Ala Asn Leu He Ala Ala Arg Glu Asp Asn Ala Gin Asp
225 230 235 240
Glu Ala He Lys Ala Leu He Glu Val Leu Gin Ser Glu Lys Thr Arg
245 250 255
Lys Phe He Leu Asp Thr Tyr Lys Gly Ala He He Pro Ala Phe 260 265 270
(2) INFORMATION FOR SEQ ID NO: 430:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 97 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 46b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:430: Val Ala Asp He Thr Lys Ser He Ser Arg Asp Tyr Asp Val Leu Phe
1 5 10 15
Glu Glu Ala He Ala Leu Arg Gly Ala Phe Leu He Asp Lys Asn Met 20 25 30 Lys Val Arg His Ala Val He Asn Asp Leu Pro Leu Gly Arg Asn Ala 35 40 45
Asp Glu Met Leu Arg Met Val Asp Ala Leu Leu His Phe Glu Glu His
50 55 60
Gly Glu Val Cys Pro Ala Gly Trp Arg Lys Gly Asp Lys Gly Met Lys 65 70 75 80
Ala Thr His Gin Gly Val Ala Glu Tyr Leu Lys Glu Asn Ser He Lys
85 90 95
Leu
(2) INFORMATION FOR SEQ ID NO: 431:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 242 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 46c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 431:
Arg Asp Trp Lys Lys Thr Gly His Gly Asn Ser Asn Leu Tyr Lys Ala 1 5 10 15 He Arg Glu Ser Val Asp Val Tyr Phe Tyr Lys Phe Gly Leu Glu He 20 25 30
Ser He Glu Lys Leu Ser Lys Thr Leu Arg Glu Val Gly Phe Gly Glu
35 40 45
Lys Thr Gly Val Asp Leu Pro Asn Glu Phe Val Gly He Val Pro Asp 50 55 60
Asn Leu Trp Lys Leu Lys Arg Phe Asn Gin Asn Trp Arg Val Gly Asp 65 70 75 80
Thr Leu He Thr Ala He Gly Gin Gly Ser Phe Leu Ala Thr Pro Leu 85 90 95 Gin Val Leu Ala Tyr Thr Gly Leu He Ala Thr Gly Lys Leu Ala Thr 100 105 110
Pro His Phe Ala He Asn Asn Gin Gin Pro Leu Lys Asp Pro Leu Asn
115 120 125
Ser Phe Gin Lys Lys Lys Leu Gin Ala Leu Arg Val Gly Met Tyr Glu 130 135 140
Val Cys Asn His Lys Asp Gly Thr Ala Tyr His Ser Thr Arg Gly Ser
145 150 155 160
Lys Val Thr Leu Ala Cys Lys Thr Gly Thr Ala Gin Val Val Glu He
165 170 175 Ala Gin Asn He Val Asn Arg Met Lys Glu Lys Asp Met Glu Tyr Phe
180 185 190
His Arg Ser His Ala Trp He Thr Ala Phe Leu Pro Tyr Glu Lys Pro
195 200 205
Lys Tyr Ala He Thr He Leu Val Lys His Gly Glu Arg Gly Ser Lys 210 215 220
Leu Gly Gly Leu Leu Met Lys Met Asn Asn Lys Leu Tyr Glu Leu Gly 225 230 235 240
Tyr Leu
(2) INFORMATION FOR SEQ ID NO:432
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 186 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 47a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 432:
Met Phe Gin He Arg Trp His Ala Arg Ala Gly Gin Gly Ala He Thr
1 5 10 15
Gly Ala Lys Gly Leu Ala Asp Val He Ser Lys Thr Gly Lys Glu Val 20 25 30 Gin Ala Phe Ala Ser Tyr Gly Ser Ala Lys Arg Gly Ala Ala Met Met 35 40 45
Ala Tyr Asn Arg Val Asp Asp Glu Pro He Leu Asn His Glu Arg Phe
50 55 60
Met Gin Pro Asp Tyr Val Leu Val He Asp Pro Gly Leu Val Phe He 65 70 75 80
Glu Asn He Phe Pro Asn Glu Lys Glu Asp Thr Thr Tyr He He Thr
85 90 95
Ser Tyr Leu Asn Lys Glu Glu Leu Phe Glu Lys Lys Pro Glu Leu Lys 100 105 110 Thr Arg Lys Val Phe Leu Val Asp Cys Leu Lys He Ser Met Glu Thr 115 120 125
Leu Lys Arg Pro He Pro Asn Thr Pro Met Leu Gly Ala Leu Met Lys
130 135 140
Val Ser Gly Met Leu Glu He Gly Ala Phe Lys Glu Ala Phe Lys Lys 145 150 155 160
Val Leu Gly Lys Lys Leu Thr Gin Glu Val He Asp Ala Asn Met Leu
165 170 175
Ala He Gin Arg Ala Tyr Glu Glu Val Gin 180 185
(2) INFORMATION FOR SEQ ID NO:433:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 108 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 47b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:433
Met Lys Asp Trp Asn Glu Phe Glu Met Gly Ala Val Leu Phe Pro Phe 1 5 10 15 Glu Lys Asn Ala Gin Ser Glu Met Glu Lys His Asn Asp Glu Arg His 20 25 30
Tyr Thr Glu Gin Ser Tyr Phe Thr Thr Ser Val Ala His Trp Arg Val
35 40 45
Ala Lys Pro Val His Asn Asn Asn He Cys He Asn Cys Phe Asn Cys 50 55 60
Trp Val Tyr Cys Pro Asp Ala Ala He Leu Ser Arg Glu Gly Lys Leu 65 70 75 80
Lys Gly Val Asp Tyr Ser His Cys Lys Gly Cys Gly Val Cys Val Asp 85 90 95 Val Cys Pro Thr Asn Pro Lys Ser Leu Trp Met Phe 100 105
(2) INFORMATION FOR SEQ ID NO:434: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 180 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 48a, antigen 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 434:
Met Leu Asn Lys Phe Lys Lys He Val Gly Val Gly Val Leu Val Gly 1 5 10 15 Cys Leu Gly Val Leu Gin Ala Lys Asn Ser Leu Phe Val Leu Pro Tyr 20 25 30
Glu Gin Arg Asp Ala Leu Asn Ala Leu Val Ser Gly He Ser Asn Ala
35 40 45
Arg Glu Ser Val Lys He Ala He Tyr Ser Phe Thr His Arg Asp He 50 55 60
Ala Arg Ala He Lys Ser Val Ala Ser Arg Gly He Lys Val Gin He 65 70 75 80
He Tyr Asp Tyr Glu Ser Asn His His Asn Lys Gin Ser Thr He Gly 85 90 95 Tyr Leu Asp Lys Tyr Pro Asn Thr Lys Val Cys Leu Leu Lys Gly Leu 100 105 110
Lys Ala Lys Asn Gly Asn Tyr Tyr Gly He Met His Gin Lys Val Ala
115 120 125
He He Asp Asp Lys He Val Phe Leu Gly Ser Ala Asn Trp Ser Lys 130 135 140
Asn Ala Phe Glu Asn Asn Tyr Glu Val Leu Leu Lys Thr Asp Asp Thr 145 150 155 160
Glu Thr He Leu Lys Ala Lys Ser Tyr Tyr Gin Lys Met Leu Gly Ser 165 170 175 Cys Val Gly Phe 180
(2) INFORMATION FOR SEQ ID NO: 435: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 70 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 48b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 435: Met Lys Met He Leu Phe Asn Gin Asn Pro Met He Thr Lys Leu Leu 1 5 10 15
Glu Ser Val Ser Lys Lys Leu Glu Leu Pro Met Glu Asn Phe Asn His
20 25 30
Tyr Gin Glu Leu Ser Ala Cys Leu Lys Glu Asp Pro Glu Trp He Leu 35 40 45
He Ala Asp Asp Glu Cys Leu Glu Lys Leu Asp Gin Val Asp Trp Leu
50 55 60
Glu Leu Lys Glu He He 65 70
(2) INFORMATION FOR SEQ ID NO:436:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 23 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 48c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:436:
Thr Thr Asp Ala Leu Lys Val Gin Met Arg Arg Val Phe Ala Phe Tyr 1 5 10 15 Val Gly Tyr Asn Tyr His Phe 20 (2) INFORMATION FOR SEQ ID NO: 437:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 267 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 49, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:437:
Met Ser Glu Lys Glu Arg Leu Asn Glu Val He Leu Glu Glu Glu Asn 1 5 10 15 Asn Gly Ser Gly Thr Lys Lys Val Phe Leu He Val Ala He Ala He 20 25 30
He He Leu Ala Val Leu Leu Met Val Phe Trp Lys Ser Thr Arg Val
35 40 45
Ala Pro Lys Glu Thr Phe Leu Gin Thr Asp Ser Gly Met Gin Lys He 50 55 60
Gly Asn Thr Lys Asp Glu Lys Lys Asp Asp Glu Phe Glu Ser Leu Asn 65 70 75 80
Leu Asp Pro Ser Lys Gin Glu Asp Lys Leu Asp Lys Val Ala Asp Asn 85 90 95 Val Lys Lys Gin Glu Asn Asp Ala Phe Asn Met Pro Thr Gin Thr Asp 100 105 110
Gin Thr Gin Thr Glu Met Lys Thr Thr Glu Glu Thr Gin Glu Ala Gin
115 120 125
Lys Gly Leu Lys Val Val Glu His Thr Ser Thr Gin Lys Glu Ser Gin 130 135 140
Ala Val Ala Lys Lys Glu He Ser His Lys Lys Pro Lys Ala Thr Pro
145 150 155 160
Lys Asp Lys Glu Ala His Lys Asp Lys Asp Lys His Ala Val Lys Glu
165 170 175 Leu Lys Val Lys Lys Glu Ala His Lys Glu Val Pro Lys Lys Ala Asn
180 185 190
Ser Lys Thr Thr Leu Thr Lys Gly His Tyr Leu Gin Val Gly Val Phe
195 200 205
Ala His Thr Pro Asn Lys Ala Phe Leu Gin Ala Phe Asn Gin Phe Pro 210 215 220
His Lys He Glu Asp Arg Gly Ala Thr Lys Arg Tyr Leu He Gly Pro 225 230 235 240
Tyr Lys Ser Lys Gin Glu Ala Leu Met His Ala Asp Glu Val Ser Lys 245 250 255 Lys Met Thr Lys Pro Val Val He Glu Ala Arg 260 265
(2) INFORMATION FOR SEQ ID NO:438: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 50a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:438: Met Ala Phe Asn Tyr Asp Glu Tyr Leu Arg Val Asp Lys He Pro Thr 1 5 10 15
Leu Trp Cys Trp Gly Cys Gly Asp Gly Val He Leu Lys Ser He He
20 25 30
Arg Thr He Asp Ala Leu Gly Trp Lys Met Asp Asp Val Cys Leu Val 35 40 45
Ser Gly He Gly Cys Ser Gly Arg Met Ser Ser Tyr Val Asn Cys Asn 50 55 60 Thr Val His Thr Thr His Gly Arg Ala Val Ala Tyr Ala Thr Gly He 65 70 75 80
Lys Met Ala Asn Pro Ser Lys His Val He Val Val Ser Gly Asp Gly 85 90 95 Glu Gly Phe Ala He Gly Gly Asn His Thr Met His Ala Cys Arg Arg 100 105 110
Asn He Asp Leu Asn Phe He Leu Val Asn Asn Phe He Tyr Gly Leu
115 120 125
Thr Asn Ser Gin Thr Ser Pro Thr Thr Pro Asn Gly Met Trp Thr Val 130 135 140
Thr Ala Gin Trp Gly Asn He Asp Asn Gin Phe Asp Pro Cys Ala Leu
145 150 155 160
Thr Thr Ala Ala Gly Ala Ser Phe Val Ala Arg Glu Ser Val Leu Asp
165 170 175 Pro Gin Lys Leu Glu Lys Val Leu Lys Glu Gly Phe Ser His Lys Gly
180 185 190
Phe Ser Phe Phe Asp Val His Ser Asn Cys His He Asn Leu Gly Arg
195 200 205
Lys Asn Lys Met Gly Glu Ala Ser Gin Met Leu Lys Trp Met Glu Ser 210 215 220
Arg Leu Val Ser Lys Arg Gin Phe Glu Ala Met Ser Pro Glu Glu Arg 225 230 235 240
Val Asp Lys Phe Pro Thr Gly Val Leu Lys His Asp Thr Asp Arg Lys 245 250 255 Glu Tyr Cys Glu Ala Tyr Gin Glu He He Glu Lys Ala Gin Gly Lys
260 265 270
Gin
(2) INFORMATION FOR SEQ ID NO: 439:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 230 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 50b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:439:
Ala Tyr Thr Glu Thr Val Arg Ala Phe Asn Leu Ala Glu Met Leu Met
1 5 10 15
Thr Pro Val Phe Leu Leu Met Asp Glu Thr Val Gly His Met Tyr Gly 20 25 30
Lys Val Gin He Pro Asp Leu Glu Glu Val Gin Lys Thr Thr He Asn
35 40 45
Arg Lys Glu Phe Val Gly Asp Lys Lys Asp Tyr Lys Pro Tyr Gly Val 50 55 60 Ala Gin Asp Glu Pro Ala Val Leu Asn Pro Phe Phe Lys Gly Tyr Arg 65 70 75 80
Tyr His Val Ser Gly Leu His His Gly Pro He Gly Phe Pro Thr Glu
85 90 95
Asp Ala Lys He Gly Gly Asp Leu Thr Asp Arg Leu Phe Asn Lys He 100 105 110
Glu Ser Lys Gin Asp He He Asn Glu Asn Glu Glu Met Asp Leu Glu
115 120 125
Gly Ala Glu He Val He He Ala Tyr Gly Ser Val Ser Leu Pro Val
130 135 140 Lys Glu Ala Leu Lys Asp Tyr His Lys Glu Ser Lys Gin Lys Val Gly
145 150 155 160
Phe Phe Arg Pro Lys Thr Leu Trp Pro Ser Pro Ala Lys Arg Leu Lys
165 170 175
Glu He Gly Asp Lys Tyr Glu Lys He Leu Val He Glu Leu Asn Lys 180 185 190
Gly Gin Tyr Leu Glu Lys He Glu Arg Ala Met Gin Arg Lys Val His 195 200 205 Phe Leu Gly Gin Ala Asn Gly Arg Thr He Ser Pro Lys Gin He He
210 215 220
Ala Lys Leu Lys Glu Leu 225 230
(2) INFORMATION FOR SEQ ID NO: 440:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 180 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 50c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 440:
Met Glu Ala Gin Leu Arg Phe Thr Gly Val Gly Gly Gin Gly Val Leu 1 5 10 15 Leu Ala Gly Glu He Leu Ala Glu Ala Lys He Val Ser Gly Gly Tyr 20 25 30
Gly Thr Lys Thr Ser Thr Tyr Thr Ser Gin Val Arg Gly Gly Pro Thr
35 40 45
Lys Val Asp He Leu Leu Asp Arg Asn Glu He He Phe Pro Tyr Ala 50 55 60
Lys Glu Gly Glu He Asp Phe Met Leu Ser Val Ala Gin He Ser Tyr 65 70 75 80
Asn Gin Phe Lys Ser Asp He Lys Gin Gly Gly He Val Val He Asp 85 90 95 Pro Asn Leu Val Thr Pro Thr Lys Glu Asp Glu Glu Lys Tyr Gin Leu 100 105 110
Tyr Lys He Pro He He Ser He Ala Lys Asp Glu Val Gly Asn He
115 120 125
He Thr Gin Ser Val Val Ala Leu Ala He Thr Val Glu Leu Thr Lys 130 135 140
Cys Val Glu Glu Asn He Val Leu Asp Thr Met Leu Lys Lys Val Pro 145 150 155 160
Ala Lys Val Ala Asp Thr Asn Lys Lys Ala Phe Glu He Gly Lys Lys 165 170 175 His Ala Leu Glu 180
(2) INFORMATION FOR SEQ ID NO: 441: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 230 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 51, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 441: Met Xaa Val Leu Asn Ala Lys Glu Cys Val Ser Pro He Thr Arg Ser 1 5 10 15
Val Lys Tyr His Gin Gin Ser Ala Glu He Arg Ala Leu Gin Leu Gin
20 25 30
Ser Tyr Lys Met Ala Lys Met Ala Leu Asp Asn Asn Leu Lys Leu Val 35 40 45
Lys Asp Lys Lys Pro Ala Val He Leu Asp Leu Asp Glu Thr Val Leu
50 55 60
Asn Thr Phe Asp Tyr Ala Gly Tyr Leu He Lys Asn Cys He Lys Tyr 65 70 75 80 Thr Pro Glu Thr Trp Asp Lys Phe Glu Lys Glu Gly Ser Leu Thr Leu
85 90 95
Val Pro Gly Val Leu Asp Phe Leu Glu Tyr Ala Asn Ser Lys Gly Val 100 105 110
Lys He Phe Tyr He Ser Asn Arg Thr Gin Lys Asn Lys Ala Phe Thr
115 120 125
Leu Lys Thr Leu Lys Ser Phe Lys Leu Pro Gin Val Ser Glu Glu Ser 130 135 140
Val Leu Leu Lys Glu Lys Gly Lys Pro Lys Ala Val Arg Arg Glu Leu
145 150 155 160
Val Ala Lys Asp Tyr Ala He Val Leu Gin Val Gly Asp Thr Leu His
165 170 175 Asp Phe Asp Ala Leu Phe Ala Lys Asp Ala Lys Asn Ser Gin Glu Gin
180 185 190
Arg Ala Lys Val Leu Gin Asn Ala Gin Lys Phe Gly Thr Glu Trp He
195 200 205
He Leu Pro Asn Ser Leu Tyr Gly Thr Trp Glu Asp Glu Pro He Lys 210 215 220
Ala Trp Gin Asn Lys Lys 225 230
(2) INFORMATION FOR SEQ ID NO: 442:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 52, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:442:
Val Ser Ser Leu Phe Lys Met Arg He Leu Ser Phe Lys Lys Asn Lys
1 5 10 15
Arg Ala Val Phe Ser Leu Tyr Leu Phe He Ala Leu Leu Ala Leu Ser 20 25 30 Leu Leu Ala Pro Leu Trp Val Asn Asp Arg Pro Leu Phe He 35 40 45
(2) INFORMATION FOR SEQ ID NO: 443: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 104 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 53a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:443: Glu Phe Ala Gin Lys His Lys Thr Asp Ser Val He Gin Gly Lys Val 1 5 10 15
Val Ser He Lys Asp Phe Gly Val Phe He Asn Ala Asp Gly He Asp
20 25 30
Val Leu He Lys Asn Glu Asp Leu Asn Pro Leu Lys Lys Asp Glu He 35 40 45
Lys He Gly Gin Glu He Thr Cys Val Val Val Ala He Glu Lys Ser
50 55 60
Asn Asn Lys Val Arg Ala Ser Val His Arg Leu Glu Arg Lys Lys Glu 65 70 75 80 Lys Glu Glu Leu Gin Ala Phe Asn Thr Ser Asp Asp Lys Met Thr Leu
85 90 95
Gly Asp He Leu Lys Glu Lys Leu 100 (2) INFORMATION FOR SEQ ID NO: 444:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 60 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 53b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 444: Met Arg Phe Lys Ser Val Val Ala Phe He Ser Leu Ala Val Ala Leu 1 5 10 15
Gly Val Leu Ala Tyr Leu Phe Leu Ser Val Lys Lys Glu Met Pro Ala
20 25 30
Thr Ser His Ala He Ser Gin Thr His Ala He Ser Gin Thr Asn Glu 35 40 45
Gly Leu Ser Gin Thr Asp Ala Lys Asn His Asp He 50 55 60
(2) INFORMATION FOR SEQ ID NO: 445:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 148 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 54, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:445:
Met Thr Pro Glu Leu Asn Leu Lys Ser Leu Gly Ala Lys Thr Pro Tyr
1 5 10 15
He Phe Glu Tyr Asn Ser Gin Leu Leu Glu Ala Phe Pro Asn Pro Asn 20 25 30 Pro Asn Leu Asp Pro Leu He Thr Leu Glu Cys Lys Glu Phe Thr Ser 35 40 45
Leu Cys Pro He Thr Ser Gin Pro Asp Phe Gly Val He Phe He Arg
50 55 60
Tyr He Pro Lys Asp Lys Met Val Glu Ser Lys Ser Leu Lys Leu Tyr 65 70 75 80
Leu Phe Ser Tyr Arg Asn His Gly Ser Phe His Glu Ser Cys He Asn
85 90 95
Thr He Leu Leu Asp Leu He Gin Leu Leu Glu Pro Lys Tyr Leu Glu 100 105 110 Val Tyr Gly Asp Phe Val Ser Arg Gly Gly He Ala He Lys Pro Phe 115 120 125
Val Asn Tyr Ala He Lys Glu Tyr Gin Glu Phe Lys Glu Lys Arg Leu
130 135 140
Leu Asn Ala Lys 145
(2) INFORMATION FOR SEQ ID NO:446:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 232 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 55, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:446:
Met He Thr Met Asn He Asn Gly Lys Met He Glu Cys Gin Glu Gly 1 5 10 15
Gin Ser Val Leu Glu Ala Ala Arg Ser Ala Gly He Tyr He Pro Thr 20 25 30 He Cys Tyr Leu Ser Gly Cys Ser Pro Thr Val Ala Cys Lys Met Cys
35 40 45
Met Val Glu Met Asp Gly Lys Arg He Tyr Ser Cys Asn Thr Lys Ala 50 55 60 Lys Asn Asn Ala Thr He Leu Thr Asn Thr Pro Thr Leu Met Asp Glu 65 70 75 80
Arg Lys Ser He Met Gin Thr Tyr Asp Val Asn His Pro Leu Glu Cys
85 90 95
Gly Val Cys Asp Lys Ser Gly Glu Cys Glu Leu Gin Asp Met Thr His 100 105 110
Leu Thr Gly Val Glu His Gin Pro Tyr Ala Val Ala Asp Asp Phe Lys
115 120 125
Ala Leu Asp Phe Trp Ala Lys Ala Leu Tyr Asp Pro Asn Leu Cys He
130 135 140 Met Cys Glu Arg Cys Val Thr Thr Cys Lys Asp Asn Val Gly Glu Asn
145 150 155 160
Asn Leu Lys Ala Thr Lys Ala Asp Leu His Ala Pro Asp Lys Phe Lys
165 170 175
Asp Ser Met Ser Lys Asp Ala Phe Ser Val Trp Ser Arg Lys Gin Lys 180 185 190
Gly He He Ser Phe Val Gly Ser Val Pro Cys Tyr Asp Cys Gly Glu
195 200 205
Cys He Ala Val Cys Pro Val Gly Ala Leu Ser Tyr Lys Asp Phe Ala 210 215 220 Tyr Thr Ala Asn Ala Trp Glu Leu 225 230
(2) INFORMATION FOR SEQ ID NO:447: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 56, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 447: Met Ser Phe Ala Pro Met Leu Leu Ala Thr He Asn Asn Ser He Gly 1 5 10 15
Asn Lys Asp Lys His Val Ser Leu Glu Tyr Leu He Gly Leu Phe Met 20 25 30 (2) INFORMATION FOR SEQ ID NO: 448:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 81 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 57, antigen (xi) SEQUENCE DESCRIPTION: SEQ ID NO:448:
Gly Val He He Met Leu Leu Met Gly Asn Lys Glu Glu Ser Lys Glu
1 5 10 15
Asn Ala Ser Lys Asn Thr Gin Glu Val Gin Ala Asn Pro Met Ala Asn 20 25 30
Lys Asn Gin Glu Ala Lys Glu Gly Ser Asn He Gin Gin Tyr Leu Val
35 40 45
Leu Gly Pro Leu Tyr Ala He Asp Ala Pro Phe Ala Val Asn Leu Val 50 55 60 Ser Gin Asn Gly Arg Arg Tyr Leu Lys Ala Ser He Ser Leu Glu Leu 65 70 75 80
Ser (2) INFORMATION FOR SEQ ID NO: 449: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 140 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 58a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 449: Met Lys Gin Thr Thr He Asn His Ser Val Glu Leu Val Gly He Gly 1 5 10 15
Leu His Lys Gly Val Pro Val Lys Leu Val Leu Glu Pro Leu Gly Glu
20 25 30
Asn Gin Gly He Val Phe Tyr Arg Ser Asp Leu Gly Val Asn Leu Pro 35 40 45
Leu Lys Pro Glu Asn He Val Asp Thr Lys Met Ala Thr Val Leu Gly
50 55 60
Lys Asp Asn Ala Arg He Ser Thr He Glu His Leu Leu Ser Ala Val 65 70 75 80 His Ala Tyr Gly He Asp Asn Leu Lys He Ser Val Asp Asn Glu Glu
85 90 95
He Pro He Met Asp Gly Ser Ala Leu Thr Tyr Cys Met Leu Leu Asp
100 105 110
Glu Ala Gly He Lys Glu Leu Asp Ala Pro Lys Lys Val Met Glu He 115 120 125
Lys Gin Ala Val Glu He Arg Glu Ser Asp Lys Phe 130 135 140
(2) INFORMATION FOR SEQ ID NO: 450:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 140 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 58b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:450:
Met Lys Gin Thr Thr He Asn His Ser Val Glu Leu Val Gly He Gly
1 5 10 15
Leu His Lys Gly Val Pro Val Lys Leu Val Leu Glu Pro Leu Gly Glu 20 25 30 Asn Gin Gly He Val Phe Tyr Arg Ser Asp Leu Gly Val Asn Leu Pro 35 40 45
Leu Lys Pro Glu Asn He Val Asp Thr Lys Met Ala Thr Val Leu Gly
50 55 60
Lys Asp Asn Ala Arg He Ser Thr He Glu His Leu Leu Ser Ala Val 65 70 75 80
His Ala Tyr Gly He Asp Asn Leu Lys He Ser Val Asp Asn Glu Glu
85 90 95
He Pro He Met Asp Gly Ser Ala Leu Thr Tyr Cys Met Leu Leu Asp 100 105 110 Glu Ala Gly He Lys Glu Leu Asp Ala Pro Lys Lys Val Met Glu He 115 120 125
Lys Gin Ala Val Glu He Arg Glu Ser Asp Lys Phe 130 135 140 (2) INFORMATION FOR SEQ ID NO: 451:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 58c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 451: Val Arg Pro Asn Glu Trp Leu Asn Glu Arg Trp He Lys Thr Asn He 1 5 10 15
He Thr Pro He Glu Gin Ala Lys Arg Leu Leu Met Lys Gly 20 25 30 (2) INFORMATION FOR SEQ ID NO:452:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 309 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 59, antigen (xi) SEQUENCE DESCRIPTION: SEQ ID NO:452:
Met Lys Val Leu Ser Tyr Leu Lys Asn Phe Tyr Leu Phe Leu Ala Met
1 5 10 15
Gly Ala He Met Gin Pro Ser Glu Asn Met Gly Ala Gin His Gin Lys 20 25 30
Thr Asp Glu Arg Val He Tyr Leu Ala Gly Gly Cys Phe Trp Gly Leu
35 40 45
Glu Ala Tyr Met Glu Arg He Tyr Gly Val He Asp Ala Ser Ser Gly 50 55 60 Tyr Ala Asn Gly Lys Thr Ser Ser Thr Asn Tyr Glu Lys Leu His Glu 65 70 75 80
Ser Asp His Ala Glu Ser Val Lys Val He Tyr Asp Pro Lys Lys He
85 90 95
Ser Leu Asp Lys Leu Leu Arg Tyr Tyr Phe Lys Val Val Asp Pro Val 100 105 110
Ser Val Asn Lys Gin Gly Asn Asp Val Gly Arg Gin Tyr Arg Thr Gly
115 120 125
He Tyr Tyr Val Asn Ser Ala Asp Lys Glu Val He Asp His Ala Leu
130 135 140 Lys Ala Leu Gin Lys Glu Val Lys Gly Lys He Ala He Glu Val Glu
145 150 155 160
Pro Leu Lys Asn Tyr Val Arg Ala Glu Glu Tyr His Gin Asp Tyr Leu
165 170 175
Lys Lys His Pro Ser Gly Tyr Cys His He Asp Leu Lys Lys Ala Asp 180 185 190
Glu Val He Val Asp Asp Asp Lys Tyr Thr Lys Pro Ser Asp Glu Val
195 200 205
Leu Lys Lys Lys Leu Thr Lys Leu Gin Tyr Glu Val Thr Gin Asn Lys
210 215 220 His He Glu Lys Pro Phe Ala His Glu Tyr His His Lys Glu Glu Glu
225 230 235 240
Gly He Tyr Val Asp He Thr Thr Gly Asp Pro Leu Phe Phe Ser Ala
245 250 255
Asp Lys Tyr Asp Ser Gly Cys Gly Trp Pro Ser Phe Ser Lys Pro He 260 265 270
Asn Lys Asp Val Val Lys Tyr Glu Asp Asp Glu Ser Leu Asn Arg Lys
275 280 285
Arg He Glu Val Leu Ser Arg He Gly Lys Ala His Leu Gly His Val 290 295 300 Phe Asn Asp Gly Pro 305 (2) INFORMATION FOR SEQ ID NO: 453:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 111 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 60, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:453:
Met Phe Val Val Phe He Glu Gly Phe Gly Leu Ala He Ser Leu Cys 1 5 10 15 Ala Ala Val Gly Ala Gin Ser Leu Phe He Val Glu Arg Gly Met Ala 20 25 30
Arg Asn Tyr Val Phe Leu He Cys Ala Leu Cys Phe Met Cys Asp He
35 40 45
Val Leu Met Ser Met Gly Val Phe Gly Val Gly Ala Tyr Phe Ala Lys 50 55 60
Asn Leu Tyr Leu Ser Leu Phe Leu Asn Leu Phe Gly Ala Val Phe Thr 65 70 75 80
Gly Phe Tyr Ala Phe Leu Ala Leu Lys Thr Leu Phe Gin Thr Phe Lys 85 90 95 Lys Lys Gin Val Gin Thr Pro Lys Lys Leu Ser Leu Lys Lys Thr 100 105 110
(2) INFORMATION FOR SEQ ID NO: 454: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 82 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 61, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 454: Leu Lys Gin Arg Thr Leu Ser He He Lys Pro Asp Ala Leu Lys Asn 1 5 10 15
Lys Val Val Gly Lys He He Asp Arg Phe Glu Ser Asn Gly Leu Glu
20 25 30
Val Val Ala Met Lys Arg Leu His Leu Ser Val Lys Asp Ser Glu Asn 35 40 45
Phe Tyr Ala He His Arg Glu Arg Pro Phe Phe Asn Asp Leu He Glu
50 55 60
Phe Met Val Ser Gly Pro Val Val Val Leu Val Leu Glu Gly Glu Asp 65 70 75 80 Ala Val
(2) INFORMATION FOR SEQ ID NO: 455: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 216 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 62, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 455: Ala Leu Tyr Asn Asn Asn Asn Arg Met Asp Thr Cys Val Val Arg Asn 1 5 10 15
Thr Asp Asp He Lys Ala Cys Gly Met Ala He Gly Asn Gin Ser Met 20 25 30
Val Asn Asn Pro Asp Asn Tyr Lys Tyr Leu He Gly Lys Ala Trp Lys
35 40 45
Asn He Gly He Ser Lys Thr Ala Asn Gly Ser Lys He Ser Val Tyr 50 55 60
Tyr Leu Gly Asn Ser Thr Pro Thr Glu Asn Gly Gly Asn Thr Thr Asn 65 70 75 80
Leu Pro Thr Asn Thr Thr Ser Asn Val Arg Ser Ala Asn Asn Ala Leu 85 90 95 Ala Gin Asn Ala Pro Phe Ala Gin Pro Ser Ala Thr Pro Asn Leu Val 100 105 110
Ala He Asn Gin His Asp Phe Gly Thr He Glu Ser Val Phe Glu Leu
115 120 125
Ala Asn Arg Ser Lys Asp He Asp Thr Leu Tyr Thr His Ser Gly Ala 130 135 140
Lys Gly Arg Asp Leu Leu Gin Thr Leu Leu He Asp Ser His Asp Ala 145 150 155 160
Gly Tyr Ala Arg Gin Met He Asp Asn Thr Ser Thr Gly Glu He Thr 165 170 175 Lys Gin Leu Asn Ala Ala Thr Thr Thr Leu Asn Asn He Ala Ser Leu
180 185 190
Glu His Lys Thr Ser Ser Leu Gin Thr Leu Ser Leu Ser Asn Ala Met
195 200 205
He Leu Asn Ser Arg Leu Val Asn 210 215
(2) INFORMATION FOR SEQ ID NO:456:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 368 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 63, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:456:
Met Ser Lys He Ala Asp Asp Gin Asn Phe Asn Asp Glu Glu Glu Asn 1 5 10 15
Phe Ala Lys Leu Phe Lys Lys Glu Leu Glu Lys Glu Glu Thr Leu Glu
20 25 30
Lys Gly Thr He Lys Glu Gly Leu He Val Ser He Asn Glu Asn Asp 35 40 45 Gly Tyr Ala Met Val Ser Val Gly Gly Lys Thr Glu Gly Arg Leu Ala 50 55 60
Leu Asn Glu He Thr Asp Glu Lys Gly Gin Leu Leu Tyr Gin Lys Asn 65 70 75 80
Asp Pro He He Val His Val Ser Glu Lys Gly Glu His Pro Ser Val 85 90 95
Ser Tyr Lys Lys Ala He Ser Gin Gin Lys He Gin Ala Lys He Glu
100 105 110
Glu Leu Gly Glu Asn Tyr Glu Asn Ala He He Glu Gly Lys He Val 115 120 125 Gly Lys Asn Lys Gly Gly Tyr He Val Glu Ser Gin Gly Val Glu Tyr 130 135 140
Phe Leu Ser Arg Ser His Ser Ser Leu Lys Asn Asp Ala Asn His He 145 150 155 160
Gly Lys Arg He Lys Ala Cys He He Arg Val Asp Lys Glu Asn His 165 170 175
Ser He Asn He Ser Arg Lys Arg Phe Phe Glu Val Asn Asp Lys Arg
180 185 190
Gin Leu Glu Val Ser Lys Glu Leu Leu Glu Ala Thr Glu Pro Val Leu 195 200 205 Gly Val Val Arg Gin He Thr Pro Phe Gly He Phe Val Glu Ala Lys 210 215 220
Gly He Glu Gly Leu Val His Tyr Ser Glu He Ser His Lys Gly Pro 225 230 235 240
Val Asn Pro Glu Lys Tyr Tyr Lys Glu Gly Asp Glu Val Tyr Val Lys
245 250 255
Ala He Ala Tyr Asp Ala Glu Lys Arg Arg Leu Ser Leu Ser He Lys 260 265 270
Ala Thr He Glu Asp Pro Trp Glu Glu He Gin Asp Lys Leu Lys Pro
275 280 285
Gly Tyr Ala He Lys Val Val Val Ser Asn He Glu His Tyr Gly Val
290 295 300 Phe Val Asp He Gly Asn Asp He Glu Gly Phe Leu His Val Ser Glu
305 310 315 320
He Ser Trp Asp Lys Asn Val Ser His Pro Ser His Tyr Leu Ser Val
325 330 335
Gly Gin Glu He Asp Val Lys He He Asp He Asp Pro Lys Asn Arg 340 345 350
Arg Leu Arg Val Ser Leu Lys Gin Leu Thr Asn Arg Pro Phe Asp Val 355 360 365
(2) INFORMATION FOR SEQ ID NO: 457:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 508 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 64a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:457:
Glu Phe Ala Gin Lys Ala Asn Leu Asn Leu Ala Asp Val He Lys Thr
1 5 10 15
Leu Phe Asn Leu Gly Leu Met Val Thr Lys Asn Asp Phe Leu Asp Lys 20 25 30 Asp Ser He Glu He Leu Ala Glu Glu Phe His Leu Glu He Ser Val 35 40 45
Gin Asn Thr Leu Glu Glu Phe Glu Val Glu Glu Val Leu Glu Gly Val
50 55 60
Lys Lys Glu Arg Pro Pro Val Val Thr He Met Gly His Val Asp His 65 70 75 80
Gly Lys Thr Ser Leu Leu Asp Lys He Arg Asp Lys Arg Val Ala His
85 90 95
Thr Glu Ala Gly Gly He Thr Gin His He Gly Ala Tyr Met Val Glu 100 105 110 Lys Asn Asp Lys Trp Val Ser Phe Pro Asn Val Asn Pro Asp Lys Leu 115 120 125
Lys Ala Glu Cys Ala Glu Leu Gly Tyr Asn Pro Val Asp Trp Gly Gly
130 135 140
Glu His Glu Phe He Pro Val Ser Ala Lys Thr Gly Asp Gly He Asp 145 150 155 160
Asn Leu Leu Asp Thr He Leu He Gin Ala Asp He Met Glu Leu Lys
165 170 175
Ala He Glu Glu Gly Ser Ala Arg Ala Val Val Leu Glu Gly Ser Val 180 185 190 Glu Lys Gly Arg Gly Ala Val Ala Thr Val He Val Gin Ser Gly Thr 195 200 205
Leu Ser Val Gly Asp Ser Phe Phe Val Glu Thr Ala Phe Gly Lys Val
210 215 220
Arg Thr Met Thr Asp Asp Gin Gly Lys Ser He Gin Asn Leu Lys Pro 225 230 235 240
Ser Met Val Ala Leu He Thr Gly Leu Ser Glu Val Pro Pro Ala Gly
245 250 255
Ser Val Leu He Gly Val Glu Asn Asp Ser He Ala Arg Leu Gin Ala 260 265 270 Gin Lys Arg Ala Thr Tyr Leu Arg Gin Lys Ala Leu Ser Lys Ser Thr 275 280 285
Lys Val Ser Phe Asp Glu Leu Ser Glu Met Val Val Asn Lys Glu Leu 290 295 300
Lys Asn He Pro Val Val He Lys Ala Asp Thr Gin Gly Ser Leu Glu 305 310 315 320
Ala He Lys Asn Ser Leu Leu Glu Leu Asn Asn Glu Glu Val Ala He 325 330 335
Gin Val He His Ser Gly Val Gly Gly He Thr Glu Asn Asp Leu Ser
340 345 350
Leu Val Ser Ser Ser Glu His Ala Val He Leu Gly Phe Asn He Arg 355 360 365 Pro Thr Gly Asn Val Lys Asn Lys Ala Lys Glu Tyr Asn Val Ser He 370 375 380
Lys Thr Tyr Thr Val He Tyr Ala Leu He Glu Glu Met Arg Ser Leu 385 390 395 400
Leu Leu Gly Leu Met Ser Pro He He Glu Glu Glu His Thr Gly Gin 405 410 415
Ala Glu Val Arg Glu Thr Phe Asn He Pro Lys Val Gly Met He Ala
420 425 430
Gly Cys Val Val Ser Asp Gly Val He Ala Arg Gly He Lys Ala Arg 435 440 445 Leu He Arg Asp Gly Val Val Val His Thr Gly Glu He Leu Ser Leu 450 455 460
Lys Arg Phe Lys Asn Asp Val Lys Glu Val Ser Lys Gly Tyr Glu Cys 465 470 475 480
Gly He Met Leu Asp Asn Tyr Asn Glu He Lys Val Gly Asp Val Phe 485 490 495
Glu Thr Tyr Lys Glu He His Lys Lys Arg Thr Leu 500 505
(2) INFORMATION FOR SEQ ID NO:458:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 64b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:458:
Met Asn Ala His Lys Glu Arg Leu Glu Ser Asn Leu Leu Glu Leu Leu
1 5 10 15
Gin Glu Ala Leu Ala Ser Leu Asn Asp Ser Glu Leu Asn 20 25
(2) INFORMATION FOR SEQ ID NO: 459:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 132 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 65a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 459:
Glu Phe Leu Asn Pro Thr Phe Val Thr Gin Tyr Pro He Glu He Ser 1 5 10 15 Pro Leu Ala Arg Arg Asn Asp Ser Asn Pro Asn He Ala Asp Arg Phe 20 25 30
Glu Leu Phe He Ala Gly Lys Glu He Ala Asn Gly Phe Ser Glu Leu
35 40 45
Asn Asp Pro Leu Asp Gin Leu Glu Arg Phe Lys Asn Gin Val Ala Glu 50 55 60
Lys Glu Lys Gly Asp Glu Glu Ala Gin Tyr Met Asp Glu Asp Tyr Val 65 70 75 80 Trp Ala Leu Ala His Gly Met Pro Pro Thr Ala Gly Gin Gly He Gly
85 90 95
He Asp Arg Leu Val Met Leu Phe Thr Gly Ser Lys Ser He Lys Asp 100 105 110 Val He Leu Phe Pro Ala Met Arg Pro Val Lys Asn Asp Phe Asn Val 115 120 125
Glu Ser Glu Glu 130 (2) INFORMATION FOR SEQ ID NO:460:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 65b, antigen 2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:460:
Met Ala Tyr Phe Leu Glu Gin Thr Asp Ser Glu He Phe Glu Leu He
1 5 10 15
Phe Glu Glu Tyr Lys Arg Gin Asn Glu His Leu Glu Met He Ala Ser 20 25 30
(2) INFORMATION FOR SEQ ID NO:461:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 127 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 66a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 461:
Tyr Glu Trp Gly Ala Tyr Leu Lys Arg Thr Gly Leu Gly Glu His Glu 1 5 10 15
Met Ala Phe Ala Gly Trp Met Ala Tyr He Ala Asp Pro Asp Asn Phe
20 25 30
Leu Tyr Thr Leu Trp Ser Lys Gin Ala Ala Ser Ala He Pro Thr Gin 35 40 45 Asn Gly Ser Phe Tyr Lys Ser Asp Ala Phe Ser Asp Leu Leu He Lys 50 55 60
Ala Lys Arg Val Ser Asp Gin Lys Glu Arg Glu Ala Leu Tyr Leu Lys 65 70 75 80
Ala Gin Glu He He His Lys Asp Ala Pro Tyr Val Pro Leu Ala Tyr 85 90 95
Pro Tyr Ser Val Val Pro His Leu Ser Lys Val Lys Gly Tyr Lys Thr
100 105 110
Thr Gly Val Ser Val Asn Arg Phe Phe Lys Val Tyr Leu Glu Lys 115 120 125
(2) INFORMATION FOR SEQ ID NO:462:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 35 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 66b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 462: Met Leu Ser Phe He He Lys Arg He Leu Trp Ala He Pro Thr Leu
1 5 10 15
Phe Gly Val Ser He He Val Phe Met Met Val His Leu Val Pro Gly 20 25 30 Asp Pro Ala
35
(2) INFORMATION FOR SEQ ID NO: 463: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 358 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 67, antigen
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:463: Ala Ala Gly Asn He Gly Gly Gly Gly Phe Ala Val He His Leu Ala 1 5 10 15
Asn Gly Glu Asn Val Ala Leu Asp Phe Arg Glu Lys Ala Pro Leu Lys
20 25 30
Ala Thr Lys Asn Met Phe Leu Asp Lys Gin Gly Asn Val Val Pro Lys 35 40 45
Leu Ser Glu Asp Gly Tyr Leu Ala Ala Gly Val Pro Gly Thr Val Ala
50 55 60
Gly Met Glu Ala Met Leu Lys Lys Tyr Gly Thr Lys Lys Leu Ser Gin 65 70 75 80 Leu He Asp Pro Ala He Lys Leu Ala Glu Asn Gly Tyr Val He Ser
85 90 95
Gin Arg Thr Ser Arg Asn Pro Lys Arg Ser Lys Gly Ser Gly Phe Leu
100 105 110
Lys Tyr Thr Ser Ser Lys Lys Val Phe Phe Leu Lys Lys Asp Thr Leu 115 120 125
He He Lys Lys Gly He Cys Leu Ser Gin Lys Asp Leu Ala Gin Thr
130 135 140
Leu Asn Gin He Lys Thr Leu Gly Ala Lys Gly Phe Tyr Gin Gly Gin 145 150 155 160 Val Ala Glu Leu He Glu Lys Asp Met Lys Lys Asn Gly Gly He He
165 170 175
Thr Lys Glu Asp Leu Ala Ser Tyr Asn Val Lys Trp Arg Lys Pro Val
180 185 190
Val Gly Ser Tyr Arg Gly Tyr Lys He He Ser Met Ser Pro Pro Ser 195 200 205
Ser Gly Gly Thr His Leu He Gin He Leu Asn Val Met Glu Asn Ala
210 215 220
Asp Leu Ser Thr Leu Gly Tyr Gly Ala Ser Lys Asn He His He Ala 225 230 235 240 Ala Glu Ala Met Arg Gin Ala Tyr Ala Asp Arg Ser Val Tyr Met Gly
245 250 255
Asp Ala Asp Phe Ala Ser Val Pro Val Asp Lys Leu He Asn Lys Ala
260 265 270
Tyr Ala Lys Lys He Phe Asp Thr He Gin Pro Asp Thr Val Thr Pro 275 280 285
Ser Ser Gin He Lys Pro Gly Met Gly Gin Leu His Glu Gly Ser Asn
290 295 300
Thr Thr His Tyr Ser Val Ala Asp Arg Trp Gly Asn Ala Val Ser Val 305 310 315 320 Thr Tyr Thr He Asn Ala Ser Tyr Gly Ser Ala Ala Ser He Asp Gly
325 330 335
Ala Gly Phe Leu Leu Asn Asn Glu Met Asp Asp Phe Ser He Lys Pro
340 345 350
Gly Asn Pro Asn Leu Tyr 355
(2) INFORMATION FOR SEQ ID NO: 464: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 234 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 68a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 464:
Val Ala Lys Lys Val Phe Lys Arg Leu Glu Lys Leu Phe Ser Lys He
1 5 10 15
Gin Asn Asp Lys Ala Tyr Gly Val Glu Gin Gly Val Glu Val Val Lys 20 25 30 Ser Leu Ala Ser Ala Lys Phe Asp Glu Thr Val Glu Val Ala Leu Arg 35 40 45
Leu Gly Val Asp Pro Arg His Ala Asp Gin Met Val Arg Gly Ala Val
50 55 60
Val Leu Pro His Gly Thr Gly Lys Lys Val Arg Val Ala Val Phe Ala 65 70 75 80
Lys Asp He Lys Gin Asp Glu Ala Lys Asn Ala Gly Ala Asp Val Val
85 90 95
Gly Gly Asp Asp Leu Ala Glu Glu He Lys Asn Gly Arg He Asp Phe 100 105 110 Asp Met Val He Ala Thr Pro Asp Met Met Ala Val Val Gly Lys Val 115 120 125
Gly Arg He Leu Gly Pro Lys Gly Leu Met Pro Asn Pro Lys Thr Gly
130 135 140
Thr Val Thr Met Asp He Ala Lys Ala Val Thr Asn Ala Lys Ser Gly 145 150 155 160
Gin Val Asn Phe Arg Val Asp Lys Lys Gly Asn Val His Pro Pro He
165 170 175
Gly Lys Ala Ser Phe Pro Glu Glu Lys He Lys Glu Asn Met Leu Glu 180 185 190 Leu Val Lys Thr He Asn Arg Leu Lys Pro Ser Ser Ala Lys Gly Lys 195 200 205
Tyr He Arg Asn Ala Ala Leu Ser Leu Thr Met Ser Pro Ser Val Ser
210 215 220
Leu Asp Ala Gin Glu Leu Met Asp Val Lys 225 230
(2) INFORMATION FOR SEQ ID NO: 465:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 60 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 68b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 465:
Ala Ser Gly Val Glu Lys Gly Ser Asp Asn Pro Leu Lys Asn Lys He 1 5 10 15
Ala Lys Leu Thr His Lys Gin Val Glu Glu He Ala Gin Leu Lys Met
20 25 30
Glu Asp Leu Asn Thr Ser Thr Met Glu Ala Ala Lys Lys He Val Met 35 40 45 Gly Ser Ala Arg Ser Met Gly Val Glu Val Val Asp 50 55 60
(2) INFORMATION FOR SEQ ID NO:466: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 443 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 69a, antigen 1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:466:
Met Ser Lys Leu Asn Met Thr Ser Arg Glu He Val Ala Tyr Leu Asp 1 5 10 15 Glu Tyr He He Gly Gin Lys Glu Ala Lys Lys Ser He Ala He Ala 20 25 30
Phe Arg Asn Arg Tyr Arg Arg Leu Gin Leu Glu Lys Ser Leu Gin Glu
35 40 45
Glu He Thr Pro Lys Asn He Leu Met He Gly Ser Thr Gly Val Gly 50 55 60
Lys Thr Glu He Ala Arg Arg He Ala Lys He Met Glu Leu Pro Phe 65 70 75 80
Val Lys Val Glu Ala Ser Lys Tyr Thr Glu Val Gly Phe Val Gly Arg 85 90 95 Asp Val Glu Ser Met Val Arg Asp Leu Val Asn Asn Ser Val Leu Leu 100 105 110
Val Glu Asn Glu His Lys Glu Lys Leu Lys Asp Lys He Glu Glu Ala
115 120 125
Val He Glu Lys He Ala Lys Lys Leu Leu Pro Pro Leu Pro Asn Gly 130 135 140
Val Ser Glu Glu Lys Lys Gin Glu Tyr Ala Asn Ser Leu Leu Lys Met
145 150 155 160
Gin Gin Arg He Ala Gin Gly Glu Leu Asp Ser Lys Glu He Glu He
165 170 175 Glu Val Arg Lys Lys Ser He Glu He Asp Ser Asn Val Pro Pro Glu
180 185 190
He Leu Arg Val Gin Glu Asn Val He Lys Phe Phe His Lys Glu Gin
195 200 205
Asp Lys Val Lys Lys Thr Leu Ser Val Lys Glu Ala Lys Glu Ala Leu 210 215 220
Lys Ala Glu He Ser Asp Thr Leu Leu Asp Ser Glu Ala He Lys Met
225 230 235 240
Glu Gly Leu Lys Arg Ala Glu Ser Ser Gly Val He Phe He Asp Glu
245 250 255 He Asp Lys He Ala Val Ser Ser Lys Glu Gly Ser Arg Gin Asp Pro
260 265 270
Ser Lys Glu Gly Val Gin Arg Asp Leu Leu Pro He Val Glu Gly Ser
275 280 285
Val Val Asn Thr Lys Tyr Gly Ser He Lys Thr Glu His He Leu Phe 290 295 300
He Ala Ala Gly Ala Phe His Leu Ser Lys Pro Ser Asp Leu He Pro
305 310 315 320
Glu Leu Gin Gly Arg Phe Pro Leu Arg Val Glu Leu Glu Asn Leu Thr
325 330 335 Glu Glu He Met Tyr Met He Leu Thr Gin Thr Lys Thr Ser He He
340 345 350
Lys Gin Tyr Gin Ala Leu Leu Lys Val Glu Gly Val Glu He Ala Phe
355 360 365
Glu Asp Asp Ala He Lys Glu Leu Ala Lys Leu Ser Tyr Asn Ala Asn 370 375 380
Gin Lys Thr Glu His He Gly Ala Arg Arg Leu His Thr Thr He Glu 385 390 395 400
Lys Val Leu Glu Asp He Thr Phe Glu Pro Glu Asp Tyr Ser Gly Gin 405 410 415 Asn He Thr Thr Thr Lys Glu Leu Val Gin Ser Asn Leu Glu Asp Leu 420 425 430
Val Ala Asp Glu Asn Leu Val Lys Tyr He Leu 435 440 (2) INFORMATION FOR SEQ ID NO: 467:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 137 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 69b, antigen 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 467: Ser Leu Tyr His Asn Gin Val Leu Ser Gly Phe Ala Gly Ser Thr Ala 1 5 10 15
Asp Ala Phe Ser Leu Phe Asp Met Phe Glu Arg He Leu Glu Ser Lys
20 25 30
Lys Gly Asp Leu Phe Lys Ser Val Val Asp Phe Ser Lys Glu Trp Arg 35 40 45
Lys Asp Lys Tyr Leu Arg Arg Leu Glu Ala Met Met He Val Leu Ser
50 55 60
Leu Asp Leu He Phe He Leu Ser Gly Thr Gly Asp Val Leu Glu Ala 65 70 75 80 Glu Asp Asn Lys He Ala Ala He Gly Ser Gly Gly Asn Tyr Ala Leu
85 90 95
Gly Ala Ala Arg Ala Leu Asp His Phe Ala His Leu Gin Pro Arg Lys
100 105 110
Leu Val Glu Glu Ser Leu Lys He Ala Gly Asp Leu Cys He Tyr Thr 115 120 125
Asn Thr Asn He Lys He Leu Glu Leu 130 135
(2) INFORMATION FOR SEQ ID NO: 468:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 69c, antigen 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:468:
Met Met Lys Thr Lys Ala Gly Phe Val Ser Leu Met Gly Lys Pro Asn
1 5 10 15
Ala Gly Lys Ser Thr Leu Leu Asn Thr Leu 20 25
(2) INFORMATION FOR SEQ ID NO:469:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 235 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 1 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:469:
CTCCTTAAGA GAGAGCGCTA AATTAGAATT AGAAGAGCAT GCCAATAACC CTTTTGTGAA 60 AGAGATTTGC TCTTTTATTT TAGAGAGTTC TAGGGGCGTG GCGTATAAGT CAAGCGAATA 120
TTCTAGCGAA GAAAAACAAG AGGAATAACA TGAACGAAAC GCTTTATTGC AGTTTTTGCA 180
AAAAACCAGA ATCCAGAGAT CCCAAAAAAC GCCGCATTAT TTTTGCGAGC AACCT 235
(2) INFORMATION FOR SEQ ID NO:470:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 313 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 1 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:470: ATTTTATTCA TTTGCGCTGG AGCGTTTGAT GGGTTAGCGG AAATCATTAA AAAACGCACC 60
ACGCAAAATG TGTTGGGTTT CACTCAAGAA AAGATGAGCA AAAAAGAGCA AGAAGCGATT 120
TTGCATTTAG TCCAAACCCA TGATTTGGTA ACTTATGGGC TTATCCCTGA GCTTATCGGC 180
CGTTTGCCGG TTTTAAGCAC GCTAGATAGT ATCAGTCTAG AAGCGATGGT GGATATTTTG 240
CAAAAACCTA AAAACGCTCT CATCAAGCAA TACCAGCAAC TTTTCAAAAT GGATGAAGTG 300 GATTTGATTT TTG 313
(2) INFORMATION FOR SEQ ID NO: 471:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1236 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 471: CGGTGTTTAA TTTGAATCTG ATCACTCTAA ACAGCAAATA TATCCGCAGA CTTACCGTGA 60
ATGGCTCTAA CCAGATGCCT CGTTTTTCTA CGGATGGGAG AAATATCATG TATATCAAAA 120
AGACACCACA AGAATACGCC ATGGGGCTTA TTTTGCTAGA CTATAATCAG AGTTTTTTAT 180
TCCCTTTAAA GAATGTGAAA ATACAAGCCT TTGATTGGTA AGGTTAAATT AAGCGAATGT 240
GAGTTAATAT TTACACTTAT TAAAATTTTA TTCTTGGAGA ATTTATAATG AAGAGATCTT 300 CTGCATTTAG TTTCTTGGTA GCTTTTTTAT TGGTAGCTGG CTGTAGTCAT AAAATGGATA 360
ATAAGACTGT GGCTGGCGAT GTGAGCGCTA AAACGGTTCA GACTGCACCT GTTACTACAG 420
AACCAGCTCC AGAGAAAGAA GAGCCTAAAC AAGAGCCAGC TCCAGTGGTT GAAGAAAAGC 480
CGGCTATTGA AAGCGGGACT ATCATCGCTT CTATTTATTT TGATTTTGAC AAGTATGAAA 540
TCAAAGAATC CGATCAAGAG ACTTTAGATG AGATCGTGCA AAAAGCTAAA GAAAACCACA 600 TGCAAGTGCT TTTGGAAGCC AATACCGATG AATTTGGCTC TAGCGAATAC AACCAAGCGC 660
TTGGCGTTAA AAGGACTTTG AGCGTGAAAA ACGCTTTAGT CATTAAAGGG GTAGAAAAAG 720
ATATGATCAA AACCATCAGT TTTGGTGAAA CCAAACCCAA ATGCGCCCAA AAAACTAGAG 780
AATGTTACAA AGAAAACAGA AGAGTGGATG TCAAATTAGT GAAGTAATTC TAGGATGAAA 840
AGGTTTTTTT TTATCCCTTT TATCGCTCCC TTTTTTCTCA ATGGGGAGCC TTCAGCGTTT 900 GATTTRCAAA GCGGAGCCAC CAAAAAAGAA CTCAAGCAGT TGCAAGTCAA TAGTAAGAAT 960
TTTTCCAATA TTTTGACCAA AATCCATTCG CAAGTAGAGG CTAACACTCA AGCTCAAGAG 1020
GGTTTGAGAA GCGTTTATGA GGGGCAGGCT AATAAGATTA AAGATCTCAA TAACGCTATC 1080
CTTTCCCAAG AAGAATCCTT GCGAGCCTTA AAAGCCTTGC AAGAAGTGCA AGCTAACACA 1140
TTAAAGCAGC AATCGCAAAC TTTAGAGGAT TTAAGGAATG AGATTCACGC TAACCAACAG 1200 GCCATCCAAC AGCTAGACAA GCAAAATAAA GAGATG 1236
(2) INFORMATION FOR SEQ ID NO:472:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1082 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:472: GGCGAACATT TTTACCCACT TACCCACTTT GAACGGAATG AATAACTTTA AGCATTTCTT 60 ACCTCTCAAT GTGAGTTTTC TGCAGTCATG ATAGCTGATT TTGTTTTAAA TTTGCTATAA 120 TGTAAATTTA ATGATGAAAA TTAGTTTAGA GTGGAGAACA CACAATGAAA AAAAATATCT 180 TAAATTTAGC GTTAGTGGGC GCGTTGAGTG CGTCGTTTTT GATGGCTAAG CCGGCTCATA 240
ACGCAGATAA CGCTACGCAT AACACCAAAA AAACGACTGA TTCTTCACCC GGCGTGTTAG 300
CGACAGTGGA TGGCAGACCT ATCACTAAAA GCGATTTTGA TATGATTAAG CAACGAAATC 360
CTAATTTTGA TTTTGACAAG CTTAAAGAGA AAGAAAAAGA AGCCTTGATT GAGCAAGCTA 420 TCCGCACCGC ACTTGTAGAA AATGAGGCTA AGGCAGAAAA GCTCGATCAG ACTCCAGAAT 480
TTAAAGCGAT GATGGAAGCG GTTAAAAAAC AGGCTTTAGT GGAATTTTGG GCTAAAAAAC 540
AGGCTGAAGA AGTGAAAAAA GTCCAAATCC CAGAAAAAGA AATGCAAGAT TTTTACAACG 600
CTAATAAAGA TCAGCTTTTT GTCAAGCAAG AAGCCCATGC TAGGCATATT TTAGTGAAAA 660
CCGAAGATGA GGCTAAACGG ATTATTTCTG AGATTGACAA ACAGCCAAAG GCTAAAAAAG 720 AAGCCAAATT CATTGAGTTA GCCAATCGGG ATACGATTGA TCCTAACAGC AAGAACGCGC 780
AAAATGGCGG TGATTTGGGG AAATTCCAAA AGAACCAAAT GGCTCCGGAT TTTNCTAAAG 840
CCGCTTTCGC TTTAACTYCT GGGGATTACA CTAAAACCCC TGTTAAAACA GAGTTTGGTT 900
ATCATATTAT CTATTTGATT TCTAAAGATA GCCCTGTAAC TTATACTTAT GAGCAAGCTA 960
AACCTACCAT TAAGGGGATG TTACAAGAAA AGCTTTTCCA AGAACGCATG AATCAACGCA 1020 TTGAGGAATT AAGGAAGCAC GCTAAAATTG TTMTCAACAA GTAGATGAGG TGTTATCATG 1080
TT 1082
(2) INFORMATION FOR SEQ ID NO:473: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1540 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 4
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:473
CTTCGTGTGC CTACCACTAA TTTGATTTCT TGCGAATAAA GCTAGCTTGA ATGCGGGCAA 60
TATCCAAATC CAAAACATGC CCAAAGTTAA AGAGCGAGTG AGTGTCCCCT CTAAAGACGA 120
TACGGATCTA TTCTTACCAC GATTCTATTA AGGATTCGAT TAAGGCGGTG GTGAATATCT 180
CYACTGAAAA RAAGATTAAA AACAATTTTA TAGGTGGCGG TGTGTTTAAT GACCCCTTTT 240 TCCAACAATT TTTTGGGGAT TTGGGTGGCA TGATCCCTAA AGAAAGAATG GAAAGGGCTT 300
TAGGCAGCGG TGTCATCATT TCTAAAGACG GCTATATTGT AACTAATAAC CATGTGATTG 360
ATGGCGCGGA TAAGATTAAA GTGACCATTC CAGGGAGCAA TAAAGAATAT TCCGCTACTT 420
TAGTAGGCAC GGATTCTGAA AGCGATTTAG CCGTGATTCG CATCACTAAA GACAACTTGC 480
CCACGATCAA ATTCTCTGAT TCTAACGATA TTTCAGTGGG CGATTTGGTT TTTGCGATTG 540 GTAACCCTTT TGGCGTGGGT GAAAGCGTTA CTCAAGGCAT TGTTTCAGCG CTCAATAAAA 600
GCGGGATTGG GATCAACAGC TATGAGAATT TCATTCAAAC AGACGCCTCT ATTAATCCTG 660
GAAATTCCGG CGGCGCTTTA ATTGATAGCC GTGGAGGGTT AGTGGGGATT AATACCGCTA 720
TCATCTCTAA AACTGGGGGC AACCACGGCA TTGGCTTTGC CATCCCTTCT AACATGGTTA 780
AAGATATTGT AACCCAACTC ATCAAAACCG GTAAGATTGA AAGAGGTTAC TTGGGCGTGG 840 GCTTGCAAGA TTTGAGCGGC GATTTGCAAA ATTCTTATGA CAATAAAGAA GGGGCCGTAG 900
TCATTAGCGT AGAAAAAGAC TCCCCGGCTA AAAAAGCAGG GATTTTGGTG TGGGATTTGA 960
TCACCGAAGT CAATGGCAAA AAGGTTAAAA ACACGAACGA ATTGAGAAAT CTAATCGGCT 1020
CTATGCTACC CAATCAAAGG GTAACCTTAA AGGTCATTAG AGACAAAAAA GAACGCGCCT 1080
TCACCCTCAC ACTTGCTGAA AGGAAAAACC CTAACAAAAA AGAAACCATT TCTGCTCAAA 1140 ACGGCGCGCA AGGCCAATTG AACGGGCTTC AAGTAGAAGA TTTAACCCAA AAAACCAAAA 1200
GGTCTATGCG TTTGAGCGAT GATGTTCAAG GGGTTTTAGT CTCTCAAGTG AATGAAAATT 1260
CCCCAGCAGA GCAAGCCGGA TTTAGGCAAG GTAACATTAT CACAAAAATT GAAGAGGTTG 1320
AAGTTAAAAG CGTTGCGGAT TTTAACCATG CTTTAGAAAA GTATAAAGGC AAACCCAAAC 1380
GATTCTTAGT TTTAGATTTG AATCAAGGTT ATAGGATCAT TTTGGTGAAA TGATAGAGGT 1440 GGGTTGTTAG TCGCATGTCT TTGATTAGAG TGAATGGGGA AGCTTTTAAA CTCTCTTTAG 1500
AAAGTTTAGA AGAAGACCCT TTTGAAACTA AAGAAACGCT 1540
(2) INFORMATION FOR SEQ ID NO: 474: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 222 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 5 (start) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:474:
GGTTGTTCGC TGAAACGCAT GTTGATCCTA AAAACGCCCT AAGCGATGGG GCGAACATGC 60
TAAAACCTAG CGAGCTAGAA CACTTAGTAA CCGACATGTT AAAAATCCAA AATTTATTTT 120 AAAGGAATTT CATGCGAATC ATAGAAGGGA AATTGCAATT GCAAGGGAAT GAAAGAGTCG 180
CTATTTTAAC ATCGCGCTTC AATCATATCA TCACAGACAG AT 222
(2) INFORMATION FOR SEQ ID NO:475: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 241 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 5 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 475
ATGTGAGCGC GGAAGCGACT AAGGGCATTG CCCATGCGAT GCTCAAATAC AGCATGCCGG 60
TAAGCTTTGG CGTGCTGACC ACAGACAATA TTGAACAAGC GATTGAAAGA GCGGGCAGTA 120
AAGCCGGCAA TAAGGGCTTT GAAGCGATGA GCACCCTCAT TGAATTGTTG AGCTTGTGCC 180
AAACTCTCAA GGGTTAAAAT GGCGACACGA ACTCGAGCCA GGGGGGCTGT GGTTGAATTG 240 T 241
(2) INFORMATION FOR SEQ ID NO: 476:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1166 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 6
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 476: TAGCGGATAC CACGCAGAGC ATTTAGCCGG CAAAGAAGCC CTTTTTAAAG TGAAATTGCG 60
CCAGATTCAA GCGCGTGAAG TGTTAGAAAT CAATGACGAA CTCGCTAAAA TCGTGCTAGC 120
TAATGAAGAG GATGCGACCT TAAAGCTTTT AAAAGAAAGG GTTGAGGGGC AGTTGTTTTT 180
AGAAAATAAA GCCAGGCTCT ATAATGAAGA GTTGAAAGAA AAATTGATTG AAAATTTAGA 240
TGAAAAGATT GTTTTTGATT TGCCTAAAAC GATCATAGAG CAAGAAATGG ATTTGTTGTT 300 CAGGAACGCT CTTTATTCCA TGCAAGCTGA GGAAGTCAAA TCCTTACAAG AAAGTCAAGA 360
AAAAGCCAAA GAAAAGCGTG AGAGCTTTAG GAACGATGCC ACAAAAAGCG TGAAAATCAC 420
TTTTATCATT GACGCTTTAG CGAAAAGAAG AAAAATTGGC GTGCATGACA ATGAAGTCTT 480
TCAAACCTTG TATTATGAAG CGATGATGAC AGGGCGAAAC CCAGAAAGTC TCATTGAACA 540
ATACCGCAAA AATAACATGT TAGCGGCGGT GAAAATGGCG ATGATTGAAG ATAGGGTGTT 600 AGCTTATTTG TTGGATAAAA ACCTGCCTAA AGAGCAACAA GAAATTTTGG AAAAAATGAG 660
GCCCAACGCT CAAAAAATTC AAGCGGGTTA AACGGCTAAA AAGGAGAGAT GATGGGATAC 720
ATTCCTTATG TAATAGAGAA TACCGATCGT GGGGAGCGCA GCTATGATAT TTACTCGCGC 780
CTTTTAAAGG ATCGCATTGT TTTATTGAGC GGTGAGATTA ACGATAGCGT GGCGTCTTCT 840
ATCGTGGCCC AACTCTTGTT TTTGGAAGCT GAAGACCCTG AAAAAGACAT TGGCTTGTAT 900 ATCAATTCTC CCGGTGGGGT GATAACAAGC GGTCTTAGCA TCTATGATAC CATGAATTTT 960
ATCCGCCCTG ATGTTTCCAC GATTTGCATC GGTCAAGCGG CTTCTATGGG GGCGTTTTTA 1020
CTGAGCTGTG GGGCTAAGGG CAAGCGCTTT TCACTACCCC ATTCAAGGAT TATGATCCAC 1080
CAGCCTTTAG GGGGGGCTCA AGGGCAAGCG AGCGATATTG AAATCATTTC TAACGAGATC 1140
CTTAGGCTTA AGGGTTTGAT GAATTC 1166
(2) INFORMATION FOR SEQ ID NO: 477:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 265 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 7 (start) (ix) FEATURE: (A) NAME/KEY: Other
(B) LOCATION: 266...266
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 477:
GGCATTTATT GGCGGACACT TACGCTCTAG CTGCAACGAC TGGGAATGTG TTTATCCTAA 60
AAGCCGATTA TAATATCCAC AAATGGGGAC TTACTTTAAC CTGGCTCTCG CGTTTTGTAA 120
CTAACATGTT TTATGAAGGT TATTCTATCT ACTATCCGCA ATACGGCTTG ATGAAAATCC 180 ATAAACCCGG GTATGGCGTG CATAATGTCT TTATCAACTG GACCCCCACT TCTAAAAAAT 240
GGCAGGGTTT AAGGATTTCA GCCGT 265
(2) INFORMATION FOR SEQ ID NO: 478: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 660 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 8
(ix) FEATURE: (A) NAME/KEY: Other
(B) LOCATION: 309-463
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 478:
TCTAAAACGA TTTCTGCGCC TTTGTATTTG CCAAAAGCGA TCACATCGCC TTCTTTAACG 60
CATTTGCAAC CCTCACTGAT TTTATGGCTA ACGGCTTTGA CTACGCCCAT TAAAGGCTTT 120
TCTTTAGCGT TATCAGGGAT GATGATGCCT GAACTGGTTT TGTTCTCTTC TTCAAGTCTT 180 TCTACTAAGA CCCTTTCTCC TAATGGTTGA AACTTCATTT CAGTTCTCCT AAGTTTTGAT 240
AAATAAAATA GAAATTTAGC GCTTATTGTT ATTAAGCGAC ATAACTATAG CGCATTTTTA 300
GGGAGAAGNN NISTNNNNNNNN NNNNNNNNNN NNNNNNKTNNN NlSTNIxTNNMNNN NNNNNNNNNN 360
NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NMSTNNNNNNN NNNNNNNNNN 420
NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNN1STNNNNNN NNNGAAAGTC AAACCAAAAA 480 AGCCATTTAA AGAAAGCATG AAAAATGATT CTTAAAAGTT CCATTGATCG CCTTTTGCAA 540
ACGATAGACA TTGTAGAAGT CATTAGCTCC TATGTGGATT TGAGGAAGTC AGGCTCGAAT 600
TACATGGCTT GTTGCCCTTT TCATGAAGAA AGGAGCGCGA GTTTTAGCGT CAATCAAATT 660
(2) INFORMATION FOR SEQ ID NO: 479:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1206 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 479:
AGCTTATTCC TTTTATTGTA AGGATTTAGG CTATTGAACT TTAGGAGTTT TAATGATATT 60 AAGAGCGAGT GTGTTGAGCG CGTTACTTCT TGTAAGCTTA GGGGCAGCCC CTAAACATTC 120 AGTTTCAGCT AATGACAAAC GGATGCAGGA TAATTTAGTG AGCGTGATTG AAAAACAGAC 180 CAATAAAAAG GTGCGTATTT TAGAAATCAA ACCTTTAAAA TCCAGCCAGG ATTTAAAAAT 240 GGTCGTTATT GAAGATCCGG ACACTAAATA CAATATCCCG CTTGTGGTGA GTAAGGATGG 300 CAATTTAATC ATAGGGCTTA GCAGCATATT CTTTAGCTAT AAAAGCGATG ATGTGCGATT 360 AGTTGCGGAA ACCAATCAGA ATGTTCAAGC TGCGTTAACG CTACCCAGCA AAAGCAGCGC 420
GAAAGTTGAA GCGCTAGTTT GTGTGAATGA GAATATACCG GCTGATTATG CGATAGAGTT 480
GCCCTCTACT AACGCTGAAA ATAAGGATAA AATCCTTTAT ATTGTCTCTG ATCCCATGTG 540
CCCGCATTGC CAAAAAGAGC TCACTAAACT TAGGGATCAC TTAAAAGAAA ACACCGTGAG 600 AATGGTTGTA GTGGGGTGGC TTGGAGTCAA TTCGGCTAAA AAAGCGGCTT TAATCCAAGA 660
AGAAATGGCG AAAGCTAGGG CTAGGGGAGC GAGCGTGGAA GATAAAATCT CTATTCTTGA 720
AAAGATTTAT TCCACCCAAT ACGATATTAA CGCTCAAAAA GAGCCTGAAG ATTTACGCAC 780
TAAAGTGGAA AATACCACTA AAAAGATTTT TGAATCTGGC GTGATTAAGG GTGTGCCTTT 840
CTTGTACCAT TATAAGGCAT GATATAAGGT TACTCTCATG AAAAAACCCT ACAGGAAGAT 900 TTCTGATTAT GCGATCGTGG GTGGTTTGAG CGCGTTAGTG ATGGTGAGCA TTGTGGGGTG 960
TAAGAGCAAT GCCGATGACA AACCAAAAGA GCAAAGCTCT TTAAGTCAAA GCGTTCAAAA 1020
AGGTGCGTTT GTGATTTTAG AAGAGCAAAA GGATAAATCT TACAAGGTTG TTGAAGAATA 1080
CCCTAGCTCA AAAACCCACA TCATAGTGCG CGATTTGCAA GGCAATGAAC GAGTGTTAAG 1140
CAATGAAGAG ATTCAAAAGC TCATCAAAGA AGAAGAAGCC AAAATTGATA ACGGCACGAG 1200 CAAGCT 1206
(2) INFORMATION FOR SEQ ID NO: 480:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 930 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 10
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 480: CGGTTTTTTA GGGGAATTTT TTTTGACATA TTTCAAGCCT TTGATCAAAA ATTTTGAGTT 60
ATTTAATGCA TTAACTCAAG TAAGCATTGC AAAAGGATCT TTAAACTTTA GAACTATCCA 120
TGATAAAAAT ACCAAAAACC CACCTGCAAT ACTAACCACT TTAGGCCACA CGCTCACAAA 180
CAAAATCATG GAGCAAAATC ATCTGCCGCA CTAGATCAGA GAAAGGCGCA AAAGGCCTTT 240
CTGTTTTTAA GTCTTACTTG ACTTCAACCT TAGCACCTAC TTCTTCAAGT TTCTTCTTGA 300 TGGTTTCAGC TTCTTCTTTA TTCACGCCCT CTTTAAGCAC ATGAGGGGTT TTTTCGGTAG 360
CGTCTTTAGC TTCTTTCAGG CCAAGTCCAG TGATTTCACG AACCACTTTA ATCACCTTAA 420
TTTTTTCAGC ACCGCTATCG GCTAAAATCA CATTAAATTC GGTTTTTTCT TCGCTCTCAG 480
CCGCTGCACC GCCAGCTACA GCCGCACCCG CTACGACCGT TGGAGTCGCG CTCACGCCAA 540
ATTTTTCCTC AAACATTTTA ACCAATTCAG CAAGCTCTAA AACGCTCAAT GAACCAATAT 600 ACTCTAACAC TTCTTCTTTT GAAATTGCCA TAATCCAATC CTTCAAATTT TTTTAATTAA 660
AGCCATCATA GGCTCTTAGT TTTCTTCTTT CGCTTTACGC AAATTGTCTA AACCGGTCAC 720
AAAATAACGC ACCGGAGCCG TCCAAACAGA AAGCAACATT CCCATAAGCT CTTCTTTGCT 780
TGGGAGTTTT GAAACCGCTT CCACATGAGC TACGCTAACG CTTTCTTTAT CAAACAAGCC 840
CGCTTTCAAC ACAAAGTGAT CTTTATGCTC TTTTTGGAAA TCAAACACGA GTTTAGAGAG 900 AGCGATTTGA TCATCGCCCC ACAAAAACAC 930
(2) INFORMATION FOR SEQ ID NO:481:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 945 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 11
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 481: AATACTTTTT CTAAAAAATC CATGGTTAAA TTGCCCTTTC GTTTTAAAGA TAAACATTGT 60
AGCATTTTTA GATTTAAGAA TGCTTTTTAT ATATTATATA AAAATATCCC CTTTTAACCC 120
CCTATTGATR CCAACCCCTT TTTGAACCTA ATTCTCATTA AATAGATTTT TATGATAAAA 180
TCTAAACTTT ATCAAGCCAT TAGCCGGTGT TCTTTCTCAT TTTTGTAAAT TTTTAAAAAT 240
TTTCATACTC YYRTTTACTT TTTCATTATC ATTTATGCTA TAATTATGGG ACAACTTAAA 300 CCAACACAAA GGAGATACTA TGTTATCAAA AGACATCATT AAGTTGCTAA ACGAACAAGT 360
GAATAAGGAA ATGAACTCTT CCAACTTGTA TATGAGCATG AGTTCTTGGT GCTATACCCA 420
TAGCTTAGAT GGCGCGGGGC TTTTCTTATT TGATCATGCG GCTGAAGAAT ACGAGCATGC 480 TAAAAAGCTT ATCGTCTTCT TGAATGAAAA CAATGTGCCT GTGCAATTGA CTAGCATCAG 540
CGCCCCTGAG CATAAGTTTG AAGGTTTGAC TCAAATTTTC CAAAAAGCCT ATGAACATGA 600
GCAACACATC AGCGAGTCTA TTAACAATAT CGTTGATCAC GCCATAAAAG GCAAAGATCA 660
TGCGACTTTC AATTTCTTGC AATGGTATGT GTCTGAACAG CATGAAGAAG AAGTGCTTTT 720 CAAGGATATT TTGGATAAAA TTGAGTTGAT TGGTAATGAA AACCATGGCT TGTATTTGGC 780
TGATCAGTAT ATCAAAGGGA TCGCTAAAAG CAGGAAATCT TAATTTTAGG GTCATTGRGT 840
GCAAAAACTA GCCGTTTTTG ATTTTGACTC CACGCTAGTC AATGCTGAGA CGATTGAGTC 900
TTTAGCGAGG GCGTGGGGGG TGTTTGATGA AGTGAAAAMG ATCAC 945 (2) INFORMATION FOR SEQ ID NO: 482:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3069 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 12
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:482:
AGCTTTAAAC CACGATTTAA TGAGCGCTAA AATCAGGGTT TTAAAAAAAT AAAGATTGTA 60
TTAAGAGAGG TTCTTTATGC AATCCATGCA AAGCGTAATA GATTGTGAGA GCACCCAAGT 120 TGTTAAAATT GTTGTAAAAT TCGTTTTTAT TGATTGTCAT TCTTTTTGCT CGCTCGTGTA 180
GGCTCTTAAA AAAGATTTTA GTTTTTGTTA ATTTTAGTTT TTAGCGGGTG TGGTTTTAGA 240
TGAGCTTGTG GGATAAAGAG TATTGAATGA TATTTAGTTT ATTTGTGGCG AGAACAAAAG 300
CCACAAACCT TTAGAGAGTT ATTTTAATGG GTGCGAAAAC GCCACGCCAA ACAAAATGAT 360
AACGCTACCG GACTGAATCA ATAACCAAAA GTGCAAGCAA GATTAATATC CTGCTTATAA 420 CCCCGTATTG CCTTGCCCTT CTTTATTGGC ATAATCTTTC AGTTCTTTAT AAAGCAAAGA 480
TTGCAAGCGC GTGATAGCAA AATTCTTATT CTTTTCATCC ACTTCAAGGC TTCTATTCAG 540
TACGGGCGTG CCTTGACAAT CGCTCGCGCT AACGACCACT CTAATGCGTG TGGTGCCATT 600
AGCGCTGTCT GTGAAAACTT CATTTTTGAT TTGATAGAGT TTTTCTTCAT CGCTAGGGGT 660
TAGCACTCTG GCATAAGCGC TTATGAGAGC GTTTTTAATC TCTGTATCGC TGTTGTTATC 720 AAAAGTGATT TGCACTTTAG GTTTGAAAGC GTTTTGATAA TAGATTTTTT CATAATTTTC 780
TAACGAACCC ACCGGAACAG AATACGCTTT TAAAATCCTT TCTATAGGCA TCGCTTTAGC 840
CAATAAATCC CCCATGCTCT TGGATTGCTG TAAAAAAACC CCTTTACAAA CCTTAGGCAT 900
TAAGGTGGAA AACTGCCCAT AAAGAGCGTT ATACTTATCC CTTAAACCCT GCAAAAACAA 960
GTTTTGATTG ATCCTTACTC TGGTGTAGTA GATCCCTTTT TGCGCTTCTT GATTGACAAT 1020 TTCTACATTA TTCAATTCCA AGTCATCGGT CTTTAAGTTA ATCGTTTGCG AATCACTGGA 1080
TTTTAACTTA TTATCCACAC GACTTTTTTG GATATGGATT TGGGAATTAA CCACCACGCT 1140
AATAGACGCC ACTAAATCCG CTAACGCTTT TTGTTTAGAA GCTTCTTTAG AAGTGGCTGA 1200
ACCATTCCCA TAAAGATAGC CTTTTTGGGT GTTTGTTTTG TTATAGGCCT TGCCATACCA 1260
CTTAGGCTCC GCGCTTAAAT TAGCACCCAC AAAAGCCATA AGGCATGCAA GAATAATCTT 1320 TTTCATTATT AATCCTTACT CGTTCATTAA AGCGACATGC TTCTTAGTGA TTCTTAGAAG 1380
CACATAAACT TTATCCGTAT CAATAAAAGT GGCCTTTAGC TCACTAGCCT TAATCTCTAT 1440
TTTATCCCCC TTGAAAGCTT CCATCACTTT CTCTAAGGCG TTTTGTCTGG CTGAGGACAA 1500
GCTATAGTCC ATATCTGAAA GCAATATATC GCTCACCCCA CACACTAAAA TTTTTTCGTC 1560
TTTGGTTAGG GTCTTTTCAA TCTGCACTTC TTTTGAGCAA TCTTTTGCCC ACTTAGGCAA 1620 GGACTTCCCC CACGCCACTC CCAACACCAA AACAAGCGCA ATAGCAAGGC GTTTCATTAA 1680
AACCCTACTT TTTAACCATG CCCAACTCTT CGCGCACTTT ATCCACAATT TGCTTATCCA 1740
AGCCCACTAA AACAAAAACC CTATCTTTAC CGACATAGCG GGCAAGCATT TTAGAAGCGA 1800
TCAATTCCTT ATCCACTAAT TGAGAAATTT TTTCAGTATC AGTGCCGCTT ATGGACCTTT 1860
TACCAGAAGC GTCTACCGTT CTGGTTTTTT CATTTTCCAA ATCTTTTTGT AAAGTGGATT 1920 TTAAATTCGC CGCTAAATTA GCCCTAGCCT TCGCTGTAGC CTGGTTAGTA GAATAATCCA 1980
CATCATTATT GGTGATCAAA TCTTCAGCCC TTCCTAAAAA GACCCCTGAA TACTTTTCAT 2040
ACTTCGCCAC TTTTTCTAAA TCCCCTACCA CCCAATCAGG AGCGCCTTTA GTCGCTTCTT 2100
TGTAAGCCTT ATTGCTTTTA CTGATACCTG ATTTTGGGGC ATGACTACAA CCTACGATCA 2160
CCATCGCTGC TATCACACTC ATCCCTAAAA TTTTTTTAAC TTGATTTTTC ATCTCTCATC 2220 CTTTCAAATA TAAATTTAAA ACATACGCTT ATTGCTAGCG TTCTTCACAA TAGGCTTAAC 2280
ATCGCTCCAT ACCTCTTCAC CCGTTTTCCT GTTGGTAAGG CTTAGGGTGA AGTCATAGTC 2340
CAAGCGCTGC CTAGAACTAC TAATAGAGGC TGCGATACTA GATACTTTAC CACTTAAAGA 2400
TAAATCAGCG GCTTTTAAAG TGCCTTTTTC TACAGTGGTG TCTTGATTAT ACTCTTCACT 2460
CTCTCGTTCT TTTTCGCGCT GCTTCACCAT TCGGCTATCG GCTGCAATGC CATTCCTTCC 2520 GTTCGCCCTT GTGATATTGA ACCTCCCATT AGATCGCAAC CGCAACTGCC GCGCAATTTC 2580
AGTCGTCAAA AGATTCATGT CCAAATTGGG TTGCGTGGTG TCGTTAATCA CATCTGAAAC 2640
TTCAATCAAA TGCTTGCCCT TGAGTTGCTC AAAATTAGGG TCGCTAAACA TGGAATCTAA 2700 CATCGCGTTA GCGGTTAGCA ATAAATCCGT GCTATTAATG CTTGCAGTCG TATTTTTAGT 2760
GGCATCATTC ACATTTTGAT AAGTCGCCAT CTCGCTTGAG CAACCCACCC ACAATAAAGT 2820
GTTCAAAATC ACCGAGCTTA TAATTTTTAA TTTTGTTTTC AATACCATAG TGATAGAAAC 2880
TCCCTATTCA AAATTTAAAA CAGATACTTC CGTTCCTAAT TTTAAACTTC TAATCTAAAC 2940 ACACAGTGTT TTTGCGGATT TTAGCATTTT ATGCGTCTTT TTTTTCACAT TTTACAAGTT 3000
CTAGCCCTAC AAATTTCACA CTGCAAGCAT TAGTACCACA AATTCGCTCA TTTTTAGCTT 3060
AATTAAAGA 3069
(2) INFORMATION FOR SEQ ID NO: 483:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1003 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 13 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 483:
ATGAAAAATG CAATATCTTT TTATCGCTTC AAATCAATTG TTCAAAAAAC ATGCCACCTA 60
GCATAAAACA AAAAATCCTT GAATCAAGTC CAAGTTTGAG AATTATCGGC TTGAAGTGTT 120
CTTTTTCTTA CAATTTTCAT CAATGTCAAA ATCGCATGCT TTTTGCATGT ATTCTTCAGC 180 CTTTTTCTTA TCTTTTTCCA CGCCTAACCC ATACTGATAA GACTCTGCTA ACCCTTCATA 240
AGCTCTAGAA GAGCTCATAT CAGCCGCCAT TTTATAATAG ACAATCGCCT TATCTTTGTC 300
TGGCTCAATA CCCAATTGAT CATTCCCACT ATAATAAATA TCCCCTAAAA GGATATAAGC 360
TTCTACATTA CCCTTATGCA TCGCTTTTCT AAAACACTCT GTCGCTTTCA CATAGTTGCT 420
TGGAACGCCC CTACCCTCCA TATACATGAT GCCTAAGTTG ATATAAGCGT TAGTATAGCC 480 TTTCTCTGTA GCTATTCTAA AATACTCCAC GGCTTTCTTT TCATCTTTAG GAACGCCCCT 540
ACCCTCTTTA TACATCACGC CTAAATTGTT ATACCCTCTA GGTATATCGT TATCAACCGC 600
TTTTTGAAAA TATTCAGCCG CTTTCTTGTA ATCTTTAGGC ACACCCCTAC CATTTTCATA 660
CATGATTCCT AAAAGAACAT AAGCAAGCGG CTCGCCATTT TTAATCGCGC TCTTATAAAA 720
AGAAGCCGCC CGCTCATATT CCTTATTATT ATAGGCTTCC TCCCCTTTAT AAATAAAATT 780 CCTTTTTTGT TCAAGGTGTT CTGCACCAAG AGCGCTAAAT AAACATAAAC TTGCCAAACA 840
AATCTTCAAG GCTAATTTGC TTGCGTATCC CATTGTTACT CCTCTGCCTT AATAAGATCA 900
AACATAATGC CAATGGCTAC CAATCTTAAA AAAGAAACTG TATGCGTTCT AAAACCAACA 960
AACATTCTCA AACTCCATTC TGCATACACC ATTTTAGATC TAA 1003 (2) INFORMATION FOR SEQ ID NO: 484:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 831 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 14
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:484:
CAATGCCTTC TTCAACCGCC GCTTTAGTCG CGCTCAACGC GTCATCCACC CGGTCTTTTT 60
TCTCTTTCAT TTCCACTTCA CTCGCAGCGC CCACTTTAAT CACAGCCACA CCGCCAGAGA 120 GTTTAGCCAA TCTTTCTTGC AATTTTTCTT TGTCATAATC GCTTGTCGTG CTTGCAATTT 180
GGGTTTTGAT TTGCGCAACT CTGTCTTTAA CATCATCGCT ATGGCCTTTG CCATCTACGA 240
TCGTGGTGTT GTCTTTGTCA ATCACAATCC TTCCGGCTTT GCCTAAAAAC TCCACTTCAG 300
CGTTTTCTAA AGTCAAGCCC AATTCTTCGC TAATAACTTG ACCGCCGGTT AAAATAGCGA 360
TGTCTTTGAG CATTTCTTTT CTTCTGTCCC CAAAGCCTGG AGCTTTAACC GCTGCGATAT 420 TCAACACGCC TCTTAATTTA TTCACCACTA GAGTCGTTAA AGCTTCGCCC TCAATGTCTT 480
CAGCGATGAT TAAAAGCGGT TTGCCCTCTT TCATGGTTTT TTCCAATAGC GGGAGGATGT 540
CTTTCATGCT AGAGATTTTT TTGTCCGTTA AAAGGATGTA AGCGTTATCC AATTGAGCGG 600
TCATTTTCTC AGCGTTCGTT ACAAAATAAG GGGAGAGGTA GCCTCTATCA AATTGCATGC 660
CCTCTACGAC ATCTARTTCA TCTTCAATGC CCTGAGCCTC TTCAACGGTG ATCACGCCGT 720 CTTTGCCCAC TTTTTCCATA GCGTCAGCGA TGAGTTTCCC GATATTGTGA TCGGAGTTTG 780
CAGAAATAGT CGCCACTTGG GTGATTTCTT CTTTACCGCC TACTTTTTTG C 831 (2) INFORMATION FOR SEQ ID NO: 485:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 261 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 15 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 485:
AGCCTAAGAT TTTCACGCTT TTAGAGTTGT TGCGCCTTTT CTTAAACCAC AGAAAAACCA 60 TTATTATAAG ACGCACGATT TTTGAATTGG AAAAGGCTAA GGCTAGAGCG CATATTTTAG 120
AGGGCTATTT GATCGCACTA GACAATATTG ATGGAATCGT GCGACTCATT AAAACAAGCC 180
CAAGCCCAGA AGCGGCTAAA AACGCCTTAA TGGAGCGTTT CACTTTGAGC GAGATCCAAA 240
GCAAAGCCAT TTTAGAAATG C 261 (2) INFORMATION FOR SEQ ID NO: 486:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 270 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 15 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:486:
AAAGGCTATG TGAAAAGAGT GGGTTTAAAA GCTTATGAAA AGCAAAATCG TGGCGGTAAG 60
GGCAAGCTTT CAGGCAGCAC TTATGAAGAC GATTTCATTG AAAACTTTTT TGTGGCTAAC 120 ACGCATGATA TTTTGCTCTT TATCACCAAT AAGGGGCAAT TGTATCATTT GAAAGTCTAT 180
AAAATCCCAG AAGCGAGCCG GATCGCTATG GGTAAAGCCA TTGTGAATTT AATCTCGCTC 240
GCTCCGGATG AAAAGATCAT GGCAACCCTA 270
(2) INFORMATION FOR SEQ ID NO: 487
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 594 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 16 (ix) FEATURE:
(A) NAME/KEY: Other
(B) LOCATION: 229-305
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 487:
TAGGGCGTTT TAGCGTGGGT TTAGAGATTG AAAAAGAATA TTGTGAGTTG TCTAAAAAGC 60
GTATTTTGGA GAGTTTGTCG TTAGTGTGAG CGTTTTAAAA ACCTTTGAGG GTAAAATAGT 120 GTACAATACT AAAGATTTTA AAACTCAAAA AGGATTGATA ATGAATTTAT TTGAAAAAAT 180
GACTGACCAA TTGCATGAGG CTTTAGACAG CGCGCTCGCT TTAGCCTTNN NNNNNNNNNN 240
NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNMSTNNNNN NΪrNNNNNNNN NNNNNNNNNN 300
NNNNNAAGCC TTACAAAAAA TGCCTGTGGA TATTGAAGCT TTAAAACTTA GCGTTCAAAG 360
CGAGTTGAAT AAGCTCGCTA AAGTTTCACA AATCAGCAAG CAAAATATCC AATTAAACCA 420 AGCTCTAATC CAAAGTTTAG AAAACGCTCA AGGCTTGATG GCTAAAACGG GCGATTCTTT 480
CATCGCTACC GATGTGTATC TTTTGGCGAA CATGAGCCTT TTTGAAAGCG TTTTAAAGCC 540
TTATTTAGAC ACTAAGGAAT TGCAAAAAAC TTTAGAATCT TTAAGAAAAG GCAG 594 ( 2 ) INFORMATION FOR SEQ ID NO : 488 :
( i ) SEQUENCE CHARACTERISTICS : (A) LENGTH : 1323 base pairs (B ) TYPE : nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 17
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:488:
TTTTTCACAT ACTCGGCAAA ATTTTCAAAA GCTTCAACAT TCAGCGCTTG AGATTTTTTG 60 AGAAGTTGGA TTTCATGCTC GTTTTTAATG ATGCGTTTTT GGCGGTGGTA ACTAGGCACG 120
CCCTCTAAAG CAACCTTATC CCCAAGCGCT GAATTTAAAC GCTTGTAGGT TTGTAAATTC 180 ACTTGATTGG GGTCAAAAAA GAGTTTTTTA ACCGAACTTT TTGCGATCAA ATCAACCGCA 240
CTTCGCGCTA AATCGCTAGA TTCTACCACT TCCGCTAAAA CGCCATTTTT AGGCTGAACG 300
CTTTCTTTAG CTTCTTGAGT GTAGCGAGAA TCAGTGATAA AAAACGAGCG ATCGTCTAAT 360 TGCAAAAACA AAGCGTTATC GCAACTATAA GCACACTCAA AAAACATCGC ATTCTCATTG 420
AGCGTGAAAT GCGATTCTCT TTCTAATCCT TTCATGGGTA GCCCCCTTTT ATTTTTGATT 480 GTTAATGGGG TTATTAGGGT TGTTTTGTTG GGCTTCTTGG AACGCTTTCA TCTCGGCTAA 540
AATATTGACC ATCGCCATTA AAGCCATGTT GTAACCAAGC GGGCCAAATC CCATGATCAC 600
GCCTCCACAA GCCGCTCCGG TGTAAGAATT TTTCCTGAAT TCTTCTCTAG CTTGAATGTT 660 AGTGAGATGC ACTTCAATAA CGGGTTTGCC CGCTAGCATG ATCGCATCCG CAATCGCAAT 720
AGAAGTGTGC GAAAACGCTC CAGGGTTAAT GATAATCCCT TCATAATCGC TGCCCACGCT 780
CTCTTGGATC TTGTCAATGA TTTCGCCCTC AAAATTGGTT TGAAAAAACT CTAATTCCAC 840
ATCTAAATTG CCTTGTTTCA CGAAAGTTTG CATGATTTCA TGGATTTGGT CTAAGGTTAC 900
CATGCCATAA AGTCTTGGGT CTCTGTGTCC TAACATGTTT AAATTAGGCC CTTGAATCAC 960 TAAAATTTTC ATTATTTATT TCTCCTTGTT TAATATGGAA TGAAACCGCA TTATAGCATA 1020
AGTTTTTATC TTACCCCTTA AATCCCCCTA AGAACGCATT GTCAAGAAAC TACCGCTTGA 1080 CTAACATCAA ACCCTTTGAT GCTAAAAAGC GCTTTTATCC TTTTTTAAAA AAACAGCCAT 1140
CATGATTATT CCTACAAAAA AACTTCCCGT AATGAAGAAA TCAAAGGGGT CAAACCCGAT 1200 GCCAAAAATG AGGTAAAAAG CGAGTGTCGT TAATATAAAG AGCGATATTA AAGAATTTTG 1260 CTTGATCCCA CTCCAAAAAA TCTTTACAAT AACTAATAAA AAAAAGAGCC AGATCAAAAA 1320
GCC 1323
(2) INFORMATION FOR SEQ ID NO: 489: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2705 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 18
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 489:
GAATTCAATA AATGCAACCA AAAATGCTAG GTTGCACCAC TATTTGAACA AATGGCATGA 60
GTGTATAAAA GACAAGGATT TTAGAGATAT TGATAATGAC ATCAAACAAC TTTTAAT CT 120
CCTTAGCAGA AATCCCAAAT ATCTTCAGTG TCTGGGATGA ATGCTACCAA TTCATGGTAT 180
CATATCCCCA TACATTCGTA TCTAGCGCAG GAAGTGCACA AAGTTACGCC TTCGGAGATA 240 TGATGTGTGA GACCTGTAGG GAATGCGTTG GGAGCTCAAA CTCTGTAAAA TCCCTATGAT 300
TAGGGACACA GAGTGAGAAC CAAATTCTCC CTACGGGCAA CATCAGCCTA GGAAGCCCAA 360
TCGTCTTTAG CGGTTGGGCA CTTCACTAAG TCAGCATCAA GTTTTAGGGG CTACTTGTGT 420
TGGTCTTGCT GTTAAGCATC TTGGGATTGA ATTGATGGAA TTTGATGTTA CCATCATAGA 480
TGAGACAGGC AGGGCCACAG CACCAGAAAT CTTGATTCCT GCACTTCGCA CTAAAAAACT 540 GATCTTAATA GGCGATCACA ACCAGCTCCC ACCTAGCATT GATAGGTACC TCCTAGAACA 600
ATTAGAGAGC GATGATATTC AAAACTTGGA TGCCATTGAT CGCCAATTAT TGGAAGAGAG 660
TTTTTTGGAA AATCTCTATA AGTATATTCC AGAGAGTAAT AAGGCCATGC TTAATGAGTA 720
ATTTAGAATG CCTGCTTCTA TTGGATCGCT AGTTAGTCAG CTTTTTTATA AAGAGAAACT 780
TAAGAATGGA GTGATCAAAA ATACCTCGCA ATTTTACGAT CCTAAGAATA TTATCCGTTG 840 GATTAATGTT GAAGGGGAGC ATAAACTAGA AAAAACAAGT AGCTATAACA AAAATCAAGT 900
TCAAAAAATC ATAGAGCTTT TAGAGCAAAT CAATCGCATT CTTAATCAAA GAAAAATCAA 960
AAAAACCATA GGAATTATCA CACCTTATAA TGCCCAAAAA AGATGCTTGC GATCAGAAGT 1020 GGAAAAATAC GGCTTCAAGA ATTTTGATGA GCTCAAAATA GACACTGTGG ATGCCTTTCA 1080 AGGCGAGAAG GCAGATATTA TTATTTATTC CACCGTGAAA ACTTATGGTA ATCTTTCTTT 1140 CTTGATAGAT TCTAAACGCT TGAATGTAGC TATTTCTAGG GCAAAAGAAA ATCTCATTTT 1200 TGTGGGCAAA AAGTCTTTCT TTGAGAATTT GCGAAGCGAT GAGAAGAATA TCTTTAGCGC 1260 TATTTTGCAA GTCTGTAGAT AGGTAATCTT TTCCAAAGAT AATCATTAGG CATTATCCGC 1320 TTCAAAACGC TCCTAAATTG CAAACTTATT TTTTTTGAAT GCTTTACTTT ATGGTGAGCC 1380 ATAACTTTAT AATCTACCAA TCCATCGCAT GACTTTTAAA ATACTCAAAG ATCCTAGATG 1440 AGAGCTTGAG TTGGATTGAC TTTAGTTTAT TTTAATTTTT CTTTATTTTG AAATATCTTG 1500 AAATGCTTAG CTCAATCAAC ATTTAACAAA AAAGCCAAAA CATTTTTTAA GAAGAAAAAA 1560 CCCTAAAACC CAATATCAGT TTGATTGCTA AAATAAAAGC TACCAAAGTC TTTGGGCGTG 1620 TGGTGCGATT CTTTCTCTAT AACGGCGTCT TTAACGCAAG CGACACGCAG AGCGTCAAAG 1680 CGAGTTTCAA CGCTAGCACA ACCGCCGAGT AAAGACGCTC CAATAAACAT ACTATTTCTA 1740 ATGGTTTTCA TTTTATATCC TTTTGTTTTA AAATTTTTAA TAACTCAAAT ACTTTAATCA 1800 TGTATTTATG GATAGTTAAG ATTTATTATG AAAAAAAAGT AAATGGAACT CAAAACAATC 1860 AAAATAACGC TCTAAATCCC AACCAGAAAC GCTACCCTTT GTAAATCCTT AATAATTTTT 1920 GCTATAATAA AGCCCTAACT GAAATTTATC ATTTTATTTT AGTTAGGCTC CTTGAATTAG 1980 AATTATAGTA GACTTGTTAT ACCTTGTTCT AAATATTGTG GTATACTAAC AATGTTCAAA 2040 GACATGAATT GATTACTCAA GTGTGTAGCG ATTTTTATCA GTCTTTGATA CCAATAAGAT 2100 ACCGATAGGT ATGAAACTAG GTATAGTAAG GAGAAACAAT GACTAACGAA ACCATTAACC 2160 AACAACCACA AACCGAAGCG GCTTTTAACC CGCAGCAATT TATCAACAAT CTTCAAGTGG 2220 CTTTTCTTAA AGTTGATAAC GCTGTCGCTT CATACGATCC TGATCAAAAA CCAATCGTTG 2280 ATAAGAATGA TAGGGATAAC AGGCAAGCTT TTGATGGATC TTCGCAATTA AGGGAAGAAT 2340 ACTCCAATAA AGCGATCCAA AATCCTACCA AAAAGAATCA GTATTTTTCA GACTTTATCA 2400 ATGAGAGCAA TGATTTAATC AACAAAGACA ATCTCATTGA TATAGGTTCT TCCATAAAAA 2460 GCTTTCAGAA ATTTGGGACT CAGCGTTACC GAATTTTCAC AAGTTGGGTG TCCCATCAAA 2520 ACGATCCGTC TAAAATCAAC ACCCGATCGA TCCGAAATTT TATGGAAAAT ATCATACAAC 2580 CCCCTATCCC TGATGACAAA GAAAAAGCAG AGTTTTTGAA ATCTGCCAAA CAATCTTTTG 2640 CAGGAATCAT TATAGGGAAT CAAATCCGAA CGGATCAAAA GTTCATGGGC GTGTTTGATG 2700 AATTC 2705
(2) INFORMATION FOR SEQ ID NO: 490
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1660 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 19
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:490:
GCTCGAAATC TTGAACGAAT GCCTAAATTT AGTCATCGCT CACCCCAAAA ATAACGCTTT 60
AGAATGGCTG ATTGAATGGG TTAAGGGTCA TTATTTACCT AATAATGCTA TAAACCATTC 120
GCCAATAGGC ACAAAAAATT AAAAACAGAG AAAACATGAT AACGATGGAT GCGATTCAAT 180
GGCCTAAGAA ATGGATTTTA GGAGAGACTG ATAGTTTCGT GTCTAATGAA GTCATTGTCA 240
AAGGTTTGGA TTTTAAAAAA GTGGTGCAGC ATTTAATGGC GTATTTGTGC TAATAAGTGA 300
GGGTTAAATG AAAATAGGAT GGATTGGACT TGGGGCTATG GGGACTCCTA TGGCGACTCG 360
CGAGCCGGCC GCGAATTCGC GGCCGCTCGC TATCAATAGT GAAGTCCTGC AAACCTCAAA 420
ATTGGGCTAT CGCACTTTGA AATTTTTCCC GGCAGATATT GCGGGGGCGT CAAGCTTTTA 480
AACCCCTTTT TAATGCCCCT TTTAAACGGG TGAAATTTTG CCCTCCTGGG GGGATTAGCG 540
CAGATAACAT GCGTTCTTAT TTAAATTTAG AAAATGTTTT GTGCGTGGGG GGGAGCTGGC 600
TTACCCCTAA AGATTTGATA CCAAACAAAG AGTGGGATAA AATCACAGAA ATTTGTAAGA 660
GAGCGCTAAC TTTAAGATAC CCAAAAGATC ATGGCATTTT GTATTTGCTT AATAACACTA 720
TAAAAAAANT TTTAATTAGG AGATACATCA TGTTAGAAAA TGTCCAAAAA TCCCTTTTTA 780
GGGTTTTGTG CTTGGGAGCG TTGTGTTTAG GGGGGCTAAT GGCAGAGCCA GACCCTAAAG 840
AGCTTGTGGG TTTGGGCGCG AAGAGTTACA AAGAGCAAGA TTTCACTCAA GCTAAGAAAT 900
ATTTTGAGAA AGCGTGCGAT TTGAAAGAAA ATAGCGGGTG TTTTAATTTA GGGGTGCTTT 960
ATTATCAAGG GCAAGGGGTG GAAAAGAACT TGAAAAAAGC CGCCTCCTTT TACGCTAAAG 1020
CTTGCGATTT GAATTACAGC AATGGGTGTC ATTTGCTAGG GAATTTATAT TACAGTGGGC 1080
AAGGCGTCTC CCCACACACC AATAAAGCCT TACAATACTA CTCTAAAGCG TGCGATTTGA 1140
AATACTCTGA AGGGTGCGCG AGCTTAGGGG GGATTTATCA TGATGGTGGA AAATGGTACA 1200
CTAGGGATTT TAAAAAAGCG GTGGAATATT TCACTAAAGC GTGCGATTTA AACGATGGCG 1260
ATGGTTGCAC GATATTAGGG AGCTTGTATG ATGCAGGCAG AGGCACGCCT AAGGATTTGA 1320
AAAAGGCGCT CGCTTCGTTT GATAAAGCTT GCGACTTAAA AGACAGCCCG GGGTGTTTTA 1380
ACGCAGGGAA TATGTATCAT CATGGCGATG GCGTGGCGAA GAATTTTAAA GAGGCTCTCG 1440 ATCGTTATTC TAAAGCATGC GAGATGCAAA ACGGCGGAGG GTGTTTCAAT TTAGGGGCTA 1500
TGCAATACAA TGGCGAAGGT GCAACAAGGA ATGAAAAGCA AGCCATAGAA AACTTTAAAA 1560
AAGGCTGTAA ATTGGGCGCT AAAGGGGCAT GCGATATTCT CAAGCAGGTC AAAATCAAAG 1620
TTTAGTTTGG ATTAAGGTTG ANCAAGCGGT TTAAAAAGCG 1660
(2) INFORMATION FOR SEQ ID NO: 491:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 918 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 20
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:491:
TGAGCTTGAA AGAATTAGAG CTTTTAGAAA AAGTGTTTTT AGGGGTTTTA GAGGACTTGA 60 GTGAGAAATA AAATAAAYAA ACATTAAGTA AGGCTTATCA ATATTTGATT ACAATTATAA 120
NGGGTTACAT TTTTTTAATA GGAGATATAC CATGCTAGGA AGCGTTAAAA AAACCCTTTT 180
TGGGGTCTTG TGTTTGGGCA CATTGTGTTT GAGATGGTTA ATGGCAGAGC CAGACGCTAA 240
AGAGCTTGTG AATTTAGGCA TAGAGAGCGC AAAGAAGCAA GATTTCGCTC AAGCTAAAAC 300
GCATTTTGAA AAAGCTTGTG AGTTAAAAAA TGGCTTTGGA TGTGTTTTTT TAGGGGCGTT 360 CTATGAAGAA GGGAAAGGAG TGGGAAAAGA CTTGAAAAAA GCCATCCAAT TTTACACTAA 420
AGGTTGTGAA TTAAATGATG GTTATGGGTG CAACCTGCTA GGAAATTTAT ACTATAACGG 480
ACAAGGCGTG TCTAAAGACG CCAAAAAAGC CTCACAATAC TACTCTAAAG CTTGCGACTT 540
AAACCATGCT GAAGGGTGTA TGGTATTAGG AAGCTTACAC CATTATGGCG TAGGCACGCC 600
TAAGGATTTA AGAAAGGCTC TTGATTTGTA TGAAAAAGCT TGCGATTTAA AAGACAGCCC 660 AGGGTGTATT AATGCAGGAT AYATATATAG TGTAACAAAG AATTTTAAGG AGGCTATCGT 720
TCGTTATTCT AAAGCATGCG AGTTGAACGA TGGTAGGGGG TGTTATAATT TAGGGGTTAT 780
GCAATACAAC GCTCAAGGCA CAACAAAGGA TGAAAAACAA GCGGTAGAAA ACTTTAAAAA 840
AGGTTGCAAA TCAGGCGTTA AAGAAGCATG CGACGCTCTC AAGGAATTGA AAATAGAACT 900
TTARTTTCAA TRAAGTTA 918
(2) INFORMATION FOR SEQ ID NO: 492:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2395 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 21
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 492:
AGGAAACAGC TATGACCATG ATTACGCCAA GCTCGAAATT AACCCTCACT AAAGGGAACA 60 AAAGCTGGAG CTCCACCGCG GTGGCGGCCG CTCTAGAACT AGTGGATCCC CCGGGCTGCA 120
GGAATTCTTT CTTGCAAGAT GTGCCTTATT GGATGTTGCA AAATCGCAGT GAGTATATCA 180
CGCAAGGGGT GGATAGCTCG CACATTGTGG ATGGTAAGAA AACTGAAGAG ATAGAAAAAA 240
TCGCTACCAA AAGAGCGACA ATAAGAGTGG CGCAAAATAT TGTGCATAAA CTCAAAGAGG 300
CTTACCTTTC TAAATCCAAC CGCATCAAGC AAAAGATCAC TAATGAAATG TTTATCCAAA 360 TGACACAGCC CATTTATGAC AGCTTGATGA ATGTGGATCG TTTAGGGATT TATATCAATC 420
CTAACAATGA GGAAGTGTTT GCGTTAGTGC GCGCGCGTGG TTTTGATAAG GACGCTTTGA 480
GCGAAGGGTT GCATAAAATG GCATTAGACA ATCAAGCGGT GAGTATCCTT GTGGCTAAAG 540
TGGAAGAAAT CTTTAAAGAT TCTGTCAATT ACGGAGATGT TAAAGTCCCT ATAGCCATGT 600
AGGATTAGAA CAACAAGGGT TCCTCACTAT CGTCTGTTCT TTTGGGGGTG GGGGTGATGG 660 AATTAGGAGT TGAGTAGTAG GGGATTTTAT CCACGATTTC TTTACGCAAG CCTTTGGGGA 720
CATCAAACTT TCTTTTTAAA GAAGGTTCAA TCGCTAAAAT ATTACGCATG AAATACGAAT 780
ACACAGGTGC ACTCACAACG CCTCCTGTCG CTCGTTTGCT AATAGGCGTG TTATCGTCTC 840
TCCCAAACCA GATCACGCTT TGCAAGGTGG GGGTAAAGCC AATGAACCAA GCGTCAATAT 900
TGTTGTTAGA AGTCCCGGTT TTACCGGCGA TTTCTAAACC TTTAGTGCGA GCCAAACGCC 960 CTGTGCCGTT TTCTACCGCA TTTATCAGCA CTGAAAGGGT TAAAAAAGCC TGTTCTTTGG 1020
AGGTGATCTT TTTGGTTTCC ATAGGCGTGA AAGTTTTGAC ATCGTTTTGT TGGTTAGTGA 1080
TGCTTTCAAT GAGCATGGGT TTGAGCATGG TGCCGTAATT AGAAAATAAA GAATACTTTT 1140 CAGCCGCATC AATGGGTGAG ATAGCAAAGC TCCCTAACAC GATAGACAAG TCCTTAGGGA 1200 GGTTTTTAAA CCCCATATCG CTTAAAGATT GATAAATTTT TTCAAAGCCA AGCTGATCGC 1260 TTAAATTGAT CGTGGCTAGA TTTAACGAAT GGCTTAAGGC TTCTTGCAAG GTTACAAGCC 1320 CTAAAAACTT GCGAGAATAA TTGCTGGGGT GCCATGCGTG GTTTTGTTCG CTGTTTTTAC 1380 TATAATTGCC ATTTTCAAAG TTTCGCGCGG TATCAGGGAT TTTGGAAGTG GTGGAATAGC 1440 CATTATCAAA AGCGATTTGA TACACAAAAG GCTTTATCGC GCTCCCAAAC TGCCGTTTGG 1500 CTTGCGTGGC GCGATTGAAA GCGCTTTTTT TATAATCAAT CCCCCCCACT AAAGCTAAAA 1560 TCTTACCGGT GCTCGTGTCT GTAACTATCA TGCTGGCGTT CAAGTTGTCT TCATCTTCAT 1620 CAGAGGCGTT AGTTTTTGGC TTTTCTTTAG CGATTTTTTC TAAGATTTTT TGATGCCCAA 1680 AACGCAAGGA CTCTAACGCT AAGCGTTGGT AATCCAAATC TATCGTGAGC TTTATGGTAT 1740 AGCCTTGAGT TTTTAACCCG TCTAATTGAT CCAATTGCTT CAACACTTCA TCCACGACAT 1800 AGGGAGCGAT ATTTTGCGTG GAAGTTTGGT TATAGACGAT TGGCACTTCA TTGAGAGCGC 1860 CTTTGAGCTC GTTAGAAGAA ATCCAGCCTA AAGAATACAA CCGCCTTAAA ATATCATTAG 1920 CTCTAGAGAG TGAAAATTCT AAATTTTTGG TAGGATCATA AAAACTCGGA GCCCTAGGCA 1980 AGGCGACTAA CATGGTGATT TCTTTAAGCG TGAGTTTGTC AAGGGGTTTT TTAAAATACC 2040 CTAAACTTGC GGTTTTCACG CCATAATACC CATGCCCAAA AAAAGTTTGG TTCAAATAAC 2100 GCTCTAAAAT TTCTTCTTTG CTTAAGACTT TTTCAATGCG TAAAGAAATG ATAGCTTCTT 2160 TGAGTTTTCT GGTTAATGTT TTTTCTCGCG TGAGCACCAT GTTTTTAACG AGTTGTTGGG 2220 TTAGAGTGCT GCCCCCTTCA GTGTAACGAC CGCTTTTAGC GTTTTTAATC ATAGCGCGCA 2280 TGATAGCGTC TAAATTGATC CCCCCATGTT CAAAAAAGAG GGTGTCTTCT ACCGCTAAAA 2340 GGCTTTCAAT AAATCGTGGG GGGATTTCTT CAAAACGCGC ATAAAAACGG AATTC 2395
(2) INFORMATION FOR SEQ ID NO:493
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 22, clone b5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:493:
AGCTTTGTAG CTGAAAAACT CAAAAACGAT TATGAAAACA AAATGAAAGT TTTGGATAGC 60
GAACAAAGAA GCCGGATCGA ACGCATCGTG TATTTGCAGA TTTTAGACAA CGCATGGCGA 120
GAGCACCTTT ATACGATGGA TAATCTCAAA ACTGGTATCA ATTTAAGAGG CTACAACCAA 180
AAAGACCCCC TTGTAGAATA CAAAAAAGAG AGTTACAACC TTTTCTTAGA ATTCATTGAA 240 GACATCAAAA TGGAAGCGAT CAAAACCTTT TCTAAGATCC AGTTTGAAAA TGAGCAAGAT 300
TCTAGCGATG CGGAGCGTTA TTTGGATAAC TTTAGCGAAG AAAGAGAGTA TGAGAGCGTA 360
ACTTACCGCC ATGAAGAAGC CTTAGACGAA GATTTGAATG TGGCCATGAA AGCT 414
(2) INFORMATION FOR SEQ ID NO: 494:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 171 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 23 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:494:
ACGGCTTCTG GGGATATTAG CTTGACTTTT AAACAAGTGG ATGGCGTGAA TGATGTAACT 60 TTAGAGAGCG TAAAAGTTTC TAGTTCAGCA GGCACGGGGA TCGGTGTGTT AGCGGAAGTG 120 ATTAACAAAA ATTCTAACCG AACAGGGGTT AAAGCTTATG CGAGCGTTAT C 171
(2) INFORMATION FOR SEQ ID NO: 495:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1960 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 24, clone D6
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 495:
TTCAAAGGAA AAATTTTGAT TACGGTGGTT AAACGAAACG GGCGCATTGA GCCTTTGGAC 60
ATTACAAAAA TCCAAAAATA CACTAAGGAC GCTACGGATA ATTTAGAGGG CGTGAGCCAA 120
AGTGAGCTGG AAGTGGATGC GAGGTTGCAA TTCAGGGACA AGATCACTAC TGAAGAAATC 180
CAACAAACTT TGATTAAAAC CGCTGTGGAT AAGATAGATA TTGACACGCC TAATTGGAGT 240
TTTGTCGCCT CAAGGCTTTT TTTGTATGAT TTATACCATA AACTGAGTGG TTTTACAGGA 300
TATAGGCATT TGAAAGAGTA TTTTGAAAAC GCTGAAGAAA AGGGCCGCAT CCTTAAGGGC 360
TTTAAGGAAA AATTTGATTT AGAGTTTTTA AATAGCCAGA TCAAGCCTGA AAGGGATTTC 420
CAATTCAATT ATTTAGGGAT TAAAACCTTG TATGATCGCT ATTTGTTAAA AGACGCTAAC 480
AATAACCCTA TTGAATTGCC CCAACACATG TTTATGAGCA TTGCGATGTT TTTAGCGCAA 540
AACGAACAAG AACCCAATAA AATCGCTTTA GAATTTTATG AAGTTTTGAG TAAATTTGAA 600
GCGATGTGCG CGACCCCCAC TCTAGCGAAC GCTCGCACCA CCAAGCACCA ACTCAGCTCA 660
TGCTATATTG GCAGCACGCC GGATAATATT GAGGGGATTT TTGACAGCTA TAAGGAAATG 720
GCGCTGTTGT CCAAATACGG CGGGGGGATC GGCTGGGATT TTTCTTTGGT GCGCTCTATT 780
GGGAGTTATA TTGATGGGCA TAAAAATGCG AGCGCTGGCA CGATCCCTTT TTTAAAAATC 840
GCTAACGATG TGGCGATTGC GGTGGATCAA TTAGGCACAC GAAAGGGTGC GATTGCGGTG 900
TATTTGGAAA TTTGGCACAT TGATTGGCGG GATCCATTGA TTCAAGGAAG AGAGGTCGGG 960
GTTGAAACGC GAGGAGCGCT TGATTTGTGC CCAGCATCTG TGGGTGGGCG ATTTGTTTTT 1020
GAAAAGGGTT TTAGAACCAT GCGATGTGGA CTGTTGTATG ACCCTTATGA GTGTAAGGAT 1080
TTGACTGACC TTTACGGGCA GGATTTTGAA AAACGCTATT TAGAATATGA AAAAGATCCT 1140
AAAATCATTA AGGAGTACAT CAACGCTAAA GATTTATGGA CAAAAATCTT AATGAATTAT 1200
TTTGAAGCCG GCTTGCCTTT CTTAGCCTTT AAAGATAACG CCAATCGGTG CAACCCAAAC 1260
GCTCATGCAG GAATCATTCG ATCCAGCAAT TTATGCACGG AGATTTTCCA AAATACCGCG 1320
CCTAACCACT ACTACATGCA AATAGAATAC ACCGATGGCA CCATAGAGTT TTTTGAAGAA 1380
AAGCAGTTGG TAACGACAGA TAGTAATATC ACTAAATGCG CTAACAAGCT CACTAGCACC 1440
GATATTCTTA AAGGCAAACA AATCTATATC GCTACTAAAG TCGCTAAAGA CGGGCAAACG 1500
GCGGTGTGTA ATCTAGCGAG CATTAATTTA AGCAAAATCA ACACTGAAGA AGACATTAAA 1560
AGGGTTGTGC CGATCATGGT CAGGCTTTTA GACAATGTGA TTGATTTGAA TTTCTACCCT 1620
AACCGCAAAG TCAAAGCCAC TAATTTGCAA AATAGGGCCA TAGGGTTAGG GGTTATGGGT 1680
GAAGCGCAAA TGCTCGCAGA ACACAAAATC GCTTGGGGAT CTAAAGAGCA TTTACAAAAA 1740
ATTGACGCTT TAATGGAGCA AATCAGCTAC CATGCGATTG ACACGAGCGC GAATTTAGCG 1800
AAAGAAAAAG GGGTTTATAA GGATTTTGAA AATTCACAAT GGAGTAAGGG GATTTTCCCC 1860
ATTGATAAAG CCAATAATGA AGCCTTAAAG CTCACCGAAA AAGGGCTTTT CTATCACCGC 1920
TTGCGATTGG CAAGGTTTGA AGGAGAAAGT CAAGGCCAAC 1960
(2) INFORMATION FOR SEQ ID NO: 496:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2474 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 25, clone g2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 496:
GAATTCATCA GGGATCAATG ATGGCGCGAG CATTATCATT TTATGCAGCG CTAAAAAAGC 60
GCAAAAATTA GGGTTAAAAG CCATGGCTAC TATCAGGGGG TTTGGTTTGG GTGGTTGCAG 120
TCCGGATATA ATGGGTATAT GCCCTATTAT TGCGATTAAA AACAATCTTA AAAATGTCAA 180
AATGAATCTC AATGACATCA ATCTTTTTGA ACTCAATGAA GCCTTTGCCG CGCAAAGTCT 240
AGCCGTGTTA AAAGAGCTTG AATTAAACCC CAATATAGTG AATGTGAATG GAGGCGCGAT 300
AGCGATTGGC CACCCTATTG GTGCGAGCGG CGCTAGGATC TTAGTAACTT TATTGCATGA 360
AATGAAAAAA AGCGGTCATG GCGTGGGTTG CGCGTCATTG TGCGTGGGTG GCGGACAAGG 420
CCTTTCTGTG GTATTGGAAC AAAAATAAGG AGAAAATGAG ATGAACAAGG TTATAACTAA 480
TTTAAACAAA GCATTGAGCG GGTTAAAAAA CGGGGACACT ATTTTAGTGG GCGGTTTTGG 540
GCTGTGCGGG ATACCCGAAT ACGCCATTGA TTACATTTAT AAGAAAGGCA TTAAGGATTT 600
GATTGTCGTG AGTAATAATT GCGGCGTTGA TGATTTTGGG CTTGGCATTC TTTTAGAAAA 660
AAAACAGATT AAAAAGATTA TCGCTTCCTA TGTGGGAGAA AATAAGATTT TTGAATCGCA 720
AATGCTGAAC GGAGAAATTG AATTCGTTTT GACACCGCAA GGCACGCTGG CTGAAAACTT 780
GCGCGCTGGA GGGGCTGGGA TACCCGCTTA CTACACCCCA ACCGGTGTTG GGACTTTGAT 840
CGCTCAAGGC AAGGAATCAC GGGAATTTAA CGGCAAAGAG TATATTTTAG AAAGAGCGAT 900 CACAGGCGAT TACGGGCTTA TCAAAGCCTA TAAAAGCGAC ACTCTTGGGA ATTTGGTGTT 960
TAGAAAAACG GCTAGAAATT TCAACCCCTT GTGTGCGATG GCATCAAAAA TATGCGTCGC 1020
TGAAGTGGAA GAAATTGTCC CGGCTGGGGA ATTAGACCCA GATGAAATAC ACTTGCCAGG 1080
AATCTATGTG CAACACATCT ATAAAGGCGA GAAATTTGAA AAACGGATAG AAAAAATCAC 1140 AACAAGGAGC GCGAAATGAG AGAGGCTATC ATTAAAAGAG CGGCAAAGGA ATTGAAAGAG 1200
GGCATGTATG TGAATTTAGG GATAGGCTTG CCCACGCTTG TGGCTAATGA AGTGAGCGGG 1260
ATGAATATCG TTTTCCAGAG CGAGAACGGG CTGTTAGGGA TTGGCGCTTA CCCTTTAGAG 1320
GGGGGCGTTG ATGCGGATCT CATCAATGCA GGAAAGGAAA CCGTAACCGT GGTGCCGGGC 1380
GCTTCGTTTT TCAACAGCGC GGATTCGTTT GCAATGATTC GTGGGGGGCA TATTGATCTA 1440 GCGATTTTAG GAGGGATGGA AGTTTCACAA AATGGGGATT TGGCTAATTG GATGATCCCT 1500
AAAAAGCTCA TAAAAGGCAT GGGAGGGGCT ATGGATTTGG TGCATGGCGC TAAAAAAGTG 1560
ATTGTCATCA TGGAGCATTG CAACAAATAC GGGGAGTCTA AAGTGAAAAA AGAATGCTCA 1620
TTGCCCTTAA CAGGAAAGGG CGTGGTGCAT CAATTGATAA CGGATTTAGC GGTGTTTGAA 1680
TTTTCCAATA ACGCCATGAA ATTAGTGGAA TTGCAAGAGG GTGTCAGCCT TGATCAAGTG 1740 AGAGAAAAAA CAGAAGCCGA ATTTGAAGTG CACCTATAGC TTATAAAAGG GGTGTTTATG 1800
TTTTTATTAA GGCATTTGAC TTCAGCGTGC GTGTTTTTAG CGTCTAAATG TTTGCCGGAC 1860
TCCTTTGTCT TGGTCGCTCT TTTATCGTTT ATCGTGTTTG TTCTTGTTTA TGGTTTGACA 1920
GGGCAAGACG CTTTTTCTGT CATTTCTAGT TGGGGGAATG GCGCTTGGAC GCTTTTAGGT 1980
TTTTCTATGC AAATGGCTCT TATTTTGGTG CTAGGTCAGG CTTTGGCTAG CGCTAAATTA 2040 GTCCAAAAAC TTTTAAAATA TCTAGCGTCT TTACCTAAAG GGTATTACAC GGCTTTATGG 2100
TTGGTTACTT TTTTATCGTT AATCGCTAAT TGGATCAACT GGGGTTTTGG CTTGGTGATC 2160
AGCGCAATTT TTGCAAAAGA GATCGCCAAA AATGTTAAAG GGGTGGATTA CAGGCTGCTT 2220
ATTGCTAGCG CTTATTCGGG TTTTGTCATC TGGCATGGGG GTTTATCAGG CTCTATCCCT 2280
TTAAGCGTTG CCACCCAAAA TGAAAATTTA TCCAAAATAA GTGCTGGGGT GATTGAAAAA 2340 GCTATTCCTA TCAGTCAGAC GATTTTTTCT GCCTATAATT TAATCATTAT AGGGATCATT 2400
CTTGTAGGGT TACCCTTTTT AATGGCAATA ATCCACCCTA AAAAAGAAGA AATCGTTGAG 2460
ATTGACGCCA AGCT 2474
(2) INFORMATION FOR SEQ ID NO: 497:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 389 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 26, clone Y237.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:497:
GCGACGATAA GGGGTTAGGG TAAAGCATAT TTTTGATCTA CAAAAAATCA GCCAAGAAGA 60 TAAAAGAAGT GGTTAAATCA TAGTTGTTTA ATTAAGCACG CTACAATACG AAAATTTTAA 120 AAGAAAGGAA AAGCATGAGC GAACAACGAA AAGAATCTTT ACAAAATAAC CCTAATTTGA 180 GTAAAAAAGA CGTCAAAATC GTGGAAAAGA TTTTGAGCAA GAACGACATT AAAGCCGCTG 240 AAATGAAAGA GCGCTATCTC AAAGAAGGGC TGTATGTGTT AAATTTCATG AGCTCTCCTG 300 GCAGCGGTAA AACCACGATG CTAGAAAATC TAGCGGATTT TAAAGACTTT AAGTTTTGCG 360 TGGTAGAGGG CGATTTGCAA ACCAACAGA 389
(2) INFORMATION FOR SEQ ID NO:498:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 590 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 27
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 498:
CGTATATGCC CATGATTGTA GAAAGCGTTT ATATGATGCT CGCATGCGCT AGGATTGGAG 60
CGATCCATAG CATCGTTTTT GCTGGGTTTA GCCCTGAAGC CTTAAGGGAT AGGATCAACG 120 ATGCTCAAGC TAAATTAGTT ATCACAGCGG ATGGGACTTT TAGAAAAGGC AAACCTTACA 180
TGCTCAAGCT AGCCCTTGAC AAGGCTCTAG AAAATAACGC CTGCCCTAGC GTGGAAAAAG 240
CGCTCATTGT GATACGAAAC GCCAGAGAGA TTGACTATGT GAGAGGGCGC GATTTTGTCT 300 ATAATGAAAT GGTCAATTAC CAATCCGATA AATGCGAACC TGAAATGATG GACTCTGAAG 360 ATCCTTTATT CTTGCTCTAT ACAAGCGGAT CAACCGGAAA GCCTAAAGGC GTTCAACACA 420 GCAGTGCAGG GTATTTGTTA TGGGCGCAAA TGACGATGGA GTGGGTTTTT GATATTAGAG 480 ATAACGATAA TTTTTGGTGC ACCGCTGATA TTGGTTGGAT CACAGGGCAC ACTTATGTGG 540 TTTATGGGCC TTTAGCTTGC GGGGCAACGA CTTTGATACT AGAAGGCACG 590
(2) INFORMATION FOR SEQ ID NO: 499:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2588 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 28
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 499:
CTCGTCTTGA ATGGTTTGGA TCGCTCTTAA AGCGATCTCG CCTCTATTAG CGATCAAAAT 60
GCGTGAAAGC TCTTTTTTTT CTACCTTTTT ATTTTCTTTA TTCACTTTAT TCATGGATTT 120
TAAAGCTTTT CAACTTTGAT GAGTTTCGTG CCGTATTCTA CCGGTTGAGC GTCTCCCACT 180
TCAACAGAAA CCACCTTGCA AGGGTATTCC ACTTCAATTT CATTCATGAT TTTCATCGCT 240
TCTACAATGC CCACGATTTG CCCTTTTTTA AGCGTATCGC CCGCTTTGAC ATAAGGCTCA 300
GCCCCAGGCG AGGGCGCATG ATAAAAAGTG CCTACCATAG GCGAAAGCAC GAAATCTTCT 360
TTTTTATCCA CAATAGGGGT GCATACCATA GGCACAGGGG CTTGAGTGCT TGGCATGCTC 420
GCTTCTACCA TGATGGGGGC TTGGATGGAG GCTTGAGAAT GAGCGGGATT TAGTGCATTT 480
TTTTTCGCAT AAGCGGATTC TTTATCCAAA ACCAACTCAA AATGCTCATG CTTTAATTTC 540
AAATGCCCCA AATCAGAAGC TTTAAATTCT TTGATCAACT CTTCAATTTC AGAAAGGTTC 600
ATAACGATTT ATTCCTTATA TTGTTTTAAA ATATGCCACA AAACATAATC TTACTAAGCG 660
TTTTTAGCAA AAAATTATTG TTATAATAAC ATAATTATTA ACAAACTTTA AAGGCTTTGT 720
GCGTATGGGA TTGAAAGCGG ATTCTTGGAT TAAAAAAATG AGTTTAGAGC ATGGCATGAT 780
TAGCCCTTTT TGCGAAAAGC AAGTCGGTAA GAATGTGATC AGCTATGGTT TGAGCAGTTA 840
CGGGTATGAT ATTAGAGTGG GGAGTGAGTT CATGCTCTTT GATAACAAAA ACGCTTTAAT 900
TGACCCTAAA AACTTTGACC CTAACAACGC GACTAAAATT GATGCGAGTA AAGAAGGGTA 960
TTTTATCTTG CCCGCTAACG CGTTCGCCCT AGCTCATACG ATAGAGTATT TTAAAATGCC 1020
TAAAGACACC TTAGCGATTT GTTTAGGCAA AAGCACTTAC GCCAGGTGTG GGATCATTGT 1080
GAATGTTACG CCTTTTGAGC CAGAATTTGA AGGGTATATC ACGATTGAAA TTTCTAACAC 1140
CACCAACTTA CCGGCTAAAG TCTATGCCAA TGAGGGGATC GCGCAAGTGG TGTTTTTACA 1200
AGGCGATGAA ATGTGCGAGC AAAGCTATAA AGACAGAGGC GGTAAGTATC AAGGGCAAGT 1260
GGGCATCACT TTGCCTAAAA TTTTAAAGTG ATTTTGAAGA AAGATAAAAA CACTTATCCT 1320
TGTGAGCGTT GTTTATTTTT TAATACAAAC TATACCGCAG TTTCTCAAGA AATATATTAT 1380
AATACGAAAG TTCAGTTTGA TTGAGTTACA CACTCTTTGA GAACAAGACG CTAAATTATT 1440
TAGGAAATTA CCATGCTAAG ATTCGTTAGT AAAACGATTT GCTTGTCTTT AATCGGCTTG 1500
TTCAACCCTT TAGAAGCCTT TCAAAAACAC CAAAAAGACG GCTTTTTTAT AGAAGCCGGG 1560
TTTGAAACCG GGTTATTAGA AGGCGTGCAA AATAAAGAGC AAACCATAAC CACCCAAAAA 1620
ATCCAAAAAA ACCCCCTAAC CCACCCACAA ATTAAAGAAC AGCCTAAAGA ACAAAACAAA 1680
AGCGATACAG CCACCCCACA AAGTGCTTAC GGAAAATACT ACATACCCCA AAGCACCATT 1740
TTAAAAAACG CAACGGCTTT ATTCACCACG GATAATATAG AAAAAAATGG CTTAACTTTT 1800
TATTCTCAAA ACCCTGTGTA TGCGAATATG GTTAATGGGA GCGTAACCAT ACAAAACTTT 1860
CTGCCCTACA ATTTAAACAA TATTGAGCTG AGCTATACAG ACGCTCAAGG CAAGGTAGTC 1920
AATTTAGGCG TGATAGAAAC TATCCCTAAA GATTCTCAAA TCATTCTGCC TGCAAGCTTG 1980
TTTAATGATT AGAATTTGAA CAAGCTGATA GCTTTAATTA CCAACAACTT CAAGCTACTG 2040
CCACACAATT TTCTGATGCT AACACGCAAA GTTTGTTTGA AAAGCTCAGC CAAATCACGA 2100
CCAATGTAAC GATGAGTTAT GAAAACGCCG ATACCAACAA TTTTAAAGGT AATTGCAATG 2160
ATTGTGTGTC AGATTTCACC CCACAAACCG CAGAAGAATT GACCAATTTA ATGCTAGATA 2220
TGATTGCGGT GTTTGACTCT AAATCGTGGG AAGAAGCCAT TTTAAACGCT CCTTTCCAAT 2280
TTTCTAACAG CCCATCAGAG TGCGGCTCTG ACTTTCCTAA ATGCGTGAAT CCTTTCAATA 2340
ACGGGCGTGT CGCTCCCATC TATGAAAAAT ACGTGCTAAC CCCACAATCC GTTATAGATG 2400
CGTTTAGAAG AGCAATCAAT CTTGAAGTGA ACATCATGAA ATCAGGGTTT TTAGGGCTAG 2460
GGTATGAACT TGATGATAAT GATGGCAATC TAGGGATAGC CGCTTCTGCA TTAAATCCCG 2520
AAAAATTGTT TGGTAAAACT TTGAACAAAG TTGATATTGT GGAATTAAGA GACATTATCC 2580
ATGAATTC 2588
(2) INFORMATION FOR SEQ ID NO: 500: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 972 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 29
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:500:
CCCCCTAAAA GCTGTAAAAA TTCTTCAAAA AACGAACGCA AATCATCATT TTTTTTAGCC 60
CTAAACTCCG GGAAAAGAAT AGCTTTGGTA TTGGGCTTAA AATAGCTTAT GACTTCTTTA 120
GCGAGCTCAG ACTCTTTAAA ATCCTTACAA GCGAGTATTT GATAATCAAA GCCTCTAATT 180
AAGGCTCTAT AAAGACTAGA TTGGATCATT CCCTATTTAT TTTCAATTTT CTTTTCTTGC 240 TCATTGATTA ACAACGCCCC CCCTTGAATG TTCTTAGGGC GAGTTTCCCC AATCAAAATC 300
CCCTTCCTTT CCACCACAAG CTCTTGGCTA GAGATCTTAC CATCAAGGCG CCCTAAAGGC 360
ATGATTTCTA CCACTTCCGC TTCCACCGTG CCAGTGAACT TGCCATTGAC CACTAATTTA 420
TTAGCAAAAA TCTCACCCAC TACCGAGCCG GTTTGCCCGA TCACCACTGT GCTTTTAGAA 480
TGCACCACCC CTTCTAATTC GCCATCTATG TGCAAATGGT AATCCAAGTG AAGCTCACCC 540 TTTATTTTTG TGCCTTGAGC GATGATAGTC GCTGGTCCTG TTTTTGCATT AGCCGATTTA 600
TTATTGTTAT CAAAGATTGC CATCTGATGC TCCTTTCTTT ATTGAGAACT TCTTCAAAAT 660
TTTTCATGTT CCATTTGGTG AAACTCATGG GGTTTATGGG TTGATCTAAG AAGCGCACTT 720
CATAATGCAA ATGCGGTCCT GTACTCATTC CTGTATTACC ACTATACCCA ATCAACTGCC 780
CTTTTTTGAC AAATTCGCCC GTTTTTACGA CGATTTTATT CAAATGGGCG TAGTAGGTTT 840 TAAAACCAAA AGGGTGGAAA ACTTTAATCA AATTCCCATA CCCCCCATTC CACCCTTTGC 900
TCGCTAACCC CACTACCCCG CTCGCGCTCG CATACACAGG GGTGTTAATA GCGGTGCTTA 960
AATCAAGCCC GG 972
(2) INFORMATION FOR SEQ ID NO: 501:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 970 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 501:
GAATTCCCTG GTGATGACAC TCCTATCGTA GCGGGTTCAG CTTTAAGAGC TTTAGAGGAA 60
GCAAAGGCTG GTAATGTGGG TGAATGGGGT GAAAAAGTGC TTAAGCTCAT GGCTGAAGTG 120
GATGCCTATA TCCCTACTCC AGAAAGAGAC ACTGAAAAAA CTTTCTTGAT GCCGGTTGAA 180 GATGTGTTCT CTATTGCGGG TAGAGGGACT GTGGTTACAG GTAGGATTGA AAGAGGTGTG 240
GTGAAAGTAG GCGATGAAGT GGAAATCGTT GGTATCAGAG CTACACAAAA AACGACTGTA 300
ACCGGTGTGG AAATGTTTAG AAAAGAGCTA GAAAAAGGTG AGGCCGGCGA TAATGTGGGC 360
GTGCTTTTGA GAGGAACTAA AAAAGAAGAA GTAGAACGCG GTATGGTTCT ATGCAAACCA 420
GGTTCTATCA CTCCGCACAA GAAATTTGAG GGAGAAATTT ATGTCCTTTC TAAAGAAGAA 480 GGCGGGAGAC ACACTCCATT CTTCACCAAT TACCGCCCGC AATTCTATGT GCGCACGACT 540
GATGTGACTG GCTCTATCAC CCTTCCTGAA GGCGTAGAAA TGGTTATGCC TGGCGATAAT 600
GTGAAAATCA CTGTAGAGTT GATTAGCCCT GTTGCGTTAG AGTTGGGAAC TAAATTTGCG 660
ATTCGTGAAG GCGGTAGGAC CGTTGGTGCT GGTGTTGTGA GCAACATTAT TGAATAATAT 720
TAGCAAAAAG AGTTACCATA AAGGGTCATT ATGAAAGTTA AAATAGGGTT GAAGTGTTCT 780 GATTGTGAAG ATATCAATTA CAGCACAACC AAGAACGCTA AAACTAACAC TGAAAAACTG 840
GAGCTTAAGA AGTTCTGCCC AAGGGAAAAC AAACACACTC TTCATAAAGA AATCAAATTG 900
AAGAGCTAGT TCTTTCTTTT GTGTTGTGAT TGAAAAGGAG GGGAGGTTAG GTCAGTAGCT 960
CCAATGGTAG 970 (2) INFORMATION FOR SEQ ID NO: 502:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 523 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 31, clone Z9.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 502:
TGAAGCGAGA AAACTACTAG AACAAGAAGT TAAAAAGAGC GTCAAGGCTT ATTTGGATTG 60
CGTATCAAAA GCTAGGAATG AAAAAGAGAA ACAAGAATGC GAGAAATTAC TCACGCCTGA 120
AGCGAGGAAA TTTTTAGAGA AAGAACTCCA ACAAAAAGAT AAAGCGATAA AAGATTGCTT 180
GAAAAACGCC GATCCTAACG ACAGAGCGGC TATCATGAAG TGTTTGGATG GTTTGAGCGA 240 TGAAGAGAAG CTCAAATACC TGCAAGAAGC TAGAGAAAAG GCTGTTGCGG ATTGTTTGGC 300
TATGGCTAAA ACCGATGAAG AAAAAAGGAA ATGCCAAAAC CTTTATAGCG ATTTGATCCA 360
AGAAATCCAA AATAAAAGGA CACAAAGCAA ACAAAATCAA TTGAGTAAAA CAGAAAGATT 420
GCATCAAGCA AGCGAGTGTT TGGATAACTT AGATGACCCT ACCGATCAAC AAGTCATAGA 480
GCAATGTTTA GAGGGCTTGA GCGATAGTGA AAGGGCGCTA ATT 523
(2) INFORMATION FOR SEQ ID NO: 503:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1012 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 32
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 503:
CATCTGCTCA ATCGCTCGCA TAAGCATTGG GCCTCTCCAA ATGAGACTCT GCCCCTCATC 60 ATACAAAAGC CCCATGCTCA TCACAGAAAC GCCAAAAGCT TTTAAAGGAA TGAGTTTTTT 120
ACCGCTAGGA TCCATGATCA CATCAGCGCT TTGCAAGCCC ATCATTCTAG GGATATTAGG 180
GCCATACACA TCAGCGTCTA GTAACCCCAC TTTTTGGTTT AAATTCGCTA AAGCGATGCT 240
TAAATTCACG CTGGTGGTGC TTTTACCCAC ACCGCCCTTA CCTGAGCTTA TCATCACTAC 300
ATGCTTGATG TTTTTAGCCA AATTTTTAGT AGTGGGCTTT GGRGCTTGCG GCTTTGGCGG 360 GGTTTTAATA TCCAAATTCA AAGCTTTCAC GCCTATTTTC TGCACCGCTT CAGAGATATT 420
TTCCCTTAAA ATCGCGCTCG TTTCTTCAGA GCTTGAGGGG ATTTCTATTA AAAGCCCTAA 480
TTGATTGTCA TGCAAAGCGA TGCTTTTAAC AAAACCAAAG CTGACAATAT CCTTTTCAAA 540
ATTAGGGTAA ATGATCGTTT TTAACGCGTT TAAGACATCT TCTTGGGTGA GCATTTTAAT 600
CCTTAAAATG ATAGTATTTG AATTTTCTTT ATAGCATGTT TTGTTTAATT TTGCATTATA 660 AATTAAATTT GAGTTTTCAA AAAAGGATAT GGCATGAAAT TTTATAAGCG TGTTTTGAAA 720
TTGCACCATT TTAGGAATTT GGGTAGAAAC TCGCCTATGG AGTTGCTTTT AAATTCAAGT 780
TTTGAAAAAC ATGGAGGATT GGTGGTTTTA GTGGGGGAAA ATAATGTCGG CAAGAGCAAT 840
GTTTTAGAAG CTTTGAAAAT CTTTAATGAT GCGGATGTCA AACTTTGTAG TGAGAAAGAT 900
TATTTCAAAG CCCATGAGTC TGAAGACGCC GTTTTGAACT TGGAAGAAGA AACAATCCTT 960 GATCATAAAA CCATAGGTTT TTCTTGCGTG GATTTGAAGA TCCAAACTAA AG 1012
(2) INFORMATION FOR SEQ ID NO: 504:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 804 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 33
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 504: TCTTCAATGA CGATCCCAAT AGAACCTTAT ACAACTATTT GAATATTGCA GAAATTGAGG 60 ACAAAAACCC ATTGAGAGCC TTTTATGAGT GTATTAGTAA TGGTGGCAAC TATGAAGAAT 120 GTTTGAAGCT TATCAAAGAC AAAAAACTTC AAGATCARAT GAAAAAGACT CTAGAGGCTT 180 ATAATGACTG CATCAAAAAT GCCAAAACTG AAGAAGAAAG GATCAAGTGT TTAGATTTAA 240 TCAAAGATGA AAACTTGAAA AAAAGCTTAC TGAACCAACA AAAAGTTCAG GTGGCGCTAG 300 ATTGTTTGAA AAACGCTAAA ACCGATGAAG AACGAAACGA GTGCCTAAAA CTCATAAATG 360 ACCCTGATAT TAGAGAGAAA TTCCGTAAGG AATTAGAGCT TCAAAAAGAG CTTCAAGAGT 420 ATAAGGATTG TATCAAAAAC GCCAAAACAG AGGCTGAGAA AAATGAATGC TTGAAAGGCT 480 TGTCTAAAGA AGCTATAGAA AGATTGAAAC AGCAAGCGCT AGATTGTTTG AAAAACGCTA 540
AAACCGATGA AGAACGAAAC GAGTGCTTGA AAAATATTCC CCAAGACTTG CAAAAAGAAC 600
TACTAGCTGA TATGAGCGTC AAGGCTTATA AGGACTGCGT TTCAAAAGCT AGGAATGAAA 660
AAGAGAAAAA AGAATGCGAG AAATTGCTCA CGCCTGAAGC GAGGAAAAAG TTAGAACAAC 720 AGGTTCTAGA TTGTTTGAAA AACGCTAAAA CCGATGAAGA ACGAAAAAAG TGTTTGAAAG 780
ATCTCCCTAA AGACTTACAA AGCG 804
(2) INFORMATION FOR SEQ ID NO: 505: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1055 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 34
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 505:
GGAAAAGAGG TGATAGCAGT ATGCCAGGGA TTAAGGTTAG AGAAGGCGAT GCGTTTGATG 60
AAGCTTACAG GAGATTCAAA AAGCAAACCG ATCGCAATTT GGTGGTAACA GAATGCCGTG 120
CAAGGAGATT CTTTGAATCT AAAACTGAAA AGCGCAAAAA GCAAAAAATC AGTGCTAAAA 180
AGAAAGTCTT AAAGCGTCTT TATATGTTAA GGCGTTATGA ATCAAGACTA TAAGAACTTG 240 AAAAAATTTA AAAATTAAGG ATTATTGAAT AATGCAATTC ACAGGGAAAA ATGTTCTCAT 300
TACCGGAGCT TCTAAAGGCA TTGGGGCTGA AATCGCTAAA ACTCTCGCTT CTATGGGGCT 360
GAAAGTTTGG ATCAATTACC GCAGTAATGC TGAAGTGGCT GACGCTTTGA AAAATGAGCT 420
TGAAGAAAAA GGCTGTAAGG CAGCTGTCAT TAAATTTGAT GCGGCTTCTG AGAGCGATTT 480
TATTGAAGCG ATACAAACCA TTGTCCAAAG CGATGGGGGG TTGTCTTAAC TTGGTGAATA 540 ACGCCGGTGT GGTGCGCGAT AAATTAGCGA TCAAAATGAA AACAGAAGAC TTTCACCATG 600
TCATAGACAA TAACCTCACT TCAGCCTTTA TAGGTTGCAG AGAGGCTTTA AAGGTGATGA 660
GCAAGAGTCG TTTTGGGAGC GTGGTCAATG TCGCTTCTAT CATTGGTGAA AGAGGCAATA 720
TGGGGCAGAC AAACTACTCA GCGAGTAAGG GAGGAATGAT TGCGATGAGC AAGTCCTTTG 780
CTTATGAGGG AGCTTTAAGG AATATTCGTT TCAACTCTGT AACACCAGGC TTTATAGAAA 840 CCGACATGAA CGCCAATTTG AAAGACGAAC TCAAAGCGGA TTATGTTAAA AACATTCCTT 900
TAAACAGACT AGGGTCTGCT AAGGAAGTGG CAGAAGCGGT AGCGTTTCTT TTGAGTGATC 960
ACTCTAGTTA CATCACTGGA GAGACTCTCA AAGTCAATGG CGGGCTTTAT ATGTAGTCCT 1020
AAACAAAGGG TTCTTTTAGC GATAAAAGTT TGTAC 1055 (2) INFORMATION FOR SEQ ID NO:506:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 391 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 35 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 506:
TGAAGAAATG GATGAAGAAG AAGATGAATT GAACAAACTG GGCGATTTGA GAAAAAAAGT 60
AGAAGATCAA TTAGGGCTTA ATGCAACCTT TAGCGAAGAA GAAGTAAGAT ACGAAATTAT 120 ATTAGAAAAG ATTAGAGGGA CTCTTAAAGA ACGCCCTGAT GAAATCGCAA CACTCTTTAA 180
ACTCCTAATC AAAGATGAAA TCTCTTCAGA CAGCGCGAAA GGTTAAATAA AGGTTAAAAA 240
TGGCAACCAA GCTTACCCCC AAACAAAAGG CTCAATTAAA CGAACTCTCC ATGAGTGAAA 300
AAATCGCTAT TTTACTCATT CAAGTGGGCG AAGACACCAC GGGCGAAATT TTAAGGCATT 360
TAGACATTGA CTCTATTACA GAGATTTCTA A 391
(2) INFORMATION FOR SEQ ID NO: 507
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 191 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 35 (middle)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 507:
TTATTTAGGC AAAATCAAGC CCCCACAACT CGCTGATTTC ATCATTAACG AACACCCTCA 60
AACCATTGCC TTGATTTTAG CCCACATGGA AGCCCCTAAT GCGGCTGAGA CTTTGAGCTA 120
TTTCCCTGAT GAAATGAAAG CCGAAATCTC TATTAGAATG GCGAATTTAG GCGAAATATC 180
GCCCCAAGTG C 191
(2) INFORMATION FOR SEQ ID NO: 508:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 326 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 35 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 508:
ATGTGGATGT GGCTCAAAGG AAAATCATTG AAATCGTGCA GAACTTGCAA GAAAAAGGCG 60 TGATCCAAAC CGGTGAAGAG GAAGATGTCA TTGAATAGCC GCAAAAATTT GATCCAAAAA 120
GACCATTTGA ATAAGCATGA CATTCAAAAA TACGAATTTA AGAGCATGGC AAACTTACCC 180
CCTAAAACTA ATCCTAATGG CGCGTCTTTA GAAACGCCTA ATCTAGAAGA GCCTTTGGAA 240
AAAAAAGCGA TAGAAAACGA TTTGATTGAT TGCTTATTGA AAAAAACCGA TGAGCTCTCA 300
AGCCATTTAG TGAAATTGCA AATGCA 326
(2) INFORMATION FOR SEQ ID NO: 509:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 511 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 36
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 509:
GAATTCCTAT AATTATACAA GCGATAAGGC TGGCACTTAT TATTTGACGA GCAACATCAA 60 AGGTTTTAAT CAAAACAATG AGATACCAGG AACTTATAAC GCGCAAAACC AACCCTTACA 120
AGCCTTGCAC ATTTACAATC AGGCTATCAC TAAGCAGGAT TTAAGCGTTA TTGCCAGCTT 180
GGGTAAGGAA TTTTTGCCTA AAATAGCCAA TCTTTTATCT TCAGGGGCTT TGGACAATCT 240
CAATCTCAAT AGCCCGGATA GTTTTGAAAC TCTTTTTGGT ATCTTTGAAA AATATGGTAT 300
CACTTTAACT CAAGCAAATT GGAAGAGTTT ATTGGGTATT ATCAATAATT TTTCCAACAC 360 GGCTAATTAT CATTTCTCTC AAGGCAATCT CGTGGTGGGG GCGATCAAAG AAGGGCAAAC 420
GAACACTAAT AGCGTGGTGT GGTTTGGAGG CGATGGGTAT AAAGAGCCAT GTGCGGTTGG 480
AGATAACACT TGCCAGATGT TTAGACAGAC T 511
(2) INFORMATION FOR SEQ ID NO: 510:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 4126 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 37 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 510:
AGCTTTTAGC GCCAAGAGGC ATAAAGCCTC TTCATAAAAA GGATTGAGCG ATTTTAGCGT 60 CTTTCTGGGG GCTGTATGCC TAGCCATTCT TTATAAAACA CCTCTATATC AGGCTCAAAA 120
TCATGGATCA CAGGATACCA AGGAGTTGGG GCTTCAACAT CTAAAAGGAT GCCGCAATCC 180
GGGCAAATAT ACTCCCGATA CACTTGCCAG TTAGTATCAC TGGCTAAGAG TTTAGGATAC 240
ACTTCTTCCA TTTTTTCTGC TGTATCGTGC ACATAGATGT TTGCATGCAA TTTCCAGTTT 300 TCTTCTGGAG CACAGAATGC ATGACCGCAT GAACACTTAA TGACCCATTT CTTTTGAGAA 360
TCTTGCACCA CAAACAAATG TGGCCCCAAG GGTAAGACGA TTTTGTCATC AAAATCTACC 420
TTATCTTGCA ACACCTTCAA ATACATTTGG AATCTTTCAT GATCTTTAGG CATGCTAAGC 480
ATTTTTAAGA CAGTGTTCCA ATCCAAGTTC CCCTCTACCA AATTTTTAAT TTGTTCTTGT 540
GTGTATTTTG ACATAACCAT TCTCCTTTCT TTATTTCTCA TCAACCAACA GAACTGTGCG 600 CACATCAGGC AATTTGCTCA AATCCATCCT GTATTTAGAA CCATAGGTGA ATACGCCAAG 660
CTCATCTTCT TTCATAGTCC AGCTCTTTGG CAAGTTCCAA AATGTTTTAA AATCGCTTAA 720
AAACTTAGGC GAAAGATCAA AGCTAGTCGC ATACATGTGC TTAACCTGTT TGGAAGCCTC 780
TTTTTCAAGG ATAGCGTTTC TTTCTTGCTC CATCCATTGT TTTACCGGTA TGGATCTAGC 840
CTTTCTGTTT TCAAGAATTT CTTTTCTTCT GGCTTTCGTT TTGGCTTCAT CGCCGACCCA 900 CACGCCGTCT TTATTTTGAC TCACCACTGC GCCATAAACC TTGTAAGCGT ATTCTGGCAA 960
TAGCTGTTTG CTGTTGAGAT CTTCTAAAAT CGCATTCAAA TCCCTTTCAA TCGGATCGCC 1020
AAATCCAGGA CCACCTTTGA TGTAATTCAA ATACAAATCA TAATTGTCAA AGCAGTTCTC 1080
AGTGGTGATG CATTGCTTAT CCCTTTTGAC TTGAGACGCA TGAGAAATAT GCTTTTCATA 1140
ATCTCTATCC GTTGGGTTAA AATCGCCCCC CAAAGGCAAG CTGGCGTTAT TTTTAATCCT 1200 GTTTTCCAAG TCGGTGTTGT GCGCTTCAAA CCTATAGCCA CTGGCCGGTG GATAGCCCCC 1260
CATCATACCC CAATCGCTAT TCATGTAGCC ATTACCCATA AAGAACATGG TCCAATCATG 1320
CACACCCCAC ACCATTCTTA AGGTTTCAAA CCCGTTACCG CCTCGATATT TCCCATACCC 1380
ACCGGTATTG GCTTTGACAT TCCTGCCCAA ATAAAGAAGA GGCTCTGCCA TTTCCCAAAT 1440
TTCAACATCG CCCATATCGC CTTCTGGATT CCAAATAGCC GCTGCGTGAT TTAAGCCGTC 1500 TTTTATCGCG CAAGCTCCAG TCCCACAAGA ACTTGTCTCA AAGCTATTCA CCGCATGGAT 1560
TTCTCCATCC TGGTTGATAC CACCGCCTTG CAACCAATTG GAAGTGTTAG CGTTCCCGGA 1620
ATTCACCTCT TCTAAATACC CTCGGCTGTA ATACGCTTGA GACAAGCCTC TCCACAAAGC 1680
GCTCCAGCCT GATACCAAGA AATGCCATGC ATAAGCATGC CCGGTGCGTC TGTCATCTGG 1740
ATTCATCCAA GTCCCTTTTT TGAGTCTGAA TTGAGTCGCA AAATAAGCGC CATCATTGAT 1800 GCGAGAAGTG GGTATCAGCG TTTGAGTCAT CATCACCCAA ATACCGCTAG TGAAAGACAC 1860
TTGGTTGCAA TTGAAAGAGT GCCATCCCCA CCTGGACGCG CCATCAAAAT CTAATTTCCA 1920
TGTAGCGTCT TTATTGATAG TGATTTCCAC AGGAGAGTGC ATGATCGTGT CTAGCTTAGC 1980
AAATTCAGAG CACACGCCAA TATCCTTATG CGCATAAGGC ACATCCACAA AGGTAACCTT 2040
TCTATAATTG CCTGGTATGG TCATGGATTT GATCCTAGAG ATAAGGCTTC TTCTTTCCTT 2100 CTCAATCACT TCATCAATAA ATCGCATGTA AGAATCAATG CCGTCTTCTT TAATGACTTC 2160
CATCACTAAA TCCCTAATCA TATGGCATCC TGCAATCCTA GTCCTTTCAT CTAGAATCCA 2220
ATATTTAGGC GTGCGAACCG ATCTTTGAGA TTCATGCAAC CAATCCTTAA AGCTCTCATC 2280
ATTCGCTCCT GTCTTACGGC AAGTGATCAT GTATCCATCC CCAAATCTTT GAACCTGCCC 2340
GGTGCTCATC GATCCTGGAG TAACCGAACC GGTATCAATC ACATGCGTAA CACCACCTAC 2400 CCACCCAATC AATTTTTCAT CGTGGAAAAT AGGCACAAGA GTCATAATAT CGCATGGGTG 2460
CACATTCCCA ATCGCACAGT CGTTATTGGT GAAAATATCC TTGTCATTGA TGCCTGGGTT 2520
GTCTTCCCAA TTATTCTCTA CCATGTATTT GATAGCTGAT CCCATAGTCC CTACATGGAT 2580
AATGATACCC GTAGAAGTCA GCACGCTATC GCCCACAGCG TTATAAAGCG TGAAGCACAA 2640
TTCTCCCTCT TGCTCAACAA TAGGGCTTGC CGCAATCCTT TTAGCCGTTT CTCTGGCATG 2700 CACGATACCG CCTCTTAATT TAGAGAACAT CTTCTCATAA CCAATCGGAT CTCTTTCCTT 2760
AAACTCTAGT TTTTTGAGAC CATTATAATG CCCTGTTTTT TCTGTCCTGG CTAGGATTTC 2820
ATCTCTAGCT TGTTTTAAAG TTTTGCCGTT TTTCAATAAA TTTGCCATTT TGAACTCCTT 2880
TATTTAATTT CTTTCAAGTG GAACAATCGG TGTTTGTCTA GTCTTGTCGC AAAGCCTTTG 2940
GGTATCACGA AAGTGGTCGC ATCTGATTCC ACGATCGCAG GCCCTATGAC TTCATTTCCA 3000 GGCAATAATT TTTCCATTTG CCACACATCT GCATCCACCC ATTTTTTATG CCGATAGAAT 3060
TTTCTAACGC CTATTTTGGC TTCTTTTGGG GGCGTAGCGC CATGCTCTTT TTCAACCGGA 3120
ATCACAGGTT TTTGCGTAGC CACAACACCA CGCATGATCA CGCCGGTCAC GCTAAAACCA 3180
AGCTCTGGAG AACACGCTGA TTCAGAATAA ACACGAGCGT AGGTTTTTTC ATATTCTTTG 3240
ACAATCTCTT CCCAATCAGC CACGCTTGCA GCTTTTGACA CAGGAGAAGT GATCTCTAAA 3300 TCATTCAATT GCCCCATATA CTGCATCCTG TATCCAGGTC TTAAGATCAC ATCTTTTTGA 3360
GAAAATCCAT TGATCTTGAA CTCTTCAATC ACTTTGAGAG TCAATTCATC CCATGCGTCT 3420
TGAATGATCT TGCATGCTTC TATTTTGGAC TCGTCTGAAG AATACTGCGG AATGGCAATA 3480
TCCACGCTCT TGTCGTATCT GTATTCAAAA TCAGCGCAAG CGCAACCAAA AGCGCTAAAT 3540
CCAGCCGCCC ACGCAGGCAC TACCACATCC TTAAAACCTA AGCTTTCTGT ATAGCCATAT 3600 GTATGCACAG GCCCCGCGCC ACCATATGAA AAGCACACAA AATCAGATGG GCTATACCCT 3660
TTAGCGCTAA TATTGGATCG CAAGTATTCT TTAAGCTCCG GATCAAGCAG CTCAATCACA 3720
CCCGCAGCCG CATCTTCTAC GATAATGCCT AGCGGATCAG CGATTTGCTC TTTAATGTGT 3780
TTTTTAGCCA TATCCACATC TAATTTGATC AAACCGCCTA AGAAATAATC TGGGTTCAAA 3840
TAGCCTAAAA CAATATGGCA ATCGGTTACT GAAACCGTGT CTATCCGCTG TCTTTCCAAC 3900 AAGTGCCAAC TTTATACCCC GCGCTGTCAG GCCCTAGTTT GACCGATCGG CTGTGCGGAT 3960
CAATGCGCAC GAAACTCCCC GCACTGCCAT ATCAGGGTCA GAAGCGATAT TAAAATTGCT 4020
CTTAACGATA AGCGCCATAT CAAAGCTCGT GCCGCCAATA TCGCTGCATG CAATATTGTC 4080 ATAACCAAGC GTTTCGCCTA GCAATTTAGA TCCGATCACG CCTCCA 4126
(2) INFORMATION FOR SEQ ID NO: 511: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 610 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 38, clone Y163.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 511:
ACATCGTCAA TAATTCTCAA AACGCTTTAA CGCTAGCCAA CAACGCTAAC ATCAGCAATT 60
CAACAGGCTA TCGAATGAGC TATGGCGGGA ATATTGATCA AGCGCGCTCT ACCCAACTGT 120
TAAACAACAC CACAAACACT TTGGCTAAAG TTACCGCTCT AAACAACAAG CTTAAAGCTA 180
ACCCATGGCT TGGGAATTTC GCTGCCGGTA ACAGCTCTCA AGTGAATGCG TTTAACGGGT 240 TTATCACTAA AATCGGTTAT AAGCAATTCT TTGGGGAAAA CAAGAATGTG GGCTTACGCT 300
ACTACGGCTT CTTCAGCTAT AATGGTGCGG GCGTGGGTAA TGGCCCTACT TACAATCAAG 360
TCAATCTGCT CACTTATGGG GTGGGGACTG ATGTGCTTTA CAATGTGTTT AGCCGCTCTT 420
TTGGTAGCCG AAGTCTTAAT GCGGGCTTCT TTGGGGGGAT CCAACTCGCA GGGGACACTT 480
ACATCAGCAC GCTAAGAAAC AGCCCTCAGC TTGCGAATAG ACCCACAGCG ACGAAATTCC 540 AATTCTTGTT TGATGTGGGC TTACGCATGA ACTTTGGTAT CTTGAAAAAA GACTTGAAAA 600
GCCACGAGCG 610
(2) INFORMATION FOR SEQ ID NO: 512: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 927 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 39
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 512:
CCTCTTCAAC CACCATCTTC AACAACGAGC CAGGGTATCG ATCCACTTCC ATCACTTGCT 60
CTTTGCGGAA GCGAAAGTTT GAGTGGTAAC CGTGCCAAGA GCTGAACCTT GTGTGCCACT 120
AGTGCCAGCT GTTGAAGGGT TGTTGCATGT GGCTAAAAAG CCTGTAACAA AATTTTTAAA 180
ACTCCCGGTA AGATTGTCAG GGTTAATGGC TTGCCCCACC TGATGGGCTA AATTGAGCAT 240 TTTGGCTTGC GCGCTAGCGT TAGCGAGCAT GCCTTGAGCG AAGCTAGCGT CTGTGAAAGG 300
GTTGAAAGGC TTGCCATTAT TACCGCCTAC TGGGGTTGAT TGTTCATGCT CGTTAATGAC 360
GCTCGTTTGA TTGACCAGCT CTTGCGCGTC CGTGATCATC TTTTGGATCG CGCTGATTTC 420
TTCTGAAAAA GCGCCGCATA TTTTCCCAGT AGTAGTAGAG AATTTTGGGG CATTAGCCTC 480
ACTACTATTA TTAGCATGGA AATACGGGCA TGCCGTGTTG ATGGTGTTAA TGAGCGTGCT 540 CGCTTGCGCT AAGAGCGCTT GAGCGCTATT AGGCACACCG TCTAATTGGT TAGTGATTTC 600
GGTGTAGGAG ACACGTTGGG TGTTACCTAG TGCGGTTGAA TCAACGACTT TTGAACTGAT 660
CGTGGTGGTT ACGCTTTTGC CGTCTATGGT TTGGATTTTA GTCGTGGTTC CGCCATGTTG 720
GTTTTTTACA CCTGTGGCTT TTTCCGAGCA GTTATCATTC CCTTCCCCTG AGCATGTGTA 780
AGTATAGCTT ACATTCACCG TTCCGTTGTT TTCTTTGAGC GCGGGTAAGC CTTTTTTTAA 840 AGCCGTTTGG AGGATCTGAT AGGCTTCGTT AATCTTTTTA AAATTTTCAA TGCTCATAGG 900
GCCGTAGTAT CCAGGCGTAT ACCCGTT 927
(2) INFORMATION FOR SEQ ID NO:513: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 200 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 40 (start) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:513:
GACTGCCTTT AGTGGCTCCA AGCAAAGAAA CGATCAAACT TTTAGAAAAA ACTTTACAAC 60
AATATGAGGT AATTGCATGA ATGGTTCCAA TCACATGAAA AATAAAACCC TAGTGATCAG 120 CGGCGCGACT AGAGGGATTG GCAAGGCGAT ATTGTATCGC TTCGCTCAAA GCGGCGTGAA 180
TATCGCTTTC ACTTACAATA 200
(2) INFORMATION FOR SEQ ID NO: 514: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 223 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 40 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 514:
GATCTAGCCG GAGCGGCTTA TTTTTTATGC GCTGAAACCC AAAGCGGTTG GCTTACAGGG 60
CAAACGATCG TTGTAGATGG CGGGACCACT TTTAAATAAA GATATTTCTT GCAAAACATT 120
ATCCACATCC ACCAAAACAA AGAGTTGCAA TCCATTAAAA AATGCTTGTT GGGCTATTTC 180
TTCGCCCCTT TGTGTGGGGC TGTTTTGTTG GTGCTTTTTC TTG 223
(2) INFORMATION FOR SEQ ID NO: 515:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 112 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 41 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 515:
GGCGCTGAAT AGGGATGATT TCGCGCAAAA ATACCAAATT AAATCCGGTG CTGACTGGAA 60 TATTGATATC AAGAAGTGCG CTAAGTAAAG CGCTGTTTAG AAAATTAAGG GG 112
(2) INFORMATION FOR SEQ ID NO:516:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2534 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 42
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 516: CTGCAGGAAT TCAAGGAGCG AAATACCGCA AGGCTTTTAG CGTGGAAGAA ATGATTCCTA 60
GCATGGGTCA GGGGGCTTTA GGGGTAGAAA TGCTCAAAAA CCACAAGCAT TTTATTACGC 120
TCCAAAAACT CAACGACGAG AAAAGCGCGT TTTGCTGCCA TTTAGAAAGG GAGTTTGTTA 180
AGGGGCTTAA TGGGGGGTGT CAGATCCCTA TAGGCGTGCA TGCGAGTTTA ATGGGCGATA 240
GGGTTAAAAT CCAGGCGGTT TTAGGCTTGC CTAACGGGAA AGAAGTCATC ACTAAAGAAA 300 AGCAAGGGGA TAAAACTAAA GCGTTTGATT TAGTTCAAGA GCTTTTAGAA GAATTTTTGC 360
AAAGCGGGGC CAAAGAGATT TTAGAAAAGG CGCAGTTGTT TTAATGCGTT TGTTTATCGC 420
GCTAGTTTTG TTTTGGTGGT GGTTAAGTTT GAGCGCTAAG GAAGCGGATT TCATCTCTGA 480
TTTAGAATAC GGGATGGCTC TTTATAAAAA CCCTAGGGGT GTTGCGTGTG CGAAATGCCA 540
TGGCATTAAA GGCGAACAAC AAGAAATCAC CTTTTATTAT GAAAAAGGCG AAAAAAAAAT 600 CCTCTACGCC CCTAAAATCA ACCATTTGGA TTTTAAAACC TTTAAAGATG CCTTGAGTTT 660
AGGCAAAGGC ATGATGCCTA AATACAATCT CAATTTAGAA GAAATCCAAG CGATTTATCT 720
TTACATCACC TCTTTAGAGC ATAAAGAAGA GCGTAAGGAT TCTCCTAAGC CTTAATCAAA 780 GCGCTTGATT TATGTTAAAA TGGAGCGTTG CATTTTTGTT TTGATTAAAG AAGGGTTCTA 840
AAAATCAGAA TTTAAAAGAA GGTAAAAATG AGTGTCAAAA TTTTAAAAAT ATTAGTTTGT 900
GGGTTATTTT TTTTGAATGC CCATTTATGG GGGAAACAAG ACAATAGTTT TTTGGGGGTT 960
GCTGAAAGAG CCTATAAAAG CGGGAATTAT TCTAAAGCCA CATCTTATTT TAAAAAAGCA 1020 TGCAACGATG GGGTGAGTGA AGGTTGCACG CAATTAGGAA TCATTTATGA AAACGGGCAA 1080
GGCACTAGAA TAGATTATAA AAAAGCCCTA GAATATTATA AAACCGCATG CCAGGCTGAC 1140
GATAGGGAAG GGTGTTTTGG TTTAGGGGGG CTTTATGATG AGGGGTTAGG CACGACTCAA 1200
AATTATCAAG AGGCCATTGA CGCTTATGCT AAGGCGTGCG TTTTAAAACA CCCTGAGAGT 1260
TGCTACAATT TAGGCATTAT TTATGACCGA AAAATCAAGG GCAATGCCGA TCAAGCGGTT 1320 ACCTACTATC AAAAAAGCTG TAATTTTGAT ATGGCTAAGG GGTGTTATGT TTTGGGCGTG 1380
GCTTATGAAA AAGGCTTTTT AGAAGTCAAA CAAAGCAACC ATAAAGCCGT CATCTATTAT 1440
TTGAAAGCAT GCCGATTGGA TGATGGGCAG GCTTGCCGCG CGTTAGGGAG TTTGTTTGAA 1500
AATGGCGATG CAGGGCTTGA TGAAGATTTT GAAGTGGCGT TTGATTACTT GCAAAAAGCT 1560
TGCGGGTTAA ACAATTCTGG TGGTTGCGCG AGTTTAGGCT CTATGTATAT GTTAGGCAGG 1620 TATGTCAAAA AAGATCCCCA AAAGGCTTTT AATTATTTCA AACAAGCATG CGATATGGGA 1680
AGCGCGGTGA GTTGCTCTAG GATGGGCTTT ATGTATTCCC AAGGGGACGC YGTTCCAAAA 1740
GATTTGAGGA AAGCCCTTGA TAATTATGAA AGGGGTTGCG ATATGGGCGA TGAAGTGGGT 1800
TGCTTCGCTC TAGCGGGGAT GTATTACAAC ATGAAAGATA AAGAAAACGC CATAATGATT 1860
TATGACAAGG GCTGTAAGCT AGGCATGAAA CAAGCATGCG AAAACCTCAC TAAACTTAGG 1920 GGGTATTGAA AATTTAACCA ACCCCCTAAA TTATCATTGC GCTTGACTCA AAACTTTTCA 1980
AAGATTTGGC TCTGTTTTAA GAGCTAAAGC AGAAACCCAC CCTATTAATT TTTTAATCTT 2040
TGGTGTTTTT AGGGCTTTGT CTATTTTCAA AAAGAAAACT TTTTGAATGT TTTTTGCGGT 2100
TGTTTGGTTG TATTTGTAGT GTATTTTTAT GGTGTAAATT TTTGTTAGGT TAGCTTGAAG 2160
TGGGTTTTAG GTTTAAAAGT CCTATAAAAA ATGTTTTAGC GTGTTTTTGC GCTATGGATA 2220 GATATGCGTT TGGTTGTGTT TTTCCCAATG GCTTTAATTT ATGGCTTTTG CGTGGTTATT 2280
ATTATAAGCA CGCTATAAAC ACGAATTACA CGATAACAGA GCGGTATACG CACGCTATAA 2340
AAAGACTTGA TAAAAATAAC GAAAAATAGT TAAATTTCAA GCGTTCTTTT AAAAATTGTT 2400
GTTAGGTGAG ACAGATAAAA ACGCTTTTAG TTTAAAGATA GAGTTTTAGG GGTTTTTTGT 2460
GTTGGTTTAG TTATTCTTTA TTTTTTTAAA AAATGGGATT TTTAAAACTC ATAAAGAGAT 2520 AGGGGGTATT TTGA 2534
(2) INFORMATION FOR SEQ ID NO: 517:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 327 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 43 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 517: TTGAAAAAAG AATTCAACCA CCCCCTTTTT GTCGCCTACG CTTATAACGC TGGGCCTGGG 60
TTTTTAAGGA GGTGGTTAGA AAGCTCCAAA CGATTTAAAG AAAAAAATCA TTTTGAGCCA 120
TGGCTTAGCA TGGAGCTTAT GCCTTATAGT GAAACCCGTT TGTATGGCTT TAGAGTCATG 180
CTCAATTACT TGATTTATCA AGAAATTTTT GGGAATTTCA TCCCTATTGA TGGATTTTTA 240
GAACAAACTC TTAACTCAAA GGACAAACCA TGATTAAAAA ATGCCTTTTT CCTGCCGCGG 300 GCTATGGCAC GCGCTTTTTG CCGATCA 327
(2) INFORMATION FOR SEQ ID NO:518:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 810 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 43 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 518: GGCACTCAAG CCAAAAGAAA GCGCATCATC GCTTATCAGT TCAAGGGCAA ACGATACGAT 60
TGCGGGAGCG TGGAAGGCTA TATTGAAGCG AGTAACGCTT ATTATAAAAA ACGCTTATAA 120
ATTTTATCAA CGTGGGCAAT TTGACTTATT ACGCTTACAT GTATTTGATC CTCTTTGTAT 180 GCTTGTTGCC TGTGTTATTA ATGGGGCTTG TTTGGAGGCT TACTCGCCCC CCCTTAAAGC 240
AAAATATTCC TAATAAAAGC CTCTCTTTAG AAAATTTAAA CGAACAAATC AAAAACCTTA 300
AAAGCGTACC AGCTTTAGAA AAACTGAAAA ACGACTTCAA TGAGCGTTTT AAAATTTGCC 360
CCAAAGATAA AGAAACTCTG TGGTTAGAAA CGATCCAAAA ATTAGTCGCT TCAGAATTTT 420 TTGAATTAGA AGACGCTATT AATTTTGGGC AAGAATTAGA AAACGCTAAC CCTAATTACC 480
AACAAAAAAT CGCTAACGCT ACCGGCTTAG CCCTTAAGAA TAAAAAAGAA AAAGGATAGA 540
ATTGGATTTT TTAGAGATTG TAGGACAAGT CCCTTTAAAA GGAGAGGTAG AAATTTCAGG 600
GGCGAAAAAC TCCGCGCTCC CCATTTTAGC CGGTACGCTT TTAAGCCGCC AAGAAGTCAA 660
AATCAAATCC TTGCCCCAAG TGGTGGATAT AAAGGCGATG GCGTTATTGT TGCAAAATTT 720 AGGCGCAAGC TTAGAATGGC TTAATCCTAA CACGCTCCAA ATCAGCGCTA AATCCCTGAG 780
CCACACCGAA GCCACTTACG ATTTGGTGCA 810
(2) INFORMATION FOR SEQ ID NO: 519: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 424 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 44
(ix) FEATURE:
(A) NAME/KEY: Other
(B) LOCATION: 251-279 (D) OTHER INFORMATION: V'note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 519:
GATTGGAGCA ATAGGAAAAT GCAATTAGAG CTTTTCCCTA TTGATTTGCC CTATGCGAGT 60
GAAAAAGAAA TCGCTATAGC TAAAATGCAA CACCTCCCCA AGCTAGTAAG AGASGCGCTC 120
AAATGCATGG GGTTTGATAG GGTGAGCAAA GAAATCGTTT TTGAATACGA GCCTAAATTA 180
TTAAAGCCAA GCCGCTTGAC TTATTTTTTT GGCTATTTCC AAGATCCACG ATATTTTGAT 240
GCTATATCCC OTTONNNNNNN NN-^NNNNNN NNNNNNNNNC CCCCCCCCCC GAAAATGGAA 300
AAATTATTAA AAAAAAAGAG GAAGAATATC AGCGCAAGCT TGCTTTGATT TTAGCCGCTA 360
AAAACAGCGT ATTTGTGCAT GTAAGAAGAG GGGATTATGT GGGGATTGGC TGTCAGCTTG 420
GTAT 424
(2) INFORMATION FOR SEQ ID NO: 520:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1198 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 45
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 520:
AGCGTGATAA AGCACTTCTT TATTGTTTTT GAACGCTGTT TGGAAATGCT GGTGATCTTT 60 TAACCCTAAT TTTTGATAAA ACCCTAAGCC TAATGCGTTC AACCCTTGCG TTTTATCTTC 120
ATTGTTGCAT GCGATTTTAA CCGGAGGCCC AAAAAATTCC TCTTTGGTTT CTTGGGTTAA 180
AAGGGGAAAA TCTTTAATGA GCAAACACAA ACGCCTAGGG GTGTAAAAAA TCTCTATATT 240
TCCCACTTCT AAAGTGCGTT TTTGAAAAAG AGCATGGAGT TTTTTAGGCA TTTCTTTATA 300
TTCATTCAAT AACGCTTGTG CGGGCAATTC TTCAACTAAA ATCTCTACTA ACAATTCATC 360 TGAATGCAAA ATCTCAATTC TCCCTAAAAA ACAAAATCAC TTTTAAGACT AAATCATGTT 420
AGAATTATAC TTGAATTTAC ACTCAGTTTA GTTTATTTCT TAATACAAAA GGTAGGCGTT 480
TTGAAACATT TAACCCCACT CACTCACACC ATCTTTAAAG CCTTATGGCT AGGCACAGCC 540
TTAAGCGCAT CTTTAAGTTT AGCCGCAGCA GAAAGCCCCA CTAAAACAGA GCCTAAGCCC 600
GCTAAAGGGG TTAAAAACAA ACCCAAATCG CCCGTTACTA AAGTCATGAT GACCAATTGC 660 GACAATATTA AAGATTTTAA CGCTAAGCAA AAAGAAGTCT TAAAAGCCGC TTATCAATTC 720
GGCTCTAAAG AAAATTTAGG CTATGAAATG GCAGGCATTG CATGGAAAGA GTCATGCGCA 780
GGGGTTTATA AAATCAATTT TTCCGATCCG AGCGCGGGTG TGTACCATTC TTATATCCCT 840 AGCGTTCTAA AAAGCTATGG GCATAATGAT AGCCCCTTTT TGCGTAATGT GATGGGGGAA 900
TTGCTCATTA AAGACGATGC GTTTGCTTCT GAAGTGGCTT TAAAAGAGTT GCTCTATTGG 960
AAAACACGCT ACCATGACAA TYTAAAAGAC ATGATTAAAT CTTACAACAA GGGCAGTCGT 1020
TGGGAAAAAA ACGAGAAGTC TAACGCTGAT GCTGAAAAAT ATTACGAAGA GATACAAGAC 1080 AGGATTAGGC GTTTGAAAGA ATCTAAAATC TTTGATTCGC AGTCCAGTAA TGACCAAGAA 1140
TTGCAAAAAA GTGCTAATAG CAACCTGGAT TTAGACCCTA TCGGCAACGC CATGCCCC 1198
(2) INFORMATION FOR SEQ ID NO: 521: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2466 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 46
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 521:
GTGGCTGATA TTACTAAGAG CATTTCTAGA GACTATGATG TGCTGTTTGA AGAAGCGATC 60
GCTTTGAGAG GAGCTTTTTT GATTGACAAA AACATGAAAG TAAGACACGC AGTGATCAAT 120
GACTTRCCAT TAGGTAGGAA TGCAGATGAA ATGCTTCGCA TGGTAGACGC TCTCTTGCAC 180
TTTGAAGAAC ATGGTGAAGT ATGTCCAGCA GGTTGGAGAA AAGGCGATAA AGGCATGAAA 240 GCAACTCACC AAGGCGTTGC AGAGTATCTC AAAGAAAATT CCATTAAGCT TTAATCGGCT 300
TAATTCTTTT AACCAAGAGA CTTCTTTCTT GGTTGAATCC TTTCTTTTTA ATTCTTCGCC 360
AGTTTCTAAA GTTTTCGTAT CGTTTTATCT TTTTATCATT TTTGAATTAA TTTTTTTTAA 420
TAAAGGGTTT TTTATGAATA TATTCAAGCG TGTTATTAGT GTAGGGGTAA TTGTTTTAGG 480
TTTGTTTAAC CTTTTAGACG CCAAACACCA CAAAGAAAAA AAAGAAAACC ACAAAATCAC 540 TCGTGAGCTT AAAGTGGGCG CTAACCCCGT TCCGCATGCG CAAATCTTGC AATCAGTCGT 600
GGACGATTTG AAAGAGAAAG GGATCAAATT AGTGATCGTA TCTTTTACSG ATTATGTGTT 660
GCCTAATTTA GCGCTCAATG ACGGCTCTTT AGACGCGAAT TACTTCCAGC ACCGCCCTTA 720
TTTGGATCGG TTTAATTTGG ACAGAAAAAT GCACCTTGTT GGTTTGGCCA ATATCCATGT 780
GGAGCCTTTA AGATTTTATT CTCAAAAAAT CACGGACATT AAAAACCTTA AAAAAGGCTC 840 AGTGATTGCT GTGCCAAATG ATCCGGCCAA TCAAGGCAGG GCGTTGATTT TACTCCATAA 900
ACAAGGCCTT ATCGCTCTCA AAGATCCAAG CAATCTATAC GCTACGGAGT TTGATATTGT 960
CAAAAACCCT TACAACATCA AAATCAAGCC TTTAGAAGCC GCGCTATTGC CTAAAGTTTT 1020
AGGGGATGTG GATGGAGCTA TTGTAACAGG GAATTATGCC TTGCAAGCAA AACTCACCGG 1080
AGCTTTATTT TCAGAAGACA AGGACTCGCC TTATGCCAAT CTAATAGCCG CTCGTGAGGA 1140 TAATGCGCAA GATGAAGCCA TAAAAGCGCT GATTGAAGTT TTGCAGAGTG AAAAGACCAG 1200
GAAATTTATT TTGGATACCT ATAAGGGGGC GATTATCCCG GCTTTTTAAG CCATTTGGAT 1260
AACATTCATC TTTAAGATGA ATGGGAGTGC TTTAGGCGCT AATAAAAAGC TCTTTTCTCA 1320
TTGAGCATTA AAAACGCTAA AAGTATTTTT AAAATAAAAC TATAAAAGAG CTTGCGCTAA 1380
TCAAGCGACA ATTTCTTAAA AAGCGTTATT TCTTAAGGGG GGAAATGGCG CAAAATAACC 1440 CCCTATCCCT TTAAGAAAAT AATGATAAAA TCCAAAACGA TGAAGCCAAA TAACTAAAAC 1500
CCCATTTTTT AAAAATATTA AGCAAGACTT TAACCATCAT GCGATTAGAA TAAAGCCAAA 1560
AACACAGCAA GTCTAAATCT CATTAAAGTT ACGCTTTTAG AATTTGCTCC AATAAAGACC 1620
AGTTTTAAAC CACCCGTTAG ACTGGGTTAT AGAGCGGTGG AATGAGTTGG AAACAAGTTG 1680
GCTAATGGTC TTTAGCGTTT AAAAGGTTGC GTTTTTGAAA AGGTTTAAAA TTTGATTAAA 1740 GATAGCCAAG CTCATAGAGT TTATTGTTCA TTTTCATTAA CAAGCCCCCC AGTTTTGACC 1800
CCCTTTCCCC ATGTTTTTAC TAAAATAGTG ATAGCGTATT TGGGTTTTTC ATAAGGCAAG 1860
AATGCGGTAA TCCACGCATG GGATCGATGG AAATATTCCA TATCCTTTTC TTTCATGCGA 1920
TTGACGATGT TTTGAGCGAT TTCCACGACT TGCGCGGTGC CGGTTTTACA CGCTAAAGTA 1980
ACCTTAGAAC CCCTTGTGGA ATGATAAGCG GTGCCGTCTT TATGGTTACA CACTTCATAC 2040 ATGCCAACGC GCAAGGCTTG GAGCTTCTTT TTTTGAAAGC TATTTAAGGG ATCTTTGAGC 2100
GGTTGTTGGT TGTTGATAGC AAAATGAGGC GTTGCCAGTT TGCCCGTAGC AATGAGTCCC 2160
CGTGTAGGCT AAAACCTGCA AGGGCGTGGC TAAAAAAGAG CCTTGCCCAA TAGCGGTAAT 2220
GAGCGTGTCC CCAACGCGCC AATTTTGATT GAAGCGTTTG AGTTTCCACA AATTATCCGG 2280
CACAATCCCC ACAAATTCAT TCGGCAAATC AACGCCCGTT TTTTCCCCAA AGCCCACTTC 2340 CCTTAAGGTT TTAGAGAGTT TTTCTATAGA AATTTCAAGC CCAAATTTAT AAAAATACAC 2400
ATCCACAGAC TCCCTAATGG CTTTATACAA ATTAGAATTG CCATGCCCTG TTTTTTTCCA 2460
GTCTCT 2466
(2) INFORMATION FOR SEQ ID NO: 522:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 308 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 47 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 522: GAATTCAATT TATCTTATAC GATATTAAGG AGACATATTA CCATGTTTCA AATTAGATGG 60
CATGCACGAG CGGGTCAAGG TGCTATCACC GGCGCTAAAG GGTTGGCTGA TGTGATTTCA 120
AAAACAGGCA AAGAAGTGCA AGCGTTCGCC TCTTATGGCT CAGCTAAAAG GGGGGCTGCT 180
ATGATGGCTT ATAACCGCGT TGATGATGAG CCTATTTTAA ATCATGAACG CTTCATGCAG 240
CCTGATTATG TGCTGGTGAT TGATCCCGGT TTGGTTTTCA TTGAAAACAT CTTCCCCAAT 300 GAAAAAGA 308
(2) INFORMATION FOR SEQ ID NO: 523:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 285 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 47 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:523: CCCTTTTGAA AAAAACGCGC AAAGCGAAAT GGAAAAACAC AACGATGAGC GCCATTACAC 60
CGAGCAAAGT TACTTCACCA CTTCAGTGGC TCATTGGCGC GTGGCTAAGC CTGTGCATAA 120
CAATAATATT TGCATCAATT GCTTTAATTG TTGGGTTTAT TGCCCAGACG CTGCTATTCT 180
TTCAAGAGAG GGTAAGTTAA AGGGCGTGGA TTATTCTCAT TGTAAAGGCT GTGGCGTGTG 240
CGTGGATGTT TGCCCTACCA ACCCTAAATC GCTATGGATG TTTGA 285
(2) INFORMATION FOR SEQ ID NO: 524:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 975 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 48
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 524:
ACACCACTGA TGCACTCAAG GTTCAAATGC GTAGGGTCTT TGCCTTTTAT GTGGGGTATA 60 ATTACCACTT CTAAAGGGCT TTTAAAACCC AACGCAACTC CCTAACATCT TTTGGTAATA 120
GCTCTTGGCT TTGAGGATCG TTTCTGTGTC ATCGGTTTTT AAAAGCACTT CGTAATTGTT 180
TTCAAAAGCG TTTTTGCTCC AATTCGCTGA GCCTAAAAAC ACGATCTTAT CATCAATGAT 240
CGCCACTTTT TGGTGCATGA TGCCGTAATA ATTCCCGTTT TTAGCCTTAA GCCCTTTCAA 300
TAAGCACACT TTCGTGTTAG GGTATTTGTC TAAATAGCCA ATAGTGGATT GCTTGTTATG 360 ATGATTGCTT TCATAATCAT AAATGATTTG CACCTTAATC CCCCTACTCG CTACGCTTTT 420
AATCGCTCTT GCAATATCTC TGTGCGTGAA ACTATAGATA GCGATTTTCA CGCTCTCTCT 480
GGCGTTACTA ATGCCAGAAA CTAAAGCGTT CAAAGCGTCT CTTTGTTCAT AAGGCAAGAC 540
AAATAAGCTG TTTTTAGCTT GCAAAACCCC TAAACAGCCC ACTAACACGC CCACGCCAAC 600
GATTTTTTTA AACTTGTTTA ACATCAAATG ATCCTTATTT TAAATTACAC TCTATGATAA 660 AAACAACCAA CCAAGTGGAA TTTTGATTAT AATAAAAGAT TTAAAAAAAG CGATAAAATG 720
GATAGGCTTA AAGAAGCGTT TCTTATCAAA TAAGGAGTCC CTAAAATGAA AATGATTCTA 780
TTCAACCAAA ACCCCATGAT CACAAAGCTG CTTGAGAGCG TCTCTAAGAA ATTAGAATTG 840
CCTATGGAAA ATTTTAACCA CTATCAAGAG CTATCTGCGT GCCTTAAAGA AGATCCAGAA 900
TGGATTTTGA TAGCCGATGA TGAATGTTTG GAAAAACTAG ACCAAGTGGA TTGGCTAGAA 960 TTAAAAGAGA TCATC 975
(2) INFORMATION FOR SEQ ID NO: 525: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 331 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 49 (end) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:525:
AAAGCCTTTT TGCAAGCGTT TAACCAATTC CCCCATAAGA TTGAAGATAG GGGGGCTACT 60
AAACGCTATC TCATAGGCCC TTATAAGAGC AAGCAAGAAG CCTTAATGCA TGCTGATGAA 120
GTCAGCAAAA AAATGACTAA ACCGGTTGTC ATAGAAGCGC GGTAGGGGTT TGAGCTAAAA 180 ATTCTTATTT CTTTCTTGGG TGTTTTTAGT GGATCGCCCA AGCTCTAAAA CGATGGGGTA 240
GCTTACTCCA TCGGCTCGGA TAATTTTAAC TAAAGAATCA AGATAAGCAG ATTTTTTTGA 300
TCAATATACC TATTTGGTCT TTGAGGAATT C 331
(2) INFORMATION FOR SEQ ID NO: 526
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2059 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 50, clone dll (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 526:
AGCTTACACA GAGACCGTTC GCGCGTTCAA TTTGGCTGAA ATGCTCATGA CTCCTGTATT 60
CTTGCTCATG GATGAAACCG TGGGGCATAT GTATGGTAAG GTGCAAATCC CAGATTTAGA 120
AGAAGTGCAA AAGACGACTA TTAATCGTAA GGAATTTGTG GGCGATAAAA AAGACTACAA 180 ACCTTATGGG GTCGCACAAG ATGAGCCGGC TGTTTTAAAC CCTTTCTTTA AAGGTTATCG 240
CTACCATGTT TCAGGCTTGC ACCATGGGCC CATTGGCTTT CCTACTGAAG ACGCTAAAAT 300
CGGTGGGGAT TTGACTGACA GATTATTTAA TAAGATTGAA TCCAAACAAG ACATTATCAA 360
CGAAAATGAG GAAATGGATT TAGAGGGCGC TGAAATCGTT ATCATCGCTT ATGGTTCGGT 420
TTCTCTACCG GTTAAAGAAG CCTTGAAAGA TTACCATAAA GAAAGCAAGC AAAAAGTCGG 480 CTTTTTCAGG CCCAAAACCT TATGGCCAAG CCCGGCTAAA CGCTTGAAAG AAATAGGGGA 540
TAAATACGAA AAAATCCTTG TGATTGAATT GAATAAAGGG CAGTATTTAG AAAAAATTGA 600
AAGGGCTATG CAAAGAAAGG TGCATTTCTT GGGGCAAGCC AATGGGCGCA CGATTTCGCC 660
TAAACAAATC ATCGCAAAAT TGAAGGAGCT TTAAAATGGC GTTTAATTAT GATGAATATT 720
TGCGTGTGGA TAAAATACCC ACTTTGTGGT GTTGGGGCTG TGGCGATGGC GTGATTTTGA 780 AATCCATTAT CCGCACGATT GACGCTTTAG GCTGGAAAAT GGATGATGTG TGCTTGGTGA 840
GYGGGATTGG TTGCAGCGGG CGCATGAGYT CGTATGTGAA TTGCAACACC GTTCACACCA 900
CGCATGGTAG GGCTGTAGCG TATGCGACAG GGATTAAAAT GGCTAACCCT AGTAAGCATG 960
TGATCGTGGT TTCTGGTGAT GGCGAAGGCT TTGCTATTGG AGGCAATCAC ACCATGCATG 1020
CATGCAGAAG AAACATTGAT TTGAATTTTA TTTTAGTGAA TAATTTCATT TATGGTTTGA 1080 CCAACTCCCA AACTTCGCCC ACCACGCCTA ATGGCATGTG GACGGTTACG GCTCAATGGG 1140
GGAATATTGA TAACCAATTT GACCCATGCG CTTTAACCAC CGCTGCAGGG GCGAGCTTTG 1200
TGGCTAGAGA GAGCGTTTTA GACCCTCAAA AATTAGAAAA AGTGCTTAAA GAAGGTTTCT 1260
CGCACAAGGG CTTTAGCTTC TTTGATGTCC ATAGTAATTG CCATATCAAT TTAGGGCGCA 1320
AGAATAAAAT GGGCGAAGCG TCTCAAATGC TAAAATGGAT GGAAAGCCGA TTGGTGAGCA 1380 AACGCCAATT TGAAGCCATG AGCCCTGAAG AAAGGGTGGA TAAATTCCCT ACAGGCGTTT 1440
TAAAGCATGA CACGGACAGG AAAGAATATT GCGAAGCGTA TCAAGAAATC ATTGAAAAAG 1500
CACAAGGAAA ACAATAATGG AAGCGCAATT ACGATTTACG GGCGTTGGAG GGCAAGGCGT 1560
GCTGTTAGCG GGGGAGATTT TAGCTGAGGC TAAGATCGTG AGTGGGGGCT ATGGCACTAA 1620
GACTTCCACT TACACTTCGC AAGTGCGTGG AGGGCCCACT AAGGTGGATA TTTTGCTAGA 1680 CAGAAATGAA ATTATTTTCC CTTATGCTAA AGAGGGCGAG ATTGATTTCA TGCTTTCAGT 1740
CGCTCAAATC AGCTACAACC AGTTTAAAAG CGATATTAAA CAAGGCGGTA TCGTTGTCAT 1800
TGATCCCAAT CTAGTAACCC CCACTAAAGA AGATGAAGAA AAGTATCAGC TTTATAAAAT 1860
CCCTATCATT AGTATCGCTA AAGATGAAGT GGGTAACATC ATCACGCAAT CTGTGGTAGC 1920
GTTAGCCATT ACCGTGGAGC TTACTAAATG TGTAGAAGAA AATATCGTGC TAGACACCAT 1980 GCTTAAAAAA GTCCCTGCAA AAGTCGCTGA CACCAACAAA AAAGCCTTTG AAATTGGCAA 2040
AAAACATGCT TTAGAAGCT 2059 (2) INFORMATION FOR SEQ ID NO: 527:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1563 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 51
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 527:
GAATTCTTTT TTATTCATGG GGTTTTGTAA GGGGGCTTGG ATTTTGGCTT CCAATACGAC 60 AGGCTTGGTT ACGCCATGAA GAGTCAAATC CCCATGGATT TTACCATCTT CGTATTTGGT 120
CATTTTAAAG CTCCCTTTGG GGTATTTCAT GGCATCAAAA AACTCTGCTG TTTTTAGGTG 180
GTCGTCTCTT TTTTTGTTCC TAGTGTTAAT GCTTTTAATA TCAGTTTTGC CTTCAAAAAC 240
ATTGAGAGCT TTGGTATTAG GATCGGCATC AATTTTGCCA TCAAAACTAT CAAACACGCC 300
TCTTGTTTCA TTGAATTTGA AGTGTTTGAC CTCAAACCAC ACGCTAGAGT TTGCCTTATC 360 AATCGTATAA GGTTTTGCAA ACGCCAGACT AACGCCCAAA AGGGTGAACA TTAACGCTTT 420
TTTCATTGTT TCCTCGCTTC ATTTTGAATT AAAAACGCCA ACTATACAAC AAATTGGTTA 480
ATGGTAAAAA TACTCCTAAC CATCTGTTTT AAGGTATAAT ACAAAAAATC CCCATTAGGT 540
AGGTGTTGTT ATGATAAAAA AGACCTTTGC ATCAGTTTTA TTAGGATTGA GTTTGGTGAG 600
TTGTTTTAAA TGCCAAAGAA TGCGTCTCGC CCATAACAAG AAGCGTTAAG TATCATCAGC 660 AAAGCGCTGA GATCAGAGCC TTGCAATTGC AAAGTTACAA AATGGCGAAA ATGGCGCTAG 720
ACAATAATCT CAAGCTCGTT AAAGACAAAA AGCCAGCCGT CATCTTGGAT TTAGATGAAA 780
CCGTTTTGAA CACTTTTGAT TATGCGGGCT ATTTGATCAA AAATTGCATC AAATACACCC 840
CAGAAACTTG GGATAAATTT GAAAAAGAAG GCTCTCTCAC GCTCGTTCCT GGAGTGCTAG 900
ACTTTTTAGA ATACGCTAAT TCTAAGGGCG TTAAGATTTT TTACATTTCT AACCGCACGC 960 AAAAAAATAA GGCATTCACC TTAAAAACGC TCAAAAGTTT TAAACTCCCC CAAGTGAGTG 1020
AAGAATCCGT TTTATTAAAA GAAAAAGGCA AGCCTAAAGC CGTTAGGCGA GAATTAGTCG 1080
CTAAGGATTA TGCGATTGTT TTACAAGTGG GCGACACTTT GCATGATTTT GATGCACTTT 1140
TTGCTAAAGA CGCTAAAAAC AGCCAAGAAC AACGAGCTAA AGTCTTGCAA AACGCTCAAA 1200
AATTCGGCAC AGAGTGGATC ATTTTACCTA ATTCTCTCTA TGGCACATGG GAAGATGAGC 1260 CTATAAAAGC ATGGCAGAAT AAAAAATAAA ATTTATCCAT CGCATAAGAA CGATTTTTGC 1320
TAACATGACA AAAAATTTGA TCTCTTAGGC GGAGCTATGG ATTTTGTAGG GTTTGAAGAT 1380
TTAAAATGCA AAGATAAAGA AAACTCTCAA AAAGTTTTTG TGATCCGTAA CGATAAGTTA 1440
GGCGATTTTA TTTTAGCCAT ACCCGCTTTA ATCGCTCTAA AGCAAGCTTT TTTAGAAAAA 1500
GGCAAGGAAG TGTATTTGGG CGTGGTTGTG CCTAGCTATA CCACCCCAAT CGCTTTAGAA 1560 TTC 1563
(2) INFORMATION FOR SEQ ID NO:528:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 424 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 52, clone Y196-1.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 528: TAATTTTAAG ACAATCATTC TTAATTTGCT ATTAAGTGTT TCAAAAAAGC ATCAAATTGA 60
CATCTTTTAA GCAAAATGTT ATAATGCTCG CTTAATTTTA TTTGGGGTGT TAGCTCAGTT 120
GGGAGAGCAC AACGCTGGCA GCGTTGAGGT CAGGGGTTCG AACCCCCTAC ACTCCACCAT 180
TATCACGCTT ATCCTCTTTC TTTATGGATC AAACCCATTT TCTTGCTAGT ATTACCCATA 240
ATTAATTATA ATCTTAGGAA TTGTTTTTCA TCAAGGGAGT TTGAGCGTGA GCAGTTTGTT 300 TAAAATGCGC ATTCTGAGTT TTAAAAAGAA TAAGCGGGCG GTGTTTTCGC TCTATCTTTT 360
TATCGCTTTA TTAGCGCTTT CTCTTTTAGC CCCCTTGTGG GTCAATGATC GCCCCTTATT 420
CATC 424
(2) INFORMATION FOR SEQ ID NO: 529:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 535 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 53
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 529: GAATTCGCTC AAAAGCATAA AACAGACAGC GTGATTCAAG GCAAAGTGGT GAGCATTAAG 60
GATTTTGGCG TTTTCATTAA TGCTGATGGC ATTGATGTGC TGATCAAAAA TGAAGATTTG 120
AACCCCTTGA AAAAAGATGA AATCAAAATA GGCCAAGAAA TCACATGCGT GGTGGTTGCA 180
ATTGAAAAAT CTAACAACAA GGTGCGTGCT TCTGTGCATA GGTTAGAGCG CAAAAAAGAA 240
AAAGAAGAAT TGCAAGCTTT TAACACGAGC GATGATAAAA TGACTTTAGG GGATATTCTT 300 AAAGAAAAAC TCTAAAGAGT GATTTTAAAA GCATGAGAGT GGCATGAGAT TTAAGAGTGT 360
TGTTGCTTTT ATTTCCCTAG CTGTCGCTCT TGGCGTTTTA GCCTATTTGT TTTTAAGCGT 420
TAAAAAAGAA ATGCCCGCTA CTTCTCATGC GATCTCTCAA ACACATGCGA TCTCTCAAAC 480
CAATGAAGGC CTCTCTCAAA CAGACGCAAA AAACCATGAC ATCAACCTAG AAGAA 535 (2) INFORMATION FOR SEQ ID NO: 530:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 708 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 54
(ix) FEATURE:
(A) NAME/KEY: Other
(B) LOCATION: 332-455
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 530:
GATGATGAGG GAAATGTTGT TGCTAGAGGT GAGGAGGCGG TAAATTATGA GGGGTAGCCC 60 CCCCTGTGGG GTCTCCTTTT TTAATCTCTT TTCTTGCTCT TTTCATTCTC TCTCTAATTT 120
GTCCGTTTAA TTTGGAACGG CTTTAATCTT AAAACCCTAT CTTTGCACGT TATTTCGCAT 180
TCAAAAGGCG TTTTTCTTTA AATTCCTGGT ATTCTTTGAT CGCATAATTC ACAAAGGGTT 240
TGATCGCAAT CCCGCCCCTA GAGACAAAAT CCCCATACAC TTCCAAATAC TTTGGTTCTA 300
GCAATTGGAT TAAATCCAGT AATATCGTAT TNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 360 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNN1STNNNNNN NNNNNNNNNN NNNNNNNNNN 420
NNrørøNNNNN NNNNNNNNNN NNNNN-TNNNN NNNNNGAAGT GATGGGGCAA AGGCTTGTAA 480
ATTCCTTGCA CTCTAGCGTG ATTAAGGGGT CTAAATTGGG GTTTGGGTTA GGGAAAGCTT 540
CTAATAGCTG GCTGTTGTAT TCAAAAATAT AGGGCGTTTT AGCGCCTAAG GATTTGAGGT 600
TTAGTTCAGG GGTCATTAAT CTTTCCTTGA TTTAAGTTAT AATAAGGCTA TTTTAACCCA 660 TTTTTTAGGA AACTCATGAA CAAACGCATA GAAACGATTA CGGCTTCA 708
(2) INFORMATION FOR SEQ ID NO: 531:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 735 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 55, clone Y105.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 531: GTTAACTCCC ATGCGTTAGC CGTGTAAGCG AAATCTTTAT AGCTCAAAGC GCCCACAGGG 60
CATACCGCAA TGCATTCCCC GCAATCATAG CAAGGCACGC TGCCCACAAA AGAAATAATG 120
CCTTTTTGCT TGCGACTCCA CACGCTAAAA GCGTCTTTGG ACATGCTGTC TTTAAATTTA 180 TCCGGAGCAT GCAAGTCGGC TTTAGTGGCT TTGAGGTTGT TTTCGCCCAC ATTGTCCTTG 240 CAAGTGGTTA CGCACCTTTC GCACATGATG CACAAATTAG GGTCATACAA GGCTTTTGCC 300 CAAAAATCCA GCGCTTTAAA ATCATCAGCC ACCGCATAAG GTTGGTGCTC CACGCCGGTT 360 AAATGCGTCA TGTCTTGCAA TTCGCACTCC CCGCTCTTAT CGCACACGCC ACACTCTAAG 420 GGGTGGTTGA CATCATAAGT TTGCATGATG CTTTTTCTTT CATCCATGAG CGTGGGGGTG 480 TTAGTGAGAA TGGTGGCGTT ATTTTTGGCT TTCGTGTTGC AGCTATAAAT GCGTTTGCCA 540 TCCATTTCAA CCATGCACAT TTTGCATGCG ACTGTGGGCG AGCAACCGCT TAAATAGCAA 600 ATAGTAGGGA TGTAGATCCC AGCACTCCTA GCAGCCTCTA AAACGCTTTG TCCCTCTTGG 660 CATTCAATCA TTTTGCCATT GATATTCATT GTGATCATGA AATTCCTTGA GAGTGGGATA 720 AAATAGGGAT ATAAG 735
(2) INFORMATION FOR SEQ ID NO: 532:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 822 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 56, clone Y103-2.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:532:
CAAAGAAAAT TTAAAGCTCA TTAGTCTAAT CACACCTAAA ATCTCTAATT TAGAGATTTA 60
CTTACGCAAC GCACTAGATT ATTGCCTAAC TCAAATTAAA GGGAATGAGT GGGTGTTTGA 120
TGAAGTTTCT TTAATCCCCT TAATAGAAGA ATTGAAAGAC AAGAAAAAAG AAATCACGCA 180
TTCTTTGGTC TTGTCTAAAA TGTCTTTAGA AGCGGTGATT AAGCTTATCT TTTTTTACAA 240
ATTAGAGGGG GTAGCATTAG ATTTGAGAGC CTATAGTCTT AAGGCTTATT ACAAAGATAA 300
TAAGGATACT TTGCTTATTA AGGGTAGAAA ACAACATCTT TCTAATTATG CTAAAGCCTA 360
TATTGCTTTA AACTTACTAT GGACAATTAG AAATCGTGCG TATCATTGGG AAAATTTACT 420
AAAGCTAAGG GCAAACAACC GCCCACGCAT TACAACACGC TTTATCAGAG AATTAGAAAA 480
GCCTACAAGT AAAAGTTTTA ACTTTGGTAT CATGCCTAAC AAAATTGTTT CATTTTTAGA 540
TGATTTAATC AAAAGTGTTG GAAACAAAGA CTTGGAAAAA CTAAGTAGTC TATAAGCTAT 600
AAGAAAGTGG GCTTCGGTCA GCCCTACGAA TGTAGGCGTA TTATAGCTAA ACAATCATTA 660
TAAGTCAAAC CAAAACCAAC ACAAAATTTG CTAAACTACA ATCAAATCAA TTTAGGGAGA 720
ATAAAAATGT CATTTGCTCC TATGTTATTA GCTACAATCA ATAACTCTAT TGGCAATAAA 780
GATAAGCATG TGAGTTTAGA GTATCTTATA GGGCTTTTTA TG 822
(2) INFORMATION FOR SEQ ID NO:533:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 247 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 57, clone Y103-1/T3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:533:
TGGGGGTGAT TATCATGCTG CTTATGGGGA ATAAGGAAGA ATCTAAAGAA AACGCTTCTA 60
AAAACACCCA AGAAGTTCAA GCTAATCCTA TGGCGAACAA GAATCAAGAG GCTAAAGAAG 120 GCTCTAATAT CCAGCAATAT TTGGTGCTTG GGCCTTTGTA TGCGATTGAT GCGCCTTTTG 180
CGGTGAATTT AGTCTCTCAA AATGGCAGAC GCTACCTTAA GGCTTCTATT TCGTTAGAAT 240
TGAGTAA 247
(2) INFORMATION FOR SEQ ID NO: 534:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 58 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:534: TGTCCGGCCC AATGAATGGC TTAATGAGCG ATGGATAAAA ACCAATATCA TCACCCCCAT 60
AGAGCAAGCC AAACGGCTTT TAATGAAAGG ATAGTCATGT TAAAAACGAA TCAAAAAAAT 120
GTGCATGCGT TTGAAATTGA AAAGCAAGAG CCTAAAGCGG TCATAGGATT TTTAGAAAAA 180
AACCATGCCC TTTTGCAGTA TTTTCTTATT ATATTTAAAT ATGATATTGA ACCAGAATTC 240
AAAGCCATTT TGCACAAACA CCAGCTTTTG TTTTTAG 277
(2) INFORMATION FOR SEQ ID NO: 535:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 480 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 58 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 535:
AATAAAAATA TTAAAATAAT CACTAAAAAT GACGATATAC TAGACATAAA GGAAGTATTA 60 TGAAACAAAC AACCATTAAC CACTCTGTGG AATTAGTAGG GATAGGCTTG CACAAGGGCG 120
TTCCTGTGAA GCTTGTTTTA GAGCCTTTAG GGGAAAATCA AGGCATTGTT TTTTACCGCT 180
CTGATTTGGG CGTGAATCTC CCCTTAAAAC CTGAAAACAT CGTGGATACC AAAATGGCAA 240
CCGTGTTGGG TAAGGATAAT GCTAGGATTT CTACGATTGA GCATTTGCTT TCAGCTGTCC 300
ATGCGTATGG CATTGACAAT CTTAAGATCT CTGTGGATAA CGAAGAAATC CCTATCATGG 360 ATGGGAGTGC TTTGACTTAT TGCATGCTTT TAGATGAAGC AGGGATTAAA GAACTAGACG 420
CTCCTAAAAA GGTGATGGAA ATCAAGCAAG CCGTTGAGAT TAGAGAGAGC GATAAGTTTG 480
(2) INFORMATION FOR SEQ ID NO: 536: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 237 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 59 (start)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:536
AAACTCAATT TAAAGAACAA CTACCATTGT TTTTAGTTAT AATGGCTAAC ACCATTAAGA 60
TCAAAACAAA NGTTATTCAA TGAAGGTATT ATCTTATTTG AAAAATTTTT ATCTTTTTTN 120
AGCGATGGGA GCAATTATGC AACCGAGTGA AAACATGGGG GCTCAACACC AAAAAACCGA 180
TGAAAGAGTG ATCTACTTGG CTGGGGGGTG CTTTTGGGGG CTAGAGGCGT ATATGGA 237
(2) INFORMATION FOR SEQ ID NO: 537:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 261 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 59 (end)
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 537:
AACACATTGA GAAACCCTTT GCCCACGAGT ATCACCACAA AGAAGAAGAG GGCATTTATG 60 TGGATATTAC CACAGGCGAC CCGTTATTTT TTTCTGCGGA TAAATACGAC TCCGGTTGCG 120
GATGGCCAAG CTTTTCTAAG CCCATCAATA AAGACGTGGT GAAATACGAA GACGATGAGA 180
GCCTTAATAG GAAACGCATT GAAGTGTTGA GCCGTATTGG TAAGGCGCAT TTAGGGCATG 240 TGTTTAACGA TGGGCCTTAA G 261
(2) INFORMATION FOR SEQ ID NO:538: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 408 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 60, clone Y133.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 538:
TACCATTTTA GAATGATTGG TTTGTATGGT GTTGGTAGTT TTTGGCTTAT TTTGAGTGAA 60
AGGATTTAAT CAAGATGTTT GTGGTTTTTA TAGAAGGTTT TGGTTTAGCG ATTTCTTTGT 120
GCGCGGCTGT GGGAGCGCAA TCCTTGTTCA TTGTAGAGCG AGGTATGGCT AGGAATTATG 180
TGTTTTTGAT TTGCGCTCTG TGTTTTATGT GCGATATTGT GTTAATGAGC ATGGGCGTGT 240 TTGGCGTGGG GGCTTATTTC GCTAAAAACC TTTATTTGAG TTTGTTTTTG AATTTATTCG 300
GGGCGGTTTT TACCGGATTT TATGCTTTTT TAGCTTTAAA AACCCTTTTT CAAACCTTTA 360
AAAAGAAGCA AGTCCAAACC CCTAAAAAAT TATCCTTAAA AAAGACCT 408
(2) INFORMATION FOR SEQ ID NO: 539:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 372 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 61 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:539:
GAATTCACTT GGGAAACAAC CAACAAAGCT GAGGAGATTA AAGCGTTCTT TAAGCGTTGA 60
AAAACTATTT TAAGGGTAAT ATTTGGTAAA ATTTTTGTAT AATCAACAAT TCACAAGGAG 120
TTTAAATTGA AACAAAGAAC GCTGTCTATT ATTAAGCCTG ATGCACTTAA GAACAAAGTG 180 GTAGGCAAAA TCATTGATCG CTTTGAGAGT AACGGCTTGG AAGTGGTTGC TATGAAACGC 240
TTGCATTTGA GCGTTAAAGA CTCTGAAAAC TTTTATGCGA TCCACAGAGA GAGACCCTTT 300
TTTAACGATC TAATAGAATT TATGGTCAGT GGTCCGGTAG TGGTTTTGGT TTTAGAGGGC 360
GAAGATGCGG TG 372 (2) INFORMATION FOR SEQ ID NO: 540:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 651 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 62
(ix) FEATURE:
(A) NAME/KEY: Other
(B) LOCATION: 235-357
(D) OTHER INFORMATION: \"note: where N is A, C, G, or T/U"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:540:
TAGCCCTTTA TAACAACAAT AACCGCATGG ATACTTGTGT GGTGCGAAAT ACTGATGACA 60 TTAAAGCATG CGGTATGGCT ATCGGCAATC AAAGCATGGT GAACAACCCT GACAATTACA 120
AGTATCTTAT CGGTAAAGCA TGGAAAAATA TAGGCATCAG TAAAACGGCT AACGGCTCTA 180
AAATTTCGGT GTATTATTTA GGCAACTCTA CGCCTACTGA GAATGGTGGC AATANNNNNN 240 NNNNNNNNNN NNNNNNNNNN NN-TONNNNNN KTNNNNTSTNNNN NNNNOTSTNNNN NNNNNNNNNN 300
NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NOTrNNNNNNN NNNNNNNNNN NNNNNNNTTG 360
GCACTATTGA GAGCGTGTTT GAATTGGCTA ACCGCTCTAA AGATATTGAC ACGCTTTATA 420
CTCATTCAGG TGCGAAAGGT AGGGATCTCT TGCAAACCTT ATTGATTGAT AGCCATGATG 480 CGGGTTACGC TAGACAAATG ATTGATAACA CAAGCACCGG TGAAATCACC AAGCAATTGA 540
ATGCGGCCAC TACCACTTTA AACAACATAG CCAGTTTAGA GCATAAAACC AGCAGCTTAC 600
AAACCTTGAG CTTGAGCAAT GCGATGATTT TAAATTCTCG TTTAGTCAAT C 651
(2) INFORMATION FOR SEQ ID NO: 541:
!i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1200 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 63, clone e6
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 541:
AAATAGCTAA TTGTTAGTTT TTAAACGATT TTGTAAGCCA AAAAAGGATA CAATACCCTT 60
AAGGATTTTA TATTTATTTT ATAGTAAGGC AGTCAATGAG CAAGATAGCA GATGATCAGA 120
ACTTTAATGA CGAGGAGGAA AACTTCGCAA AACTCTTTAA AAAAGAATTA GAAAAAGAAG 180
AAACCTTAGA AAAAGGCACT ATCAAAGAAG GGCTAATCGT TTCCATCAAT GAGAATGATG 240
GTTATGCCAT GGTGAGCGTG GGCGGTAAGA CAGAAGGCCG TTTGGCTTTG AATGAGATCA 300
CCGATGAAAA GGGGCAGTTG CTGTATCAAA AAAATGACCC CATTATCGTG CATGTGTCCG 360
AAAAAGGTGA ACACCCTAGC GTTTCCTACA AAAAGGCCAT TTCCCAACAA AAGATTCAAG 420
CTAAAATTGA AGAATTAGGC GAAAACTATG AAAACGCCAT TATTGAAGGC AAGATTGTAG 480
GCAAGAATAA AGGGGGCTAT ATCGTGGAGT CTCAAGGCGT GGAGTATTTT CTCTCCCGCT 540
CGCACTCTTC TTTAAAGAAT GACGCAAACC ATATCGGCAA ACGCATTAAA GCGTGCATCA 600
TTCGTGTGGA TAAGGAAAAC CATTCTATCA ATATTTCTCG CAAACGATTC TTTGAAGTCA 660
ATGACAAACG ACAGCTTGAA GTTTCTAAAG AATTGTTAGA AGCCACAGAG CCGGTATTGG 720
GGGTTGTGCG CCAGATCACC CCTTTTGGCA TTTTTGTAGA AGCTAAGGGG ATTGAGGGCT 780
TAGTCCATTA TTCTGAAATC AGCCATAAAG GACCAGTCAA TCCTGAAAAA TACTACAAAG 840
AGGGCGATGA AGTCTATGTC AAAGCCATCG CTTATGATGC AGAAAAAAGA CGCCTTTCAC 900
TCTCCATAAA AGCGACCATA GAAGACCCAT GGGAAGAGAT CCAAGACAAG CTAAAACCCG 960
GATACGCCAT TAAGGTGGTG GTGAGCAATA TTGAACATTA TGGGGTGTTT GTGGATATTG 1020
GTAATGATAT TGAAGGCTTT TTGCATGTTT CTGAAATCTC TTGGGATAAA AATGTCAGCC 1080
ACCCTAGCCA TTACTTGAGC GTGGGGCAAG AAATTGATGT GAAGATCATT GACATTGATC 1140
CTAAAAACCG CCGCTTAAGG GTTTCTTTAA AACAACTCAC TAACAGGCCT TTTGATGTTT 1200
(2) INFORMATION FOR SEQ ID NO: 542:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1795 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 64, clone D5
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:542:
GAATTCGCGC AAAAAGCGAA TTTGAATCTG GCTGATGTGA TTAAAACCCT CTTTAATTTA 60
GGGCTTATGG TAACTAAAAA CGACTTTTTG GATAAGGATA GTATAGAAAT TTTAGCCGAA 120
GAGTTCCATT TAGAAATTTC TGTTCAAAAC ACTTTAGAAG AATTTGAAGT GGAAGAAGTG 180
CTAGAGGGGG TGAAAAAAGA GCGCCCGCCT GTGGTTACTA TCATGGGGCA TGTTGATCAT 240 GGTAAAACTT CGCTATTGGA TAAAATCCGT GATAAAAGAG TCGCTCACAC GGAAGCTGGG 300
GGGATCACTC AGCACATTGG CGCTTACATG GTAGAAAAGA ATGATAAGTG GGTGTCTTTC 360
ATTGACACCC CAGGGCATGA AGCCTTTAGC CAGATGCGTA ATCGTGGGGC TCAAGTTACA 420
GATATTGCAG TGATTGTGAT AGCGGCTGAT GATGGCGTGA AGCAACAGAC TATTGAAGCG 480
TTAGAGCATG CAAAGGCCGC TAATGTGCCT GTGATTTTTG CGATGAATAA AATGGATAAG 540 CCTAATGTGA ATCCGGACAA ACTCAAAGCC GAATGCGCTG AGCTTGGCTA TAACCCTGTG 600
GATTGGGGCG GAGAGCATGA GTTTATCCCT GTTTCGGCTA AAACGGGCGA TGGCATTGAC 660
AATTTATTAG ACACCATTCT TATCCAAGCG GATATTATGG AATTGAAAGC CATAGAAGAG 720 GGCAGCGCTA GAGCGGTTGT TTTAGAAGGA AGCGTGGAAA AAGGGCGTGG GGCAGTGGCC 780
ACTGTGATTG TCCAAAGCGG GACTTTGAGC GTGGGGGATA GTTTTTTTGT CGAAACCGCG 840
TTTGGTAAAG TAAGAACGAT GACTGATGAT CAAGGCAAGA GCATTCAAAA TTTAAAACCC 900
TCTATGGTGG CTCTCATCAC AGGCTTGAGC GAAGTGCCTC CTGCGGGATC TGTTTTAATA 960 GGGGTAGAAA ACGATTCTAT CGCGCGCTTG CAAGCTCAAA AGAGGGCGAC TTATTTGCGC 1020
CAAAAAGCGT TGAGTAAAAG CACTAAAGTG TCTTTTGATG AGCTTTCAGA AATGGTCGTT 1080
AATAAGGAAT TGAAAAACAT TCCTGTAGTC ATTAAAGCGG ACACGCAAGG AAGCTTAGAA 1140
GCCATTAAAA ACAGCCTGTT GGAGCTTAAT AACGAAGAAG TGGCGATTCA AGTGATCCAC 1200
TCAGGGGTGG GGGGCATTAC TGAGAATGAT TTAAGCCTAG TCTCTAGCAG TGAGCATGCC 1260 GTGATTTTAG GCTTTAATAT CCGCCCCACC GGTAATGTGA AAAATAAGGC TAAAGAATAC 1320
AATGTGAGCA TTAAAACTTA CACGGTGATT TATGCCTTGA TTGAGGAAAT GCGATCGCTG 1380
TTATTAGGCT TGATGAGTCC TATTATTGAA GAAGAGCATA CTGGGCAAGC GGAAGTGAGA 1440
GAAACCTTTA ATATCCCTAA AGTTGGCATG ATAGCCGGGT GTGTGGTGAG CGATGGGGTG 1500
ATCGCTCGTG GCATTAAGGC GCGTTTGATT AGAGATGGCG TGGTGGTTCA TACCGGTGAA 1560 ATCCTTTCTT TGAAACGCTT TAAAAATGAT GTGAAAGAAG TTTCTAAGGG CTATGAGTGT 1620
GGGATCATGC TAGACAATTA TAACGAAATT AAAGTGGGCG ATGTGTTTGA AACCTATAAA 1680
GAAATCCATA AAAAAAGAAC CCTTTAATGA ACGCTCATAA AGAACGCTTA GAATCCAATC 1740
TTTTAGAATT ACTACAAGAG GCTTTAGCGA GTTTGAACGA CAGTGAGTTG AATTC 1795 (2) INFORMATION FOR SEQ ID NO: 543:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 495 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 65, clone Y313.ASM
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 543:
GAATTCCTTA ACCCCACTTT TGTAACCCAA TACCCTATTG AGATTAGCCC CTTAGCCAGA 60
CGCAACGATA GTAACCCTAA TATTGCTGAC AGGTTTGAAT TGTTTATCGC AGGGAAAGAA 120 ATCGCTAACG GCTTTAGCGA GTTGAACGAT CCTTTAGATC AATTAGAACG CTTTAAAAAT 180
CAAGTGGCTG AAAAAGAGAA AGGCGATGAA GAAGCCCAAT ACATGGATGA AGATTATGTG 240
TGGGCCCTAG CCCATGGAAT GCCCCCAACC GCAGGGCAAG GCATAGGCAT TGATAGATTA 300
GTGATGCTAT TCACTGGATC TAAAAGCATT AAAGATGTGA TTTTATTCCC AGCGATGCGT 360
CCTGTTAAAA ACGATTTTAA TGTGGAGAGT GAAGAATAAT GGCGTATTTT TTAGAACAAA 420 CGGATAGTGA AATTTTTGAA CTTATCTTTG AAGAATACAA GCGGCAAAAT GAGCATTTAG 480
AAATGATAGC GAGCG 495
(2) INFORMATION FOR SEQ ID NO:544: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 501 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 66, clone Y120.asm
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 544:
TATGAGTGGG GGGCTTATTT GAAAAGAACG GGATTAGGCG AGCATGAAAT GGCGTTTGCA 60
GGCTGGATGG CATACATAGC GGATCCGGAT AATTTCTTAT ACACCTTATG GAGCAAGCAA 120
GCCGCCTCAG CCATACCCAC TCAAAATGGT TCCTTTTATA AGAGCGACGC CTTTTCCGAT 180
CTGCTCATAA AGGCTAAACG GGTTTCGGAT CAAAAAGAGA GGGAAGCCCT TTATTTAAAG 240
GCTCAAGAAA TTATCCATAA AGATGCACCC TATGTGCCTT TAGCCTACCC TTATTCAGTG 300
GTGCCGCACT TGTCTAAAGT CAAAGGCTAT AAAACGACTG GAGTGAGCGT GAATCGCTTC 360
TTTAAGGTGT ATTTAGAAAA ATAAAAGGGG TTGCATGCTG AGTTTTATCA TTAAGCGTAT 420
TTTGTGGGCG ATCCCCACGC TGTTTGGAGT GAGCATTATC GTGTTTATGA TGGTGCATTT 480 AGTGCCAGGA GATCCGGCGT T 501
(2) INFORMATION FOR SEQ ID NO:545: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1078 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 67, B23 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 545:
CTGCAGCAGG CAATATTGGT GGTGGAGGTT TTGCGGTTAT CCATTTGGCT AATGGTGAAA 60
ATGTTGCCTT AGATTTTAGA GAAAAAGCCC CCTTGAAAGC CACTAAAAAC ATGTTTTTAG 120
ACAAGCAAGG CAATGTAGTC CCTAAACTCA GTGAAGATGG CTATTTGGCG GCTGGAGTTC 180 CTGGAACGGT GGCGGGCATG GAAGCGATGT TGAAAAAATA CGGCACTAAA AAACTATCGC 240
AACTCATTGA TCCTGCCATT AAATTGGCTG AAAATGGTTA TGTGATTTCA CAAAGGACAA 300
GCAGAAACCC TAAAAGAAGC AAGGGGAGCG GTTTTTTAAA ATACACTTCT AGCAAAAAAG 360
TATTTTTTTT AAAAAAGGAC ACCTTGATTA TCAAGAAGGG GATTTGTTTG TCCCAAAAAG 420
ATTTAGCCCA GACTTTGAAT CAAATCAAAA CGCTAGGCGC TAAAGGCTTT TATCAAGGGC 480 AAGTCGCTGA GCTTATCGAG AAAGACATGA AAAAAAATGG AGGGATTATC ACTAAAGAAG 540
ATTTAGCCAG TTACAATGTG AAATGGCGCA AACCCGTGGT AGGGAGTTAT CGTGGGTATA 600
AGATCATTTC TATGTCGCCA CCAAGTTCAG GAGGCACGCA TTTGATCCAG ATTTTAAATG 660
TCATGGAGAA TGCGGATTTA AGCACCCTTG GGTATGGGGC TTCTAAGAAT ATCCATATCG 720
CTGCCGAAGC GATGCGTCAA GCTTATGCGG ACAGATCGGT TTATATGGGA GACGCTGATT 780 TTGCTTCGGT GCCGGTGGAT AAATTGATTA ATAAAGCGTA TGCCAAAAAG ATTTTTGACA 840
CTATCCAGCC AGATACGGTT ACGCCAAGCT CTCAAATCAA ACCAGGAATG GGGCAGTTGC 900
ATGAGGGGAG CAACACCACG CATTATTCTG TAGCGGACAG GTGGGGGAAT GCAGTCAGCG 960
TTACTTACAC CATTAACGCT TCTTATGGAA GCGCTGCTAG TATTGATGGC GCAGGATTTT 1020
TATTGAACAA TGAAATGGAT GATTTTTCCA TAAAGCCTGG GAATCCTAAT CTCTATGG 1078
(2) INFORMATION FOR SEQ ID NO: 546:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 984 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: cluster 68, B9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:546:
AAGCTTCTGG GGTTGAAAAA GGTTCTGACA ACCCGCTCAA AAATAAGATC GCAAAGCTCA 60 CCCACAAGCA AGTGGAAGAG ATTGCGCAAT TGAAAATGGA AGATTTAAAC ACAAGCACCA 120
TGGAAGCGGC CAAAAAAATC GTTATGGGCA GCGCTAGGAG CATGGGCGTA GAAGTTGTGG 180
ATTGATTGGG TTTTGTTGGA ATTGAAAGAA ATTTTTAAGG ATTAGAATCG TGGCAAAAAA 240
AGTATTTAAA AGATTGGAAA AACTTTTTTC TAAAATTCAA AACGATAAAG CGTATGGCGT 300
AGAGCAGGGC GTAGAGGTGG TTAAATCCCT CGCTTCAGCC AAATTTGATG AAACCGTGGA 360 AGTAGCGTTA AGGCTAGGGG TTGATCCAAG GCATGCGGAT CAAATGGTGC GCGGTGCGGT 420
GGTGCTTCCT CATGGAACAG GGAAAAAAGT AAGAGTGGCC GTTTTTGCAA AAGACATCAA 480
GCAAGATGAA GCCAAGAACG CTGGGGCTGA TGTCGTTGGC GGAGATGATT TGGCTGAAGA 540
AATCAAAAAT GGTCGCATTG ATTTTGACAT GGTGATTGCA ACGCCTGATA TGATGGCGGT 600
TGTCGGTAAA GTGGGTAGGA TTTTAGGCCC TAAGGGTTTG ATGCCAAACC CTAAAACCGG 660 AACCGTTACG ATGGATATTG CTAAAGCGGT TACTAACGCT AAAAGCGGTC AAGTGAATTT 720
CAGGGTGGAT AAAAAGGGCA ATGTTCATCC CCCTATTGGT AAGGCGAGTT TTCCTGAAGA 780
AAAAATCAAA GAAAACATGC TTGAGTTGGT TAAAACGATC AACCGCCTAA AACCCAGTAG 840
TGCGAAAGGC AAGTATATTA GAAACGCCGC TCTTTCGCTC ACCATGTCGC CTTCAGTGAG 900
TTTGGACGCG CAAGAATTGA TGGATGTTAA ATAGCGTTAG GAGTTTTTAA TCTTAGGCTG 960 AAGATCGTAA GAGCTAAAAA GCTT 984
(2) INFORMATION FOR SEQ ID NO: 547:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1878 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: cluster 69, A3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:547:
AGCTTGTATC ACAACCAGGT TTTAAGCGGG TTTGCCGGGA GCACGGCGGA CGCTTTTAGC 60
TTGTTTGATA TGTTTGAACG CATTTTAGAG AGCAAAAAAG GGGATTTGTT TAAAAGCGTG 120 GTGGATTTCA GCAAAGAATG GCGCAAGGAC AAGTATTTAC GCCGACTAGA AGCGATGATG 180
ATCGTTTTAA GTTTGGATCT CATTTTCATT TTGAGCGGCA CGGGCGATGT TTTAGAAGCT 240
GAAGACAATA AAATCGCTGC TATTGGGAGT GGGGGGAATT ACGCCTTAGG CGCGGCTAGG 300
GCTTTAGATC ATTTCGCTCA TTTACAGCCT AGAAAACTTG TAGAATAGTC CTTAAAAATC 360
GCAGGGGATC TTTGCATTTA CACCAACACA AATATTAAAA TTTTGGAGCT TTAATGTCTA 420 AATTGAATAT GACCTCACGA GAAATTGTCG CTTATTTAGA TGAATACATC ATTGGGCAAA 480
AGGAAGCTAA AAAGTCTATC GCTATCGCTT TTAGGAATCG TTACAGGCGT TTGCAACTGG 540
AAAAATCCTT ACAAGAAGAA ATCACGCCTA AAAACATTTT AATGATTGGT TCTACTGGCG 600
TGGGTAAGAC TGAAATCGCA AGACGAATAG CAAAAATCAT GGAACTCCCC TTTGTGAAAG 660
TGGAAGCGAG CAAATACACA GAAGTGGGTT TTGTGGGGCG CGATGTGGAG TCTATGGTAA 720 GGGATTTAGT CAATAACAGC GTGCTTTTAG TGGAAAATGA GCATAAAGAA AAATTAAAAG 780
ACAAGATTGA AGAAGCGGTT ATAGAAAAAA TCGCTAAAAA ACTCCTACCC CCCTTGCCTA 840
ATGGCGTGAG CGAAGAAAAA AAACAAGAAT ACGCTAACAG CCTTTTAAAA ATGCAACAAA 900
GGATCGCGCA AGGCGAGTTG GATAGTAAAG AAATTGAAAT TGAAGTGCGT AAAAAAAGCA 960
TAGAGATTGA TTCTAATGTG CCGCCTGAAA TTTTAAGGGT TCAAGAAAAT GTGATTAAGT 1020 TTTTCCATAA AGAACAGGAT AAAGTCAAAA AAACTTTAAG CGTTAAAGAG GCTAAAGAAG 1080
CCCTAAAAGC AGAAATCAGC GACACGCTTT TAGACAGCGA AGCCATTAAA ATGGAAGGTT 1140
TGAAGCGCGC GGAAAGTTCA GGGGTGATTT TTATTGATGA AATTGATAAG ATCGCTGTAA 1200
GCTCTAAAGA AGGAAGCCGT CAAGATCCCA GTAAAGAGGG GGTTCAAAGG GATTTGTTGC 1260
CGATTGTAGA GGGGAGCGTG GTGAATACGA AGTATGGTTC TATTAAAACA GAGCATATTT 1320 TATTCATTGC AGCAGGGGCG TTTCATCTTT CTAAACCAAG CGATTTGATC CCTGAATTGC 1380
AGGGGCGTTT CCCTTTAAGG GTGGAGTTAG AAAATTTAAC CGAAGAAATC ATGTATATGA 1440
TTTTAACCCA AACTAAGACC TCTATCATCA AGCAATACCA AGCCCTTTTA AAAGTGGAGG 1500
GCGTAGAAAT TGCGTTTGAA GACGATGCGA TCAAAGAGTT AGCCAAACTT TCTTATAACG 1560
CCAATCAAAA AACCGAACAT ATAGGCGCTA GAAGGTTGCA CACCACCATT GAAAAAGTGC 1620 TAGAAGACAT TACTTTTGAA CCGGAGGATT ATTCGGGGCA AAATATTACT ACCACTAAAG 1680
AATTGGTTCA ATCCAACCTA GAGGATTTAG TGGCTGATGA AAATTTGGTG AAGTATATTT 1740
TATGATGAAA ACTAAGGCGG GCTTTGTATC TCTCATGGGC AAACCAAACG CTGGAAAAAG 1800
CACTCTTTTA AACACTTTAT TCGCCCTATA GTGAGTCGTA TTACAATTCA CTGGCCGTCG 1860
TTTTACAACG TCGTGACT 1878
(2) INFORMATION FOR SEQ ID NO: 548:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 27 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: primer CP
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:548:
GCGGATCCAT CCGGAGCCAT TTGGTTC 27
(2) INFORMATION FOR SEQ ID NO: 549:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1484 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: b8
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 549: CTTCGTGTGC CTACCACTAA TTTGATTTCT TGCGAATAAA GCTAGCTTGA ATGCGGGCAA 60
TATCCAAATC CAAAACATGC CCAAAGTTAA AGAGCGAGTG AGTGTCCCCT CTAAAGACGA 120
TACGATCTAT TCTTACCACG ATTCTATTAA GGATTCGATT AAGGCGGTGG TGAATATCTC 180
TACTGAAAAA AAGATTAAAA ACAATTTTAT AGGTGGCGGT GTGTTTAATG ACCCCTTTTT 240 CCAACAATTT TTTGGGGATT TGGGTGGCAT GATCCCTAAA GAAAGAATGG AAAGGGCTTT 300
AGGCAGCGGT GTCATCATTT CTAAAGACGG CTATATTGTA ACTAATAACC ATGTGATTGA 360
TGGCGCGGAT AAGATTAAAG TGACCATTCC AGGGAGCAAT AAAGAATATT CCGCTACTTT 420
AGTAGGCACG GATTCTGAAA GCGATTTAGC CGTGATTCGC ATCACTAAAG ACAACTTGCC 480
CACGATCAAA TTCTCTGATT CTAACGATAT TTCAGTGGGC GATTTGGTTT TTGCGATTGG 540 TAACCCTTTT GGCGTGGGTG AAAGCGTTAC TCAAGGCATT GTTTCAGCGC TCAATAAAAG 600
CGGGATTGGG ATCAACAGCT ATGAGAATTT CATTCAAACA GACGCCTCTA TTAATCCTGG 660
AAATTCCGGC GGCGCTTTAA TTGATAGCCG TGGAGGGTTA GTGGGGATTA ATACCGCTAT 720
CATCTCTAAA ACTGGGGGCA ACCACGGCAT TGGCTTTGCC ATCCTTTCTA ACATGGTTAA 780
AGATATTGTA ACCCAATTCA TCAAAACCGG TAAGATTGAA AGAGGTTACT TGGGCGTGGG 840 CTTGCAAGAT TTGAGCGGCG ATTTGCAAAA TTCTTATGAC AATAAAGAAG GGGCGGTAGT 900
CATTAGCGTA GAAAAAGACT CCCCGGCTAA AAAAGCAGGG ATTTTGGTGT GGGATTTGAT 960
CACCGAAGTC AATGGCAAAA AGGTTAAAAA CACGAACGAA TTGAGAAATC TAATCGGCTC 1020
TATGCTACCC AATCAAAGGG TAACCTTAAA GGTCATTAGA GACAAAAAAG AACGCGCCTT 1080
CACCCTCACA CTTGCTGAAA GGAAAAACCC TAACAAAAAA GAAACCATTT CTGCTCAAAA 1140 CGGCGCGCAA GGCCAATTGA ACGGGCTTCA AGTAGAAGAT TTAACCCAAA AAACCAAAAG 1200
GTCTATGCGT TTGAGCGATG ATGTTCAAGG GGTTTTAGTC TCTCAAGTGA ATGAAAATTC 1260
CCCAGCAGAG CAAGCCGGAT TTAGGCAAGG TAACATTATC ACACAAATTG AAGAGGTTGA 1320
AGTTAAAAGC GTTGCGGATT TTAACCATGC TTTAGAAAAG TATAAAGGCA AACCCACACG 1380
ATTCTTAGTT TTAGATTTGA ATCAAGGTTA TAGGATCATT TTGGTGAAAT GATAGAGGTG 1440 GGTTGTTAGT CGCATGTCTT TGATTAGAGT GAATGGGGAA GCTT 1484
(2) INFORMATION FOR SEQ ID NO: 550:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 451 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: b8
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:550:
Met Pro Lys Val Lys Glu Arg Val Ser Val Pro Ser Lys Asp Asp Thr 1 5 10 15
He Tyr Ser Tyr His Asp Ser He Lys Asp Ser He Lys Ala Val Val
20 25 30
Asn He Ser Thr Glu Lys Lys He Lys Asn Asn Phe He Gly Gly Gly 35 40 45 Val Phe Asn Asp Pro Phe Phe Gin Gin Phe Phe Gly Asp Leu Gly Gly 50 55 60
Met He Pro Lys Glu Arg Met Glu Arg Ala Leu Gly Ser Gly Val He 65 70 75 80
He Ser Lys Asp Gly Tyr He Val Thr Asn Asn His Val He Asp Gly 85 90 95
Ala Asp Lys He Lys Val Thr He Pro Gly Ser Asn Lys Glu Tyr Ser
100 105 110
Ala Thr Leu Val Gly Thr Asp Ser Glu Ser Asp Leu Ala Val He Arg 115 120 125 He Thr Lys Asp Asn Leu Pro Thr He Lys Phe Ser Asp Ser Asn Asp 130 135 140
He Ser Val Gly Asp Leu Val Phe Ala He Gly Asn Pro Phe Gly Val 145 150 155 160
Gly Glu Ser Val Thr Gin Gly He Val Ser Ala Leu Asn Lys Ser Gly 165 170 175
He Gly He Asn Ser Tyr Glu Asn Phe He Gin Thr Asp Ala Ser He
180 185 190
Asn Pro Gly Asn Ser Gly Gly Ala Leu He Asp Ser Arg Gly Gly Leu 195 200 205 Val Gly He Asn Thr Ala He He Ser Lys Thr Gly Gly Asn His Gly 210 215 220
He Gly Phe Ala He Leu Ser Asn Met Val Lys Asp He Val Thr Gin 225 230 235 240
Phe He Lys Thr Gly Lys He Glu Arg Gly Tyr Leu Gly Val Gly Leu
245 250 255
Gin Asp Leu Ser Gly Asp Leu Gin Asn Ser Tyr Asp Asn Lys Glu Gly 260 265 270
Ala Val Val He Ser Val Glu Lys Asp Ser Pro Ala Lys Lys Ala Gly
275 280 285
He Leu Val Trp Asp Leu He Thr Glu Val Asn Gly Lys Lys Val Lys
290 295 300 Asn Thr Asn Glu Leu Arg Asn Leu He Gly Ser Met Leu Pro Asn Gin
305 310 315 320
Arg Val Thr Leu Lys Val He Arg Asp Lys Lys Glu Arg Ala Phe Thr
325 330 335
Leu Thr Leu Ala Glu Arg Lys Asn Pro Asn Lys Lys Glu Thr He Ser 340 345 350
Ala Gin Asn Gly Ala Gin Gly Gin Leu Asn Gly Leu Gin Val Glu Asp
355 360 365
Leu Thr Gin Lys Thr Lys Arg Ser Met Arg Leu Ser Asp Asp Val Gin
370 375 380 Gly Val Leu Val Ser Gin Val Asn Glu Asn Ser Pro Ala Glu Gin Ala
385 390 395 400
Gly Phe Arg Gin Gly Asn He He Thr Gin He Glu Glu Val Glu Val
405 410 415
Lys Ser Val Ala Asp Phe Asn His Ala Leu Glu Lys Tyr Lys Gly Lys 420 425 430
Pro Thr Arg Phe Leu Val Leu Asp Leu Asn Gin Gly Tyr Arg He He
435 440 445
Leu Val Lys 450
(2) INFORMATION FOR SEQ ID NO: 551:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 230 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: dllORFl
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 551:
Ala Tyr Thr Glu Thr Val Arg Ala Phe Asn Leu Ala Glu Met Leu Met 1 5 10 15 Thr Pro Val Phe Leu Leu Met Asp Glu Thr Val Gly His Met Tyr Gly 20 25 30
Lys Val Gin He Pro Asp Leu Glu Glu Val Gin Lys Thr Thr He Asn
35 40 45
Arg Lys Glu Phe Val Gly Asp Lys Lys Asp Tyr Lys Pro Tyr Gly Val 50 55 60
Ala Gin Asp Glu Pro Ala Val Leu Asn Pro Phe Phe Lys Gly Tyr Arg 65 70 75 80
Tyr His Val Ser Gly Leu His His Gly Pro He Gly Phe Pro Thr Glu 85 90 95 Asp Ala Lys He Gly Gly Asp Leu Thr Asp Arg Leu Phe Asn Lys He 100 105 110
Glu Ser Lys Gin Asp He He Asn Glu Asn Glu Glu Met Asp Leu Glu
115 120 125
Gly Ala Glu He Val He He Ala Tyr Gly Ser Val Ser Leu Pro Val 130 135 140
Lys Glu Ala Leu Lys Asp Tyr His Lys Glu Ser Lys Gin Lys Val Gly
145 150 155 160
Phe Phe Arg Pro Lys Thr Leu Trp Pro Ser Pro Ala Lys Arg Leu Lys
165 170 175 Glu He Gly Asp Lys Tyr Glu Lys He Leu Val He Glu Leu Asn Lys
180 185 190
Gly Gin Tyr Leu Glu Lys He Glu Arg Ala Met Gin Arg Lys Val His 195 200 205
Phe Leu Gly Gin Ala Asn Gly Arg Thr He Ser Pro Lys Gin He He
210 215 220
Ala Lys Leu Lys Glu Leu 225 230
(2) INFORMATION FOR SEQ ID NO: 552:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 273 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: dllORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:552:
Met Ala Phe Asn Tyr Asp Glu Tyr Leu Arg Val Asp Lys He Pro Thr 1 5 10 15
Leu Trp Cys Trp Gly Cys Gly Asp Gly Val He Leu Lys Ser He He
20 25 30
Arg Thr He Asp Ala Leu Gly Trp Lys Met Asp Asp Val Cys Leu Val 35 40 45 Ser Gly He Gly Cys Ser Gly Arg Met Ser Ser Tyr Val Asn Cys Asn 50 55 60
Thr Val His Thr Thr His Gly Arg Ala Val Ala Tyr Ala Thr Gly He 65 70 75 80
Lys Met Ala Asn Pro Ser Lys His Val He Val Val Ser Gly Asp Gly 85 90 95
Asp Gly Phe Ala He Gly Gly Asn His Thr Met His Ala Cys Arg Arg
100 105 110
Asn He Asp Leu Asn Phe He Leu Val Asn Asn Phe He Tyr Gly Leu 115 120 125 Thr Asn Ser Gin Thr Ser Pro Thr Thr Pro Asn Gly Met Trp Thr Val 130 135 140
Thr Ala Gin Trp Gly Asn He Asp Asn Gin Phe Asp Pro Cys Ala Leu 145 150 155 160
Thr Thr Ala Ala Gly Ala Ser Phe Val Ala Arg Glu Ser Val Leu Asp 165 170 175
Pro Gin Lys Leu Glu Lys Val Leu Lys Glu Gly Phe Ser His Lys Gly
180 185 190
Phe Ser Phe Phe Asp Val His Ser Asn Cys His He Asn Leu Gly Arg 195 200 205 Lys Asn Lys Met Gly Glu Ala Ser Gin Met Leu Lys Trp Met Glu Ser 210 215 220
Arg Leu Val Ser Lys Arg Gin Phe Glu Ala Met Ser Pro Glu Glu Arg 225 230 235 240
Val Asp Lys Phe Pro Thr Gly Val Leu Lys His Asp Thr Asp Arg Lys 245 250 255
Glu Tyr Cys Glu Ala Tyr Gin Glu He He Glu Lys Ala Gin Gly Lys
260 265 270
Gin
(2) INFORMATION FOR SEQ ID NO: 553:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 181 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: dllORF3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 553: Met Glu Ala Gin Leu Arg Phe Thr Gly Val Gly Gly Gin Gly Val Leu
1 5 10 15
Leu Ala Gly Glu He Leu Ala Glu Ala Lys He Val Ser Gly Gly Tyr 20 25 30 Gly Thr Lys Thr Ser Thr Tyr Thr Ser Gin Val Arg Gly Gly Pro Thr 35 40 45
Lys Val Asp He Leu Leu Asp Arg Asn Glu He He Phe Pro Tyr Ala
50 55 60
Lys Glu Gly Glu He Asp Phe Met Leu Ser Val Ala Gin He Ser Tyr 65 70 75 80
Asn Gin Phe Lys Ser Asp He Lys Gin Gly Gly He Val Val He Asp
85 90 95
Pro Asn Leu Val Thr Pro Thr Lys Glu Asp Glu Glu Lys Tyr Gin Leu 100 105 110 Tyr Lys He Pro He He Ser He Ala Lys Asp Glu Val Gly Asn He 115 120 125
He Thr Gin Ser Val Val Ala Leu Ala He Thr Val Glu Leu Thr Lys
130 135 140
Cys Val Glu Glu Asn He Val Leu Asp Thr Met Leu Lys Lys Val Pro 145 150 155 160
Ala Lys Val Ala Asp Thr Asn Lys Lys Ala Phe Glu He Gly Lys Lys
165 170 175
His Ala Leu Glu Ala 180
(2) INFORMATION FOR SEQ ID NO:554:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 410 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: e6
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 554:
Met Ser Lys He Ala Asp Asp Gin Asn Phe Asn Asp Glu Glu Glu Asn 1 5 10 15 Phe Ala Lys Leu Phe Lys Lys Glu Leu Glu Lys Glu Glu Thr Leu Glu 20 25 30
Lys Gly Thr He Lys Glu Gly Leu He Val Ser He Asn Glu Asn Asp
35 40 45
Gly Tyr Ala Met Val Ser Val Gly Gly Lys Thr Glu Gly Arg Leu Ala 50 55 60
Leu Asn Glu He Thr Asp Glu Lys Gly Gin Leu Leu Tyr Gin Lys Asn 65 70 75 80
Asp Pro He He Val His Val Ser Glu Lys Gly Glu His Pro Ser Val 85 90 95 Ser Tyr Lys Lys Ala He Ser Gin Gin Lys He Gin Ala Lys He Glu 100 105 110
Glu Leu Gly Glu Asn Tyr Glu Asn Ala He He Glu Gly Lys He Val
115 120 125
Gly Lys Asn Lys Gly Gly Tyr He Val Glu Ser Gin Gly Val Glu Tyr 130 135 140
Phe Leu Ser Arg Ser His Ser Ser Leu Lys Asn Asp Ala Asn His He
145 150 155 160
Gly Lys Arg He Lys Ala Cys He He Arg Val Asp Lys Glu Asn His
165 170 175 Ser He Asn He Ser Arg Lys Arg Phe Phe Glu Val Asn Asp Lys Arg
180 185 190
Gin Leu Glu Val Ser Lys Glu Leu Leu Glu Ala Thr Glu Pro Val Leu
195 200 205
Gly Val Val Arg Gin He Thr Pro Phe Gly He Phe Val Glu Ala Lys 210 215 220
Gly He Glu Gly Leu Val His Tyr Ser Glu He Ser His Lys Gly Pro 225 230 235 240 Val Asn Pro Glu Lys Tyr Tyr Lys Glu Gly Asp Glu Val Tyr Val Lys
245 250 255
Ala He Ala Tyr Asp Ala Glu Lys Arg Arg Leu Ser Leu Ser He Lys 260 265 270 Ala Thr He Glu Asp Pro Trp Glu Glu He Gin Asp Lys Leu Lys Pro 275 280 285
Gly Tyr Ala He Lys Val Val Val Ser Asn He Glu His Tyr Gly Val
290 295 300
Phe Val Asp He Gly Asn Asp He Glu Gly Phe Leu His Val Ser Glu 305 310 315 320
He Ser Trp Asp Lys Asn Val Ser His Pro Ser His Tyr Leu Ser Val
325 330 335
Gly Gin Glu He Asp Val Lys He He Asp He Asp Pro Lys Asn Arg 340 345 350 Arg Leu Arg Val Ser Leu Lys Gin Leu Thr Asn Arg Pro Phe Asp Val 355 360 365
Phe Glu Ser Lys His Gin Val Gly Asp He Val Glu Gly Gin Ser Gly
370 375 380
Asp Phe Asn Gly Phe Trp Gly Val Phe Glu Ser Gly Trp Ser Gly Trp 385 390 395 400
Leu Ala Pro Gin Ser Arg Arg Phe Leu Gly 405 410
(2) INFORMATION FOR SEQ ID NO: 555:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 179 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y175A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 555:
Met Lys Arg Ser Ser Val Phe Ser Phe Leu Val Ala Phe Leu Leu Val
1 5 10 15
Ala Gly Cys Ser His Lys Met Asp Asn Lys Thr Val Ala Gly Asp Val 20 25 30 Ser Ala Lys Thr Val Gin Thr Ala Pro Val Thr Thr Glu Pro Ala Pro 35 40 45
Glu Lys Glu Glu Pro Lys Gin Glu Pro Ala Pro Val Val Glu Glu Lys
50 55 60
Pro Ala Val Glu Ser Gly Thr He He Ala Ser He Tyr Phe Asp Phe 65 70 75 80
Asp Lys Tyr Glu He Lys Glu Ser Asp Gin Glu Thr Leu Asp Glu He
85 90 95
Val Gin Lys Ala Lys Glu Asn His Met Gin Val Leu Leu Glu Gly Asn 100 105 110 Thr Asp Glu Phe Gly Ser Ser Glu Tyr Asn Gin Ala Leu Gly Val Lys 115 120 125
Arg Thr Leu Ser Val Lys Asn Ala Leu Val He Lys Gly Val Glu Lys
130 135 140
Asp Met He Lys Thr He Ser Phe Gly Glu Thr Lys Pro Lys Cys Ala 145 150 155 160
Gin Lys Thr Arg Glu Cys Tyr Lys Glu Asn Arg Arg Val Asp Val Lys
165 170 175
Leu Met Lys
(2) INFORMATION FOR SEQ ID NO: 556:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 104 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y89A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 556:
He Tyr Phe Asp Phe Asp Lys Tyr Glu He Lys Glu Ser Asp Gin Glu
1 5 10 15
Thr Leu Asp Glu He Val Gin Lys Ala Lys Glu Asn His Met Gin Val 20 25 30 Leu Leu Glu Gly Asn Thr Asp Glu Phe Gly Ser Ser Glu Tyr Asn Gin 35 40 45
Ala Leu Gly Val Lys Arg Thr Leu Ser Val Lys Asn Ala Leu Val He
50 55 60
Lys Gly Val Glu Lys Asp Met He Lys Thr He Ser Phe Gly Glu Thr 65 70 75 80
Lys Pro Lys Cys Ala Gin Lys Thr Arg Glu Cys Tyr Lys Glu Asn Arg
85 90 95
Arg Val Asp Val Lys Leu Met Lys 100
(2) INFORMATION FOR SEQ ID NO: 557:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 288 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y261A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 557:
Val Thr Gin Gly He Val Ser Ala Leu Asn Lys Ser Gly He Gly He 1 5 10 15 Asn Ser Tyr Glu Asn Phe He Gin Thr Asp Ala Ser He Asn Pro Gly 20 25 30
Asn Ser Gly Gly Ala Leu He Asp Ser Arg Gly Gly Leu Val Gly He
35 40 45
Asn Thr Ala He He Ser Lys Thr Gly Gly Asn His Gly He Gly Phe 50 55 60
Ala He Pro Ser Asn Met Val Lys Asp Thr Val Thr Gin Leu He Lys 65 70 75 80
Thr Gly Lys He Glu Arg Gly Tyr Leu Gly Val Gly Leu Gin Asp Leu 85 90 95 Ser Gly Asp Leu Gin Asn Ser Tyr Asp Asn Lys Glu Gly Ala Val Val 100 105 110
He Ser Val Glu Lys Asp Ser Pro Ala Lys Lys Ala Gly He Leu Val
115 120 125
Trp Asp Leu He Thr Glu Val Asn Gly Lys Lys Val Lys Asn Thr Asn 130 135 140
Glu Leu Arg Asn Leu He Gly Ser Met Leu Pro Asn Gin Arg Val Thr
145 150 155 160
Leu Lys Val He Arg Asp Lys Lys Glu Arg Ala Phe Thr Leu Thr Leu
165 170 175 Ala Glu Arg Lys Asn Pro Asn Lys Lys Glu Thr He Ser Ala Gin Asn
180 185 190
Gly Ala Gin Gly Gin Leu Asn Gly Leu Gin Val Glu Asp Leu Thr Gin
195 200 205
Glu Thr Lys Arg Ser Met Arg Leu Ser Asp Asp Val Gin Gly Val Leu 210 215 220
Val Ser Gin Val Asn Glu Asn Ser Pro Ala Glu Gin Ala Gly Phe Arg
225 230 235 240
Gin Gly Asn He He Thr Lys He Glu Glu Val Glu Val Lys Ser Val
245 250 255 Ala Asp Phe Asn His Ala Leu Glu Lys Tyr Lys Gly Lys Pro Lys Arg
260 265 270
Phe Leu Val Leu Asp Leu Asn Gin Gly Tyr Arg He He Leu Val Lys 275 280 285
(2) INFORMATION FOR SEQ ID NO:558: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 236 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Z25C
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 558: Glu Phe Pro Leu Thr Phe Pro Ser Lys Tyr His Ala Glu His Leu Ala 1 5 10 15
Gly Lys Glu Ala Phe Phe Lys Val Lys Leu His Gin He Gin Ala Arg
20 25 30
Glu Met Leu Glu He Asn Asp Glu Leu Ala Lys He Val Leu Ala Asn 35 40 45
Glu Glu Asn Ala Thr Leu Lys Leu Leu Lys Glu Arg Val Glu Gly Gin
50 55 60
Leu Phe Leu Glu Asn Lys Ala Arg Leu Tyr Asn Glu Glu Leu Lys Glu 65 70 75 80 Lys Leu He Glu Asn Leu Asp Glu Lys He Val Phe Asp Leu Pro Lys
85 90 95
Thr He He Glu Gin Glu Met Asp Leu Leu Phe Arg Asn Ala Leu Tyr
100 105 110
Ser Met Gin Ala Glu Glu Val Lys Ser Leu Gin Glu Ser Gin Glu Lys 115 120 125
Ala Lys Glu Lys Arg Glu Ser Phe Arg Asn Asp Ala Thr Lys Ser Val
130 135 140
Lys He Thr Phe He He Asp Ala Leu Ala Lys Glu Glu Lys He Gly 145 150 155 160 Val His Asp Asn Glu Val Phe Gin Thr Leu Tyr Tyr Glu Ala Met Met
165 170 175
Thr Gly Gin Asn Pro Glu Asn Leu He Glu Gin Tyr Arg Lys Asn Asn
180 185 190
Met Leu Ala Ala Val Lys Met Ala Met He Glu Asp Arg Val Leu Ala 195 200 205
Tyr Leu Leu Asp Lys Asn Leu Pro Lys Glu Gin Gin Glu He Leu Glu
210 215 220
Lys Met Arg Pro Asn Ala Gin Lys He Gin Ala Gly 225 230 235
(2) INFORMATION FOR SEQ ID NO: 559:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 51 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y291A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:559:
Met Lys Phe Gin Pro Leu Gly Glu Arg Val Leu Val Glu Arg Leu Glu 1 5 10 15 Glu Glu Asn Lys Thr Ser Ser Gly He He He Pro Asp Asn Ala Lys 20 25 30
Glu Lys Pro Leu Met Gly Val Val Lys Ala Val Ser His Lys He Ser
35 40 45
Glu Gly Cys 50
(2) INFORMATION FOR SEQ ID NO:560: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 265 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:560:
Met He Leu Arg Ala Ser Val Leu Ser Ala Leu Leu Leu Val Gly Leu
1 5 10 15
Gly Ala Ala Pro Lys His Ser Val Ser Ala Asn Asp Lys Arg Met Gin 20 25 30 Asp Asn Leu Val Ser Val He Glu Lys Gin Thr Asn Lys Lys Val Arg 35 40 45
He Leu Glu He Lys Pro Leu Lys Ser Ser Gin Asp Leu Lys Met Val
50 55 60
Val He Glu Asp Pro Asp Thr Lys Tyr Asn He Pro Leu Val Val Ser 65 70 75 80
Lys Asp Gly Asn Leu He He Gly Leu Ser Asn He Phe Phe Ser Asn
85 90 95
Lys Ser Asp Asp Val Gin Leu Val Ala Glu Thr Asn Gin Lys Val Gin 100 105 110 Ala Leu Asn Ala Thr Gin Gin Asn Ser Ala Lys Leu Asn Ala He Phe 115 120 125
Asn Glu He Pro Ala Asp Tyr Ala He Glu Leu Pro Ser Thr Asn Ala
130 135 140
Ala Asn Lys Asp Lys He Leu Tyr He Val Ser Asp Pro Met Cys Pro 145 150 155 160
His Cys Gin Lys Glu Leu Thr Lys Leu Arg Asp His Leu Lys Glu Asn
165 170 175
Thr Val Arg Met Val Val Val Gly Trp Leu Gly Val Asn Ser Ala Lys 180 185 190 Lys Ala Ala Leu He Gin Glu Glu Met Ala Lys Ala Arg Ala Arg Gly 195 200 205
Ala Ser Val Glu Asp Lys He Ser He Leu Glu Lys He Tyr Ser Thr
210 215 220
Gin Tyr Asp He Asn Ala Gin Lys Glu Pro Glu Asp Leu Arg Thr Lys 225 230 235 240
Val Glu Asn Thr Thr Lys Lys He Phe Glu Ser Gly Val He Lys Gly
245 250 255
Val Pro Phe Leu Tyr His Tyr Lys Ala 260 265
(2) INFORMATION FOR SEQ ID NO: 561:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 106 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y98A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 561:
Met Lys Lys Pro Tyr Arg Lys He Ser Asp Tyr Ala He Val Gly Gly 1 5 10 15 Leu Ser Ala Leu Val Met Val Ser He Val Gly Cys Lys Ser Asn Ala 20 25 30
Asp Asp Lys Pro Lys Glu Gin Ser Ser Leu Ser Gin Ser Val Gin Lys
35 40 45
Gly Ala Phe Val He Leu Glu Glu Gin Lys Asp Lys Ser Tyr Lys Val 50 55 60
Val Glu Glu Tyr Pro Ser Ser Arg Thr His He He Val Arg Asp Leu 65 70 75 80 Gin Gly Asn Glu Arg Val Leu Ser Asn Glu Glu He Gin Lys Leu He
85 90 95
Lys Glu Glu Glu Ala Lys He Asp Asn Gly 100 105
(2) INFORMATION FOR SEQ ID NO:562:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 125 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y173A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:562:
Met Ala He Ser Lys Glu Glu Val Leu Glu Tyr He Gly Ser Leu Ser 1 5 10 15 Val Leu Glu Leu Ser Glu Leu Val Lys Met Phe Glu Lys Lys Phe Gly 20 25 30
Val Ser Ala Thr Pro Thr Val Val Ala Gly Ala Ala Val Ala Gly Gly
35 40 45
Ala Ala Ala Glu Ser Glu Glu Lys Thr Glu Phe Asn Val He Leu Ala 50 55 60
Asp Ser Gly Ala Glu Lys He Lys Val He Lys Val Val Arg Glu He 65 70 75 80
Thr Gly Leu Gly Leu Lys Glu Ala Lys Asp Ala Thr Glu Lys Thr Pro 85 90 95 His Val Leu Lys Glu Gly Val Asn Lys Glu Glu Ala Glu Thr He Lys 100 105 110
Lys Lys Leu Glu Glu Val Gly Ala Lys Val Glu Val Lys 115 120 125 (2) INFORMATION FOR SEQ ID NO:563:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 210 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 12b (xi) SEQUENCE DESCRIPTION: SEQ ID NO:563:
Met Leu Leu Lys Thr Lys Leu Lys He He Ser Ser Val He Leu Ser
1 5 10 15
Ala Leu Leu Trp Val Gly Cys Ser Ser Glu Met Ala Thr Tyr Gin Asn 20 25 30
Val Asn Asp Ala Thr Lys Asn Thr Thr Ala Ser He Asn Ser Thr Asp
35 40 45
Leu Leu Leu Thr Ala Asn Ala Met Leu Asp Ser Met Phe Ser Asp Pro 50 55 60 Asn Phe Glu Gin Leu Lys Gly Lys His Leu He Glu Val Ser Asp Val 65 70 75 80
He Asn Asp Thr Thr Gin Pro Asn Leu Asp Met Asn Leu Leu Thr Thr
85 90 95
Glu He Ala Arg Gin Leu Arg Leu Arg Ser Asn Gly Arg Phe Asn He 100 105 110
Thr Arg Ala Ser Gly Gly Ser Gly He Ala Ala Asp Ser Arg Met Val
115 120 125
Lys Gin Arg Glu Lys Glu Arg Glu Ser Glu Glu Tyr Asn Gin Asp Thr
130 135 140 Thr Val Glu Lys Gly Thr Leu Lys Ala Ala Asp Leu Ser Leu Ser Gly
145 150 155 160
Lys Val Ser Ser He Ala Ala Ser He Ser Ser Ser Arg Gin Arg Leu 165 170 175
Asp Tyr Asp Phe Thr Leu Ser Leu Thr Asn Arg Lys Thr Gly Glu Glu
180 185 190
Val Trp Ser Asp Val Lys Pro He Val Lys Asn Ala Ser Asn Lys Arg 195 200 205
Met Phe 210
(2) INFORMATION FOR SEQ ID NO:564:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 130 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 12d
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 564:
Leu Trp He Lys Cys Ala Lys Ser Trp Ala Trp Leu Lys Ser Arg Val
1 5 10 15
Leu Met Lys Arg Leu Ala He Ala Leu Val Leu Val Leu Gly Val Ala 20 25 30 Trp Gly Lys Ser Leu Pro Lys Trp Ala Lys Asp Cys Ser Lys Gly Val 35 40 45
Gin He Glu Lys Thr Gin Thr Lys Asp Glu Lys Phe Leu Val Cys Gly
50 55 60
Met Ser Asp He Leu Leu Ser Asp Met Asp Tyr Ser Leu Ser Ser Ala 65 70 75 80
Arg Gin Asn Ala Leu Glu Lys Val Met Glu Ala Phe Lys Gly Asp Lys
85 90 95
He Glu He Lys Ala Ser Glu Leu Lys Ala Thr Phe He Asp Thr Asp 100 105 110 Lys Val Tyr Val Leu Leu Lys He Thr Lys Lys His Val Ala Leu Met 115 120 125
Asn Glu 130 (2) INFORMATION FOR SEQ ID NO: 565:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 256 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y139A (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 565:
Met Gly Tyr Ala Ser Lys Leu Ala Leu Lys He Cys Leu Ala Ser Leu
1 5 10 15
Cys Leu Phe Ser Ala Leu Gly Ala Glu His Leu Glu Gin Lys Arg Asn 20 25 30
Tyr He Tyr Lys Gly Glu Glu Ala Tyr Asn Asn Lys Glu Tyr Glu Arg
35 40 45
Ala Ala Ser Phe Tyr Lys Ser Ala He Lys Asn Gly Glu Pro Leu Ala 50 55 60 Tyr Val Leu Leu Gly He Met Tyr Glu Asn Gly Arg Gly Val Pro Lys 65 70 75 80
Asp Glu Lys Lys Ala Ala Glu Tyr Phe Gin Lys Ala Val Asp Asn Asp
85 90 95
He Pro Arg Gly Tyr Asn Asn Leu Gly Val Met Tyr Lys Glu Gly Arg 100 105 110
Gly Val Pro Lys Asp Glu Lys Lys Ala Val Glu Tyr Phe Arg He Ala 115 120 125 Thr Glu Lys Gly Tyr Thr Asn Ala Tyr He Asn Leu Gly He Met Tyr
130 135 140
Met Glu Gly Arg Gly Val Pro Ser Asn Tyr Val Lys Ala Thr Glu Cys 145 150 155 160 Phe Arg Lys Ala Met His Lys Gly Asn Val Glu Ala Tyr He Leu Leu
165 170 175
Gly Asp He Tyr Tyr Ser Gly Asn Asp Gin Leu Gly He Glu Pro Asp
180 185 190
Lys Asp Lys Ala He Val Tyr Tyr Lys Met Ala Ala Asp Met Ser Ser 195 200 205
Ser Arg Ala Tyr Glu Gly Leu Ala Glu Ser Tyr Gin Tyr Gly Leu Gly
210 215 220
Val Glu Lys Asp Lys Lys Lys Ala Glu Glu Tyr Met Gin Lys Ala Cys 225 230 235 240 Asp Phe Asp He Asp Lys Asn Cys Lys Lys Lys Asn Thr Ser Ser Arg
245 250 255
(2) INFORMATION FOR SEQ ID NO: 566: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 95 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y143A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:566: Gin Lys Ala Val Asp Asn Asp He Pro Arg Gly Tyr Asn Asn Leu Gly
1 5 10 15
Val Met Tyr Lys Glu Gly Arg Gly Val Pro Lys Asp Glu Lys Lys Ala
20 25 30
Val Glu Tyr Phe Arg He Ala Thr Glu Lys Gly Tyr Thr Asn Ala Tyr 35 40 45
He Asn Leu Gly He Met Tyr Met Glu Gly Arg Gly Val Pro Ser Asn
50 55 60
Tyr Val Lys Ala Thr Glu Cys Phe Arg Lys Ala Met His Lys Gly Asn 65 70 75 80 Val Glu Ala Tyr He Leu Leu Gly Asp He Tyr Tyr Ser Gly Asn
85 90 95
(2) INFORMATION FOR SEQ ID NO: 567: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 276 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y212A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:567: Lys Lys Val Gly Gly Lys Glu Glu He Thr Gin Val Ala Thr He Ser 1 5 10 15
Ala Asn Ser Asp His Asn He Gly Lys Leu He Ala Asp Ala Met Glu
20 25 30
Lys Val Gly Lys Asp Gly Val He Thr Val Glu Glu Ala Lys Gly He 35 40 45
Glu Asp Glu Leu Asp Val Val Glu Gly Met Gin Phe Asp Arg Gly Tyr
50 55 60
Leu Ser Pro Tyr Phe Val Thr Asn Ala Glu Lys Met Thr Ala Gin Leu 65 70 75 80 Asp Asn Ala Tyr He Leu Leu Thr Asp Lys Lys He Ser Ser Met Lys
85 90 95
Asp He Leu Pro Leu Leu Glu Lys Thr Met Lys Glu Gly Lys Pro Leu 100 105 110
Leu He He Ala Glu Asp He Glu Gly Glu Ala Leu Thr Thr Leu Val
115 120 125
Val Asn Lys Leu Arg Gly Val Leu Asn He Ala Ala Val Lys Ala Pro 130 135 140
Gly Phe Gly Asp Arg Arg Lys Glu Met Leu Lys Asp He Ala He Leu
145 150 155 160
Thr Gly Gly Gin Val He Ser Glu Glu Leu Gly Leu Ser Leu Glu Asn
165 170 175 Ala Glu Val Glu Phe Leu Gly Lys Ala Gly Arg He Val He Asp Lys
180 185 190
Asp Asn Thr Thr He Val Asp Gly Lys Gly His Ser His Asp Val Lys
195 200 205
Asp Arg Val Ala Gin He Lys Thr Gin He Ala Ser Thr Thr Ser Asp 210 215 220
Tyr Asp Lys Glu Lys Leu Gin Glu Arg Leu Ala Lys Leu Ser Gly Gly 225 230 235 240
Val Ala Val He Lys Val Gly Ala Ala Ser Glu Val Glu Met Lys Glu 245 250 255 Lys Lys Asp Arg Val Asp Asp Ala Leu Ser Ala Thr Lys Ala Ala Val 260 265 270
Glu Glu Gly He 275 (2) INFORMATION FOR SEQ ID NO: 568:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 65 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y124A (xi) SEQUENCE DESCRIPTION: SEQ ID NO:568:
Glu Gly Met Gin Phe Asp Arg Gly Tyr Leu Ser Pro Tyr Phe Val Thr
1 5 10 15
Asn Ala Glu Lys Met Thr Ala Gin Leu Asp Asn Ala Tyr He Leu Leu 20 25 30
Thr Asp Lys Lys He Ser Ser Met Lys Asp He Leu Pro Leu Leu Glu
35 40 45
Lys Thr Met Lys Glu Gly Lys Pro Leu Leu He He Ala Glu Asp He 50 55 60 Glu 65
(2) INFORMATION FOR SEQ ID NO: 569: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 144 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y184A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 569: Met Asn Leu Phe Glu Lys Met Thr Asp Gin Leu His Glu Thr Leu Asp 1 5 10 15
Ser Ala Leu Ala Leu Ala Leu His His Lys Asn Ala Glu Val Thr Pro
20 25 30
Met His Met Leu Phe Ala Met Leu Asn Asn Ser Gin Gly He Leu He 35 40 45
Gin Ala Leu Gin Lys Met Pro Val Asp He Gin Ala Leu Arg Leu Ser 50 55 60 Val Gin Ser Glu Leu Asn Lys Phe Ala Lys Val Ser Gin He Ser Lys 65 70 75 80
Gin Asn He Gin Leu Asn Gin Ala Leu He Gin Ser Leu Glu Asn Ala 85 90 95 Gin Gly Leu Met Ala Lys Arg Gly Asp Ser Phe He Ala Thr Asp Val 100 105 110
Tyr Leu Leu Ala Asn Met Gly Leu Phe Glu Ser Val Leu Lys Pro Tyr
115 120 125
Leu Asp Ala Lys Glu Leu Gin Lys Thr Leu Glu Ser Leu Arg Lys Gly 130 135 140
(2) INFORMATION FOR SEQ ID NO: 570:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 318 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: expressed antigen for cluster 18b
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 570:
Ser Asp Phe He Asp Lys Ser Asn Asp Leu He Asn Lys Asp Asn Leu 1 5 10 15
He Asp Val Glu Ser Ser Thr Lys Ser Phe Gin Lys Phe Gly Asp Gin
20 25 30
Arg Tyr Gin He Phe Thr Ser Trp Val Ser His Gin Lys Asp Pro Ser 35 40 45 Lys He Asn Thr Arg Ser He Arg Asn Phe Met Glu Asn He He Gin
50 55 60
Pro Pro He Pro Asp Asp Lys Glu Lys Ala Glu Phe Leu Lys Ser Ala 65 70 75 80
Lys Gin Ser Phe Ala Gly He He He Gly Asn Gin He Arg Thr Asp 85 90 95
Gin Lys Phe Met Gly Val Phe Asp Glu Ser Leu Lys Glu Arg Gin Glu
100 105 110
Ala Glu Lys Asn Gly Gly Pro Thr Gly Gly Asp Trp Leu Asp He Phe 115 120 125 Leu Ser Phe He Phe Asn Lys Lys Gin Ser Ser Asp Val Lys Glu Ala 130 135 140
He Asn Gin Glu Pro Val Pro His Val Gin Pro Asp He Ala Thr Thr 145 150 155 160
Thr Thr Asp He Gin Gly Leu Pro Pro Glu Ala Arg Asp Leu Leu Asp 165 170 175
Glu Arg Gly Asn Phe Ser Lys Phe Thr Leu Gly Asp Met Glu Met Leu
180 185 190
Asp Val Glu Gly Val Ala Asp He Asp Pro Asn Tyr Lys Phe Asn Gin 195 200 205 Leu Leu He His Asn Asn Ala Leu Ser Ser Val Leu Met Gly Ser His 210 215 220
Asn Gly He Glu Pro Glu Lys Val Ser Leu Leu Tyr Ala Gly Asn Gly 225 230 235 240
Gly Phe Gly Asp Lys His Asp Trp Asn Ala Thr Val Gly Tyr Lys Asp 245 250 255
Gin Gin Gly Asn Asn Val Ala Thr Leu He Asn Val His Met Lys Asn
260 265 270
Gly Ser Gly Leu Val He Ala Gly Gly Glu Lys Gly He Asn Asn Pro 275 280 285 Ser Phe Tyr Leu Tyr Lys Glu Asp Gin Leu Thr Gly Ser Gin Arg Ala 290 295 300
Leu Ser Gin Glu Glu He Arg Asn Lys Val Asp Phe Met Glu 305 310 315 (2) INFORMATION FOR SEQ ID NO: 571:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 250 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 20
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 571: Met Leu Gly Asn Val Lys Lys Thr Leu Phe Gly Val Leu Cys Leu Gly 1 5 10 15
Thr Leu Cys Leu Arg Gly Leu Met Ala Glu Pro Asp Ala Lys Glu Leu
20 25 30
Val Asn Leu Gly He Glu Ser Ala Lys Lys Gin Asp Phe Ala Gin Ala 35 40 45
Lys Thr His Phe Glu Lys Ala Cys Glu Leu Lys Asn Gly Phe Gly Cys
50 55 60
Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly Lys Gly Val Gly Lys Asp 65 70 75 80 Leu Lys Lys Ala He Gin Phe Tyr Thr Lys Gly Cys Glu Leu Asn Asp
85 90 95
Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu Tyr Tyr Asn Gly Gin Gly
100 105 110
Val Ser Lys Asp Ala Lys Lys Ala Ser Gin Tyr Tyr Ser Lys Ala Cys 115 120 125
Asp Leu Asn His Ala Glu Gly Cys Met Val Leu Gly Ser Leu His His
130 135 140
Tyr Gly Val Gly Thr Pro Lys Asp Leu Arg Lys Ala Leu Asp Leu Tyr 145 150 155 160 Glu Lys Ala Cys Asp Leu Lys Asp Ser Pro Gly Cys He Asn Ala Gly
165 170 175
Tyr He Tyr Ser Val Thr Lys Asn Phe Lys Glu Ala He Val Arg Tyr
180 185 190
Ser Lys Ala Cys Glu Leu Lys Asp Gly Arg Gly Cys Tyr Asn Leu Gly 195 200 205
Val Met Gin Tyr Asn Ala Gin Gly Thr Ala Lys Asp Glu Lys Gin Ala
210 215 220
Val Glu Asn Phe Lys Lys Gly Cys Lys Ser Ser Val Lys Glu Ala Cys 225 230 235 240 Asp Ala Leu Lys Glu Leu Lys He Glu Leu
245 250
(2) INFORMATION FOR SEQ ID NO: 572: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 103 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y128A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 572: Pro Asp Ala Lys Glu Leu Val Asn Leu Gly He Glu Ser Ala Lys Lys 1 5 10 15
Gin Asp Phe Ala Gin Ala Lys Thr His Phe Glu Lys Ala Cys Glu Leu
20 25 30
Lys Asn Gly Phe Gly Cys Val Phe Leu Gly Ala Phe Tyr Glu Glu Gly 35 40 45
Lys Gly Val Gly Lys Asp Leu Lys Lys Ala He Gin Phe Tyr Thr Lys
50 55 60
Gly Cys Glu Leu Asn Asp Gly Tyr Gly Cys Asn Leu Leu Gly Asn Leu 65 70 75 80 Tyr Tyr Asn Gly Gin Gly Val Ser Lys Asp Ala Lys Lys Ala Ser Gin
85 90 95
Tyr Tyr Ser Lys Ala Cys Asp 100 (2) INFORMATION FOR SEQ ID NO: 573: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 114 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y128D
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:573: Met Val Leu Gly Ser Leu His His Tyr Gly Val Gly Thr Pro Lys Asp 1 5 10 15
Leu Arg Lys Ala Leu Asp Leu Tyr Glu Lys Ala Cys Asp Leu Lys Asp
20 25 30
Ser Pro Gly Cys He Asn Ala Gly Tyr He Tyr Ser Val Thr Lys Asn 35 40 45
Phe Lys Glu Ala He Val Arg Tyr Ser Lys Ala Cys Glu Leu Lys Asp
50 55 60
Gly Arg Gly Cys Tyr Asn Leu Gly Val Met Gin Tyr Asn Ala Gin Gly 65 70 75 80 Thr Ala Lys Asp Glu Lys Gin Ala Val Glu Asn Phe Lys Lys Gly Cys
85 90 95
Lys Ser Ser Val Lys Glu Ala Cys Asp Ala Leu Lys Glu Leu Lys He
100 105 110
Glu Leu
(2) INFORMATION FOR SEQ ID NO: 574:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 156 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: expressed antigen for Y146C
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 574:
Met Asn Leu Ser Glu He Glu Glu Leu He Lys Glu Phe Lys Ala Ser 1 5 10 15
Asp Leu Gly His Leu Lys Leu Lys His Glu His Phe Glu Leu Val Leu
20 25 30
Asp Lys Glu Ser Ala Tyr Ala Lys Lys Ser Ala Leu Asn Pro Ala His 35 40 45 Ser Pro Ala Pro He Met Val Glu Ala Ser Met Pro Ser Val Gin Thr 50 55 60
Pro Val Pro Met Val Cys Thr Pro He Val Asp Lys Lys Glu Asp Phe 65 70 75 80
Val Leu Ser Pro Met Val Gly Thr Phe Tyr His Ala Pro Ser Pro Gly 85 90 95
Ala Glu Pro Tyr Val Lys Ala Gly Asp Thr Leu Lys Lys Gly Gin He
100 105 110
Val Gly He Val Glu Ala Met Lys He Met Asn Glu He Glu Val Glu 115 120 125 Tyr Pro Cys Lys Val Val Ser Val Glu Val Gly Asp Ala Gin Pro Val 130 135 140
Glu Tyr Gly Thr Lys Leu He Lys Val Glu Lys Leu 145 150 155 (2) INFORMATION FOR SEQ ID NO: 575:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 89 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y146B
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:575: Met Val Cys Thr Pro He Val Asp Lys Lys Glu Asp Phe Val Leu Ser 1 5 10 15
Pro Met Val Gly Thr Phe Tyr His Ala Pro Ser Pro Gly Ala Glu Pro
20 25 30
Tyr Val Lys Ala Gly Asp Thr Leu Lys Lys Gly Gin He Val Gly He 35 40 45
Val Glu Ala Met Lys He Met Asn Glu He Glu Val Glu Tyr Pro Cys
50 55 60
Lys Val Val Ser Val Glu Val Gly Asp Ala Gin Pro Val Glu Tyr Gly 65 70 75 80 Thr Lys Leu He Lys Val Glu Lys Leu
85
(2) INFORMATION FOR SEQ ID NO: 576: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 131 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 29b
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 576: Met Ala He Phe Asp Asn Asn Asn Lys Ser Ala Asn Ala Lys Thr Gly 1 5 10 15
Pro Ala Thr He He Ala Gin Gly Thr Lys He Lys Gly Glu Leu His
20 25 30
Leu Asp Tyr His Leu His Val Asp Gly Glu Leu Glu Gly Val Val His 35 40 45
Ser Lys Ser Thr Val Val He Gly Gin Thr Gly Ser Val Val Gly Glu
50 55 60
He Phe Thr Asn Lys Leu Val Val Ser Gly Lys Phe Thr Gly Thr Val 65 70 75 80 Glu Ala Glu Val Val Glu He Met Pro Leu Gly His Leu Asp Gly Lys
85 90 95
He Ser Ser Gin Glu Leu Val Val Glu Arg Lys Gly He Leu He Gly
100 105 110
Glu Thr Arg Pro Lys Asn He Gin Gly Gly Ala Leu Leu He Asn Glu 115 120 125
Gin Glu Lys 130
(2) INFORMATION FOR SEQ ID NO: 577:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 236 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Z14B
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 577:
Pro Gly Asp Asp Thr Pro He Val Ala Gly Ser Ala Leu Arg Ala Leu 1 5 10 15 Glu Glu Ala Lys Ala Gly Asn Val Gly Glu Trp Gly Glu Lys Val Leu
20 25 30
Lys Leu Met Ala Glu Val Asp Ala Tyr He Pro Thr Pro Glu Arg Asp 35 40 45 Thr Glu Lys Thr Phe Leu Met Pro Val Glu Asp Val Phe Ser He Ala 50 55 60
Gly Arg Gly Thr Val Val Thr Gly Arg He Glu Arg Gly Val Val Lys 65 70 75 80
Val Gly Asp Glu Val Glu He Val Gly He Arg Pro Thr Gin Lys Thr 85 90 95
Thr Val Thr Gly Val Glu Met Phe Arg Lys Glu Leu Glu Lys Gly Glu
100 105 110
Ala Gly Asp Asn Val Gly Val Leu Leu Arg Gly Thr Lys Lys Glu Glu 115 120 125 Val Glu Arg Gly Met Val Leu Cys Lys Pro Gly Ser He Thr Pro His 130 135 140
Lys Lys Phe Glu Gly Glu He Tyr Val Leu Ser Lys Glu Glu Gly Gly 145 150 155 160
Arg His Thr Pro Phe Phe Thr Asn Tyr Arg Pro Gin Phe Tyr Val Arg 165 170 175
Thr Thr Asp Val Thr Gly Ser He Thr Leu Pro Glu Gly Val Glu Met
180 185 190
Val Met Pro Gly Asp Asn Val Lys He Thr Val Glu Leu He Ser Pro 195 200 205 Val Ala Leu Glu Leu Gly Thr Lys Phe Ala He Arg Glu Gly Gly Arg 210 215 220
Thr Val Gly Ala Gly Val Val Ser Asn He He Glu 225 230 235 (2) INFORMATION FOR SEQ ID NO:578:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 649 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Z9A (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 578:
Glu Ala Arg Lys Leu Leu Glu Glu Ala Lys Lys Ser Val Lys Ala Tyr
1 5 10 15
Leu Asp Cys Val Ser Gin Ala Lys Thr Glu Ala Glu Lys Lys Glu Cys 20 25 30
Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Leu Leu Glu Glu Xaa Ala
35 40 45
Lys Glu Ser Val Lys Ala Tyr Leu Asp Cys Val Ser Gin Ala Lys Asn 50 55 60 Glu Ala Glu Lys Lys Glu Cys Glu Lys Leu Leu Thr Leu Glu Ser Lys 65 70 75 80
Lys Lys Leu Glu Glu Ala Lys Lys Ser Val Lys Ala Tyr Leu Asp Cys
85 90 95
Val Ser Gin Ala Lys Thr Glu Ala Glu Lys Lys Glu Cys Glu Lys Leu 100 105 110
Leu Thr Pro Glu Ala Lys Lys Leu Leu Glu Gin Gin Ala Leu Asp Cys
115 120 125
Leu Lys Asn Ala Lys Thr Glu Ala Asp Lys Lys Arg Cys Val Lys Asp
130 135 140 Leu Pro Lys Asp Leu Gin Lys Lys Val Leu Ala Lys Glu Ser Leu Lys
145 150 155 160
Ala Tyr Lys Asp Cys Val Ser Lys Ala Arg Asn Glu Lys Glu Lys Lys
165 170 175
Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala Lys Lys Leu Leu Glu Glu 180 185 190
Ala Lys Lys Ser Val Lys Ala Tyr Leu Asp Cys Val Ser Gin Ala Lys 195 200 205 Thr Glu Ala Glu Lys Lys Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala
210 215 220
Arg Lys Leu Leu Glu Glu Ala Lys Glu Ser Val Lys Ala Tyr Lys Asp 225 230 235 240 Cys Val Ser Lys Ala Arg Asn Glu Lys Glu Lys Lys Glu Cys Glu Lys
245 250 255
Leu Leu Thr Pro Glu Ala Lys Lys Leu Leu Glu Gin Gin Val Leu Asp
260 265 270
Cys Leu Lys Asn Ala Lys Thr Glu Ala Asp Lys Lys Arg Cys Val Lys 275 280 285
Asp Leu Pro Lys Asp Leu Gin Lys Lys Val Leu Ala Lys Glu Ser Val
290 295 300
Lys Ala Tyr Leu Asp Cys Val Ser Arg Ala Arg Asn Glu Lys Glu Lys 305 310 315 320 Lys Glu Cys Glu Lys Leu Leu Thr Pro Glu Ala Lys Lys Leu Leu Glu
325 330 335
Glu Ala Lys Glu Ser Leu Lys Ala Tyr Lys Asp Cys Leu Ser Gin Ala
340 345 350
Arg Asn Glu Glu Glu Arg Arg Ala Cys Glu Lys Leu Leu Thr Pro Glu 355 360 365
Ala Arg Lys Leu Leu Glu Gin Glu Val Lys Lys Ser He Lys Ala Tyr
370 375 380
Leu Asp Cys Val Ser Arg Ala Arg Asn Glu Lys Glu Lys Lys Glu Cys 385 390 395 400 Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Phe Leu Ala Lys Gin Val
405 410 415
Leu Asn Cys Leu Glu Lys Ala Gly Asn Glu Glu Glu Arg Lys Ala Cys
420 425 430
Leu Lys Asn Leu Pro Lys Asp Leu Gin Glu Asn He Leu Ala Lys Glu 435 440 445
Ser Leu Lys Ala Tyr Lys Asp Cys Leu Ser Gin Ala Arg Asn Glu Glu
450 455 460
Glu Arg Arg Ala Cys Glu Lys Leu Leu Thr Pro Glu Ala Arg Lys Leu 465 470 475 480 Leu Glu Gin Glu Val Lys Lys Ser Val Lys Ala Tyr Leu Asp Cys Val
485 490 495
Ser Arg Ala Arg Asn Glu Lys Glu Lys Lys Glu Cys Glu Lys Leu Leu
500 505 510
Thr Pro Glu Ala Arg Lys Phe Leu Ala Lys Glu Leu Gin Gin Lys Asp 515 520 525
Lys Ala He Lys Asp Cys Leu Lys Asn Ala Asp Pro Asn Asp Arg Ala
530 535 540
Ala He Met Lys Cys Leu Asp Gly Leu Ser Asp Glu Glu Lys Leu Lys 545 550 555 560 Tyr Leu Gin Glu Ala Arg Glu Lys Ala Val Ala Asp Cys Leu Ala Met
565 570 575
Ala Lys Thr Asp Glu Glu Lys Arg Lys Cys Gin Asn Leu Tyr Ser Asp
580 585 590
Leu He Gin Glu He Gin Asn Lys Arg Thr Gin Asn Lys Gin Asn Gin 595 600 605
Leu Ser Lys Thr Glu Arg Leu His Gin Ala Ser Glu Cys Leu Asp Asn
610 615 620
Leu Asp Asp Pro Thr Asp Gin Glu Ala He Glu Gin Cys Leu Glu Gly 625 630 635 640 Leu Ser Asp Ser Glu Arg Ala Leu He
645
(2) INFORMATION FOR SEQ ID NO: 579: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 242 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 32a (xi) SEQUENCE DESCRIPTION: SEQ ID NO:579:
Met Val Gin Phe Gin Asn Thr Leu He Lys Phe His Ala Leu Ser Phe 1 5 10 15 Lys Asn Ala Asn Leu He Tyr Asn Ala Lys Leu Asn Lys Thr Cys Tyr 20 25 30
Lys Glu Asn Ser Asn Thr He He Leu Arg He Lys Met Leu Thr Gin
35 40 45
Glu Asp Val Leu Asn Ala Leu Lys Thr He He Tyr Pro Asn Phe Glu 50 55 60
Lys Asp He Val Ser Phe Gly Phe Val Lys Asn He Thr Leu His Asp 65 70 75 80
Asp Gin Leu Gly Leu Leu He Glu He Pro Ser Ser Ser Glu Glu Thr 85 90 95 Ser Ala He Leu Arg Glu Asn He Ser Lys Ala Met Gin Glu Lys Gly 100 105 110
Val Lys Ala Leu Asn Leu Asp He Lys Thr Pro Pro Lys Pro Gin Ala
115 120 125
Pro Lys Pro Thr Thr Lys Asn Leu Ala Lys Asn He Lys His Val Val 130 135 140
Met He Ser Ser Gly Lys Gly Gly Val Gly Lys Ser Thr Thr Ser Val
145 150 155 160
Asn Leu Ser He Ala Leu Ala Asn Leu Asn Gin Lys Val Gly Leu Leu
165 170 175 Asp Ala Asp Val Tyr Gly Pro Asn He Pro Arg Met Met Gly Leu Gin
180 185 190
Ser Ala Asp Val He Met Asp Pro Ser Gly Lys Lys Leu He Pro Leu
195 200 205
Lys Ala Phe Gly Val Ser Val Met Ser Met Gly Leu Leu Tyr Asp Glu 210 215 220
Gly Gin Ser Leu He Trp Arg Gly Pro Met Leu Met Arg Ala He Glu 225 230 235 240
Gin Met
(2) INFORMATION FOR SEQ ID NO: 580:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 196 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y108B
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 580:
Met Leu Thr Gin Glu Asp Val Leu Asn Ala Leu Lys Thr He He Tyr 1 5 10 15 Pro Asn Phe Glu Lys Asp He Val Ser Phe Gly Phe Val Lys Asn He 20 25 30
Thr Leu His Asp Asp Gin Leu Gly Leu Leu He Glu He Pro Ser Ser
35 40 45
Ser Glu Glu Thr Ser Ala He Leu Arg Glu Asn He Ser Lys Ala Met 50 55 60
Gin Glu Lys Gly Val Lys Ala Leu Asn Leu Asp He Lys Thr Pro Pro 65 70 75 80
Lys Pro Gin Ala Pro Lys Pro Thr Thr Lys Asn Leu Ala Lys Asn He 85 90 95 Lys His Val Val Met He Ser Ser Gly Lys Gly Gly Val Gly Lys Ser 100 105 110
Thr Thr Ser Val Asn Leu Ser He Ala Leu Ala Asn Leu Asn Gin Lys
115 120 125
Val Gly Leu Leu Asp Ala Asp Val Tyr Gly Pro Asn He Pro Arg Met 130 135 140
Met Gly Leu Gin Ser Ala Asp Val He Met Asp Pro Ser Gly Lys Lys 145 150 155 160 Leu He Pro Leu Lys Ala Phe Gly Val Ser Val Met Ser Met Gly Leu
165 170 175
Leu Tyr Asp Glu Gly Gin Ser Leu He Trp Arg Gly Pro Met Leu Met 180 185 190 Arg Ala He Glu 195
(2) INFORMATION FOR SEQ ID NO: 581: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 95 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y150A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 581: Met Leu Thr Gin Glu Asp Val Leu Asn Ala Leu Lys Thr He He Tyr 1 5 10 15
Pro Asn Phe Glu Lys Asp He Val Ser Phe Gly Phe Val Lys Asn He
20 25 30
Thr Leu His Asp Asp Gin Leu Gly Leu Leu He Glu He Pro Ser Ser 35 40 45
Ser Glu Glu Thr Ser Ala He Leu Arg Glu Asn He Ser Lys Ala Met
50 55 60
Gin Glu Lys Gly Val Lys Ala Leu Asn Leu Asp He Lys Thr Pro Pro
65 70 75 80 Lys Pro Gin Ala Pro Lys Pro Thr Thr Lys Asn Leu Ala Lys Asn
85 90 95
(2) INFORMATION FOR SEQ ID NO: 582: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 259 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 33
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:582: Gly Asn Tyr Glu Glu Cys Leu Lys Leu He Lys Asp Lys Lys Leu Gin 1 5 10 15
Asp Gin Met Lys Lys Thr Leu Glu Ala Tyr Asn Asp Cys He Lys Asn
20 25 30
Ala Lys Thr Glu Glu Glu Arg He Lys Cys Leu Asp Leu He Lys Asp 35 40 45
Glu Asn Leu Lys Lys Ser Leu Leu Asn Gin Gin Lys Val Gin Val Ala
50 55 60
Leu Asp Cys Leu Lys Asn Ala Lys Thr Asp Glu Glu Arg Asn Glu Cys 65 70 75 80 Leu Lys Leu He Asn Asp Pro Glu He Arg Glu Lys Phe Arg Lys Glu
85 90 95
Leu Glu Leu Gin Lys Glu Leu Gin Glu Tyr Lys Asp Cys He Lys Asn
100 105 110
Ala Lys Thr Glu Ala Glu Lys Asn Lys Cys Leu Lys Gly Leu Ser Lys 115 120 125
Glu Ala He Glu Arg Leu Lys Gin Gin Ala Leu Asp Cys Leu Lys Asn
130 135 140
Ala Lys Thr Asp Glu Glu Arg Asn Glu Cys Leu Lys Asn He Pro Gin 145 150 155 160 Asp Leu Gin Lys Glu Leu Leu Ala Asp Met Ser Val Lys Ala Tyr Lys
165 170 175
Asp Cys Val Ser Lys Ala Arg Asn Glu Lys Glu Lys Gin Glu Cys Glu 180 185 190
Lys Leu Leu Thr Pro Glu Ala Arg Lys Lys Leu Glu Gin Gin Val Leu
195 200 205
Asp Cys Leu Lys Asn Ala Lys Thr Asp Glu Glu Arg Lys Lys Cys Leu 210 215 220
Lys Asp Leu Pro Lys Asp Leu Gin Ser Asp He Leu Ala Lys Glu Ser 225 230 235 240
Leu Lys Ala Tyr Lys Asp Cys Val Ser Gin Ala Lys Thr Glu Ala Glu 245 250 255 Lys Lys Glu
(2) INFORMATION FOR SEQ ID NO: 583: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 247 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 34
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:583: Met Gin Phe Thr Gly Lys Asn Val Leu He Thr Gly Ala Ser Lys Gly 1 5 10 15
He Gly Ala Glu He Ala Lys Thr Leu Ala Ser Met Gly Leu Lys Val
20 25 30
Trp He Asn Tyr Arg Ser Asn Ala Glu Val Ala Asp Ala Leu Lys Asn 35 40 45
Glu Leu Glu Glu Lys Gly Tyr Lys Ala Ala Val He Lys Phe Asp Ala
50 55 60
Ala Ser Glu Ser Asp Phe He Glu Ala He Gin Thr He Val Gin Ser 65 70 75 80 Asp Gly Gly Leu Ser Tyr Leu Val Asn Asn Ala Gly Val Val Arg Asp
85 90 95
Lys Leu Ala He Lys Met Lys Thr Glu Asp Phe His His Val He Asp
100 105 110
Asn Asn Leu Thr Ser Ala Phe He Gly Cys Arg Glu Ala Leu Lys Val 115 120 125
Met Ser Lys Ser Arg Phe Gly Ser Val Val Asn Val Ala Ser He He
130 135 140
Gly Glu Arg Gly Asn Met Gly Gin Thr Asn Tyr Ser Ala Ser Lys Gly 145 150 155 160 Gly Met He Ala Met Ser Lys Ser Phe Ala Tyr Glu Gly Ala Leu Arg
165 170 175
Asn He Arg Phe Asn Ser Val Thr Pro Gly Phe He Glu Thr Asp Met
180 185 190
Asn Ala Asn Leu Lys Asp Glu Leu Lys Ala Asp Tyr Val Lys Asn He 195 200 205
Pro Leu Asn Arg Leu Gly Ser Ala Lys Glu Val Ala Glu Ala Val Ala
210 215 220
Phe Leu Leu Ser Asp His Ser Ser Tyr He Thr Gly Glu Thr Leu Lys 225 230 235 240 Val Asn Gly Gly Leu Tyr Met
245
(2) INFORMATION FOR SEQ ID NO:584: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 343 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 35a (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 584:
Met Ala Thr Lys Leu Thr Pro Lys Gin Lys Ala Gin Leu Asp Glu Leu 1 5 10 15 Ser Met Ser Glu Lys He Ala He Leu Leu He Gin Val Gly Glu Asp 20 25 30
Thr Thr Gly Glu He Leu Arg His Leu Asp He Asp Ser He Thr Glu
35 40 45
He Ser Lys Gin He Val Gin Leu Asn Gly Thr Asp Lys Gin He Gly 50 55 60
Ala Ala Val Leu Glu Glu Phe Phe Ala He Phe Gin Ser Asn Gin Tyr 65 70 75 80
He Asn Thr Gly Gly Leu Glu Tyr Ala Arg Glu Leu Leu Thr Arg Thr 85 90 95 Leu Gly Ser Glu Glu Ala Lys Lys Val Met Asp Lys Leu Thr Lys Ser 100 105 110
Leu Gin Thr Gin Lys Asn Phe Ala Tyr Leu Gly Lys He Lys Pro Gin
115 120 125
Gin Leu Ala Asp Phe He He Asn Glu His Pro Gin Thr He Ala Leu 130 135 140
He Leu Ala His Met Glu Ala Pro Asn Ala Ala Glu Thr Leu Ser Tyr
145 150 155 160
Phe Pro Asp Glu Met Lys Ala Glu He Ser He Arg Met Ala Asn Leu
165 170 175 Gly Glu He Ser Pro Gin Val Val Lys Arg Val Ser Thr Val Leu Glu
180 185 190
Asn Lys Leu Glu Ser Leu Thr Ser Tyr Lys He Glu Val Gly Gly Leu
195 200 205
Arg Ala Val Ala Glu He Phe Asn Arg Leu Gly Gin Lys Ser Ala Lys 210 215 220
Thr Thr Leu Ala Arg He Glu Ser Val Asp Asn Lys Leu Ala Gly Ala
225 230 235 240
He Lys Glu Met Met Phe Thr Phe Glu Asp He Val Lys Leu Asp Asn
245 250 255 Phe Ala He Arg Glu He Leu Lys Val Ala Asp Lys Lys Asp Leu Ser
260 265 270
Leu Ala Leu Lys Thr Ser Thr Lys Asp Leu Thr Asp Lys Phe Leu Asn
275 280 285
Asn Met Ser Ser Arg Ala Ala Glu Gin Phe Val Glu Glu Met Gin Tyr 290 295 300
Leu Gly Ala Val Lys He Lys Asp Val Asp Val Ala Gin Arg Lys He 305 310 315 320
He Glu He Val Gin Ser Leu Gin Glu Lys Gly Val He Gin Thr Gly 325 330 335 Glu Glu Glu Asp Val He Glu 340
(2) INFORMATION FOR SEQ ID NO:585: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 219 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 36
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 585: Pro Tyr Asn Glu Thr Glu Asn Ser Tyr Asn Tyr Thr Ser Asp Lys Val 1 5 10 15
Gly Thr Tyr Tyr Leu Thr Ser Asn He Lys Gly Phe Asn Gin Asn Asn
20 25 30
Lys Thr Pro Gly Thr Tyr Asn Ala Gin Asn Gin Pro Leu Gin Ala Leu 35 40 45
His He Tyr Asn Gin Ala He Thr Lys Gin Asp Leu Asn Met He Ala 50 55 60 Ser Leu Gly Lys Glu Phe Leu Pro Lys He Ala Asn Leu Leu Ser Ser 65 70 75 80
Gly Ala Leu Asp Asn Leu Asn Ser Pro Asn Ser Phe Glu Thr Leu Phe 85 90 95 Gly He Phe Glu Lys Tyr Gly He Thr Leu Asn Gin Glu Asn Trp Lys 100 105 110
Ser Leu Leu Lys He He Asn Asn Phe Ser Asn Thr Thr Asn Tyr Asp
115 120 125
Phe Ser Gin Gly Asn Leu Val Val Gly Ala He Lys Glu Gly Gin Thr 130 135 140
Asn Thr Lys Ser Val Val Trp Phe Gly Gly Glu Gly Tyr Lys Glu Pro
145 150 155 160
Cys Ala Val Gly Asp Asn Thr Cys Gin Met Phe Arg Gin Thr Asn Leu
165 170 175 Gly Gin Leu Leu His Ser Ser Thr Pro Tyr Leu Gly Tyr He Asn Ala
180 185 190
Asn Phe Arg Ala Lys Asn He Tyr He Thr Gly Thr He Gly Ser Gly
195 200 205
Asn Ala Trp Gly Ser Gly Gly Ser Ala Asn Val 210 215
(2) INFORMATION FOR SEQ ID NO: 586:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 294 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: expressed antigen for Y153A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 586:
Met Gly Asp Val Glu He Trp Glu Met Ala Glu Pro Leu Leu Tyr Leu 1 5 10 15
Gly Arg Asn Val Lys Ala Asn Thr Gly Gly Tyr Gly Lys Tyr Arg Gly
20 25 30
Gly Asn Gly Phe Glu Thr Leu Arg Met Val Trp Gly Ala His Asp Trp 35 40 45 Thr Met Phe Phe Met Gly Asn Gly Tyr Met Asn Ser Asp Trp Gly Met 50 55 60
Met Gly Gly Tyr Pro Ala Ala Ser Gly Tyr Arg Phe Glu Ala His Asn 65 70 75 80
Thr Asp Leu Lys Asn Arg He Lys Asn Asn Ala Ser Leu Pro Leu Gly 85 90 95
Gly Asp Phe Asn Pro Thr Asp Arg Asp Tyr Glu Lys His He Ser His
100 105 110
Ala Ser Gin Val Lys Arg Asp Lys Gin Cys He Thr Thr Glu Asn Cys 115 120 125 Phe Asp Asn Tyr Asp Leu Tyr Leu Asn Tyr He Lys Gly Gly Pro Gly 130 135 140
Phe Gly Asp Pro He Glu Arg Asp Leu Asn Ala He Leu Glu Asp Leu 145 150 155 160
Asn Ser Lys Gin Leu Leu Pro Glu Tyr Ala Tyr Lys Val Tyr Gly Ala 165 170 175
Val Val Ser Gin Asn Lys Asp Gly Val Trp Val Gly Asp Glu Ala Lys
180 185 190
Thr Lys Ala Arg Arg Lys Glu He Leu Glu Asn Arg Lys Ala Arg Ser 195 200 205 He Pro Val Lys Gin Trp Met Glu Gin Glu Arg Asn Ala He Leu Glu 210 215 220
Lys Glu Ala Ser Lys Gin Val Lys His Met Tyr Ala Thr Ser Phe Asp 225 230 235 240
Leu Ser Pro Lys Phe Leu Asn Asp Phe Lys Thr Phe Trp Asn Leu Pro 245 250 255
Lys Asn Trp Ser Val Lys Glu Asp Glu Leu Gly Val Phe Thr Tyr Gly 260 265 270 Ser Lys Tyr Arg Met Asp Leu Ser Lys Leu Pro Asp Val Arg Thr Val
275 280 285
Leu Leu Val Asp Glu Lys 290
(2) INFORMATION FOR SEQ ID NO: 587:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 72 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Y56A
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 587:
Pro Gly Phe Gly Asp Pro He Glu Arg Asp Leu Asn Ala He Leu Glu 1 5 10 15 Asp Leu Asn Ser Lys Gin Leu Leu Pro Glu Tyr Ala Tyr Lys Val Tyr 20 25 30
Gly Ala Val Val Ser Gin Asn Lys Asp Gly Val Trp Val Gly Asp Glu
35 40 45
Ala Lys Thr Lys Ala Arg Arg Lys Glu He Leu Glu Asn Arg Lys Ala 50 55 60
Arg Ser He Pro Val Lys Gin Trp 65 70
(2) INFORMATION FOR SEQ ID NO: 588:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 142 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 38
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 588:
Leu Asn Asp Leu Glu He Thr Ser Pro Val Ser Lys Ala Ala Ser Val
1 5 10 15
Ala Asp Trp Glu Glu He Val Lys Glu Tyr Glu Lys Thr Tyr Ala Arg 20 25 30 Val Tyr Ser Glu Ser Ala Cys Ser Pro Glu Leu Gly Phe Ser Val Thr 35 40 45
Gly Val He Met Arg Gly Val Val Ala Thr Gin Lys Pro Val He Pro
50 55 60
Val Glu Lys Glu His Gly Ala Thr Pro Pro Lys Glu Ala Lys He Gly 65 70 75 80
Val Arg Lys Phe Tyr Arg His Lys Lys Trp Val Asp Ala Asp Val Trp
85 90 95
Gin Met Glu Lys Leu Leu Pro Gly Asn Glu Val He Gly Pro Ala He 100 105 110 Val Glu Ser Asp Ala Thr Thr Phe Val He Pro Lys Gly Phe Ala Thr 115 120 125
Arg Leu Asp Lys His Arg Leu Phe His Leu Lys Glu He Lys 130 135 140 (2) INFORMATION FOR SEQ ID NO: 589:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 262 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: expressed antigen for cluster 40
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 589: Met Asn Gly Ser Asn His Met Lys Asn Lys Thr Leu Val He Ser Gly 1 5 10 15
Ala Thr Arg Gly He Gly Lys Ala He Phe Val Arg Phe Ala Gin Ser
20 25 30
Gly Val Asn He Ala Phe Thr Tyr Asn Lys Asn Val Glu Glu Ala Asn 35 40 45
Lys He He Glu Asp Val Glu Gin Lys Tyr Ser He Lys Ala Lys Ala
50 55 60
Tyr Ser Leu Asn Val Leu Glu Pro Glu Gin Tyr Thr Glu Leu Phe Lys 65 70 75 80 Gin He Asp Ala Asp Phe Asp Arg Val Asp Phe Phe He Ser Asn Ala
85 90 95
He He Tyr Gly Arg Ser Val Val Gly Gly Phe Ala Pro Phe Met Arg
100 105 110
Leu Lys Pro Lys Gly Leu Asn Asn He Tyr Thr Ala Thr Val Leu Ala 115 120 125
Phe Val Val Gly Ala Gin Glu Ala Ala Lys Arg Met Gin Lys He Gly
130 135 140
Gly Gly Ala He Val Ser Leu Ser Ser Thr Gly Asn Leu Val Tyr Met 145 150 155 160 Pro Asn Tyr Ala Gly His Gly Asn Ser Lys Asn Ala Val Glu Thr Met
165 170 175
Val Lys Tyr Ala Ala Val Asp Leu Gly Glu Phe Asn He Arg Val Asn
180 185 190
Ala Val Ser Gly Gly Pro He Asp Thr Asp Ala Leu Lys Ala Phe Pro 195 200 205
Asp Tyr Val Glu He Lys Glu Lys Val Glu Glu Gin Ser Pro Leu Lys
210 215 220
Arg Met Gly Asn Pro Asn Asp Leu Ala Gly Ala Ala Tyr Phe Leu Cys 225 230 235 240 Asp Glu Thr Gin Ser Gly Trp Leu Thr Gly Gin Thr He Val Val Asp
245 250 255
Gly Gly Thr Thr Phe Lys 260 (2) INFORMATION FOR SEQ ID NO: 590:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 300 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 41 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 590:
Met Arg Lys Thr He Ser Ala Leu Phe Leu Ser Ala Cys He Gly Leu
1 5 10 15
Ser Ser Val Tyr Ala Asp Asn Ala Leu He Leu Gin Thr Asp Phe Ser 20 25 30
Leu Lys Asp Gly Ala Val Ser Ala Met Lys Gly Val Ala Phe Ser Val
35 40 45
Asp Ser His Leu Lys He Phe Asp Leu Thr His Glu He Pro Pro Tyr 50 55 60 Asn He Trp Glu Gly Ala Tyr Arg Leu Tyr Gin Thr Ala Ser Tyr Trp 65 70 75 80
Pro Lys Gly Ser Val Phe Val Ser Val Val Asp Pro Gly Val Gly Thr
85 90 95
Lys Arg Lys Ser Val Val Leu Lys Thr Lys Asn Gly Gin Tyr Phe Val 100 105 110
Ser Pro Asp Asn Gly Thr Leu Thr Leu Val Ala Gin Thr Leu Gly He 115 120 125 Asp Ser Val Arg Glu He Asp Glu Lys Ala Asn Arg Leu Lys Gly Ser
130 135 140
Glu Lys Ser Tyr Thr Phe His Gly Arg Asp Val Tyr Ala Tyr Thr Gly 145 150 155 160 Ala Arg Leu Ala Ser Gly Ala He Thr Phe Glu Gin Val Gly Pro Glu
165 170 175
Leu Pro Pro Lys Val Val Glu He Pro Tyr Gin Lys Ala Lys Ala Thr
180 185 190
Lys Gly Glu Val Lys Gly Asn He Pro He Leu Asp He Gin Tyr Gly 195 200 205
Asn Val Trp Ser Asn He Ser Asp Lys Leu Leu Asn Gin Ala Lys He
210 215 220
Lys Leu Asn Asp Thr Leu Cys Val Thr He Phe Lys Gly Ser Lys Lys 225 230 235 240 Gin Tyr Glu Gly Lys Met Pro Tyr Val Ala Ser Phe Gly Asp Val Pro
245 250 255
Glu Gly Gin Pro Leu Val Tyr Leu Asn Ser Leu Leu Asn Val Ser Val
260 265 270
Ala Leu Asn Arg Asp Asn Phe Ala Gin Lys Tyr Gin He Lys Ser Gly 275 280 285
Ala Asp Trp Asn He Asp He Lys Lys Cys Ala Lys 290 295 300
(2) INFORMATION FOR SEQ ID NO: 591:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 530 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 43
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 591:
Ala Gly He Val Arg Asp Tyr Tyr Leu Trp Arg Tyr He Ser Asp Lys
1 5 10 15
Lys Thr Ser Leu Glu Asn Ala Lys Lys Ala Tyr Glu Leu Thr Gin Asn 20 25 30 Lys Asn Asn Ala Leu Gin Lys Ala Met Gin Glu Lys Gly Ser Asp Asn 35 40 45
Ala Glu Lys Asn Pro Asp Val Lys Leu Pro Glu Asp He Tyr Cys Lys
50 55 60
Gin Thr Ala Leu Glu Ser Met Leu Glu Thr Thr Asp Thr Phe Gin Ala 65 70 75 80
Ser Cys He Ala He Ala Leu Lys Ser Lys He Arg Asp Phe Asp Lys
85 90 95
He Pro He Glu Thr Leu Lys Pro Leu Gin He Lys He Lys Glu Ala 100 105 110 Tyr Pro Val Leu Tyr Glu Glu Leu Glu He Leu Gin Ser Lys His Val 115 120 125
Ser Ala Ser Leu Phe Lys Ala Asn Ala Gin Val Phe Ser Ala Leu Phe
130 135 140
Asn His Leu Ser Tyr Glu Lys Lys Leu Gin He Phe Glu Lys His He 145 150 155 160
Pro He Lys Glu Leu Asn Arg Leu Leu Asp Glu Asn Tyr Pro Ala Phe
165 170 175
Asn Arg Leu He Tyr Gin Val He Leu Asp Pro Lys Leu Asp His Phe 180 185 190 Lys Asp Ala Leu Thr Lys Ser Asn Ala Thr His Ser Asn Ala Gin Thr 195 200 205
Phe Phe He Leu Gly He Asn Glu He Leu Arg Lys Lys Pro Ser Lys
210 215 220
Ala Leu Lys Tyr Phe Glu Arg Ser Glu Ala Val Val Lys Asp Asp Asp 225 230 235 240
Phe Ser Lys Asp Arg Ala He Phe Trp Gin Tyr Leu Val Ser Lys Lys 245 250 255 Lys Lys Thr Leu Glu Arg Leu Ser Gin Ser Pro Ala Leu Asn Leu Tyr
260 265 270
Ser Leu Tyr Ala Ser Arg Lys Leu Lys Thr Thr Pro Ser Tyr Arg He 275 280 285 He Ser Arg He Gin Asn Leu Ser Gin Glu Asp Pro Pro Phe Asn Thr 290 295 300
Tyr Asp Pro Phe Ser Trp Gin He Phe Lys Glu Lys Thr Leu Ser Leu 305 310 315 320
Lys Asp Glu Gly Ala Phe Asn Ala Met Leu Lys Ser Leu Tyr Tyr Glu 325 330 335
Lys Ser Ala Pro Glu Leu Thr Tyr Leu Leu Ser Gin Arg Asn Lys Asp
340 345 350
Lys He Tyr Tyr Tyr Leu Ser Pro Tyr Glu Gly He He Glu Trp Gin 355 360 365 Asn Thr Asp Glu Lys Ala Met Ala Tyr Ala He Ala Arg Gin Glu Ser 370 375 380
Phe Leu Leu Pro Ala Val He Ser Arg Ser Phe Ala Leu Gly Leu Met 385 390 395 400
Gin He Met Pro Phe Asn Val Gly Pro Phe Ala Lys Ser Leu Gly Met 405 410 415
Asp Asn He Asp Leu Asn Asp Met Phe Asn Pro Asn He Ala Leu Lys
420 425 430
Phe Gly Asn Tyr Tyr Leu Asn His Leu Lys Lys Glu Phe Asn His Pro 435 440 445 Leu Phe Val Ala Tyr Ala Tyr Asn Ala Gly Pro Gly Phe Leu Arg Arg 450 455 460
Trp Leu Glu Ser Ser Lys Arg Phe Lys Glu Lys Asn His Phe Glu Pro 465 470 475 480
Trp Leu Ser Met Glu Leu Met Pro Tyr Ser Glu Thr Arg Met Tyr Gly 485 490 495
Phe Arg Val Met Leu Asn Tyr Leu He Tyr Gin Glu He Phe Gly Asn
500 505 510
Phe He Pro He Asp Gly Phe Leu Glu Gin Thr Leu Asn Ser Lys Asp 515 520 525 Lys Pro 530
(2) INFORMATION FOR SEQ ID NO: 592: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 266 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 43a
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 592: Cys Leu Phe Pro Ala Ala Gly Tyr Gly Thr Arg Phe Leu Pro He Thr 1 5 10 15
Lys Thr He Pro Lys Glu Met Leu Pro He Val Asp Lys Pro Leu He
20 25 30
Gin Tyr Ala Val Glu Glu Ala Met Glu Ala Gly Cys Glu Val Met Ala 35 40 45
He Val Thr Gly Arg Asn Lys Arg Ser Leu Glu Asp Tyr Phe Asp Thr
50 55 60
Ser Tyr Glu He Glu His Gin He Gin Gly Thr Asn Lys Glu Asn Ala 65 70 75 80 Leu Lys Ser He Arg Asn He He Glu Lys Cys Cys Phe Ser Tyr Val
85 90 95
Arg Gin Lys Gin Met Lys Gly Leu Gly His Ala He Leu Thr Gly Glu
100 105 110
Ala Leu He Gly Asn Glu Pro Phe Ala Val He Leu Ala Asp Asp Leu 115 120 125
Cys He Ser His Asp His Pro Ser Val Leu Lys Gin Met Thr Ser Leu 130 135 140 Tyr Gin Lys Tyr Gin Cys Ser He Val Ala He Glu Glu Val Ala Leu
145 150 155 160
Glu Glu Val Ser Lys Tyr Gly Val He Arg Gly Glu Trp Leu Glu Glu
165 170 175 Gly Val Tyr Glu He Lys Asp Met Val Glu Lys Pro Asn Gin Glu Asp
180 185 190
Ala Pro Ser Asn Leu Ala Val He Gly Arg Tyr He Leu Thr Pro Asp
195 200 205
He Phe Glu He Leu Ser Glu Thr Lys Pro Gly Lys Asn Asn Glu He 210 215 220
Gin He Thr Asp Ala Leu Arg Thr Gin Ala Lys Arg Lys Arg He He 225 230 235 240
Ala Tyr Gin Phe Lys Gly Lys Arg Tyr Asp Cys Gly Ser Val Glu Gly 245 250 255 Tyr He Glu Ala Ser Asn Ala Tyr Tyr Lys 260 265
(2) INFORMATION FOR SEQ ID NO:593: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 130 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 44
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:593: Gin Lys His Ser Asn Thr Pro Val Leu Leu Asp He Thr Ser Phe Asp 1 5 10 15
Trp Ser Asp Arg Lys Met Gin Leu Glu Leu Phe Pro He Asp Leu Pro
20 25 30
Tyr Ala Ser Ala Lys Glu He Ala He Ala Lys Met Gin His Leu Pro 35 40 45
Lys Leu Val Arg Asp Ala Leu Lys Cys Met Gly Phe Asp Arg Val Ser
50 55 60
Gin Glu He Val Phe Glu Tyr Glu Pro Lys Leu Leu Lys Pro Ser Arg 65 70 75 80 Leu Thr Tyr Phe Phe Gly Tyr Phe Gin Asp Pro Arg Tyr Phe Asp Ala
85 90 95
He Ser Pro Leu He Lys Gin Thr Phe Thr Leu Pro Pro Pro Pro Pro
100 105 110
Lys He He Arg He He He Lys Lys Arg Lys Asn He Ser Ala Ser 115 120 125
Phe Leu 130
(2) INFORMATION FOR SEQ ID NO: 594:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 236 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 45a
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 594:
Leu Lys His Leu Thr Pro Leu Thr His Thr He Phe Lys Ala Leu Trp
1 5 10 15
Leu Gly Thr Ala Leu Ser Ala Ser Leu Ser Leu Ala Ala Thr Glu Ser 20 25 30 Pro Thr Lys Thr Glu Pro Lys Pro Ala Lys Gly Val Lys Asn Lys Pro 35 40 45
Lys Ser Pro Val Thr Lys Val Met Met Thr Asn Cys Asp Asn He Lys 50 55 60
Asp Phe Asn Ala Lys Gin Lys Glu Val Leu Lys Ala Ala Tyr Gin Phe 65 70 75 80
Gly Ser Lys Glu Asn Leu Gly Tyr Glu Met Ala Gly He Ala Trp Lys 85 90 95
Glu Ser Cys Ala Gly Val Tyr Lys He Asn Phe Ser Asp Pro Ser Ala
100 105 110
Gly Val Tyr His Ser Tyr He Pro Ser Val Leu Lys Ser Tyr Gly His 115 120 125 Asn Asp Ser Pro Phe Leu Arg Asn Val Met Gly Glu Leu Leu He Lys 130 135 140
Asp Asp Ala Phe Ala Ser Glu Val Ala Leu Lys Glu Leu Leu Tyr Trp 145 150 155 160
Lys Thr Arg Tyr His Asp Asn Leu Lys Asp Met He Lys Ser Tyr Asn 165 170 175
Lys Gly Ser Arg Trp Glu Arg Ser Glu Lys Ser Asn Ala Asp Ala Glu
180 185 190
Lys Tyr Tyr Glu Glu He Gin Asp Arg He Arg Arg Leu Lys Glu Ser 195 200 205 Lys He Phe Asp Ser Gin Ser Ser Asn Asp Gin Glu Leu Gin Lys Ser 210 215 220
Ala Asn Ser Asn Leu Asp Leu Asp Pro He Gly Asn 225 230 235 (2) INFORMATION FOR SEQ ID NO: 595:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 146 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 45b (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 595:
Leu His Ser Asp Glu Leu Leu Val Glu He Leu Val Glu Glu Leu Pro
1 5 10 15
Ala Gin Ala Leu Leu Asn Glu Tyr Lys Glu Met Pro Lys Lys Leu His 20 25 30
Ala Leu Phe Asn Lys Arg Ala Leu Glu Val Gly Asn He Glu He Phe
35 40 45
Tyr Thr Pro Arg Arg Leu Cys Leu Leu He Lys Asp Phe Pro Leu Leu 50 55 60 Thr Gin Glu Thr Lys Glu Glu Phe Phe Gly Pro Pro Val Lys He Ala 65 70 75 80
Cys Asn His Gin Asp Lys Thr Gin Gly Leu Asn Ala Leu Gly Leu Gly
85 90 95
Phe Tyr Gin Lys Leu Gly Leu Lys Asp His Gin Tyr Phe Gin Thr Ala 100 105 110
Phe Lys Asn Asn Lys Glu Val Leu Tyr His Ala Lys He His Glu Lys
115 120 125
Glu Pro Thr Lys Asp Leu He Met Pro He Val Leu Glu Phe Leu Glu 130 135 140 Gly Leu 145
(2) INFORMATION FOR SEQ ID NO:596: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 271 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 46b (xi) SEQUENCE DESCRIPTION: SEQ ID NO:596:
Met Asn Ala Phe Lys Arg He He Ser Val Gly Val He Ala Leu Gly 1 5 10 15 Leu Phe Asn Leu Leu Asp Ala Lys His His Lys Glu Lys Lys Glu Asn 20 25 30
His Lys He Thr Arg Glu Leu Lys Val Gly Ala Asn Pro Val Pro His
35 40 45
Ala Gin He Leu Gin Ser Val Val Asp Asp Leu Lys Glu Lys Gly He 50 55 60
Lys Leu Val He Val Ser Phe Thr Asp Tyr Val Leu Pro Asn Leu Ala 65 70 75 80
Leu Asn Asp Gly Ser Leu Asp Ala Asn Tyr Phe Gin His Arg Pro Tyr 85 90 95 Leu Asp Arg Phe Asn Leu Asp Arg Lys Met His Leu Val Gly Leu Ala 100 105 110
Asn He His Val Glu Pro Leu Arg Phe Tyr Ser Gin Lys He Thr Asp
115 120 125
He Lys Asn Leu Lys Lys Gly Ser Val He Ala Val Pro Asn Asp Pro 130 135 140
Ala Asn Gin Gly Arg Ala Leu He Leu Leu His Lys Gin Gly Leu He
145 150 155 160
Ala Leu Lys Asp Pro Ser Asn Leu Tyr Ala Thr Glu Phe Asp He Val
165 170 175 Lys Asn Pro Tyr Asn He Lys He Lys Pro Leu Glu Ala Ala Leu Leu
180 185 190
Pro Lys Val Leu Gly Asp Val Asp Gly Ala He He Thr Gly Asn Tyr
195 200 205
Ala Leu Gin Ala Lys Leu Thr Gly Ala Leu Phe Ser Glu Asp Lys Asp 210 215 220
Ser Pro Tyr Ala Asn Leu He Ala Ala Arg Glu Asp Asn Ala Gin Asp 225 230 235 240
Glu Ala He Lys Thr Leu He Glu Ala Leu Gin Ser Glu Lys Thr Arg 245 250 255 Lys Phe He Leu Asp Thr Tyr Lys Gly Ala He He Pro Ala Phe 260 265 270
(2) INFORMATION FOR SEQ ID NO:597: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 186 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 47a
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 597: Met Phe Gin He Arg Trp His Ala Arg Ala Gly Gin Gly Ala He Thr 1 5 10 15
Gly Ala Lys Gly Leu Ala Asp Val He Ser Lys Thr Gly Lys Glu Val
20 25 30
Gin Ala Phe Ala Ser Tyr Gly Ser Ala Lys Arg Gly Ala Ala Met Met 35 40 45
Ala Tyr Asn Arg Val Asp Asp Glu Pro He Leu Asn His Glu Arg Phe
50 55 60
Met Gin Pro Asp Tyr Val Leu Val He Asp Pro Gly Leu Val Phe He 65 70 75 80 Glu Asn He Phe Ala Asn Glu Lys Glu Asp Thr Thr Tyr He He Thr
85 90 95
Ser Tyr Leu Asn Lys Glu Glu Leu Phe Glu Lys Lys Pro Glu Leu Lys
100 105 110
Thr Arg Lys Val Phe Leu Val Asp Cys Leu Lys He Ser Met Glu Thr 115 120 125
Leu Lys Arg Pro He Pro Asn Thr Pro Met Leu Gly Ala Leu Met Lys 130 135 140 Val Ser Gly Met Leu Glu He Gly Ala Phe Lys Glu Ala Phe Lys Lys 145 150 155 160
Val Leu Gly Lys Lys Leu Thr Gin Glu Val He Asp Ala Asn Met Leu 165 170 175 Ala He Gin Arg Ala Tyr Glu Glu Val Gin 180 185
(2) INFORMATION FOR SEQ ID NO: 598: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 255 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 49
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:598: Glu Glu Glu Asn Asn Gly Ser Gly Thr Lys Lys Val Phe Leu He Val 1 5 10 15
Ala He Ala He He He Leu Ala Val Leu Leu Met Val Phe Trp Lys
20 25 30
Ser Thr Arg Val Ala Pro Lys Glu Thr Phe Leu Gin Thr Asp Ser Gly 35 40 45
Met Gin Lys He Gly Asn Thr Lys Asp Glu Lys Lys Asp Asp Glu Phe
50 55 60
Glu Ser Leu Asn Leu Asp Pro Ser Lys Gin Glu Asp Lys Leu Asp Lys 65 70 75 80 Val Ala Asp Asn Val Lys Lys Gin Glu Asn Asp Ala Phe Asn Met Pro
85 90 95
Thr Gin Thr Asp Gin Thr Gin Thr Glu Met Lys Thr Thr Glu Glu Thr
100 105 110
Gin Glu Ala Gin Lys Gly Leu Lys Val Val Glu His Thr Ser Thr Gin 115 120 125
Lys Glu Ser Gin Ala Val Ala Lys Lys Glu He Ser His Lys Lys Pro
130 135 140
Lys Ala Thr Pro Lys Asp Lys Glu Ala His Lys Asp Lys Asp Lys His 145 150 155 160 Ala Val Lys Glu Leu Lys Val Lys Lys Glu Ala His Lys Glu Val Pro
165 170 175
Lys Lys Ala Asn Ser Lys Thr Thr Leu Thr Lys Gly His Tyr Leu Gin
180 185 190
Val Gly Val Phe Ala His Thr Pro Asn Lys Ala Phe Leu Gin Ala Phe 195 200 205
Asn Gin Phe Pro His Lys He Glu Asp Arg Gly Ser Thr Lys Arg Tyr
210 215 220
Leu He Gly Pro Tyr Lys Asn Lys Gin Glu Ala Leu Met His Ala Asp 225 230 235 240 Glu Val Ser Lys Lys Met Thr Lys Pro Val Val He Glu Ala Arg
245 250 255
(2) INFORMATION FOR SEQ ID NO: 599: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 273 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 50
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 599: Met Ala Phe Asn Tyr Asp Glu Tyr Leu Arg Val Asp Lys He Pro Thr 1 5 10 15
Leu Trp Cys Trp Gly Cys Gly Asp Gly Val He Leu Lys Ser He He 20 25 30
Arg Thr He Asp Ala Leu Gly Trp Lys Met Asp Asp Val Cys Leu Val
35 40 45
Ser Gly He Gly Cys Ser Gly Arg Met Ser Ser Tyr Val Asn Cys Asn 50 55 60
Thr Val His Thr Thr His Gly Arg Ala Val Ala Tyr Ala Thr Gly He 65 70 75 80
Lys Met Ala Asn Pro Ser Lys His Val He Val Val Ser Gly Asp Gly 85 90 95 Asp Gly Phe Ala He Gly Gly Asn His Thr Met His Ala Cys Arg Arg 100 105 110
Asn He Asp Leu Asn Phe He Leu Val Asn Asn Phe He Tyr Gly Leu
115 120 125
Thr Asn Ser Gin Thr Ser Pro Thr Thr Pro Asn Gly Met Trp Thr Val 130 135 140
Thr Ala Gin Trp Gly Asn He Asp Asn Gin Phe Asp Pro Cys Ala Leu
145 150 155 160
Thr Thr Ala Ala Gly Ala Ser Phe Val Ala Arg Glu Ser Val Leu Asp
165 170 175 Pro Gin Lys Leu Glu Lys Val Leu Lys Glu Gly Phe Ser His Lys Gly
180 185 190
Phe Ser Phe Phe Asp Val His Ser Asn Cys His He Asn Leu Gly Arg
195 200 205
Lys Asn Lys Met Gly Glu Ala Ser Gin Met Leu Lys Trp Met Glu Ser 210 215 220
Arg Leu Val Ser Lys Arg Gin Phe Glu Ala Met Ser Pro Glu Glu Arg 225 230 235 240
Val Asp Lys Phe Pro Thr Gly Val Leu Lys His Asp Thr Asp Arg Lys 245 250 255 Glu Tyr Cys Glu Ala Tyr Gin Glu He He Glu Lys Ala Gin Gly Lys 260 265 270
Gin
(2) INFORMATION FOR SEQ ID NO: 600:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 230 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for cluster 51 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 600:
Met Ser Val Leu Asn Ala Lys Glu Cys Val Ser Pro He Thr Arg Ser
1 5 10 15
Val Lys Tyr His Gin Gin Ser Ala Glu He Arg Ala Leu Gin Leu Gin 20 25 30
Ser Tyr Lys Met Ala Lys Met Ala Leu Asp Asn Asn Leu Lys Leu Val
35 40 45
Lys Asp Lys Lys Pro Ala Val He Leu Asp Leu Asp Glu Thr Val Leu 50 55 60 Asn Thr Phe Asp Tyr Ala Gly Tyr Leu He Lys Asn Cys He Lys Tyr 65 70 75 80
Thr Pro Glu Thr Trp Asp Lys Phe Glu Lys Glu Gly Ser Leu Thr Leu
85 90 95
He Pro Gly Ala Leu Asp Phe Leu Glu Tyr Ala Asn Ser Lys Gly Val 100 105 110
Lys He Phe Tyr He Ser Asn Arg Thr Gin Lys Asn Lys Ala Phe Thr
115 120 125
Leu Lys Thr Leu Lys Ser Phe Lys Leu Pro Gin Val Ser Glu Glu Ser
130 135 140 Val Leu Leu Lys Glu Lys Gly Lys Pro Lys Ala Val Arg Arg Glu Leu
145 150 155 160
Val Ala Lys Asp Tyr Ala He Val Leu Gin Val Gly Asp Thr Leu His 165 170 175
Asp Phe Asp Ala He Phe Ala Lys Asp Ala Lys Asn Ser Gin Glu Gin
180 185 190
Arg Ala Lys Val Leu Gin Asn Ala Gin Lys Phe Gly Thr Glu Trp He 195 200 205
He Leu Pro Asn Ser Leu Tyr Gly Thr Trp Glu Asp Glu Pro He Lys
210 215 220
Ala Trp Gin Asn Lys Lys 225 230
(2) INFORMATION FOR SEQ ID NO: 601:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 216 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE:
(B) CLONE: expressed antigen for Z3C
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 601:
He Thr Arg Ser Val Lys Tyr His Gin Gin Ser Ala Glu He Arg Ala 1 5 10 15 Leu Gin Leu Gin Ser Tyr Lys Met Ala Lys Met Ala Leu Asp Asn Asn 20 25 30
Leu Lys Leu Val Lys Asp Lys Lys Pro Ala Val He Leu Asp Leu Asp
35 40 45
Glu Thr Val Leu Asn Thr Phe Asp Tyr Ala Gly Tyr Leu He Lys Asn 50 55 60
Cys He Lys Tyr Thr Pro Glu Thr Trp Asp Lys Phe Glu Lys Glu Gly 65 70 75 80
Ser Leu Thr Leu He Pro Gly Ala Leu Asp Phe Leu Glu Tyr Ala Asn 85 90 95 Ser Lys Gly Val Lys He Phe Tyr He Ser Asn Arg Thr Gin Lys Asn 100 105 110
Lys Ala Phe Thr Leu Lys Thr Leu Lys Ser Phe Lys Leu Pro Gin Val
115 120 125
Ser Glu Glu Ser Val Leu Leu Lys Glu Lys Gly Lys Pro Lys Ala Val 130 135 140
Arg Arg Glu Leu Val Ala Lys Asp Tyr Ala He Val Leu Gin Val Gly
145 150 155 160
Asp Thr Leu His Asp Phe Asp Ala He Phe Ala Lys Asp Ala Lys Asn
165 170 175 Ser Gin Glu Gin Arg Ala Lys Val Leu Gin Asn Ala Gin Lys Phe Gly
180 185 190
Thr Glu Trp He He Leu Pro Asn Ser Leu Tyr Gly Thr Trp Glu Asp
195 200 205
Glu Pro He Lys Ala Trp Gin Asn 210 215
(2) INFORMATION FOR SEQ ID NO:602:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 130 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(vii) IMMEDIATE SOURCE: (B) CLONE: expressed antigen for cluster 61
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:602:
Leu Lys Gin Arg Thr Leu Ser He He Lys Pro Asp Ala Leu Lys Lys 1 5 10 15
Lys Val Val Gly Lys He He Asp Arg Phe Glu Ser Asn Gly Leu Glu 20 25 30 Val Val Ala Met Lys Arg Leu His Leu Ser Val Lys Asp Ala Glu Asn
35 40 45
Phe Tyr Ala He His Arg Glu Arg Pro Phe Phe Lys Asp Leu He Glu 50 55 60 Phe Met Val Ser Gly Pro Val Val Val Met Val Leu Glu Gly Lys Asp 65 70 75 80
Ala Val Ala Lys Asn Arg Asp Leu Met Gly Ala Thr Asp Pro Lys Leu
85 90 95
Ala Gin Lys Gly Thr He Arg Ala Asp Phe Ala Glu Ser He Asp Ala 100 105 110
Asn Ala Val His Gly Ser Asp Ser Leu Glu Asn Ala His Asn Glu He
115 120 125
Ala Phe 130

Claims

IT IS CLAIMED:
1. A H. pylori antigen comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 340-468, where said antigen is in substantially purified form and is characterized by immunoreactivity with H. pylori positive anti-sera.
2. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 340-349.
3. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:350-359.
4. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:360-369.
5. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 370-379.
6. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:380-389.
7. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS: 390-399.
8. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:400-409.
9. A H. pylori antigen of claim 1 , comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:410-419.
10. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:420-429.
11. A H. pylori antigen of claim 1 , comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:430-439.
12. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:440-449.
13. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:450-459.
14. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a cluster antigen sequence selected from the group consisting of SEQ ID NOS:460-468.
15. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 2, 4, 5, 7, 9, 10, 12, 14, 17, 21, 25-28, 36, 37, 39, 44, 48, 55, 59, 61, 69, 249, 250, 252, 254, 256, 258, 260-263, 265-269, 323, 324, and 550-554.
16. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a polypeptide sequence selected from the group consisting of SEQ ID NOS: 555-602.
17. A H. pylori antigen of claim 1, comprising a polypeptide sequence selected from the group consisting of SEQ ID NOs: 555-602.
18. A H. pylori antigen of claim 1, comprising at least 6 contiguous amino acids contained within a sequence selected from the group consisting of SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
19. A H. pylori antigen in substantially purified form and characterized by immunoreactivity with H. pylori positive anti-sera, where said antigen is encoded by a polynucleotide sequence at least 18 nucleotides in length and capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
20. The H. pylori antigen of claim 19, encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 1-10, corresponding to SEQ ID NOS:469, 470 (cluster 1); SEQ ID NO:471 (cluster 2); SEQ ID NO:472 (cluster 3); SEQ ID NO:473 (cluster 4); SEQ ID NOs:474, 475 (cluster 5); SEQ ID NO:476 (cluster 6); SEQ ID NO:477 (cluster 7); SEQ ID NO:478 (cluster 8); SEQ ID NO:479 (cluster 9); and SEQ ID NO:480 (cluster 10).
21. The H. pylori antigen of claim 19, encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 11-20, corresponding to SEQ ID NO:481 (cluster 11); SEQ ID NO:482 (cluster 12); SEQ ID NO: 483 (cluster 13); SEQ ID NO:484 (cluster 14); SEQ ID NOs:485, 486 (cluster 15); SEQ ID NO:487 (cluster 16); SEQ ID NO:488 (cluster 17); SEQ ID NO:489 (cluster 18); SEQ ID NO:490 (cluster 19); and SEQ ID NO:491 (cluster 20).
22. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 21-30, corresponding to SEQ ID NO: 492 (cluster 21); SEQ ID NO:493 (cluster 22); SEQ ID NO:494 (cluster 23); SEQ ID NO:495 (cluster 24); SEQ ID NOs:496 (cluster 25); SEQ ID NO:497 (cluster 26); SEQ ID NO:498 (cluster 27); SEQ ID NO:499 (cluster 28); SEQ ID NO:500 (cluster 29); and SEQ ID NO:501 (cluster 30).
23. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 31-40, corresponding to SEQ ID NO:502 (cluster 31); SEQ ID NO:503 (cluster 32); SEQ ID NO:504 (cluster 33); SEQ ID NO:505 (cluster 34); SEQ ID NOs:506, 507, 508 (cluster 35); SEQ ID NO:509 (cluster 36); SEQ ID NO:510 (cluster 37); SEQ ID NO.511 (cluster 38); SEQ ID NO:512 (cluster 39); and SEQ ID NOs:513, 514 (cluster 40).
24. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 41-50, corresponding to SEQ ID NO:515 (cluster 41); SEQ ID NO:516 (cluster 42); SEQ ID NOs:517, 518 (cluster 43); SEQ ID NO:519 (cluster 44); SEQ ID NOs:520 (cluster 45); SEQ ID NO:521 (cluster 46); SEQ ID NOs:522, 523 (cluster 47); SEQ ID NO:524 (cluster 48); SEQ ID NO:525 (cluster 49); and SEQ ID NO:526 (cluster 50).
25. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 51-60, corresponding to SEQ ID NO:527 (cluster 51); SEQ ID NO:528 (cluster 52); SEQ ID NOs:529 (cluster 53); SEQ ID NO:530 (cluster 54); SEQ ID NO:531 (cluster 55); SEQ ID NO:532 (cluster 56); SEQ ID NOs:533 (cluster 57); SEQ ID NOs:534, 535 (cluster 58); SEQ ID NOs:536, 537 (cluster 59); and SEQ ID NO:538 (cluster 60).
26. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence spanning a cluster region selected from the group consisting of clusters 61-69, corresponding to SEQ ID NO:539 (cluster 61); SEQ ID NO:540 (cluster 62); SEQ ID NOs:541 (cluster 63); SEQ ID NO:542 (cluster 64); SEQ ID NO:543 (cluster 65); SEQ ID NO:544 (cluster 66); SEQ ID NOs:545 (cluster 67); SEQ ID NO:546 (cluster 68); and SEQ ID NO:547 (cluster 69).
27. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:ll, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO:51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO:271, SEQ ID NO:322, SEQ ID NO:549.
28. The H. pylori antigen of claim 19, where said antigen is encoded by a polynucleotide capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
29. The H. pylori antigen of claim 19, encoded by a contiguous series of nucleotides spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
30. The H. pylori antigen of claim 27, encoded by a contiguous series of nucleotides contained within a sequence selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 3 , SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO:51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO:251 , SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO:271, SEQ ID NO:322, SEQ ID NO:549.
31. The H. pylori antigen of claim 19, encoded by a contiguous series of nucleotides contained within a sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
32. A H. pylori antigen-coding polynucleotide in substantially purified form, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence at least 18 nucleotides in length spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
33. The polynucleotide of claim 32, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO:51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO:251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO:271, SEQ ID NO:322, SEQ ID NO:549.
34. The polynucleotide of claim 32, capable of selectively hybridizing, under conditions of stringent hybridization, to a DNA sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO:38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
35. The polynucleotide of claim 32, comprising at least 18 contiguous nucleotides spanning a cluster region selected from the group consisting of SEQ ID NOs:469-547.
36. The polynucleotide of claim 32, comprising at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of
SEQ ID NO: l, SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24, SEQ ID NO:35, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:47, SEQ ID NO.51, SEQ ID NO:54, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NOs:70-230, SEQ ID NO:248, SEQ ID NO.251, SEQ ID NO:253, SEQ ID NO:255, SEQ ID NO:257, SEQ ID NO:259, SEQ ID NO:264, SEQ ID NO:270, SEQ ID NO:271, SEQ ID NO:322, SEQ ID NO:549.
37. The polynucleotide of claim 32, comprising at least 18 contiguous nucleotides contained within a sequence selected from the group consisting of SEQ ID NO:43 (A22), SEQ ID NO: 38 (Cl), SEQ ID NOs:94, 95 (Y124A), SEQ ID NOs: 169, 172 (Y261A), SEQ ID NO:253 (c5), SEQ ID NO:20 (C7), SEQ ID NOs:51, 54 (B2), SEQ ID NO:60 (Y104B), and SEQ ID NO:98 (Y128D).
38. A diagnostic kit for use in screening a biological fluid for the presence of an anti-H. pylori antibody, comprising a substantially purified H. pylori antigen of claim 1 that is immunoreactive with at least one anti-H. pylori antibody, and a reporter for detecting binding of said antibody to the antigen.
39. A diagnostic kit for use in screening a biological fluid for the presence of an anti-H. pylori antibody, comprising a substantially purified H. pylori antigen of claim 18 that is immunoreactive with at least one anti-H. pylori antibody, and a reporter for detecting binding of said antibody to the antigen.
40. A diagnostic kit for use in screening a biological fluid for the presence of an anti-H. pylori antibody, comprising a substantially purified H. pylori antigen of claim 19 that is immunoreactive with at least one anti-H. pylori antibody, and a reporter for detecting binding of said antibody to the antigen.
41. The kit of claim 38, which includes at least two H. pylori antigens having different antibody specificities.
42. The kit of claim 38, wherein said polypeptide antigen is attached to a solid support.
43. The kit of claim 38, further comprising a non-attached reporter-labelled anti-human antibody, wherein binding of said anti-H. pylori antibody to said polypeptide antigen can be detected by binding of the reporter-labelled antibody to said anti-H. pylori antibody.
44. A method of detecting H. pylori infection in a subject, comprising reacting a biological fluid from a subject with a purified H. pylori polypeptide antigen of claim 1, and detecting the presence of antibody bound to said antigen.
45. A method of detecting H. pylori infection in a subject, comprising reacting a biological fluid from a subject with a purified H. pylori polypeptide antigen of claim 18, and detecting the presence of antibody bound to said antigen.
46. A H. pylori vaccine composition, comprising a H. pylori polypeptide antigen of claim 1 which is characterized by its ability to reduce the level of H. pylori infection in a Rhesus monkey or mouse challenged with the peptide, and then infected with H. pylori.
47. The vaccine composition of claim 46, comprising an antigen selected from the group consisting of SEQ ID NO:44 (A22), SEQ ID NO:39 (Cl), SEQ ID NO:568 (Y124A), SEQ ID NO:557 (Y261A), SEQ ID NO:254 (c5), SEQ ID NO:21 (C7), SEQ ID NO:55 (B2), SEQ ID NO:61 (Y104B), SEQ ID NO:573 (Y128D).
48. The vaccine composition of claim 46, comprising an antigen selected from the group consisting of SEQ ID NO:565 (Y139), SEQ ID NO:575 (Y146B), SEQ ID NO:555 (Y175A), SEQ ID NO:44 (A22), SEQ ID NO:569 (Y184A), SEQ ID NO:578 (Z9A), SEQ ID NO:557 (Y261A) and SEQ ID NO:575 (Y146B).
49. A H. pylori vaccine composition, comprising a H. pylori polypeptide antigen of claim 1 , characterized by its ability to invoke a long-lasting antigenic response in a subject challenged with said antigen and subjected to antimicrobial treatment.
PCT/US1998/008487 1997-04-25 1998-04-25 ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR $i(HELICOBACTER PYLORI) WO1998049314A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CA002288211A CA2288211A1 (en) 1997-04-25 1998-04-25 Antigenic composition and method of detection for helicobacter pylori
EP98918806A EP0977864A2 (en) 1997-04-25 1998-04-25 Antigenic composition and method of detection for helicobacter pylori
AU71660/98A AU7166098A (en) 1997-04-25 1998-04-25 Antigenic composition and method of detection for (helicobacter pylor i)
JP54726398A JP2001517091A (en) 1997-04-25 1998-04-25 Antigenic compositions and methods for detecting HELICOBACTER PYLORI

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US4510797P 1997-04-25 1997-04-25
US6195897P 1997-10-14 1997-10-14
US60/045,107 1997-10-14
US60/061,958 1997-10-14

Publications (3)

Publication Number Publication Date
WO1998049314A2 true WO1998049314A2 (en) 1998-11-05
WO1998049314A3 WO1998049314A3 (en) 1999-01-14
WO1998049314A8 WO1998049314A8 (en) 1999-04-08

Family

ID=26722381

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/008487 WO1998049314A2 (en) 1997-04-25 1998-04-25 ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR $i(HELICOBACTER PYLORI)

Country Status (5)

Country Link
EP (1) EP0977864A2 (en)
JP (1) JP2001517091A (en)
AU (1) AU7166098A (en)
CA (1) CA2288211A1 (en)
WO (1) WO1998049314A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999048919A1 (en) * 1998-03-20 1999-09-30 Cortecs (Uk) Limited H.pylori antigen and its use
WO2000046242A2 (en) * 1999-02-04 2000-08-10 American Cyanamid Company 19 KILODALTON PROTEIN OF $i(HELICOBACTER PYLORI)
WO2001083531A1 (en) * 2000-04-27 2001-11-08 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Method for identifying helicobacter antigens
WO2002040516A2 (en) * 2000-11-15 2002-05-23 Ludwig Deml Helicobacter cysteine rich protein a (hcpa) and uses thereof
WO2002077022A1 (en) * 2001-03-23 2002-10-03 Nordic Bio Ab Immunogenic cell surface proteins of helicobacter pylori
US6617116B2 (en) 2000-01-28 2003-09-09 Genelabs Diagnostics Pte. Ltd. Assay devices and methods of analyte detection
WO2003102025A1 (en) * 2002-05-30 2003-12-11 Japan Science And Technology Agency Protein inducing cell death of helicobacter pylori
EP1492886A2 (en) * 2002-04-03 2005-01-05 Syngenta Participations AG Detection of wheat and barley fungal pathogens which are resistant to certain fungicides using the polymerase chain reaction
WO2009040529A1 (en) * 2007-09-28 2009-04-02 Ulive Enterprises Limited Bacterial vaccine
WO2013019098A1 (en) * 2011-08-03 2013-02-07 Universiti Sains Malaysia Helicobacter pylori proteins for diagnostic kit and vaccine
CN106573038A (en) * 2014-06-30 2017-04-19 默多克儿童研究所 Helicobacter therapeutic

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996040893A1 (en) * 1995-06-07 1996-12-19 Astra Aktiebolag Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics
WO1997012909A1 (en) * 1995-10-04 1997-04-10 Pasteur Merieux Serums & Vaccins Novel membrane proteins of helicobacter pylori

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996040893A1 (en) * 1995-06-07 1996-12-19 Astra Aktiebolag Nucleic acid and amino acid sequences relating to helicobacter pylori for diagnostics and therapeutics
WO1997012909A1 (en) * 1995-10-04 1997-04-10 Pasteur Merieux Serums & Vaccins Novel membrane proteins of helicobacter pylori

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP0977864A2 *
TOMB J F ET AL: "THE COMPLETE GENOME SEQUENCE OF THE GASTRIC PATHOGEN HELICOBACTER PYLORI" NATURE, vol. 388, 7 August 1997, pages 539-547, XP002066695 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999048919A1 (en) * 1998-03-20 1999-09-30 Cortecs (Uk) Limited H.pylori antigen and its use
WO2000046242A2 (en) * 1999-02-04 2000-08-10 American Cyanamid Company 19 KILODALTON PROTEIN OF $i(HELICOBACTER PYLORI)
WO2000046242A3 (en) * 1999-02-04 2000-11-23 American Cyanamid Co 19 KILODALTON PROTEIN OF $i(HELICOBACTER PYLORI)
US6617116B2 (en) 2000-01-28 2003-09-09 Genelabs Diagnostics Pte. Ltd. Assay devices and methods of analyte detection
US6849414B2 (en) 2000-01-28 2005-02-01 Genelabs Diagnostics Pte Ltd. Assay devices and methods of analyte detection
WO2001083531A1 (en) * 2000-04-27 2001-11-08 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Method for identifying helicobacter antigens
WO2002040516A2 (en) * 2000-11-15 2002-05-23 Ludwig Deml Helicobacter cysteine rich protein a (hcpa) and uses thereof
WO2002040516A3 (en) * 2000-11-15 2002-08-08 Ludwig Deml Helicobacter cysteine rich protein a (hcpa) and uses thereof
WO2002077022A1 (en) * 2001-03-23 2002-10-03 Nordic Bio Ab Immunogenic cell surface proteins of helicobacter pylori
EP1492886A4 (en) * 2002-04-03 2007-11-21 Syngenta Participations Ag Detection of wheat and barley fungal pathogens which are resistant to certain fungicides using the polymerase chain reaction
EP1492886A2 (en) * 2002-04-03 2005-01-05 Syngenta Participations AG Detection of wheat and barley fungal pathogens which are resistant to certain fungicides using the polymerase chain reaction
WO2003102025A1 (en) * 2002-05-30 2003-12-11 Japan Science And Technology Agency Protein inducing cell death of helicobacter pylori
WO2009040529A1 (en) * 2007-09-28 2009-04-02 Ulive Enterprises Limited Bacterial vaccine
WO2013019098A1 (en) * 2011-08-03 2013-02-07 Universiti Sains Malaysia Helicobacter pylori proteins for diagnostic kit and vaccine
CN106573038A (en) * 2014-06-30 2017-04-19 默多克儿童研究所 Helicobacter therapeutic
US20170165347A1 (en) * 2014-06-30 2017-06-15 Murdoch Childrens Research Institute Helicobacter therapeutic
US11819546B2 (en) 2014-06-30 2023-11-21 Murdoch Childrens Research Institute Helicobacter therapeutic

Also Published As

Publication number Publication date
CA2288211A1 (en) 1998-11-05
AU7166098A (en) 1998-11-24
EP0977864A2 (en) 2000-02-09
WO1998049314A8 (en) 1999-04-08
WO1998049314A3 (en) 1999-01-14
JP2001517091A (en) 2001-10-02

Similar Documents

Publication Publication Date Title
O'Toole et al. Isolation and biochemical and molecular analyses of a species-specific protein antigen from the gastric pathogen Helicobacter pylori
US6025164A (en) Bacterial antigens and vaccine compositions
KR100886095B1 (en) Novel Streptococcus pneumoniae open reading frames encoding polypeptide antigens and a composition comprising the same
KR100923598B1 (en) Surface Proteins of Streptococcus pyogenes
Kostrzynska et al. Molecular characterization of a conserved 20-kilodalton membrane-associated lipoprotein antigen of Helicobacter pylori
WO1996012825A1 (en) CagB AND CagC GENES OF HELICOBACTER PYLORI AND RELATED METHODS AND COMPOSITIONS
US6630582B1 (en) Treatment and prevention of helicobacter infection
AU756010B2 (en) Identification of polynucleotides encoding novel helicobacter polypeptides in the helicobacter genome
EP1012157A1 (en) $i(BORRELIA BURGDORFERI) POLYNUCLEOTIDES AND SEQUENCES
WO1998049314A2 (en) ANTIGENIC COMPOSITION AND METHOD OF DETECTION FOR $i(HELICOBACTER PYLORI)
AU726892B2 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof
US5721349A (en) Vacuolating toxin-deficient H. pylori
AU734052B2 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof
WO1997037044A9 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof
US6143872A (en) Borrelia burdorferi Osp A and B proteins and immunogenic peptides
WO1998024475A1 (en) Nucleic acid and amino acid sequences relating to helicobacter pylori and vaccine compositions thereof
WO2000069903A1 (en) LAWSONIA DERIVED GENE AND RELATED SodC POLYPEPTIDES, PEPTIDES AND PROTEINS AND THEIR USES
US20040047871A1 (en) Recombinant fusobacterium necrophorum leukotoxin vaccine and prepaation thereof
AU735391B2 (en) Helicobacter polypeptides and corresponding polynucleotide molecules
WO2002038594A9 (en) Novel therapeutic compositions for treating infection by lawsonia spp
EP0980204A1 (en) 76 kDa, 32 kDa, AND 50 kDa HELICOBACTER POLYPEPTIDES AND CORRESPONDING POLYNUCLEOTIDE MOLECULES
Oliaro et al. Identification of an immunogenic 18-kDa protein of Helicobacter pylori by alkaline phosphatase gene fusions
JPH06189773A (en) Treponema hyodysenteriae vaccine
CZ388697A3 (en) Sequence of nucleic acids and amino acids relating to heliobacter pylori for diagnosis and therapy
AU1546202A (en) Enterococcus faecalis polynucleotides and polypeptides

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: C1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

CFP Corrected version of a pamphlet front page
CR1 Correction of entry in section i

Free format text: PAT. BUL. 44/98 UNDER (22) REPLACE 27.04.98" BY "25.04.98"

ENP Entry into the national phase

Ref document number: 2288211

Country of ref document: CA

Ref country code: CA

Ref document number: 2288211

Kind code of ref document: A

Format of ref document f/p: F

Ref country code: JP

Ref document number: 1998 547263

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1998918806

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1998918806

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642